Entering the World-Wide Web: A Guide to Cyberspace
Honolulu Community College
What is the World-Wide Web?
For fifty years, people have dreamt of the concept of a universal information database - data that would not only be accessible to people around the world, but information that would link easily to other pieces of information so that only the most important data would be quickly found by a user. It was in the 1960's when this idea was explored further, giving rise to visions of a "docuverse" that people could swim through, revolutionizing all aspects of human-information interaction, particularly in the educational field. Only now has the technology caught up with these dreams, making it possible to implement them on a global scale.
The official description describes the World-Wide Web as a "wide-area hypermedia information retrieval initiative aiming to give universal access to a large universe of documents". What the World-Wide Web (WWW, W3) project has done is provide users on computer networks with a consistent means to access a variety of media in a simplified fashion. Using a popular software interface to the Web called Mosaic, the Web project has changed the way people view and create information - it has created the first true global hypermedia network.
What is hypertext and hypermedia?
The operation of the Web relies on hypertext as its means of interacting with users. Hypertext is basically the same as regular text - it can be stored, read, searched, or edited - with an important exception: hypertext contains connections within the text to other documents.
For instance, suppose you were able to somehow select (with a mouse or with your finger) the word "hypertext" in the sentence before this one. In a hypertext system, you would then have one or more documents related to hypertext appear before you - a history of hypertext, for example, or the Webster's definition of hypertext. These new texts would themselves have links and connections to other documents - continually selecting text would take you on a free-associative tour of information. In this way, hypertext links, called <a name="i16">hyperlinks</a>, can create a complex virtual web of connections.
Hypermedia is hypertext with a difference - hypermedia documents contain links not only to other pieces of text, but also to other forms of media - sounds, images, and movies. Images themselves can be selected to link to sounds or documents. Here are some simple examples of hypermedia:
- You are reading a text on the Hawaiian language. You select a Hawaiian phrase, then hear the phrase as spoken in the native tongue.
- You are a law student studying the Hawaii Revised Statutes. By selecting a passage, you find precedents from a 1920 Supreme Court ruling stored at Cornell. Cross-referenced hyperlinks allow you to
view any one of 520 related cases with audio annotations.
- Looking at a company's floorplan, you are able to select an office by touching a room. The employee's name and picture appears with a list of their current projects.
- You are a scientist doing work on the cooling of steel springs. By selecting text in a research paper, you are able to view a computer-generated movie of a cooling spring. By selecting a button you
are able to receive a program which will perform thermodynamic calculations.
- A student reading a digital version of an art magazine can select a work to print or display in full. If the piece is a sculpture, she can request to see a movie of the sculpture rotating. By interactively controlling the movie, she can zoom in to see more detail.
The Web, although still in its early years, allows many of these examples to work in real life. It facilitates the easy exchange of hypermedia through networked environments from anything as small as two Macintoshes connected together to something as large as the global Internet.
What is the Internet?
The Internet is the catch-all word used to describe the massive world-wide network of computers. The word "internet" literally means "network of networks". In itself, the Internet is comprised of thousands of smaller regional networks scattered throughout the globe. On any given day it connects roughly 15 million users in over 50 countries. The World-Wide Web is mostly used on the Internet; they do not mean the same thing. The Web refers to a body of information - an abstract space of knowledge, while the Internet refers to the physical side of the global network, a giant mass of cables and computers.
How was the Web created?
The Web began in March 1989, when Tim Berners-Lee of CERN (a collective of European high-energy physics researchers) proposed the project to be used as a means of transporting research and ideas effectively throughout the organization. Effective communications was a goal of CERNs for many years, as its members were located in a number of countries.
How popular is the Web?
From January to August 1993, the amount of network traffic (in bytes) across the National Science Foundation's (NSF's) North American network attributed to Web use multiplied by 414 times. The Web is now ranked 13th of all network services in terms of sheer byte traffic. In January its rank was 127. Today there are at least 100 hypertext Web servers in use throughout the world.
Since its inception, the CERN Web server traffic has doubled every four months - twice the rate of Internet expansion.
World-Wide Web growth.
Honolulu Community College officially announced their opening of their hypermedia server - the first Web server in Hawaii - at the end of May 1993. By September of that year (after 105 days of service), they had received over 23,000 requests for documents and over 112,000 requests for assets from nearly 5,000 separate hosts on the network. From September 1 to 7 they received traffic from over 600 separate hosts, an all-time high. It is expected that traffic will increase further as the school year begins and student involvement in the Web increases.
Since the site's opening, HCC has received virtual visitors from Xerox, Digital Equipment Corporation, Apple Computer, Cray, IBM, MIT's Media Lab, NEC, Sony, Fujitsu, Intel, Rockwell, Boeing, Honeywell, and AT&T; (which has been one of the most frequent visitors), among hundreds of other corporate sites on the Internet.
Collegiate visitors have originated from campuses such as Stanford, Harvard, Carnegie-Mellon, Cornell, MIT, Michigan State, Rutgers, Purdue, Rice, Georgia Tech, Columbia, University of Texas, and Washington University, as well as other campuses in the United Kingdom, Germany, and Denmark, to name but a few.
Governmental visitors have come from various departments in NASA, including their Jet Propulsion Laboratories, Lawrence Livermore National Laboratories, the National Institute of Health, the Superconducting Supercollider project, and the USDA, as well as government sites in Singapore and Australia. A few dozen Army and Navy sites throughout the world have browsed around as well.
Because HCC's server began operation when there were relatively few such sites in the world, and in part due to its popularity, the growth in traffic has closely reflected the growth of the Web. Further analysis of HCC's server logs indicate the following breakdown in classifications:
Although it is impossible to know for sure, it can be guessed that the largest segment roaming the World-Wide Web consists of four-year campus populations within the United States.
What is Mosaic?
Months after CERN's original proposal, the National Center for Supercomputing Applications (NCSA) began a project to create an interface to the World-Wide Web. One of NCSA's missions is to aid the scientific research community by producing widely available, non-commercial software. Another of its goals is to investigate new research technologies in the hope that commercial interests will be able to profit from them. In these ways, the Web project was quite appropriate. The NCSA's Software Design Group began work on a versatile, multi-platform interface to the World-Wide Web, and called it Mosaic.
In the first half of 1993, the first version of NCSA's Web browser was made available to the Internet community. Because earlier beta versions were distributed, Mosaic had developed a strong yet small following by the time it was officially released.
Because of the number of traditional services it could handle, and due to its easy, point-and-click hypermedia interface, Mosaic soon became the most popular interface to the Web. Currently versions of Mosaic can run on Suns, Silicon Graphics workstations, IBM-compatibles running Microsoft Windows, Macintoshes, and computers running other various forms of UNIX.
What can Mosaic do?
Mosaic running on every supported computer should have the following features:
- A consistent mouse-driven graphical interface.
- The ability to display hypertext and hypermedia documents.
- The ability to display electronic text in a variety of fonts.
- The ability to display text in bold, italic, or strikethrough styles.
- The ability to display layout elements such as paragraphs, lists, numbered and bulleted lists, and quoted paragraphs.
- Support for sounds (Macintosh, Sun audio format, and others).
- Support for movies (MPEG-1 and QuickTime).
- The ability to display characters as defined in the ISO 8859 set (it can display languages such as French, German, and Hawaiian).
- Interactive electronic forms support, with a variety of basic forms elements, such as fields, check boxes, and radio buttons.
- Support for interactive graphics (in GIF or XBM format) of up to 256 colors within documents.
- The ability to make basic hypermedia links to and support for the following network services: ftp, gopher, telnet, nntp, WAIS.
- The ability to extend its functionality by creating custom servers (comparable to XCMDs in HyperCard).
- The ability to have other applications control its display remotely.
- The ability to broadcast its contents to a network of users running multiplatform groupware such as NCSA's Collage.
- Support for the current standards of HTTP and HTML.
- The ability to keep a history of travelled hyperlinks.
- The ability to store a list and retrieve a list of URLs for future use.
What is available on the Web?
Currently the Web offers the following through a hypertext, and in some cases, hypermedia interface:
- Anything served through Gopher
- Anything served through Wide Area Information Server (WAIS)
- Anything served through anonymous FTP sites
- Full Archie services (a FTP search service)
- Full Veronica services (a Gopher search service)
- Full CSO, X.500, and whois services (Internet phone book services)
- Full finger services (an Internet user lookup program)
- Any library system using PALS (a library database standard)
- Anything on Usenet
- Anything accessible through telnet
- Anything in hytelnet (a hypertext interface to telnet)
- Anything in techinfo or texinfo (forms of campus-wide information services)
- Anything in hyper-g (a networked hypertext system in use throughout Europe)
- Anything in the form of man pages
- HTML-formatted hypertext and hypermedia documents
How does the Web work?
The Web works under the popular client-server model. A Web server is a program running on a computer whose only purpose is to serve documents to other computers when asked to. A Web client is a program that interfaces with the user and requests documents from a server as the user asks for them. Because the server does a minimal amount of work (it does not perform any calculations) and only operates when a document is requested, it puts a minimal amount of workload on the computer running it.
Here's an example of how the process works:
- Running a Web client (also called a browser), the user selects a piece of hypertext connected to another text - "The History of Computers".
- The Web client connects to a computer specified by a network address somewhere on the Internet and asks that computers Web server for "The History of Computers".
- The server responds by sending the text and any other media within that text (pictures, sounds, or movies) to the users screen.
The World-Wide Web is composed of thousands of these virtual transactions taking place per hour throughout the world, creating a web of information flow.
Future Web servers will include encryption and client authentication abilities - they will be able to send and receive secure data and be more selective as to which clients receive information. This will allow freer communications among Web users and will make sure that sensitive data is kept private. It will be harder to compromise the security of commercial servers and educational servers which wish to keep information local. Improvements in security will facilitate the idea of "pay-per-view" hypermedia, a concept which many commercial interests are currently pursuing.
The language that Web clients and servers use to communicate with each other is called the HyperText Transmission Protocol (HTTP). All Web clients and servers must be able to speak HTTP in order to send and receive hypermedia documents. For this reason, Web servers are often called HTTP servers.
The phrase "World-Wide Web" is often used to refer to the collective network of servers speaking HTTP as well as the global body of information available using the protocol.
The standard language the Web uses for creating and recognizing hypermedia documents is the HyperText Markup Language (HTML). It is loosely related to, but technically not a subset of, the Standard Generalized Markup Language (SGML), a document formatting language used widely in some computing circles.
HTML is widely praised for its ease of use. Web documents are typically written in HTML and are usually named with the suffix ".html". HTML documents are nothing more than standard 7-bit ASCII files with formatting codes that contain information about layout (text styles, document titles, paragraphs, lists) and hyperlinks. Many free software convertors are available for translating documents in foreign formats to HTML.
The current HTML standard (HTML) supports basic hypermedia document creation and layout, but for current use it is still limited. The latest version of HTML, called HTML+, is still under development but will probably be completely defined by the end of 1993. HTML+ will support interactive forms, defined "hotspots" in images, more versatile layout and formatting options and styles, and formatted tables, among many other improvements.
HTML uses what are called Uniform Resource Locators (URLs) to represent hypermedia links and links to network services within documents. It is possible to represent nearly any file or service on the Internet with a URL.
The first part of the URL (before the two slashes) specifies the method of access. The second is typically the address of the computer the data or service is located. Further parts may specify the names of files, the port to connect to, or the text to search for in a database.
Here are some examples of URLs:
file://pulua.hcc.hawaii.edu/sound.au - Retrieves a sound file and plays it. file://pulua.hcc.hawaii.edu/picture.gif - Retrieves a picture and displays it, either in a separate program or within a hypermedia document. file://pulua.hcc.hawaii.edu/directory/ - Displays a directorys contents. http://pulua.hcc.hawaii.edu/directory/book.html - Connects to an HTTP server and retrieves an HTML file. ftp://pulua.hcc.hawaii.edu/pub/file.txt - Opens an FTP connection to pulua.hcc.hawaii.edu and retrieves a text file. gopher://pulua.hcc.hawaii.edu - Connects to the Gopher at pulua.hcc.hawaii.edu. telnet://pulua.hcc.hawaii.edu:1234 - Telnets to pulua.hcc.hawaii.edu at port 1234. news:alt.hypertext - Reads the latest Usenet news by connecting to a user-specified news (NNTP) host and returns the articles in hypermedia format.
Most Web browsers allow the user to specify a URL and connect to that document or service. When selecting hypertext in an HTML document, the user is actually sending a request to open a URL. In this way, hyperlinks can be made not only to other texts and media, but also to other network services. Web browsers are not simply Web clients, but are also full-featured FTP, Gopher, and telnet clients.
HTML+ will include an email URL, so hyperlinks can be made to send email automatically. For instance, selecting an email address in a piece of hypertext would open a mail program, ready to send email to that address.
What software is available?
World-Wide Web clients (browsers) are available for the following platforms and environments:
- Text-only (dumb) terminal, nearly any platform
- UNIX, text-only using curses, for SunOS 4, AIX, Alpha, Ultrix
- X11/Motif, for IRIX (Silicon Graphics), SunOS 4, RS/6000, DEC Alpha/OSF 1, DEC Ultrix.
- NeXT, for NeXTStep 3.0
- IBM compatibles, 386 and above, under Microsoft Windows
- Macintosh computers, Classic and above
- Browsers written in perl are available.
- Browsers written for the emacs environment are available.
World-Wide Web servers are available for the following platforms and environments:
- VM, VMS