web: the unparalleled genius of Sir Tim

It is 30 years since the invention of the World-Wide Web.

Tim Berners-Lee stood on the shoulders of giants, but the Web wasn’t just an amalgamation of existing ideas. He and Robert Cailliou created:

  • A HyperText Markup Language, human-readable but easy enough for machines to parse and generate. A web page is a complete HTML document.
  • A means of referencing HTML pages across the Internet, called Uniform Resource Locators. But a URL can refer to any kind of resource, not just web pages but also plain text, pictures, other files, and even notions like “the last 7 blog posts by skierpage.” (And actually URLs are a specialization of Uniform Resource Identifiers, that let you refer to other protocols, e.g. mail to:skierpage@example.com?Subject=hello)
  • A specification of the protocol by which a client requests a URL from a server computer and the server responds with the document requested, called HyperText Transfer Protocol
  • Free open source software that implemented all this:
    • a software library implementing the protocol
    • software for an HTTP server, called httpd (“daemon”)
    • software to display and edit HTML pages

Nothing new here?

As this great 30-year summary makes clear, there was a ton of prior art.

Markup languages weren’t new

HTML identifies blocks of text as<P>(aragraph), <H1>heading level 1, etc. and spans of text as <B>old, etc. The idea of marking up blocks of text instead of inserting typesetter codes for a particular printer wasn’t new, and HTML was a simplification of the existing SGML.

Hypertext wasn’t new

Hypertext wasn’t new. In fact in a related article John Allsopp says “Tim Berners-Lee proposed a paper about the WWW for Hypertext ’91, a conference for hypertext theory and research. It was rejected! It was considered very simple in comparison with what hypertext systems were supposed to do.” !!!!

The moment you put technical information on a screen, it is completely obvious that the reader should be able to jump to explanations of technical terms, from entries in the table of contents and index to the section that’s referenced, and from “See How to Install” to… how to install. In a former life writing technical documentation I looked at publishing manuals using hypertext systems like Folio and OWL as well as on paper

Yet another protocol…

Protocols to access remote computers over the Internet weren’t new. There was File Transfer Protocol to transfer files, Simple Mail Transfer Protocol to retrieve your new emails, and even a Gopher protocol to browse information on that computer. (Many people using the Internet at the time thought Gopher would be the glue linking between computers.)

At nearly the same time, “Wide Area Information Server (WAIS) is a client–server text searching system that uses the ANSI Standard Z39.50 Information Retrieval Service Definition and Protocol Specifications for Library Applications” (Z39.50:1988) to search index databases on remote computers.”

So what was new?

In a nutshell, linking within hypertext to another computer system, to possibly get more hypertext, blew people’s fragile little minds.

Those hypertext systems I mentioned operated within a local file. You opened Widget9000Setup.NFO and happily jumped around between sections, index, and paragraphs, but there was no “jump to manufacturer’s server on the Internet for latest service bulletins,” there was no “Here’s a hypertext list of other hypertexts on the Internet about Widget 9000 customizations.” The companies selling hypertext authoring software probably fantasized about getting everyone to buy their proprietary software to author parts of a federated set of hypertexts, but they didn’t have the vision, and a single commercial vendor would have really struggled to establish their file format as a network standard.

A servers is hard but powerful

Because links in HTML can go to other computers, the Web requires a separate computer server to respond to requests (although you can open local files on your computer in your browser without a server). The hypertext software makers must have laughed. “So you have to install and run an extra software’s daemon to respond to all these requests for bits of hypertext? That is the stupidest and most overkill approach imaginable! Just give people a file with all 70 pages and illustrations of our Widget 9000 instruction manual in it.” Remember, by definition there were no manufacturers’ web sites yet, and people and computers communicated over slow modems. Making requests to other computers was theoretically useful, but not just to get the next little chunk of hypertext.

Because everything Sir Tim developed at CERN was open source, the HTTP protocol was relatively simple, and Unix was very common on servers, it turned out that having to run an HTTP server want a big barrier.

Uniform/Universal/Ubiquitous Resource Locator

The URL itself is genius. There were computers on the web that you could contact, mostly run by computer companies and universities and labs like CERN. You could imagine a hypertext page having a “check for Widget 9000 availability” link that would connect to the company’s server as a remote terminal and maybe even simulate pressing C(heck for inventory) then typing Widge part number and Enter – all the pecking away at keyboards that staff used to type when you asked for a book at the library or checked in to a flight. But the poor hypertext author would have to write a little script for every single computer server. A URL can encode the request as a single thing that fits into the HTML page.

It’s human visible

The Widget 9000 availability URL is probably quite complex, maybe http://acmecorp.com/coyote/stock check.asp?part=widget9000 . But you can see it in your browser’s location field, it probably makes sense, and is irresistibly tempting to fiddle with it, aka hack: what if if I substitute

Similarly, you can view the source of an HTML page. I taught myself the rudiments of HTML just by guessing or recognizing what tags like TITLE, P, A HREF=, etc. did. You could write the markup for something simple by hand. (Those golden days are gone now that most web pages are generated on-site by over-complex content management systems and each loads 10 JavaScript libraries and 7 ad networks and Like this on Facebook, Tweet this, and Pin it buttons.)

The Web could subsume other systems

Because the client (usually a browser under the command of a person) makes requests to a server, the Web can subsume or impersonate other systems. A simple computer program can output a Gopher category list or a directory listing as a basic HTML page with a bulleted list of links (more on this). As the Web gained mindshare among developers, people built the bridges to all the other protocols, and so a browser turned into a do-anything tool, and URLs became the lingua franca for any kind of request across the Internet.

This entry was posted in web. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.