Part II of “A Brief(ish) History of the Web Universe” aka “The Boring Posts”. No themes, no punch, just history that I hope I can use to help explain where my own perspectives on a whole bunch of things draw from…
Tim Berners-Lee was working at CERN which was, by most measures, pretty large. Budgets and official policies were, as they are in many large organizations, pretty rigid and a little bureaucratic. CERN was about particle physics, not funding Tim’s idea. More than that, many didn’t recognize the value of lots of things which were actually necessary in some way.
The paradox of the Web was that this very very hard problem to connect heterogenous information from heterogeneous computers on heterogenous networks all over the word — a very very hard problem, was solved by a small, non-official, open approach by a team with no resources, or, practically none. – Ben Segal from CERN who brought in TCP/IP ‘under the radar’
In 1989, despite being actually necessary at this point for CERN to function, a memo went out reminding that it was “unsupported”. A lot of the best things in history turn out to effectively have been people of good will working together outside the system to get things done that needed getting done.
So, in September 1990 Mike Sendall, Tim Berners-Lee’s boss at CERN found a way to give Tim the approval to to develop his idea in a way that many good bosses through history tend to: While no one wanted to fund development of “Tim’s idea,” there was some interest in Steve Jobs’ new NeXT computer (which it appears was also brought in initially despite, rather than as part of CERN policy and plans). And so, under the guise of testing the capabilities/suitability of the NeXT computer for development at CERN in 1990 Tim would be able to create a prototype browser and server.
The NeXT had great tools for developing GUI applications and Tim was able to build a pretty nice prototype GUI with read-write facilities pretty quickly. It let him figure out precisely what he was proposing.
As explained in Part I of this history, there was already a lot going on, standardized or in place by then – for example SGML. Because of this, CERN somewhat unsurprisingly, already had a bunch of SGML documents in the form of a thing called “SGMLGuid” (sometimes just GUID). Unfortunately, the earliest capture of this I can find is from 1992 but here’s what SGMLGuid looked like.
SGML itself had gotten quite complex as it tackled ever more problems, and Tim didn’t really know SGML. But he saw the clear value in having a language that was at least familiar looking for SGML authors, and of having some existing corpus of information. As he said later
Who would bother to install a client if there wasn’t already exciting information on the Web? Getting out of this chicken and egg situation was the task before us….
Thus, he initially started with a kind of a subset of ~15 GUID tags (plus the critical <a> tag for expressing hyperlinks which is Tim’s own creation and at the very core of the idea – an exact number is hard to say because the earliest document on this isn’t until 1992). As explained on w3.org’s origins page:
The initial HTML parser ignored tags which it did not understand, and ignored attributes which it did not understand from the CERN-SGML tags. As a result, existing SGML documents could be made HTML documents by changing the filename fromxxx.sgml to xxx.html. The earliest known HTML document is dated 3 December 1990:
There was not a lot of discussion of this at <a href=Introduction.html>ECHT90</a>, but there seem to be two leads:
<li><a href=People.html#newcombe>Steve newcombe’s</a> and Goldfarber’s “Hytime” committee
looking into SGML, and
<li>An ISO working group known as MHEG, “Multimedia/HyperText Expert Group”.
led by one Francis Kretz (Thompsa SA? Rennes?).
There’s a lot of history hiding out in this first surviving HTML actually.
First, note that this and many others weren’t “correct” documents by many counts we’d think of today: There was no doctype, no <html> element, no <head>, <title> or <body> etc. There actually wasn’t an HTML standard yet at that point so at some level it’s kind of amazing how recognizable it remains today – not just to humans, but to browsers. Your browser will display that page just fine. HTML, as it was being defined however, also wasn’t valid SGML necessarily. The W3C site points out that the final closing tag is an error (it transposes the letters).
More interestingly still for purposes here, I’d like to note that the very first surviving HTML document was about quite literally about HyperMedia. It’s part of notes from what they called the hypertext conference and, unsurprisingly, that is what everyone was talking about. To understand why I think this matters, let’s rewind just a little and tie in some things from Part I with some things that weren’t….
The Timing of the Web
Is that title about Tim or Time? Both.
Remember VideoWorks/Director from Part I? It illustrated that perhaps the concept of Time was really important to hypermedia. However, they weren’t the only ones to see this. In fact, even as early as 1984 people were already seeing a gap between documents and hypermedia/multimedia and talking about how to solve it. It turns out SGML, the reigning standard approach of the time for documents as markup actually wasn’t quite suited to the task. It needed revision.
So, Goldfarb (original GML creator, presumably mistyped above) went back to the drawing board with some more people to figure out how to apply it to a musical score. What they came up with was called SMDL (Standard Music Description Language), an effort that was ANSI approved in late 1985. However, no practical demonstration of SMDL was completed until 1990 and as part of a Master’s thesis rather than a product (this dissonance over what makes a “standard” appears and reappears over and over in standards history).
It’s key though because you could definitely say that by the mid-late 80’s, it was becoming obvious to many that the problem of time and linking objects in time was a more generalized problem. Video, for example, might have been a neat thing on the desktop but don’t forget that in the 1980’s, cable television was spreading along with computers and multimedia — and much faster. By this time, a number of folks were beginning to imagine something like “interactive TV” as the potentially Next Really Big Thing (even before the Web). Sun Microsystems established a group, “Green” to figure out the next big thing, who thought it would be interactive consumer electronics (like interactive TVs).
And so in 1989, just about the time Tim was putting together his ideas on the Web, the grander problem of Time/SGML was moved out of SMDL into a new ANSI project known as “HyTime” (Hypermedia/Time-based Structuring Language” which had a lot of key players and support from major businesses.
It really looked like maybe it was going somewhere. Remember Ted Nelson from Part I? In 1988, AutoDesk had decided to fund him directly and commercialize his ideas which had become known as Project Xanadu. An AutoDesk press release said:
In 1964, Xanadu was a dream in a single mind. In 1980, it was the shared goal of a small group of brilliant technologists. By 1989, it will be a product. And by 1995 it will begin to change the world.
Nelson/Autodesk were some of the big names on that HyTime committee. Ironically, I think they got the years pretty close, but the technology wrong.
At approximately the same time the MPEG (Moving Pictures Experts Group) and MHEG (Multimedia and Hypertext Experts Group – also mentioned in that initial post above) were established. MHEG’s focus, like a lot of other things included hypermedia documents, but unlike SGML required an MHEG engine – basically, a VM. The files they’d trade would be compiled binary rather than text-based. While they were authorable as documents, they were documents about interactive objects.
And so this is what people were talking about at the conference which Tim was summarizing in that early surviving HTML document. Both HyTime and MHEG were already thinking about how to standardize this quality in part because there is a lot of media. An interesting thing about media is that people were building multimedia applications.
So the world around him was moving forward and there were lots of interesting ideas on all fronts. Tim had a prototype in hand. HTML as understood by the NeXT had no forms, no tables, you couldn’t even nest things – it was flat. Not only did it have no CSS but no colors (his screen was black and white). But, for the most part many of his tags were simple formatting. You can debate that an H1 is semantic, but in Tim’s interface it was under styling. That is, as you could “style” things as an H1, more or less WYSIWYG style, and the editor would flattened it all out in serializing markup.
Tim imagined (and has repeated since) that the most important thing was the URI, then HTTP then stuff like HTML and later CSS. URIs, in theory, can work for anything as long as you have a concept of a file that is addressable. HTTP was built with a feature called ‘content type negotiation’ which allows the sender to say what it’s prepared to handle and the server to give him back something appropriate. As Tim explains this feature in Weaving the Web:
In general … the client and server have to agree on the format of data they both will understand. If they both knew WordPerfect for example, they would swap WordPerfect documents directly. If not, they could try to translate to HTML as a default.
So the weird intricacies of HTML or things above weren’t drastically important at the time because Tim didn’t imagine HTML would be for everything. In fact, help address his chicken and egg problem described above, Tim just made his browser give URIs and auto-translate some popular existing protocols like NNTP, Gopher and WAIS to the HTML. But perhaps even this is over-simplifying just a bit – as he also explained:
I expected HTML to be the basic waft and weft of the Web but documents of all types: video, computer aided design, sound, animation and executable programs to be the colored threads that would contain much of the content. It would turn out that HTML would become amazingly popular for the content as well…
It would turn out…
One of the most interesting things about invention is the stuff that the inventor didn’t expect people would do with it. It would turn out that HTML would become really popular for content for a number of reasons. One reason, undoubtedly, is that the simplest thing to do is simply to provide HTML in the first place with no alternatives. More importantly, perhaps, to re-iterate the point from part I: The line between documents and ‘more than documents’ was clearly fuzzy.
To illustrate: Even with the NeXT browser “in hand”, it was very hard to show people value. Very few people had a NeXT, even at CERN – after all, it was a pilot for establishing whether the new-fangled machines would be useful. Lugging it around only went to far . There was a new project at CERN to provide a directory and Tim and early partners like Robert Cailliau convinced CERN to publish the directory via the Web.
This is interesting because address book applications were something that a lot of the modern computers of the time had, but a phonebook was a bunch of printed pages. Who wouldn’t have liked that application? It might have been potentially “easy” to create a nice HyperCard stack and auto-transform to HTML based on content type negotiation – but which part was document and which part was application? It was actually much easier to just deliver HTML which could be generated any number of ways – and with the current digital expectations of the day, on the machines they were using, that was just fine. Thus, the simple line mode browser that made the fewest assumptions possible was born as something that could be distributed to all the CERN machines (and all the world’s machines – more on this below).
The line mode browser was, frankly, boring. It was wildly inferior to the NeXT interface which was itself wildly inferior to something like OWL’s Guide. But it worked, and as usual, that matters. Let me repeat that: Shipping something useful matters.
If you’ve never heard of Maslow’s Hammer, you’re probably at least familiar with the software version of it: We like to say “If the only tool you have is a hammer, everything looks like a nail”. Usually when we say it we’re trying to say “use the right tool for the job”. However, there’s a corollary there that is just as true and often goes unnoticed: If someone only has a butter knife it would turn out that they can suddenly screw in some kinds of screws.
It would also turn out that that’s not entirely a bad thing: If you need to unclog something, a butter knife works. If you need a lever to lift something small in a tight spot, a butter knife works in a pinch. If you need a paperweight on a windy day, guess what turns out to work pretty well? Perhaps that wasn’t the butter knife’s original intent, but it is universally true. And guess what else turns out to be true? A butter knife and some other things were probably an “almost” approximation for some tool that didn’t yet exist. What’s more, having a few of those “almost” tools frequently helps inspire something better. Steven Johnson calls this “the adjacent possible” in his Ted Talk “Where good ideas come from” and I think it’s as true of the Web as it is of anything.
However it came about, it turns out as well that the line mode browser was kind of perfect in time for a number of reasons. To keep things in perspective, this was 1990. While computers were starting to catch on, in 1990 they were still very expensive. As such, as deployed, many of them didn’t even have OS’ with something remotely like what we would call graphical UIs as a norm yet. Of those that did, few even had modems. And of those with modems, many still connected at 1200 or 2400 baud. We weren’t connected nor even completely GUI yet. Those who were connecting most frequently were often doing so through large, expensive and frequently outdated systems which had been a really big investment years before.
Because of this, what the line mode browser definitely did was to allow Tim and others to show the people who would start writing the modern browsers with GUIs and increasingly recognizable features in short order and keep a small but steady stream of new potential enthusiasts checking it out. Sadly perhaps, another thing it did was to omit the authoring piece that was present on the NeXT machine and set in motion a trend where people perceived the Web as a way to consume rather than publish and contribute and likely spurned a greater focus on authoring HTML. “Sadly Perhaps,” but then again, perhaps that’s precisely what was necessary in order for it to mature anyway. It’s hard to say in retrospect.
With a few new enthusiasts, in 1991 he created a mailing list: www-talk. For a while a very, very small but steadily growing group of people discussed the early “Web” they were trying to build. As more people came into the group they wanted more and different things – it should be more like HyTime, links should really work differently, it should actually be SGML rather than just “inspired by” or “look like” it and so on.
What happened next just keeps getting more interesting. Continue to Part III: The Early Web.