The XML Revolution
An interview with Charles Goldfarb

More than just a tutorial, The XML Handbook is a tour of the XML universe, presenting both the theory and the practical application of this key technology. As one of the inventors of SGML, co-author Charles Goldfarb has a unique authority on the subject. Here he speaks to's Tim Anderson. What is your personal involvement in the history of XML?

Charles Goldfarb: I invented the first markup language, IBM's GML, with Ed Mosher and Ray Lorie in 1969. I actually coined the term "markup language" because GML was an initialism for "Goldfarb, Mosher and Lorie", and I had to find a credible meaning for the letters. It was a technology transfer between the research side and the development side, and sometimes developers forget who gave them the ideas. This was an attempt to watermark the concept, and it worked pretty successfully for 30 years or so. So I did SGML in 1974, and both HTML and XML are instances of SGML. Why is it important for people to learn about XML?

XML is revolutionising the way business is conducted in the world, not just on the Web but in the actual operations of companies. It's the holy grail of computing, it's solved the problem of universal data interchange between dissimilar systems. What do you see as the most significant uses of XML?

Goldfarb: There are three main areas. The first is presentation-oriented--being able to create a web site, for example, and have it rendered in a number of different styles depending on whether you are using wireless transfer to a PDA or a full-function monitor on a desk. The second area is messaging. XML is being used to simplify the protocols by which different programs communicate with one another, by having messages which are self-describing in XML. The content of the message--the payload--can be anything at all. But rather than having a complex set of commands by which the programs communicate, XML lets you use a simple interface like the SOAP proposal. So that's a big advantage. The other area is enterprise application integration--the ability to have information come from a foreign source and appear in your system as though it were part of it. That foreign source could be another computer from a business partner, or it could be a legacy application that isn't part of your ERP system, or it could be data that's syndicated from other web sites. Are you talking about the future, or is this already happening?

Goldfarb: This is already happening. If you look at my web site, you can see 40 or 50 products or initiatives using XML announced on a daily basis. As competing vendors adopt XML, is there a danger of fragmentation?

Goldfarb: That is the always the problem with a standard. But I think some of the appearance of fragmentation comes from the unique characteristic of XML and SGML in general, which is that it is designed to accommodate change. It's a standard way to say what you're doing rather than to do something. People get confused, understandably so. There is the XML language itself, then various related standards having to do with the implementation of XML, such as the DOM (Document Object Model) or SAX (Simple API for XML), and then applications of XML such as vocabularies for different industries, and finally frameworks like BizTalk, which are conventions for building XML systems. I try to sort this out in The XML Handbook. Where the competition and fragmentation comes in is chiefly in these frameworks, as each major vendor tries to show off the functionality, power and attractiveness of its own toolset. What have you tried to achieve with The XML Handbook?

Goldfarb: My belief is that XML is too important a technology to just be left to technicians. I feel that XML has such a enormous potential impact on the way business is conducted that management executives, who have got to make decisions based on what XML can do for them, need to have their technical awareness lifted somewhat above the norm. By the same token, developers and programmers need to have a good understanding of how XML can be used and what it does at a high level, rather than just getting a book full of code examples that they can copy. I'm the editor of a whole series of books on XML, and there are plenty of books for programmers, but the Handbook gives the general overview that you need to communicate on this subject. We also have what I think are the most accurate tutorials on the XML language and related standards. In this third edition, what are the most significant areas of revision?

Goldfarb: We did a general overhaul of the book in order to better balance the focus between publishing-oriented presentation material, and the message oriented middleware. We updated and extended the tutorials to reflect the fact that XPath and XSLT have become final approved recommendations. And we also did a lot more on the Schema Definition Language as that standard is starting to stabilise--although it's still not final. There have also been a lot of important changes in the applications and tools areas. We've beefed up the material on schemas and vocabularies. We've added to our coverage of topic maps the whole problems of how to index and navigate and search an incredibly large information space. I was glad to see a Foreword from Microsoft and a Prologue from Sun as a way of bridging industry divides.

Goldfarb: Yes, and especially when you consider the people involved. Jean Paoli is the guy who made XML happen in Microsoft. And Jon Bosak was the guy who honchoed the whole thing through the W3C, which was an amazing political task. There is a little conspiracy going on here: to try to get people to understand why a data-centric computing world is much better for humans than a program-centric one. Why are there so many chapters written by specific XML vendors?

Goldfarb: I have companies in the industry who are sponsors of the book and contribute their experts' time to help develop chapters on different subjects, and in return we illustrate those chapters with the vendor's experience or product. But the focus is on the subject matter. We try to get ahead of the curve. Because my position is essentially the person who started all of this, I've got a good and deep relationship with all the vendors, and they are willing to trust me to do the final writing. This saves us an enormous amount of time and gives us insight into things we otherwise wouldn't have known about. How do you see XML developing in say the next two to five years?

Goldfarb: XML is going to be pervasive. How visible it will be to an ordinary web user, other than by its benefits, is hard to say and not terribly relevant. But one of the major areas is the movement from the web as a source of pages to the web as a source of services. The idea is that when you go to a web site and request something, that is not a request for a stored page the way it was in the early days, but it's a request to get some information or to conduct a transaction. Your web server may turn around and offload some of that to other web servers, or it may use data from other web servers, and these programs will talk to one another and conduct business. That's very different from just asking for a page. There are directory standards being created called UDDI (Universal Description, Discovery and Integration), that will locate the appropriate servers to do the task you need done, choose your business partners based on information in these directories, and consummate the transaction. If you are given to apocalyptic visions this could be "Terminator Three" if it ever goes bad... computers conducting business without human intervention. The way a business is described in XML is going to determine whether other computers will do business with it. And that says developers have got to have a real sense of responsibility here. One of the objectives we had in the XML handbook, was to give developers not just the mechanics but also the underlying reasoning behind XML, so that they can see the entire task that lies before them, not just the particular job that they've got to code today.

Tim Anderson is a developer and IT journalist, with regular columns on web development and application development in the computer press.

© 1998-2001, Inc. and its affiliates