Current Web Architecture (original) (raw)


Table of Contents

Introduction

Basic Web Architecture

Web Architecture Extensibility

Other Transfer Protocols

Other Open Standards


Introduction

This section of the Internet Tool Survey describes the current architecture of the World Wide Web (WWW). The NCSA Glossary is a useful starting point for Web terms. Another is the ILC glossary of Internet Terms.

The following sections describe


Basic Web Architecture

The basic web architecture is two-tiered and characterized by a web client that displays information content and a web server that transfers information to the client. This architecture depends on three key standards: HTML for encoding document content, URLs for naming remote information objects in a global namespace, and HTTP for staging the transfer.

HTML is an application of the Standard Generalized Markup Language (SGML ISO-8879), an international standard approved in 1986, which specifies a formal meta-language for defining document markup systems (more hereand here). An SGML Document Type Definition (DTD) specifies valid tag names and element attributes. HTML consists of embedded content separated by hierarchical case sensitive start and end tag names which may contain embedded _element attributes_in the start tag. These attributes may be required, optional, or empty. In addition, documents can be inter or intra linked by establishing source and target anchor points. Many HTML documents are the result of manual authoring or word processing HTML converters, but now several WYSIWYG editors support HTML styles -- see listing at W3C and the Internet Tools Survey section on Authoring HTML.
HTML files are viewed using a WWW client browser (software), the primary user interface to the Web. HTML allows for embedding of images, sounds, video streams, form fields and simple text formatting. References, called hyperlinks, to other objects are embedded using URLs (see below). When an object is selected by a hyperlink, the browser takes an action based on the URL's type, e.g., retrieve a file, connect to another Web site and display a HTML file stored there, or launch an application such as an E-mail or newsgroup reader.

URLs are location dependent and contain four distinct parts: the protocol type, the machine name, the directory path and the file name. There are several kinds of URLs: file URLs, FTP URLs, Gopher URLs, News URLs, and HTTP URLs. URLs may be relative to a directory or offsets into a document. Arguments to CGI programs (see below) may be embedded in URLs after the ? character.


Web Architecture Extensibility

This basic web architecture is fast evolving to serve a wider variety of needs beyond static document access and browsing. The Common Gateway Interface (CGI) extends the architecture to three-tiers by adding a back-end server that provides services to the Web server on behalf of the Web client, permitting dynamic composition of web pages. Helpers/plug-ins and Java/JavaScript provide other interesting Web architecture extensions.

JavaScript is a scripting language designed for creating dynamic, interactive Web applications that link together objects and resources on both clients and servers. A client JavaScript can recognize and respond to user events such as mouse clicks, form input, and page navigation, and query the state or alter the performance of an applet or plug-in. A server JavaScript script can exhibit behavior similar to common gateway interface (CGI) programs. JavaScript scripts are embedded in HTML documents using