HTML

HTML (Hypertext Markup Language), the language of the WWW, describes the structure of a document, permits the connection of all kinds of documents via hyperlinks and the embedding of multimedial elements.

$\bullet$
Fundamental to HTML is the separation of structure and layout: HTML only describes the content and logical structure of a document. The exact layout (fonts used, separation of lines etc.) is delegated to the browser.

$\triangleright$
Many extensions of the HTML language, which allowed more explicit control of the layout, did not comply with these goals. The current standard HTML 4.0 came back to the roots and removed such language elements. Instead the layout can be fixed using commands of a style sheet language, which can be connected to a document. A special version of HTML (4.0 Transitional) allows to mix old and new constructs to allow for a smooth transition phase.

$\triangleright$
HTML pages are written in (ASCII) text and can be written with any text editor. Special graphical HTML editors allow to create web pages similar to text processing programs and create the HTML code automatically. Another possibility is to write LATEXdocuments and convert them to HTML with latex2html.

$\bullet$
A HTML page consists of the actual text with additional HTML elements (tagstag), which describe the structure of the text.

$\bullet$
Though HTML allows all kinds of characters (using the Unicode standard), often one uses a simple character set like Latin-1 or even ASCII. Additional letters or characters used by the HTML tags can be entered by a special name as &NAME; or a numeric code like &#NNN;.

$\diamond$
Character HTML encoding
Ä Ö Ü Ä Ö Ü
ä ö ü ß ä ö ü ß
æ å é æ å é
$<$ $>$ & ¨ &lt; &gt; &amp; &quot;

$\bullet$
Line breaks and additional white space in the HTML text is ignored by a browser. Instead the text lines are adapted to the size of the browser window.

$\bullet$
HTML tags assign a special structural element to a part of the text. They are enclosed in <, > and usually enclose a text in the form

<TAG>text text text</TAG>

The tag name is not case-sensitive.

$\diamond$
Element Meaning
<H1>Introduction</H1> main heading
<em>notice:</em> emphasized text
A<SUB>ij</SUB> subscripts

$\bullet$
For some elements the closing tag is optional.

$\diamond$
Element Meaning
<P> paragraph
<LI> list item

$\bullet$
Some elements have no closing tag, since they don't have a corresponding text.

$\diamond$
Element Meaning
<BR> line break
<HR> horizontal rule

$\bullet$
Tags can have additional attributes with corresponding values:

     <TAG ATTRIBUTE1="value1" ATTRIBUTE2="value2" ...>
  

$\diamond$
<IMG src="flower.gif" alt="Flower">

The IMG tag embeds an image into the text that is given by the attribute src. The attribute alt contains a text that can be used instead of the image (e.g. for simple text oriented browsers).

$\bullet$
HTML documents can contain comments that are ignored by the browser. Comments are enclosed in <!-- and -->.

$\diamond$
<!--This is a comment
spanning two lines -->

$\bullet$
An HTML page consists of a header, which can contain the title of the document and additional information (e.g. author, keywords), and a body with the visible content. The whole page is enclosed by the $<$HTML$>$ tag.

     <HTML>
     <HEAD>
       <TITLE>The title of the document</TITLE>
     </HEAD>
     <BODY>
       The content of the document
     </BODY>
     </HTML>
  

$\triangleright$
The document title is usually shown in the title bar of the browser window.

$\diamond$
The following table shows the most important HTML tags describing the text structure. The column ``representation'' shows, how the corresponding element is usually displayed in a browser.


Tag Meaning Representation
<H1> main heading own paragraph, very large font size, bold
<H2> ... <H6> sub headings own paragraph, large font size, bold
<P> paragraph line break, larger line spacing
<BR> line break new line
<UL> item list own paragraph, indented
<LI> list element one (or more) lines per element, starts with a bullet (or square etc.)
<OL> numerated list like <UL>, but elements are numbered
<PRE> preformatted text typewriter font, line breaks and white space are respected
<EM> emphasized text italics
<STRONG> strongly emphasized text bold

$\triangleright$
Lists can be nested.

$\bullet$
References to other documents (links) are marked by the <A> tag:

<a href="URL">Link Text</a>

URL is the address of the link, Link Text the text in the document that is marked as a link.

$\diamond$
URL can be a local file (../contents.html) or an arbitrary WWW address (http://..., ftp://... ).

$\triangleright$
Usually browser display a link by underlining the link text and using a different text color. A recently visited link is shown in another color.

$\triangleright$
An image can be used as an anchor of a link instead of a text. Browsers show such a link by a colored border around the image or through a change of the cursor (usually to a hand).

$\bullet$
Images can be embedded into the text with the tag

    <IMG src="URL" alt="text">   
  

$\triangleright$
Images can come in many different file formats. Most browsers allow at least GIF, PNG and JPEG. Other formats may require a special plugin.

$\triangleright$
The possibility (especially of GIF images) to denote a color as "transparent" allows a seemless embedding of non-rectangular images.

$\triangleright$
As long as there is no widely used standard for the direct representation of mathematical formulae, one can include them as GIF images (e.g. created with LATEX) with a transparent background.

$\triangleright$
The GIF format allows to combine several images to a little animation (animated GIF). Such animations are included like ordinary images with the IMG tag.

$\bullet$
A HTML page can contain programs written in Java (applets) using the tag

<applet code="programfile" width="WWW" height="HHH">
</applet>

As for images the browser reserves a rectangular screen area of width WWW and height HHH (in pixel), which is used by the applet as its main window.

$\triangleright$
Java applets contain compiled code, usually in a file with the extension .class.

$\triangleright$
The following table lists some advanced features of HTML pages. For a further description cf. the huge literature or one of the online tutorials in the web. 4.1


Element Function
table elements ordered in rows and columns
imagemap links that are connected with special parts of an image
frame partitioning of a page in complete subpages
style sheet layout specification
forms graphical components for data input
JavaScript script language for dynamic HTML pages

previous    contents     next

Peter Junglas 8.3.2000