HTML
HTML (Hypertext Markup Language), the language of the
WWW, describes the structure of a document, permits the connection
of all kinds of documents via hyperlinks and the embedding
of multimedial elements.
- Fundamental to HTML is the separation of structure and layout: HTML only
describes the content and logical structure of a document. The exact layout
(fonts used, separation of lines etc.) is delegated to the browser.
-
- Many extensions of the HTML language, which allowed more explicit control of
the layout, did not comply with these goals. The current standard HTML 4.0
came back to the roots and removed such language elements. Instead the
layout can be fixed using commands of a style sheet language, which
can be connected to a document. A special version of HTML (4.0 Transitional)
allows to mix old and new constructs to allow for a smooth transition
phase.
-
- HTML pages are written in (ASCII) text and can be written with any text
editor. Special graphical HTML editors allow to create web pages similar to
text processing programs and create the HTML code automatically. Another
possibility is to write LATEXdocuments and convert them to HTML with
latex2html.
- A HTML page consists of the actual text with additional HTML elements
(tagstag), which describe the structure of the text.
- Though HTML allows all kinds of characters (using the Unicode standard),
often one uses a simple character set like Latin-1 or even ASCII. Additional
letters or characters used by the HTML tags can be entered by a special name
as &NAME; or a numeric code like &#NNN;.
-
Character |
HTML encoding |
Ä Ö Ü |
Ä Ö Ü |
ä ö ü ß |
ä ö ü ß |
æ å é |
æ å é |
& ¨ |
< > & " |
- Line breaks and additional white space in the HTML text is ignored by a
browser. Instead the text lines are adapted to the size of the browser
window.
- HTML tags assign a special structural element to a part of the text. They
are enclosed in <, > and usually enclose a text in the form
<TAG>text text text</TAG>
The tag name is not case-sensitive.
-
Element |
Meaning |
<H1>Introduction</H1> |
main heading |
<em>notice:</em> |
emphasized text |
A<SUB>ij</SUB> |
subscripts |
- For some elements the closing tag is optional.
-
Element |
Meaning |
<P> |
paragraph |
<LI> |
list item |
- Some elements have no closing tag, since they don't have a corresponding
text.
-
Element |
Meaning |
<BR> |
line break |
<HR> |
horizontal rule |
- Tags can have additional attributes with corresponding values:
<TAG ATTRIBUTE1="value1" ATTRIBUTE2="value2" ...>
- <IMG src="flower.gif" alt="Flower">
The IMG tag embeds an image into the text that is given by the
attribute src. The attribute alt contains a text that can be
used instead of the image (e.g. for simple text oriented browsers).
- HTML documents can contain comments that are ignored by the
browser. Comments are enclosed in <!-- and -->.
- <!--This is a comment
spanning two lines -->
- An HTML page consists of a header, which can contain the title of the
document and additional information (e.g. author, keywords), and a body with
the visible content. The whole page is enclosed by the
HTML
tag.
<HTML>
<HEAD>
<TITLE>The title of the document</TITLE>
</HEAD>
<BODY>
The content of the document
</BODY>
</HTML>
-
- The document title is usually shown in the title bar of the browser
window.
- The following table shows the most important HTML tags describing the text
structure. The column ``representation'' shows, how the corresponding element
is usually displayed in a browser.
Tag |
Meaning |
Representation |
<H1> |
main heading |
own paragraph, very large font size, bold |
<H2> ... <H6> |
sub headings |
own paragraph, large font size, bold |
<P> |
paragraph |
line break, larger line spacing |
<BR> |
line break |
new line |
<UL> |
item list |
own paragraph, indented |
<LI> |
list element |
one (or more) lines per element, starts with a bullet (or square etc.) |
<OL> |
numerated list |
like <UL>, but elements are numbered |
<PRE> |
preformatted text |
typewriter font, line breaks and white space are respected |
<EM> |
emphasized text |
italics |
<STRONG> |
strongly emphasized text |
bold |
-
- Lists can be nested.
- References to other documents (links) are marked by the
<A> tag:
<a href="URL">Link Text</a>
URL is the address of the link, Link Text the text in the
document that is marked as a link.
- URL can be a local file (../contents.html) or an arbitrary WWW
address (http://..., ftp://... ).
-
- Usually browser display a link by underlining the link text and using a
different text color. A recently visited link is shown in another color.
-
- An image can be used as an anchor of a link instead of a text. Browsers show
such a link by a colored border around the image or through a change of the
cursor (usually to a hand).
- Images can be embedded into the text with the tag
<IMG src="URL" alt="text">
-
- Images can come in many different file formats. Most browsers allow at least
GIF, PNG and JPEG. Other formats may require a special
plugin.
-
- The possibility (especially of GIF images) to denote a color as
"transparent" allows a seemless embedding of non-rectangular images.
-
- As long as there is no widely used standard for the direct representation of
mathematical formulae, one can include them as GIF images (e.g. created with
LATEX) with a transparent background.
-
- The GIF format allows to combine several images to a little animation
(animated GIF). Such animations are included like ordinary images
with the IMG tag.
- A HTML page can contain programs written in Java (applets)
using the tag
<applet code="programfile" width="WWW" height="HHH">
</applet>
As for images the browser reserves a rectangular screen area of
width WWW and height HHH (in pixel), which is used by
the applet as its main window.
-
- Java applets contain compiled code, usually in a file with the extension
.class.
-
- The following table lists some advanced features of HTML pages. For a
further description cf. the huge literature or one of the online tutorials
in the web. 4.1
Element |
Function |
table |
elements ordered in rows and columns |
imagemap |
links that are connected with special parts of an image |
frame |
partitioning of a page in complete subpages |
style sheet |
layout specification |
forms |
graphical components for data input |
JavaScript |
script language for dynamic HTML pages |

Peter Junglas 8.3.2000