HTML Primer for Ackanomic

1. Generalities

In an HTML document, <angle brackets> are used to mark commands, called tags, embedded within the file. For instance, <HR> draws a horizontal line across the page like this:

When you want to put literal < and > in an HTML document, you must escape them with character codes. HTML uses the ampersand (&) as the indicator of a character code. It is followed by some characters that describe the specific character, and the code is ended by a semicolon (;). Thus, literal ampersands also need to be escaped in HTML text. HTML uses < for <, > for >, and & for &.

Most HTML tags have two parts, and operate on the text between those parts. In these cases, the second part has a / in front of the tag word. For instance, <H1> is the code used to make that big title "HTML Primer for Ackanomic" above. That text appears after the tag, and then </H1> appears, marking the end of the title. (There are 6 sizes of such headers, H1 through H6. The section headers in this document are H2's.) Such pairs of tags are often called "blocks".

In an HTML document, all whitespace (spaces characters, line feeds, carriage returns, and tabs) is normally collapsed into a single space character (or a return due to automatic wrapping). There are a few ways to work around this:

  can be used to represent a non-breaking space.
<BR> can be used to force a line break at a certain location
<P> can be used to force a paragraph break (usually, a line break with a blank line following) at a certain location
A piece of text can be enclosed in <PRE> and </PRE> to mark it as preformatted. In such text, whitespace is not compacted, carriage returns and line feeds mark ends-of-lines, and most browsers will use a fixed-width font to display the text.

These codes are the bare minimum needed to convert a piece of text to a piece of HTML. Oh, one other thing: Most HTML commands are not case sensitive -- <BR> and <br> and <Br> are all equivalent tags.

2. HTML headers

Every HTML document should enclose the entire HTML code in <HTML> and </HTML> -- an HTML block. This HTML block usually contains two blocks, called HEAD and BODY. HEAD contains some special tags; in my Ackanomic pages it usually only contains a TITLE. BODY contains all the HTML for the body of a page. So a simple HTML document might look like:

<HTML>
<HEAD><TITLE>This is the Title</TITLE></HEAD>
<BODY>Wheeee! Silly body text!</BODY>
</HTML>

One of the more recent additions to HTML are meta-commands that lie outside the HTML block. There are a variety of possible uses for such tags, but one of them is to indicate what type of HTML a particular document contains. There have been a variety of different HTML standards and extensions, and by citing a particular standard, it is possible to say for certain that a particular document is (or isn't) proper HTML. HTML validators, in particular, use such codes to decide which HTML tags are allowed in your document. In my files, I have been citing the W3 Consortium's HTML 3.2 standard, by starting each file with this line:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">

3. Making links; URL types

One of the most useful HTML tags is the A tag, which marks links. It is one of many HTML tags which has arguments. Arguments in an HTML tag look like:

<tagname arg1="value" arg2="value" arg3=value>Stuff inside tag</tagname>

Note that the values may or may not be enclosed in quotes. Usually the values are quoted, because there are strict limits on what characters can appear in an unquoted value.

The A tag has two common arguments.

The HREF argument indicates another web address (called a URL) which is link to the enclosed text. So if I want to create a link to Acka's current score page, it looks like:

<A HREF="scores.html">Link Text</A>

The NAME argument gives that location on a web page a name that other A tags can link to. This is how to make links to the middle of a page, like the links to individual proposals in the proposal archive. The URL for such a link has a # followed by the value of the NAME tag you are linking to after the rest of the URL for that page.

An A tag can have an HREF argument, a NAME argument, or both. But always remember to close the tag; otherwise everything until your next A tag will be part of the link.

The URL I used above is an absolute URL. There are two ways to abbreviate URLs that are often used to save space or to make pages more portable.

The first way is by leaving off the protocol ("http:") and the host name ("//wilma.che.utexas.edu"). A link like this is assumed to use the same protocol and host name as was used for the page containing the link. So any page accessed by http on wilma could use HREF="/~devjoe/acka/scores.html" to refer to the scores page.

The second way is by just giving a filename, i.e., HREF="scores.html". In this case, the file is assumed to be in the same directory as the page containing the link; everything after the last / in the URL of the current page is replaced by the link text. Most of the Acka pages use these links when possible, so copying these pages to another server won't require changing every link on every page -- there are a few absolute links that will need to be changed, though. You can also go down more directory levels this way -- so "images/ackpffft.gif" (used in the link for the "go back" link at the bottom of this page) refers to the file that can be addressed from anywhere on the web as images/ackpffft.gif .

There is actually a third way, when there is a link to a NAME= tag in the same page, you can even leave off the name of the page and just say HREF="#gizz" but this only works within one web page so it isn't used much.

Most URLs use the http protocol, the standard web protocol, but there are URLs that use other protocols. These can be used to provide links that do other things besides load web pages:

ftp URLs download a file from a (normally anonymous) ftp site. The syntax is ftp://hostname/path/to/file -- exactly like http.
mailto URLs call a mail program, if the user's browser has a way to start one. Syntax is mailto:somebody@somewhere.foo.bar.com
telnet URLs call a telnet program, if the user's browser can do so. Syntax is telnet://hostname:port

There are others, but I don't think I've used any in the Acka web pages.

4. Inline images

Above I mentioned the URL for an image link. Images are linked into a document with the IMG tag. The IMG tag takes a SRC argument, which has the URL for the image. Other arguments are optional, but it is preferable to at least use an ALT argument. The value of an ALT argument is displayed by browsers which cannot display your image for some reason. (The file is corrupted, or is in a form that broswer cannot display, or the network is inaccessible, or the browser is text-only like lynx.) So the tag I described below, for the "ACK!" part of the "go back" link, looks like:

<IMG SRC="/deadgames/acka/images/ackpffft.gif" ALT="ACK" BORDER=0>

I didn't mention the border tag. Setting the border width to zero makes it so that browsers like netscape don't put a colored border around the image, indicating the presence of a link (similar to how link text is underlined). Especially when two images are used side-by-side (as in this case), or when you have images with transparent backgrounds used as links (also true here), BORDER=0 is useful.

Another two common arguments in an IMG tag are WIDTH and HEIGHT. This tells browsers how big (in pixels) the image is going to be, before the browser actually loads the image, which allows it to format the rest of the page before that image is loaded. This is particularly useful on pages with big images or lots of images.

The ALIGN argument in an IMG tag can tell your browser to put the picture in a different location -- most commonly, ALIGN=right is used to put an image on the right side of a page.

5. Other tags

Some tags are used to indicate some special way to display text.

The STRONG and EM tags are used to highlight certain sections of text; most browsers display these as bold and italic, respecitvely. There are also B and I tags that explicitly mean bold and italic.

The CENTER tag centers all the text or other elements within it.

The UL and OL tags are used to create lists. OL stands for ordered list; UL for unordered list. Inside an OL or UL block, the LI tag marks the start of a list item. The items in an ordered list are typically displayed as numbered or lettered or roman-numeraled, etc. The items in an unordered list are marked with bullets. You can nest lists; most browsers will use different types of numbers/markers for each level of nesting, and each level is also indented differently.

The BLOCKQUOTE tag indents a section of text. It is used in the proposal and rule files to mark the text of each proposal/rule.

HTML of the form

<!--some text-->

is a comment. This is useful for only temporarily deleting some text or HTML coding on a page.

Tables use the TABLE tag, as well as several other tags that can only appear within a TABLE block. (Tables are used on the gallery page, for instance.) A TABLE block normally consists of several TR blocks, each of which represents a table row. A TR block normally consists of several TD blocks, each of which represents a table cell. There are arguments that can be used in TABLE, TR, and TD tags to control the appearance of a table. Look at the source to the gallery page to see how this works.

The map page uses a client-side imagemap. Originally, imagemaps operated via a cgi program on the server, but the modern form allows the client to interpret the imagemap. To make a client-side imagemap, add a USEMAP argument to the IMG tag for the image. USEMAP takes a value, which is the URL for the map, usually included in the same document. The map is enclosed in a MAP block. The MAP tag takes a NAME argument (like the A tag) which is used to refer to the specific map by the USEMAP argument. The MAP block contains several AREA tags. Each AREA tag has an HREF argument (the thing linked to, like in an A tag), a SHAPE argument (usually rect for rectangle, but there are others), and a COORDS argument (the coordinates for that link, in pixels). Also common is an ALT argument, for non-graphical browsers or if the image is broken. See the map page's source to see how this is used.

The BODY tag in an HTML document can take several arguments, controlling the default colors of text on the page, and the background color or image. The most common are BACKGROUND="/deadgames/acka/images/some url" for a background image, and BGCOLOR="color" for a solid-colored background, where color is either a color name (there is a list of accepted color names somewhere) or an RGB spec, which looks like #RRGGBB where RR is the red intensity as a two-digit hexadecimal number, etc. Other possible arguments include TEXT (default text color), LINK (default color for link text), VLINK (default color for visited links, on browsers set to show the distinction), and ALINK (active link, the color that is flashed when you click on a link.

6. HTML Validators

There are HTML validators available for free use on the web. These programs present a form where you can enter the URL of a web page you wish to have checked. The validator will download your page, check it, and show you a page with the results of the check -- if the page does not validate for some reason, it will show you approximately where and what the problem is. These validators are invaluable for finding simple errors like unclosed blocks, unescaped < or > in text, etc. I have been using the one at http://ugweb.cs.ualberta.ca/~gerald/validate/ for checking the acka web pages. I never got around to checking every page this way, but all the ones that have a !DOCTYPE meta-tag at the beginning have been checked at some time or other.