Skip navigation

home    news    contact    site map    search
zStudio|contact us
home > articles > advanced HTML

Advanced HTML:
Moving to the future with XHTML

This article first appeared in Create Online 32. It was written for print, not web, so be prepared to scroll...

Clean up your coding and learn how to make your work forward compatible. In this Masterclass for experienced web designers and producers, we’ll explore how to begin using XHTML to build tighter pages.

In this feature, we’re going to show you how to improve the standard of the code you use in your sites and build pages that will work well in the new generations of browsers that are coming into widespread use. For many developers, this means finally coming to grips with the latest version of HTML, XHTML

Moving towards the future

The most important step to take in moving to XHTML is to try to separate the content and presentation. This isn’t to say that content and form are unrelated, but the idea is that whatever you’re building, try to think of your content in a more structural way and mark it up using the appropriate HTML tags. This will help greatly in making your pages easier to style and making them useful in a wider range of user agents.

As much as possible, (X)HTML should be used to define the content and structure of a document, while the appearance of the page should be applied using Cascading Style Sheets (CSS). This is the hard part at present. Because of legacy browsers, in particular NN4, much of the presentation still has to be done with HTML; we’re talking especially about things like page margins, the layout, and areas of background colour. At the same time, we can gradually change from the old hacks to newer CSS techniques where some loss of design consistency in bad browsers can be tolerated.

We’re in a period of transition; web design is moving from an era where backward compatibility with older browsers was paramount into a new one where forward compatibility is seen as just as important. The aim is to create pages that look good and represent the brand effectively while also supporting a wide range of other clients such as PDAs, speech synthesisers and other assistive technology. The trick is to do all this using the same document. This is possible only with thoughtful use of design techniques and through building well-structured pages that follow web standards.

Up until now, we’ve been hampered by trying to maintain visual consistency with older browsers, but in the last couple of years we’ve seen the version 3 and now the version 4 browsers slip lower in importance. The idea of allowing these legacy browsers to gracefully degrade the design while still allowing access to the content is taking a firmer hold. Recently, there have been some commercial sites that just don’t work at all in NN4; this can’t be commended in itself, but may be indicative of a change of feeling in the industry towards the legacy browsers.

When contemplating a move to completely standards-compliant design, the most troublesome area is that of using tables for layout. These have long been viewed as problematic for accessibility. The problem there doesn’t actually stem from the tables themselves, but from the way that the content is linearised within a text-only environment. In moving forward to XHTML and future languages, layout tables are an anachronism, and most web sites would see benefits from discarding the layout tables if the same design behaviour could be achieved in the target browsers. (For one thing, pages can be as little as one third of the file size when redesigned using CSS instead of tables.) We’re not quite there yet, though.

The problem is that the alternatives to tables – CSS-positioned DIVs and other elements – lack many of the design features that tables make so rewarding to use in web design. Proportional multi-column layouts, precise alignment of content within cells and reliable control of background colour within browsers from version three onwards are some of the features that are most missed by web designers moving to CSS positioning. CSS layout does allow the same layout techniques, it just requires much more knowledge to make it work.

The visual editors don’t help. Through the usual feature-bloat that comes with mature applications, they encourage bad design and bad coding. Only with the advent of Dreamweaver MX is there a visual authoring tool that makes any effort to help developers trying to code to standards.

The good news for those web designers who would like to at least start on this process is that working to standards doesn’t necessarily mean forgoing tables. Where CSS positioning represents a good design solution, by all means use it but where tables are better, use them. Within the tables, we need to pay much more attention to the structure of the content.

While still using tables for basic design, it’s a good idea to experiment with CSS alternatives to spacer images and deeply nested tables in order to make the tables simpler. This will make pages faster and cleaner while still maintaining a fair approximation of the layout in the older browsers.

Improve your HTML structure

The first step is to stop putting in tags that don’t make sense from a structural perspective, and replace them with valid HTML and CSS. How much of this you can do within commercial projects depends on how close to the design you need NN4 to get. Many of the things that NN4 gets wrong or can’t render won’t affect the user experience all that much; in this transitional period, there’s room for flexibility in how rigorous the design has to look in a six-year-old browser. Here’s a checklist to help you to clean up your code:

Text formatting

It’s time to stop using <FONT> tags if you haven’t already. Go through and remove all the <FONT> tags from text in your pages. These control typeface, size and colour. Instead of <FONT> tags, use CSS to change the appearance of type contained inside the common container tags; we’ll see in the second part of this article next month how to implement CSS in a way that allows you to support good browsers well without wrecking the page in bad browsers.

CSS can be used in almost as bad a way as <FONT> tags; it’s bad practice to use <SPAN> tags or classes to format subheadings or headings on the page without an accompanying heading tag; doing this means little outside of a visual environment, and isn’t structural mark-up.

Background colour and background images

Where possible, remove BGCOLOR and BACKGROUND attributes from tables, rows and cells as well as the body tag. Use CSS classes instead; you’ll find that as well as being more controllable within pages and across the site, this is better from a design perspective. In CSS, you can anchor background images to the left, centre or right of a container, such as a table, allowing you to use smaller images and fewer nested tables.

An important consideration is that if you use CSS for controlling the colour of foreground elements like text, you also need to use CSS for background colour in those areas; don’t mix CSS text formatting with HTML background colours. This way if CSS formatting isn’t displayed, both the text and the background revert to default. Otherwise, users of older CSS-challenged browsers like IE3, NN3 and NN4 may find that they can’t read the text, especially if you used a dark background colour or image.

Don’t use <B> and <I> to format text to bold and italic

Use <STRONG> and <EM> instead respectively. Most web designers heard this long ago, but it’s a chore applying these instead of the default tags you get using the bold and italic formatting buttons in most web editors. However, there’s good news; Dreamweaver MX now uses these automatically when you apply bold or italic using the Properties palette.

Don’t use tables to simulate lists

Use the available list styles instead. You can use CSS to define an image to be used in place of a bullet. Netscape 4 shows regular bullets instead, which is fine.

Clean up the <BODY> tag

Try to avoid using the LEFTMARGIN, TOPMARGIN, MARGINWIDTH, MARGINHEIGHT attributes to control browser margin, as these prevent your pages from validating; apply margin:0 to the body tag through CSS instead, if you can convince your boss/client that it doesn’t really look that bad in NN4. Don’t use TEXT, LINK, ALINK or VLINK to control the colour of text in the page; CSS properties for these work well. Don’t use the BGCOLOR or BACKGROUND attributes to control colour and background images; the CSS properties give much better control here just as they do within tables.

Use semantic mark-up for text

Use <Hx> tags wherever possible for headings and subheads, as these have special significance to screen readers and search engines. Format them with CSS to give the correct visual result. Learn to use tags like <CITE> and <ABBR> to mark-up meaningful pieces of text like citations and abbreviations. Many tags, including these and the link tag <A>, allow the insertion of a title attribute that appears as a tool tip for many users. Touches like this add a great deal to the user’s experience on a site without conflicting with the visual design.

Use <DIV> tags as containers

To use <DIV> tags with CSS as an alternative to table layout, look at your design as a set of rectangular areas; navbar, ads, title bar, content area, and so on. Define each area with its own <DIV> tag and assign an ID attribute to it with a meaningful name. You’ll find that this helps when building the CSS for the page, as well as improving readability in the code. The ID attributes allow you to apply individual CSS rules to each section that can include, for example, positioning, background colour and background image.

It’s this sort of structural mark-up that allows radical changes to site design to be implemented simply by editing a style sheet file. When support for CSS matures a little more, we can define separate style sheets for groups of user agents like PDAs, voice readers and printing, as well as desktop browsers, making site versioning potentially unnecessary.

Transitioning to XHTML

After CSS, the most important technology to be familiar with is XHTML, the newest version of HTML. It’s importance is not that it adds significant new capabilities to HTML 4.0, but that it represents instead a transitional step to a future that uses XML as the means of describing content. XHTML is a reformulation of HTML 4.0 in XML 1.0 that is designed to offer backward compatibility with existing browsers as well as forward compatibility with future browsers. Most conscientious web developers should probably start using it wherever possible from now on.

XHTML is supported by the current generation of standards compliant browsers, including Internet Explorer 6.0 for Windows, Internet Explorer 5.0 on the Mac, Opera 6.0 and Netscape 6.0. While it is much stricter than HTML 4.0, and developers are encouraged to ensure that their pages validate correctly, XHTML is written in pretty much the same way as HTML, with the addition of some simple principles that are easy to pick up. Dreamweaver MX adds good support for XHTML, making it possible to start producing compliant pages straight out of the box.

// SCREENSHOT please locate xhtml_code.tif near here
// Caption : “A minimal document in XHTML”

 

Here are the rules to create pages according to XHTML 1.0:

  1. You need to include a <DOCTYPE> declaration at the top of the page that identifies the tags to be used and the rules for validating them. This is a concept from XML. These rules are contained in an actual text file called a DTD (Document Type Definition). Most <DOCTYPE> statements reference a document on the W3C server that contains these rules. If the tag is absent, or the URL for the DTD is invalid, then the page can’t be rendered as XHTML. There are three DTDs for XHTML, Transitional, Frameset and Strict, and you need to reference the correct DTD depending on how you have built your page and expect it to be validated. Most commercial pages will need to use the Transitional DTD.
    This is the declaration to use for pages to be validated against XHTML 1.0 Transitional:
    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd ">
  2. You must also have this line following the <DOCTYPE>. It takes the place of the HTML tag in HTML 4.0:
    <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
  3. There may be an optional <?XML> prologue that precedes the <DOCTYPE> declaration as the very first line in the document. This can be used to define the character set used by the page. This is added by default by Dreamweaver when you tell it to make a page XHTML compatible. It’s probably best to omit it or remove it when you see it, though, as it can cause problems in certain browsers.
  4. All tags must be correctly nested; you can’t have any interlocking tags, just like HTML 4.0.
  5. All tags have to be closed, including <TD>, <P>, <LI> and other tags that you may not be used to closing. Many coders have been doing this for a while anyway in order to work correctly with CSS.
  6. Empty tags must be closed; these are tags like <IMG> and <BR>. Rather than including an equivalent closing tag with no content between it and the opening tag (though this is perfectly acceptable too), you can use a minimised form, where the opening tag ends with a space and a forward slash, like this; <br />. The space before the forward slash isn’t required, but adding it prevents certain rendering problems in bad browsers.
  7. All mark-up must be in lower case, including both tags and attributes. This may require a bit of effort if you preferred using upper case to help distinguish tags from content. There’s no longer a choice if you want your code to work in XHTML, though.
  8. All attributes must be quoted using double quotes.
  9. Attribute minimisation is not permitted; this means attributes like the CHECKED attribute for checkboxes must be written out as a full name/value pair like this; checked="checked"

There are a few other rather more obscure requirements, but the important thing is to give it a shot. In order to help you learn XHTML, the W3C provides online validation engines for both HTML and CSS. You’ll find pointers to these in the accompanying links. The idea is to write your code, validate it, then fix the bugs and repeat the process until the validation reports no errors.

Try reading your source as if it were the page instead, and see if you can figure out what it’s about. If your HTML structure is good, it’ll make a lot of sense. If you can’t see the content for a forest of table code, maybe your page is trying to tell you something. :::

Copyright Ian Anderson 2002. All rights reserved.