CIS 89C Client-Side Programming with JavaScript

CIS 89C Client-Side Programming with JavaScript

Unit 11 - Document Object Model

Reference for the first half of these notes:

Professional JavaScript for Web Developers

by Nicholas C. Zakas

Chapter 6 DOM basics

Topics for the first half of these notes:

Opening statement

What is the DOM - Zakas page 159

Introduction to XML - Zakas page 159

An API for XML - Zakas page 162

Hierarchy of nodes - Zakas page 163

Language Specific DOMs - Zakas page 166

DOM support - Zakas page 167

Using the DOM - Zakas page 167

Accessing relative nodes - Zakas page 167

Checking the node type - Zakas page 169

Dealing with attributes - Zakas page 169

Accessing specific nodes - Zakas page 171

Creating and manipulating nodes - Zakas page 173

DOM HTML features - Zakas page 178

Attributes as properties - Zakas page 178

Table methods - Zakas page 179

DOM Traversal - Zakas page 182

NodeIterator - Zakas page 182

TreeWalker - Zakas page 187

Detecting DOM Conformance - Zakas page 189

DOM Level 3 - Zakas page 191

Summary - Zakas page 191

Reference for the second half of these notes:

JavaScript, A Beginner's Guide, Second Edition

by John Pollock

Module 9 The Document Object

Topics for the second half of these notes:

9.1 Defining the Document Object - Pollock page 224

9.2 Using the Properties of the Document Object - Pollock page 224

9.3 Using the Methods of the Document Object - Pollock page 256

General reference for the model and its use in JavaScript:

JavaScript, The Definitive Guide, 5th Edition

by David Flanagan

I use this reference book whenever I use the Document Object Model.

***********************************************************

Opening statement

The Document Object Model (DOM) is used to contain the information of a web page.

We can look at the information in the Document Object Model.

For example, we can look at what the user has typed into an input text area.

We can also change the Document Object Model, to change what is in the web page.

So, for example, we can read the input data from two input text areas, convert these strings into numbers, multiply the two numbers, and put the answer into the web page, where the user can see the answer.

The Document Object Model makes possible many of the things we do with JavaScript. The Document Object Model is language independent, so it can be used in Java or other programming languages, as well as in JavaScript.

Of course, we will use the Document Object Model in this course.

We will not use the more advanced aspects of the DOM, but we encounter the DOM throughout much of the rest of this course.

McDuffie introduces the DOM very briefly in the first chapter, but then just uses it without any additional introduction.

I think we need an introduction to the DOM, so it is clear what we are doing when we encounter the DOM objects.

This discussion follows the book:

Professional JavaScript for Web Developers

by Nicholas C. Zakas

Chapter 6 DOM basics

So you should take careful notes in this discussion, because there is no corresponding discussion in McDuffie that focuses on this subject.

You will find the needed pieces of the DOM in McDuffie.

***********************************************************

What is the DOM

The Document Object Model (DOM) provides an object based representation for an eXtensible Markup Language (XML) document.

Remember, one of the first things we did in this course was to discuss changing from HTML to XHTML.

We did this is because XHTML is an XML language.

The DOM can be used, to some extent with ordinary HTML, but there are many areas where this does not work.

So we are using XHTML, which is the most recent version of HTML.

We can see all the elements and data of the web page by looking at the DOM.

We can change the web page by changing the DOM.

*****

Introduction to XML

Lets review XML.

First, the history of XML:

Before there were any computers or linotype machines, type was set by hand.

After the type for a page was set, a sample page was printed.

The editor would look at the page, and write suggested changes on the page.

Then the page would be changed to meet the requirements of the editor.

The markup symbols used by editors became standardized, so everyone was using, more or less, the same markup language.

When we had computers, and programming products were produced,

there were programming manuals.

There are almost no programming manuals today, but they previously were very important.

To be able to build and print programming manuals, the language script was created.

With script I could mark the beginning of a paragraph or a heading, or other simple markups in a programming manual for the program product I was working on building.

A program used these markups into make paragraphs, boldface headings. et cetera.

The script language was somewhat improved, renamed as

the Generalized Markup Language (GML), which was released as a product,

so everyone could use it.

Research on markup languages led to the creation of SGML.

SGML is a markup language used to create markup languages.

SGML (Standardized General Markup Language) is in general use today.

HTML is specified in SGML.

XML is specified in SGML.

XHTML is specified in SGML.

The <!DOCTYPE > tag at the beginning of your web pages is SGML.

It says that your document is XHTML.

Why do we use XHTML, rather than the older HTML.

HTML:

opening tag closing tag

Sometimes the closing tag is required.

<script>          </script>

Sometimes the closing tag is forbidden.

<img>

Sometimes the closing tag is optional.

<p>               </p>

Sometimes the order of the closing tags does not matter.

Sometimes attribute values are required.

<img src="dog.gif">

Sometimes attributes are not required.

<hr noshade>

Sometimes attribute values do not require quotes.

Sometimes attribute values require quotes.

"what is it?" (quotes required due to spaces)

html is not case sensitive.

All these differences are allowed by SGML, and are specified in HTML.

But these differences are a pain to parse.

They make the browsers larger and slower.

They are very hard to manage in the DOM.

So we will use XHTML

XML:

Every tag must have a closing tag, or

a tag may end with a slash before the closing greater than.

<p>               </p>

<hr />

Inner elements must be closed before there containing elements are closed:

<b><i>bold Italic text</i></b>

(opening <i> and closing </i> are within the opening and closing b.)

All attributes require values

<hr noshade="noshade">

Attribute values must always be within single or double quotes.

Tag names and attribute names are case sensitive.

XHTML:

XHTML is XML, and follows all the XML rules.

Every tag name and attribute name must be lower case.

Tags ending with a slash, must have a space before the slash.

<hr />

reference discussion SGML and XML.

http://www.w3.org/TR/NOTE-sgml-xml.html

Notes on using XML:

The first line is the xml prolog:

<?xml version="1.0"?>

We are also specifying the encoding:

<?xml version="1.0" encoding="UTF-8"?>

Like HTML, XML uses SGML comments:

<!--  This is a comment.  It may not contain double dashes. -->

You can use the SGML CDATA container, for text you do not want the parser to look at.  This would be data containing < > and other text that the parser might give special meaning.

<![CDATA[This stuff is just text a < b while b > c ]]>

It starts with <![CDATA[ and ends with ]]>

Notice that SGML tag names are upper case.

We are using the HTML link tag to include a style sheet:

<link rel="stylesheet"

      type="text/css"

      href="sample.css"

      title="sample">

XML style sheets can be specified with:

<?xml-stylesheet type="text/css" href="sample.css" ?>

This kind of XML tag is called a Processing Instruction (PI).

*****

An API for XML

The DOM is called an API (Application Programming Interface) for XML.

That is because the program can change the XML tags, and with them, the web page, by just changing the DOM.

Also the application can see the XML tag information by looking at the DOM.

This API works for any programming language. We are using JavaScript..

*****

Hierarchy of nodes

The DOM is a tree, with one root node at the top, and the other nodes as children of nodes, all the way down to the leaves at the bottom.

(Trees always have their root up and there leaves down in programming.)

Example:

<?xml version="1.0" encoding="UTF-8"?>

<html>

<head>

<title>

sample

</title>

<body>

Hello

</body>

</html>

The first line is the XML prolog.

All the other XML elements are represented by nodes in the DOM tree.

Document html

_____|____

| |

Elements head body

| |

Element title |

| |

Text sample Hello

Each element in the document is represented by a node.

The root node, at the top, is called the Document node.

The other XML elements are called Element nodes.

The text is in Text nodes.

We will use these types of nodes. We will also use

Attr nodes, which are used for attributes.

Other node types are:

DocumentType node for <!DOCTYPE>

DocumentFragment node incomplete document

CDataSection node unparsed text

EntityReference node for ©

ProcessingInstruction node for processing instrucitons

Comment node for

Notation node not used much

Nodes have properties

Node properties:

nodeName results vary by type of node

nodeValue results vary by type of node

nodeType one of the node types listed above

ownerDocument refers to the root node

firstChild refers to the first child

lastChild refers to the last child

childNodes Node list of all children

previousSibling refers to a node that is child of the same parent

nextSibling refers to a node that is child of the same parent

attributes NodeNameMap of the attribute values (Element nodes only)

A node list is an array with numerical subscripts.

A node name map is an array where each value can be accessed by attribute name or number as the subscript.

Node methods

hasChildNodes() returns Boolean

appendChild(node) adds the node as the new last node

removeChild(node) removes the node from the list of children

replaceChild(newnode, oldnode)

insertBefore(newnode, referencenode)

*****

Language Specific DOMs

Besides the general node information we just discussed,

there may be things specific to the XML language.

There are things that are specifically for XHTML.

This includes the:

HTMLDocument

HTMLElement

Most types of HTML elements have an element type like:

HTMLDivElement for div elements, for example.

We will see some of these HTML specific things, as well as general XML things.

These usually work with HTML, but are more likely to work with XHTML.

***********************************************************

DOM support

DOM support is different in different browsers.

For example Internet Explorer puts a text node only where there is currently text.,

but Firefox puts a text node in all locations where there might be text,

even if there is no text there at the moment.

Generally speaking Mozilla and Firefox have the best support, supporting DOM levels 1, 2, and parts of level 3.

Opera and Safari are close behind.

Lagging behind with incomplete level 1 support is Internet Explorer.

(Information as of when the Zakas book was published.)

***********************************************************

Part 1 lecture notes end here -

continue later from page 167 in Zakas, as needed.

***********************************************************

Part 2 lecture notes begin here - from Pollock

***********************************************************

9.1 Defining the Document Object

The browser creates a Document Object for each web page.
The Document Object contains properties for everything in the web page.

This is a very important Object.

You can look at the Document Object to find out things about the web page.

You can change the Document Object to change the web page.

You have been using the method:

document.write()

There are many other methods, which help you use the Document Object to see what is in the web page, or to change the web page.

***********************************************************

9.2 Using the Properties of the Document Object

The Pollock book lists properties and methods of the document Object.

The Flanagan reference book gives an exact list of properties and methods.

You can look up all the properties and methods in these books.

We will look at a few of them in this lecture.

The description in this lecture is more precise than that given by Pollock;

It comes from the Flanagan book.

A Document Object represents one XML web page.

In our case, we the XML web page we will be using is an XHTML web page.

There is one Document Object for each page that is open in

the browser window, or in a tab, or in a frame, or in an iframe.

The Document Object is one of the Node objects in the Document Object Model.

Properties in a Document Object:

name readonly? type object description

referenced

defaultView readonly Window window this page is displayed in.

doctype readonly DocumentType document type

(null if no !DOCTYPE specified)

documentElement readonly Element root element, "html" for html pages

implementation readonly DOMImplementation a set of global document methods

styleSheets readonly CSSStyleSheet[] an array of the style sheets for this page

This does not look like the list of properties given in Pollock.

This list is from Flanagan.

This is a real Document Object.

There is a subordinate Object, the HTMLDocument Object

It is the HTMLDocument Object that Pollock is discussing.

So, you see that the DOM is relatively complex.

Why is Pollock different than Flanagan?

1) Flanagan is more recent, reflecting the move from HTML to XML based documents.

2) Flanagan is a more precise reference book.

Some of the properties in the HTMLDocument Object are:

name readonly? type object description

referenced

anchors readonly HTMLCollection reference to a collection, which works like

an array of all the <a> elements

body no HTMLElement reference to the body element

or outermost frame

lastModified readonly String date/time of last modification is optionally

sent by the server

forms readonly HTMLCollection reference to a collection, which works like

an array of all the <form> elements

images readonly HTMLCollection reference to a collection, which works like

an array of all the <img> elements

(images created with an object tag are

NOT included)

links readonly HTMLCollection reference to a collection, which works like

an array of all the <link> elements

Something interesting:

The window object has a location property, which contains the URL requested to load the web page.

The document object has a URL property, which contains the URL loaded.

1) They are usually the same.

2) Sometimes they are different, if the load was redirected to a different page.

3) document.URL is readonly. window.location can be changed.

If you change the URL in window.location, the new page will load, replacing the current page.

This is a good example of the sort of stuff you find in the Document Object Model.

So, there is one document object for the current XHTML page.

Is it created as a Document Object or as an HTMLDocument Object?

The answer is yes.

When it is created, it is created as an HTMLDocument.

However, an HTMLDocument object inherits the properties as a Document Object.

So, the document object has all the properties of an HTML Document, and also all the properties of a Document object.

Also, a Document object also inherits all the properties from a Node object, so you have all those properties also.

Also, a Node object inherits all the properties from an Object object, so you have all those properties also.

So let's look again at the question: Is the book's description of the properties of the document object correct. It is pretty close. For the most part, the book lists HTMLDocument properties.

Those are some of the most interesting properties of the document object.

So, let's look at another difference between what we see in Pollock, and what we see in Flanagan.

Pollock looks at a single, specific object, the document object for the current page.

The name of this object is: document

Flanagan looks at what is created when you create a document object, rather than looking at a specific single document object.

Somehow, I do not think we will learn much of the Object models in this class.

It is just too much.

We will do well to learn the fundamentals of JavaScript, a little of the Document Object Model, and management of HTML forms.

***********************************************************

9.3 Using the Methods of the Document Object

Some of the properties in an object are a reference to a function.

The function referenced by a property of an object is called a method.

A method can use the keyword this to refer to the current object.

This allows methods to retrieve or change information saved in the properties of the object.

We have been using the method:

document.write("hello");

Where document is the object for the current HTML document loaded in our window and write is a method function. The write() method creates a text element within the current page, which contains "hello".

This causes the word to appear in the current page, at the location in the page where the JavaScript document.write("hello"); is located.

getElementById() is a very interesting method, which allows us to obtain a reference to any element in our web page, provided it has an id.

It is very easy to give an element an id attribute in HTML, and then use getElementById() to refer to that element.

Be careful spelling the methods. They often must have specific letters capitalized. Remember:

XHTML required lower case.

JavaScript requires the case, exactly as specified.

***********************************************************

Summary

The Document Object Model, and other related models are used to represent the XML or HTML page. By looking at the properties of the objects in the model, you can see what is in the page. By changing the properties of the objects in the model, you can change what is in the page.