Searching elements in DOM

  1. Methods
    1. document.getElementById
      1. Implicit id-valued variables
    2. document/node.getElementsByTagName
    3. Limit search by parent element
    4. document.getElementsByName
    5. document/node.getElementsByClassName
    6. document/node.querySelector, querySelectorAll
  2. XPath in modern browsers
  3. Query results are alive!
  4. Practice
    1. Label links
    2. Show children count
    3. More
  5. Summary

Most of time, to react on user-triggered event, we need to find and modify elements on the page.

The childNodes, children and other helper links are helpful, but they allow to move between adjacent elements only.

Fortunately, there are more global ways to query DOM.

Methods

document.getElementById

A fastest way to obtain an element is to query it by id.

The following example queries the document for a div with id='info'. It doesn’t matter where is the node in the document, it will be found.

<body>
  <div id="info">Info</div>
  <script>
    var div = document.getElementById('info')
    alert( div.innerHTML )
  </script>
</body>

Note, there can be only one element with certain id value. Of course, you can violate that and put many elements with same ids in the markup, but the behavior of getElementById in this case will be buggy and inconsistent across browsers. So it’s better to stick to standards and keep a single element with certain id.

If no element is found, null is returned.

Implicit id-valued variables

All browsers implicitly create a variable for every id.

For instance, run the following code. It will output “test”, because a is IE-generated reference to the element.

<div id="a">test</div>
<script>
  alert(a)
</script>

In Internet Explorer that may lead to errors, see the example below.

<div id="a">test</div>
<script>
  a = 5 // (x)
  alert(a)
</script>

If you run it in IE, it won’t work. Line (x) is erroneous, because:

  1. a references the DIV (it’s ok).
  2. IE-generated referenes can’t be overwritten (ah, bad bad!).

BUT it will work if you use var a instead of just a:

<div id="a">test</div>
<script>
  var a = 5 
  alert(a) // all fine
</script>

Yeah. IE tought us another good practice.. And also, just for fun…

We know that window is a global object. JavaScript searches everything in window as the last resort.

Then what is window.window, is it true that window === window.window ?

Logically, it should be same, for consistency, but… Open solution to learn more and see why it is important.

Open solution
Solution

In all browsers excepts IE, window.window is just a hooky way to reference.. well.. window. So window === window.window, true.

And window.window.window is also the same as window.window.

But in IE, top-level window is a special object with special features, while window.window is something closer to standard window object.

You can check it out (in IE):

alert(window === window.window) // false
  alert(window.window === window.window.window) // true

Why that may be important?

There are features and bugs which happen if you use a variable without var, because IE uses own outer window object to handle it.

Most notable are:

  1. reassigning a variable with same name as id of an element - IE will generate error:
    <div id="a">...</div>
    <script> 
      a = 5    // error in IE! Ok if "var a = 5"
      alert(a) // will never happen
    </script>
    
  2. recursion through outer window variable - the following code dies on IE<9:
    <script>
    // recurse is explicitly defined on the outer window
    window.recurse = function(times) {
      if (times !== 0) recurse(times-1)
    }
    
    recurse(13)
    </script>
    

    The bug with recursion is fixed in IE9.

document/node.getElementsByTagName

This method searches all elements with given tagname and returns an array-like list of them. The case doesn’t matter.

// get all div elements
var elements = document.getElementsByTagName('div')

The following example demonstrates how to obtain a list of all INPUT tags of the document and loop over results:

<table id="myTable">
  <tr>
    <td>Your age:</td>

    <td>
      <label>
        <input type="radio" name="age" value="young" checked/> under 18
      </label>
      <label>
        <input type="radio" name="age" value="mature"/> from 18 to 50
      </label>
      <label>
        <input type="radio" name="age" value="senior"/> older than 60
      </label>
    </td>
  </tr>

</table>

<script>
*!*
  var elements = document.getElementsByTagName('input')
*/!*
  for(var i=0; i<elements.length; i++) {
    var input = elements[i]  
    alert(input.value+': '+input.checked)
  }
</script>

It is also possible to get a first element by direct referencing:

var element = document.getElementsByTagName('input')[0]

There is a way to get all elements by specifying '' instead of the tag:

// get all elements in the document
document.getElementsByTagName('*')

Limit search by parent element

getElementsByTagName can be called on a document, but also on a DOM element.

The example below demonstrates that by calling getElementsByTagName inside another element:

<ol id="people">
  <li>John</li>
  <li>Rodger</li>
  <li>Hugo</li>
</ol>
<script>

  var elem = document.getElementById('people')

  var list = *!*elem.*/!*getElementsByTagName('li')
  alert(list[0].innerHTML)

</script>

elem.getElementsByTagName(‘li’) finds all LI inside elem. The element before the dot is called *the searching context.

document.getElementsByName

For elements which support the name attribute, it is possible to query them by name.

In the example above, it was possible to use the code:

var elements = document.getElementsByName('age')

document/node.getElementsByClassName

This method is supported in all modern browsers excluding IE<9.

It performs a search by class name, not attribute. In particlar, it understands multiple classes.

The following example demonstrates how it finds an element using one of the classes.

Please use other browser than IE<9 to run it.

<div class="a b c">Yep</div>
<script>
alert( document.getElementsByClassName('a')[0].innerHTML )
</script>

Like getElementsByTagName, it can be called for a DOM element.

document/node.querySelector, querySelectorAll

The methods querySelector and querySelectorAll allow to select elements by CSS 3 query.

The querySelector returns only first element (in tree depth-first walking order), the querySelectorAll gets all of them.

They work in all modern browsers including IE8+. There are limitations on IE support:

  1. IE8 must be in IE8-mode, not compatibility mode.
  2. It isn’t CSS 3, but is CSS 2.1 for IE. That’s less powerfull, but fine for most cases.

The following query gets all LI elements that are last children and have UL as direct parent. It will work on IE8, because this site is rendered in IE8-mode.

<ul>
  <li>The</li>
  <li>Test</li>
</ul>
<ul>
  <li>Is</li>
  <li>Passed</li>
</ul>
<script>
  var elements = document.querySelectorAll('UL > LI:last-child')

  for(var i=0; i<elements.length; i++) {
    alert(elements[i].innerHTML )
  }
</script>

The querySelector is a shortcut for querySelectorAll('...')[0].

XPath in modern browsers

All modern browsers support powerful XPath queries which is a general DOM-searching tool from the world of XML. Most browsers can run them against HTML either.

The following example demonstrates a generic non-IE syntax for finding all H3 containing ‘XPath’ using an XPath query:

var result = document.evaluate("//h3[contains(text(),'XPath')]", document.documentElement, null,                  XPathResult.ORDERED_NODE_SNAPSHOT_TYPE, null)

for (var i=0; i<result.snapshotLength; i++) {
    alert(result.snapshotItem(i).innerHTML)
}

The only exception is IE(including 9) which supports it for XML document objects only. That’s fine for documents loaded from server with XMLHTTPRequest (AJAX), but to search in the document, you’ll need to explicitly load the page into an XML document object.

In real-life querySelector can solve the task in a more convenient way, but it’s always good to keep various possibilities in mind.

Query results are alive!

All DOM queries, which may match multiple elements, return an array-like collection with length and indexes. It is also possible too loop over it with for, just like an array.

But indexes and the length property is actually the only similarities between Array and the returned collection of elements which has a special type NodeList or HTMLCollection.

So it doesn’t have push, pop and other properties of JavaScript array.

But instead, this query result is alive for all getElementsBy* methods. When you select elements and modify the document - the query result is updated automatically.

The folling example demonstrates how collection length changes when elements are removed.

<div id="outer">
  <div id="inner">Info</div>
</div>
<script>
  var outerDiv = document.getElementById('outer')
  var divs = document.getElementsByTagName('div')

  alert(divs.length) 

  outerDiv.innerHTML = '' // clear inner div

  alert(divs.length)
</script>

The liveness applies to collections only. If you get a reference to the element, the reference will not become null. For example, the element elem = document.getElementById('inner') will persist after the outer div is cleared.

Also, querySelectorAll is special here. For the sake of performance, it returns a non-live NodeList. That’s an exception of the general rule.

Practice

Consider the following html:

<!DOCTYPE HTML>
<html>
<body>
<label>The table</label>

<form name="age-form">

  <table id="age-table">
    <tr>
      <td id="age-header">Your age:</td>
      <td>
        <label>
          <input type="radio" name="age" value="young"/> under 18
        </label>
        <label>
          <input type="radio" name="age" value="mature"/> 18 to 50
        </label>
        <label>
          <input type="radio" name="age" value="senior"/> after 60
        </label>
      </td>
    </tr>
  </table>

</form>
</body>

Here are the tasks which base on the HTML above.

For the document tutorial/browser/dom/searchTask.html:

Find all labels inside the table. The result should be an array (or pseudo-array) of labels.

Open solution
Solution

The solution:

var table = document.getElementById('age-table')
var labels = table.getElementsByTagName('label')


For the document tutorial/browser/dom/searchTask.html

Write a function checkInsideTable(id) which returns true if an element with given id is inside the table with id="age-table".

If there is no such element, it should return false.

Like this:

checkInsideTable('age-header')  // true
checkInsideTable('top')         // false
checkInsideTable('non-existant-id') // false

Open solution
Solution

First, we need to get the DOM element and table by id:

var elem = document.getElementById(id)
var table = document.getElementById('age-box')

Then we need to go through parent elements: elem.parentNode, elem.parentNode.parentNode.. etc. Can be done in while loop until the next parent is null.

The function can be written as follows:

function checkInsideTable(id){
  var elem = document.getElementById(id)
  var table = document.getElementById('age-box')
  
  while (elem != table && elem) {
    elem = elem.parentNode
  }
  
  return !!elem
}

After while we have either elem == table or elem == null. So, getting a boolean value for elem gives the result.

Make all external links yellow by giving them class “external”.

<style>
.external { background-color: yellow }
</style>
<ul>
  <li><a href="http://google.com">http://google.com</a></li>
  <li><a href="/tutorial">/tutorial.html</a></li>
  <li>
   <a href="ftp://ftp.com/file.zip">ftp://ftp.com/file.zip</a>
  </li>
  <li><a href="http://nodejs.org">http://nodejs.org</a></li>
</ul>

The result:

Open hint 1
Hint 1
Open solution
Solution

The solution source: tutorial/browser/dom/markLinks.html

To skip links leading on current domain, location.protocol and location.host are used. They keep current scheme (http) and domain (JavaScript.info).

Show children count

Here’s a tree: tutorial/browser/dom/treeSource.html.

Write a code to add a bracketed descendants count to each list item (LI). Skip those LI which don’t have other list items inside.

Put the code at the bottom of BODY so it runs during page rendering.

.

Open hint 1
Hint 1
Open hint 2
Hint 2
Open solution
Solution

Generally, it could be a good idea to modify the markup, so that title will be in a <span class="title">, descendants count will have an element with it’s own class too etc. That could be good for applying CSS as well.

But from the other side, the less tags - the faster it runs. There’s no silver bullet, only silver fork.

The solution is here: tutorial/browser/dom/tree.html.

More

For an arbitrary document, we do the following:

var aList1 = document.getElementsByTagName('a'),
var aList2 = document.querySelectorAll('a');

document.body.appendChild(document.createElement('a'));

alert(aList1.length - aList2.length);

What will be the output? Why?

Open solution
Solution

The output will be 1, because getElementsByTagName is a live collection, which gets autopopulated with the new a. It’s length increases by 1.

Contrary to this, querySelector returns a static list of nodes. It referenes same elements no matter what we do with the document. So, it’s length remains the same.

Summary

There are 5 main ways of querying DOM:

  1. getElementById
  2. getElementsByTagName
  3. getElementsByName
  4. getElementsByClassName (except IE<9)
  5. querySelector (except IE<8 and IE8 in compat mode)

All of them can search inside any other element. All of them excepts the last one return live collections.

XPath is kind-of supported in most browsers, but very rarely used.

Tutorial

Donate

Donate to this project