Tag Archives: DOM

Working with XML files in VBA (VBA XML)

1 Star2 Stars3 Stars4 Stars5 Stars (13 votes, average: 4.92 out of 5)
Loading...

XML files are one of the most common type of data files apart from text and CSV (comma-separated values) files. Reading data files which are not hierarchical (as XML files or JSON) is relatively easy. You can read in the data row by row and process columns separately. With XML (and JSON) the task is not as easy as the data is hierarchical (parent-child relationships exist between records in the schema) and the number of underlying nodes may vary as opposed to tabular data which usually has a constant number of columns separated with a specific delimiter.

Fortunately, we can use the MSXML2.DOMDocument object in VBA. Let’s however, as always, start with a short introduction as to how XML files a structure before we dive into the examples.

Loading XML document in VBA

The MSXML2.DOMDocument object allows you to easily traverse through an XML structure an extract any XML node and/or attribute needed. Let’s look at the example below.

Below we start by loading the XML document. Notice that I am selecting the load to be performed synchronously and not validation be carried out on parsing the document. Feel free to change these options if needed.

Alternatively load an XML from a string:

That’s it. You have loaded the XML document into memory into the DOMDocument object. The document has been parsed and you can easily traverse the enclosed elements. See next section.

XML DOM nodes in VBA

For the below I will use the following examples XML:

The XML document will provide you with the root of the entire DOM (of type XDoc.DocumentElement). Each DocumentElement (XML DOM node) facilitates the following node references:

Node Reference Type Description
parentNode [XDoc.DocumentElement] The parent node, one node higher in the DOM hierarchy
firstChild [XDoc.DocumentElement] The first child node, first node lower in the DOM hierarchy
lastChild [XDoc.DocumentElement] The last child node, last node lower in the DOM hierarchy
childNodes [Array of type XDoc.DocumentElement] All child nodes of the current node, all nodes lower in the DOM hierarchy
nextSibling [XDoc.DocumentElement] Next sibling node i.e. node on the same level in the DOM hierarchy, having the same parent node
previousSibling [XDoc.DocumentElement] Previous sibling node i.e. node on the same level in the DOM hierarchy, having the same parent node

All the above references allow you to free move within the XML DOM.

ChildNodes

Let’s start by extracting the first list and printing it’s XML and text contents. The basics to moving around the XML DOM is using ChildNodes.

This is the result

Traversing through the whole XML in VBA

Now that we got the basics let’s print out the whole contents of the XML DOM including the basenames (node names).

This is the result:

Easy right? Using the basics above we can easily move around the document. But this still seems like a lot of coding right? Well there is an easier way of moving / extracting items using the DOMDocument object – called XPath.

XML Document example node references

Now that we have a hang of our XML document, based on the example XML I provided above I mapped a reference to how to obtain various elements of our XML file by using node references:

  • DistributionLists [FirstChild]
    • List [ChildNodes(0)]
      • Name: Recon [ChildNodes(0).ChildNodes(0).innerText]
      • TO: John;Bob;Rob;Chris [ChildNodes(0).ChildNodes(1).innerText]
      • CC: Jane;Ashley
      • BCC: Brent
    • (…)

    • List [ChildNodes(1)]
      • Name: Performance Report [ChildNodes(1).ChildNodes(0).innerText]
      • TO: Huck;Ashley
      • CC: Tom;Andrew
      • BCC: John;Seema

XPath in VBA

Instead of traversing the elements/nodes in your XML using the .ChildNodes/.FirstChild/NextChild properties we can also use XPath. XPath is a query language used for selecting XML nodes in an XML document. It is represented by a single string. It allows you to extract any number of nodes (0 or more) which match the specified XPath query.

If you want to learn XPath I can recommend this overview:
https://www.w3schools.com/xml/xpath_syntax.asp

Now let’s jump into an example:

Example 1: Extract all Lists

Example 2: Extracting all TO fields

Example 3: Extracting the first and last Name field

Example 3: Extracting all child List nodes (Name, TO, CC, BCC)

XML Attributes in VBA

Let’s tackle one last example – attributes. Let’s slightly modify the XML above and include an example attribute named attribute.

Using XPath (or traversing the DOM) we can easily extract the attribute as shown below.

Creating XML documents

Creating documents is also quite straight forward in VBA.