Tag Archives: read

VBA XML

Working with XML files in VBA (VBA XML)

1 Star2 Stars3 Stars4 Stars5 Stars (13 votes, average: 4.92 out of 5)
Loading...

XML files are one of the most common type of data files apart from text and CSV (comma-separated values) files. Reading data files which are not hierarchical (as XML files or JSON) is relatively easy. You can read in the data row by row and process columns separately. With XML (and JSON) the task is not as easy as the data is hierarchical (parent-child relationships exist between records in the schema) and the number of underlying nodes may vary as opposed to tabular data which usually has a constant number of columns separated with a specific delimiter.

Fortunately, we can use the MSXML2.DOMDocument object in VBA. Let’s however, as always, start with a short introduction as to how XML files a structure before we dive into the examples.

Loading XML document in VBA

The MSXML2.DOMDocument object allows you to easily traverse through an XML structure an extract any XML node and/or attribute needed. Let’s look at the example below.

Below we start by loading the XML document. Notice that I am selecting the load to be performed synchronously and not validation be carried out on parsing the document. Feel free to change these options if needed.

Alternatively load an XML from a string:

That’s it. You have loaded the XML document into memory into the DOMDocument object. The document has been parsed and you can easily traverse the enclosed elements. See next section.

XML DOM nodes in VBA

For the below I will use the following examples XML:

The XML document will provide you with the root of the entire DOM (of type XDoc.DocumentElement). Each DocumentElement (XML DOM node) facilitates the following node references:

Node Reference Type Description
parentNode [XDoc.DocumentElement] The parent node, one node higher in the DOM hierarchy
firstChild [XDoc.DocumentElement] The first child node, first node lower in the DOM hierarchy
lastChild [XDoc.DocumentElement] The last child node, last node lower in the DOM hierarchy
childNodes [Array of type XDoc.DocumentElement] All child nodes of the current node, all nodes lower in the DOM hierarchy
nextSibling [XDoc.DocumentElement] Next sibling node i.e. node on the same level in the DOM hierarchy, having the same parent node
previousSibling [XDoc.DocumentElement] Previous sibling node i.e. node on the same level in the DOM hierarchy, having the same parent node

All the above references allow you to free move within the XML DOM.

ChildNodes

Let’s start by extracting the first list and printing it’s XML and text contents. The basics to moving around the XML DOM is using ChildNodes.

This is the result

Traversing through the whole XML in VBA

Now that we got the basics let’s print out the whole contents of the XML DOM including the basenames (node names).

This is the result:

Easy right? Using the basics above we can easily move around the document. But this still seems like a lot of coding right? Well there is an easier way of moving / extracting items using the DOMDocument object – called XPath.

XML Document example node references

Now that we have a hang of our XML document, based on the example XML I provided above I mapped a reference to how to obtain various elements of our XML file by using node references:

  • DistributionLists [FirstChild]
    • List [ChildNodes(0)]
      • Name: Recon [ChildNodes(0).ChildNodes(0).innerText]
      • TO: John;Bob;Rob;Chris [ChildNodes(0).ChildNodes(1).innerText]
      • CC: Jane;Ashley
      • BCC: Brent
    • (…)

    • List [ChildNodes(1)]
      • Name: Performance Report [ChildNodes(1).ChildNodes(0).innerText]
      • TO: Huck;Ashley
      • CC: Tom;Andrew
      • BCC: John;Seema

XPath in VBA

Instead of traversing the elements/nodes in your XML using the .ChildNodes/.FirstChild/NextChild properties we can also use XPath. XPath is a query language used for selecting XML nodes in an XML document. It is represented by a single string. It allows you to extract any number of nodes (0 or more) which match the specified XPath query.

If you want to learn XPath I can recommend this overview:
https://www.w3schools.com/xml/xpath_syntax.asp

Now let’s jump into an example:

Example 1: Extract all Lists

Example 2: Extracting all TO fields

Example 3: Extracting the first and last Name field

Example 3: Extracting all child List nodes (Name, TO, CC, BCC)

XML Attributes in VBA

Let’s tackle one last example – attributes. Let’s slightly modify the XML above and include an example attribute named attribute.

Using XPath (or traversing the DOM) we can easily extract the attribute as shown below.

Creating XML documents

Creating documents is also quite straight forward in VBA.

vba read file

VBA Read file in Excel (txt, xml, csv, binary)

1 Star2 Stars3 Stars4 Stars5 Stars (10 votes, average: 4.60 out of 5)
Loading...

You can VBA Read file binary or text data using using the Open operator. For CSV files and other structured data schemas you can use the Jet.OLEDB driver or Microsoft Queries.

Text/binary files are common ways of storing data as opposed to databases or regular Excel files. Looking at various resources I missed a single resource which would demonstrate the various methods for PROPERLY reading files in VBA.

It is important to remember that you shouldn’t read all files using the same approach. Be aware of the structure of the file. If it is a structured CSV use the ADODB connection, if you need to read only a couple of rows read the file row by row or by chunks, else read the whole file. If you want performance – always select the right approach.

Reading text files in VBA

VBA Read text files (line by line)

To read an entire text file line by line use the code below.

VBA Read text files (read whole file)

To read an entire text file in one go (not line by line) use the code below.a

VBA Read specific number of lines from a text file

In cases when you want to read specific lines from a text file you can adapt the line by line read code as below. It allows you to read a certain number of lines (noLines) from a text file from a specific start line number (sLine). If you set noLines to 0 it will read all lines till end of the file.

Reading CSV files in VBA

Reading CSV files (read whole file and process each row)

Reading a text file line by line into a string:

Reading CSV files (whole file to Worksheet)

Read whole file to an Excel Worksheet:

To refresh the CSV upload (in case the CSV was updated) simply run:

Reading binary files in VBA

With Binary files often you will be using objects which are not of fixed byte length like Integers. For example you would want to read Strings from binary files together with other data types. In such cases use the Type object data type when writing to a file. Learn more here.
Below a simple example of reading a file to which a Type data type was saved to, including an Integer and String.

Reading XML files in VBA

Functions needed to read files in VBA

Function Description
FreeFile Get next free file number available for the Open statement / FileOpen function. Using this function is important especially when operating on multiple files simultaneously.
More info here.
BOF(fileNumber) Returns true if you are at the beginning of the file described by the file number.
More info here.
EOF(fileNumber) Returns true if you have reached the end of the file described by the file number. More info here.
Loc(fileNumber) Returns the current read/write position within an open file. More info here.
LOF(fileNumber) Returns the size in bytes of the file represented by the file number. More info here.

VBA Read File Summary

Reading files in VBA is not hard and requires just a few lines of code usually. It is, however, important to use the appropriate approach to read a file in VBA. Not all files need to be read line-by-line which is usually inefficient. Equally so you need not always read the entire file if you just need the first few / last rows. Working with XML files is also not a challenge if you read through my post on how to work with XML files.