Web Scraping is very useful for getting the information you need directly off websites. Sometimes however simple browser automation is not enough in terms of performance. Having created both the IE and Parallel classes I decided to jump an opportunity of creating a simple example of how mulithreaded browser automation can be achieved. Daniel Ferry achieved the same here. However, he used VBscript and a lot of coding was required to manage the swarm, as he called it. What I wanted to show you is rather an example how you can combine the Parallel class and the IE class to achieve the same functionality but in a much more concise and easy to manage way. So let’s jump straight in.
What does it do?
The browser automation procedure queries the Google page and copies the first resulting link text to the Excel workbook. In the mulithreaded example a “swarm” of 4 threads (or more if needed) is maintained to carry out simultaneous Google queries. This way the overall execution time is significantly reduced as more IE browser objects are created when some of them are waiting for a callback.
Multithreaded browser automation: Video
Instead of going into the details I encourage you to watch this short video example of a single and mulithreaded IE automation example:
Feel free to download the workbook here:
Check out the deterministic IE automation class here:
EXCEL: Simple class for using IE automation in VBA
Check out the Parallel class mulithreading tool here:
EXCEL: VBA Multithreading Tool