vba proxy server

Web Scraping Proxy HTTP request using VBA

Visual Basic for Application (VBA) is great for making your first steps in Web Scraping as Excel is ubiquitous and a great training arena to learn Web Scraping. Web Scraping comes down to making HTTP requests to websites. At some point however you will find that some websites will cut you off or prevent multiple HTTP requests. This is were you need to use proxy servers to hide your HTTP requests behind multiple IPs. Using a proxy server is easy in VBA so let’s study a couple of real life examples.

What is a proxy server?

A proxy server is a server that acts as an intermediary for Internet/Intranet HTTP requests. A client machine connects to the proxy server, requesting a service, file, connection, web page, or other resource available from a different server and the proxy server serves that request. Proxy servers often provide anonymity.
proxy server

VBA HTTP Request using ServerXMLHttp

Before we do a proxy HTTP request using VBA let’s look a how a regular HTTP request looks like. Typically I tend to use the ServerXMLHTTP request object, as I will show in the next section, it can be easily extended to run HTTP requests via a proxy IP.

Function GetResult(url As String) As String
    Dim XMLHTTP As Object, ret As String
    Set XMLHTTP = CreateObject("MSXML2.ServerXMLHTTP.4.0")
    XMLHTTP.Open "GET", url, False
    XMLHTTP.send
    ret = XMLHTTP.ResponseText
    GetResult = ret
End Function

Sub Example()
    'Print the HTML of Google.com
    Debug.Print GetResult "http://www.google.com" 
End Sub

VBA HTTP Request via proxy

Let us now extend the above example to run the same HTTP request via a proxy server. Notice the new line highlighted below:

Function GetResult(url As String) As String
    Dim XMLHTTP As Object, ret As String
    Set XMLHTTP = CreateObject("MSXML2.ServerXMLHTTP.4.0")
    XMLHTTP.Open "GET", url, False
    XMLHTTP.setProxy 2, "xxx.xxx.xxx:80"
    XMLHTTP.send
    ret = XMLHTTP.ResponseText
    GetResult = ret
End Function

The xxx.xxx.xxx is the IP of your proxy server and 80 is the default HTTP port. You can easily google free online proxy IPs which can easily be used to run proxy requests.

Last thoughts on VBA Proxy HTTP requests

Right.. so do play a little bit with this new piece of knowledge as this is one of the most important aspects of Web Scraping (unless you want to get your IP blocked). Here are some other tips to keep in mind:

  • Use at least 5-10 proxy servers or else your proxies will quickly be blocked too
  • Be sure to handle proxy server errors e.g. 500, 403. A good approach is to retest any server error with an additional proxy server
  • Test your proxies before use. Especially free proxy servers have a lot of down time. Although 1 proxy IP may have worked yesterday.. it may not be so today. I tend to run a simple HTTP request on all proxies and remove any raising server errors before use
  • Don’t cross the line: Web Scraping is useful when you got a lot of queries you want to run automatically instead of running them manually. Don’t cross the line by content-stealing or abusing websites TOS!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.