Scraping HTML Regex Tester

1 Star2 Stars3 Stars4 Stars5 Stars (4 votes, average: 4.75 out of 5)
Loading...
Go back to All Questions Login or Register
81 views
0

Greetings,

I’m having a bit of trouble with the Scraping HTML Regex Tester. I’m learning Regex on-the-fly mostly using this site: https://regexr.com/.

I’m also trying to learn web scrapping a bit using the scrapping tool. Unfortunately the page to the documentation does not work.

An example issue I’m having is that the Regex I use on the website above does not work on the Regex Tester tool. I mostly have to put in characters to see what happens, and use some of the example from your video. For instance, for practice I’m pulling all the names of UFC fighters from this page: http://www.ufc.com/fighter/Weight_Class

The fighter name is under an anchor class=”fighter-name”. Unfortunately the way the page is set up, there is a link before the class name. I finally figured out how to get around this to get to the actual fighter name, however, I can’t seem to figure out how to remove all of the empty spaces after the fighter name. I used:

\s+([^’]*?)

This works to get to the fighter name, but has a bunch of white space after the name which I can’t seem to remove. Everything that I’ve tried on the regex website doesn’t work within the tool. Any help on this, or a link to a documentation file that works? Also, any good recommendations for more web scrapping tutorials? Thanks!

Simply the best place to learn VBA!