Web scraping is a way to fetch a small piece of content from a page on the web and do something with it. The Nokogiri gem for Ruby makes this easy and the ActionMailer gem makes it easy to email the scraped content to yourself. If you run the program from this post using cron or some other task scheduler, you could be receiving Canon SLR Lens deals right to your inbox!
Firstly you'll need to install the Nokogiri and ActionMailer gems. As this post deals specifically with Gmail accounts the following steps are specific to Gmail. You may just be able to use your regular password for other SMTP email servers.
SimpleMailer class below sets up the SMTP server settings including the app-specific password on line 8 and your gmail account name on line 7. If you're interested you can read more about ActionMailer.
camera_price_buster.rb file is a demonstration of how you can use nokogiri to scrape the content of a webpage and then utilise
SimpleMailer to send the result to an email address. In this case, I'm scraping the Canon SLR Lenses page from camerapricebuster.co.uk to fetch any items that are marked as currently being at their lowest price ever. This is indicated by the presence of the green image on the row in the lens list (see screenshot).
By using an XPath query to pick out the rows containing this image into an array, I loop through the array and extract the pertinent information; Lens name, price and URL to the price comparison for that lens on camerapricebuster.co.uk.
Originally I used the explicit path below but using the one with the
// prefix means that if the table is moved around in the structure I'll still be able to locate the data I need so in this instance it's the better choice.
Armed with the XPath for each row containing the lens details, I add each Lens' details to the string variable
body and finally I include that as the body of the email I send. You could easily put this on a cron job to run daily and get notified by email of the day's deals. The resulting output that get's emailed is something like that shown below:
Found 9 items with lowest-ever prices Lens: Canon EF 16 35mm f4L IS USM Lens Price: £840.97 URL: http://www.camerapricebuster.co.uk/Canon/Canon-EF-lenses/Canon-EF-16-35mm-f4L-IS-USM-Lens Lens: Canon EF 24mm f2.8 IS USM Lens Price: £412.20 URL: http://www.camerapricebuster.co.uk/Canon/Canon-EF-lenses/Canon-EF-24mm-f2.8-IS-USM-Lens Lens: Canon EF 24 105mm f3.5 5.6 IS STM Lens Price: £431.10 URL: http://www.camerapricebuster.co.uk/Canon/Canon-EF-lenses/Canon-EF-24-105mm-f3.5-5.6-IS-STM-Lens Lens: Canon EF 50mm f1.2L Lens Price: £1031.40 URL: http://www.camerapricebuster.co.uk/Canon/Canon-EF-lenses/Canon-EF-50mm-f1.2L-Lens Lens: Canon EF 50mm f2.5 Macro Lens Price: £182.70 URL: http://www.camerapricebuster.co.uk/Canon/Canon-EF-lenses/Canon-EF-50mm-f2.5-Macro-Lens Lens: Canon EF 85mm f1.2 L USM II Lens Price: £1394.10 URL: http://www.camerapricebuster.co.uk/Canon/Canon-EF-lenses/Canon-EF-85mm-f1.2-L-USM-II-Lens Lens: Canon EF 100mm f2.8 Macro USM Lens Price: £346.50 URL: http://www.camerapricebuster.co.uk/Canon/Canon-EF-lenses/Canon-EF-100mm-f2.8-Macro-USM-Lens Lens: Canon TSE 24mm Mk II f3.5L Lens Price: £1331.10 URL: http://www.camerapricebuster.co.uk/Canon/Canon-EF-lenses/Canon-TSE-24mm-Mk-II-f3.5L-Lens Lens: Canon CN E 50mm T1.3 L F Cine Lens Price: £3358.00 URL: http://www.camerapricebuster.co.uk/Canon/Canon-EF-lenses/Canon-CN-E-50mm-T1.3-L-F-Cine-Lens
If the structure of the webpage changes, then the script may break. Ultimately there is always a risk with web scraping that a change to the source page will mean your scraping logic fails, but that's just something you must accept since you're not in control of the source data.