Safari Books Online is a digital library providing on-demand subscription access to thousands of learning resources.
Now that you’ve made your first server, it’s time to make a client. This project is an embedded web scraper. It takes data from an existing website and uses it to affect a physical output. It’s conceptually similar to devices made by Ambient Devices, Nabaztag, and others—but it’s all yours.
In this project, you’ll make a networked air-quality meter. You’ll need an analog panel meter, like the kind you find in speedometers and audio VU meters. I got mine at a yard sale, but you can often find them in electronics surplus stores or junk shops. The model recommended in the parts list is less picturesque than mine, but it will do for a placeholder until you find one you love.
Figure 4-8 shows how it works: the microcontroller makes a network connection to a PHP script through the Ethernet shield. The PHP script connects to another web page, reads a number from that page, and sends the number back to the microcontroller. The microcontroller uses that number to set the level of the meter. The web page in question is AIRNow, www.airnow.gov, the U.S. Environmental Protection Agency’s site for reporting air quality. It reports hourly air quality status for many U.S. cities, listed by ZIP code. When you’re done, you can set a meter from your home or office to see the current air quality in your city (assuming you live in the U.S.).
First, you need to generate a changing voltage from the microcontroller to control the meter. Microcontrollers can’t output analog voltages, but they can generate a series of very rapid on-and-off pulses that can be filtered to give an average voltage. The higher the ratio of on-time to off-time in each pulse, the higher the average voltage. This technique is called pulse-width modulation (PWM). In order for a PWM signal to appear as an analog voltage, the circuit receiving the pulses has to react much more slowly than the rate of the pulses. For example, if you pulse-width modulate an LED, it will seem to be dimming because your eye can’t detect the on-off transitions when they come faster than about 30 times per second. Analog voltmeters are very slow to react to changing voltages, so PWM works well as a way to control these meters. By connecting the positive terminal of the meter to an output pin of the microcontroller, and the negative pin to ground, and pulse-width modulating the output pin, you can easily control the position of the meter. Figure 4-9 shows the whole circuit for the project.
Next, you need to get the data from AIRNow’s site in a form the microcontroller can read. The microcontroller can read in short strings serially, and converting those ASCII strings to a binary number is fairly simple. Using a microcontroller to parse through all the text of a web page is possible, but a bit complicated. However, it’s the kind of task for which PHP was made. The program that follows reads the AIRNow page, extracts the current air-quality index (AQI) reading, and makes a simpler summary page that’s easy to read with the microcontroller. The Ethernet controller is the microcontroller’s gateway to the Internet, allowing it to open a TCP connection to your web host, where you will install this PHP script.
You could also run this script on one of the computers on your local network. As long as the microcontroller is connected to the same network, you’ll be able to connect to it and request the PHP page. For information on installing PHP or finding a web-hosting provider that supports PHP, see www.php.net/manual/en/tutorial.php#tutorial.requirements.
Figure 4-10
shows AIRNow’s page for New York City (http://airnow.gov/?action=airnow.local_city&zipcode=10003&submit=Go).
AIRNow’s page is formatted well for extracting the data. The AQI
number is clearly shown in text, and if you remove all the HTML
tags, it appears on a line by itself, always following the line
Current Conditions.
One of the most difficult things about maintaining applications like this, which scrape data from an existing website, is the probability that the designers of the website could change the format of their page. If that happens, your application could stop working, and you’ll need to rewrite your code. In fact, it happened between the first and second editions of this book. This is a case where it’s useful to have the PHP script do the scraping of the remote site. It’s more convenient to rewrite the PHP than it is to reprogram the microcontroller once it’s in place.
Next, it’s time to connect to the PHP script through the Net using the Ethernet module. This time, you’ll use the shield as a client, not a server. Before you start programming, plan the sequence of messages. Using the Ethernet module as a network client is very similar to using Processing as a network client. In both cases, you have to know the correct sequence of messages to send and how the responses will be formatted. You also have to write a program to manage the exchange of messages. Whether you’re writing that program in Processing, in Arduino, or in another language on another microcontroller, the steps are still the same:
Open a connection to the web server.
Send an HTTP GET request.
Wait for a response.
Process the response.
Wait an appropriate interval and do it all again.
Figure 4-11 is a flowchart of what happens in the microcontroller program. The major decisions (if statements in your code) are marked by diamonds; the methods are marked by rectangles. Laying out the whole program in a flowchart like this will help you keep track of what’s going on at any given point. It also helps you to see what methods depend on a particular condition being true or not.
The circuit for this project also uses LEDs to keep track of the state of the program. LEDs attached to I/O pins will indicate the state. There’s an LED to indicate that it’s connected, another to indicate that it’s disconnected, a third to indicate if it got a valid reading, and a fourth to indicate that the microcontroller is resetting.
This program will check the PHP script every two minutes. If there’s a new value for the air quality, it’ll read it and set the meter. If it can’t get a connection, it will try again two minutes later. Because it’s a client and not a server, there’ s no web interface to the project, only the meter.
For this sketch, you’re going to need Michael Margolis’ TextFinder library for Arduino. Download it from www.arduino.cc/playground/Code/TextFinder, unzip it, and save the TextFinder folder to the libraries folder of your Arduino sketches directory (the default location is Documents/Arduino/libraries/ on OS X, My Documents\Arduino\libraries\ on Windows 7, and ~/Documents/Arduino/libraries/ on Ubuntu Linux. If the libraries directory doesn’t exist, create it and put TextFinder inside. Restart Arduino, and the TextFinder library should show up in the Sketch→Import Library menu. TextFinder lets you find a substring of text from the incoming stream of bytes. It’s useful for both Ethernet and serial applications, as you’ll see.
TextFinder was modified and included in version 1.0.1 of Arduino, after this edition was published. You can find its methods in the Stream class, from which Client, Server, and Serial inherit methods. For alternate versions of these sketches that use Stream instead of TextFinder, see the GitHub repository for this book’s code, at https://github.com/tigoe/MakingThingsTalk2. Check the Reference section of www.arduino.cc for details of the Stream class.