The story of how data is collected is sometimes more interesting than the data itself. I have grouped these datasets into one section of my page because they were all collected similarly, if not exactly the same way. The visualization of the data is also more interesting than the actual data itself in this case.
Dataset 1: Visualization Wireless Access Points
The goal here was to study how prolific wireless access points have become even in the most remote places. Technically, I could have gone anywhere to collect data, but it was convenient and fitting to attempt this first on a trip from Victorville (on our way from Las Vegas) to home as a pilot. My recording instrument initially consisted of a Gateway notebook equipped with an internal Intel wireless 802.11g card. For continuous recording, the computer was powered using a DC power to AC power converter which was powered by the vehicle. As access points were discovered by the wireless card, parameters about the access point were recorded to file by way of NetStumbler v0.4.0.
But this data is useless if we do not know where each of these access points was first discovered!
For the actual data collection phase, I collected actual data on a trip from home to Laughlin, NV. This time, my recording instrument was the same as during the pilot phase, with two added devices: a BlueTake Bluetooth connection device and an iBlue GPS. On our way from home to Nevada, I recorded access points and their locations. On the way back, I recorded just our location once per second.
Note: I have stalker-proofed both datasets. You will notice that my route ends in the middle of a road. That is not where I live, but is kind of close.
The initial format exported by Netstumbler is rather...asinine. The format itself is difficult, and much of the data is encoded. A gnarly Perl script is necessary to convert this to a simple CSV file. You can download it and modify it if you feel it can help you work with this type of data.
The Data Before

And after...VOILA!

The resulting data can be inputted into GPSVisualizer for output in Google Earth. Points of two diferent colors, green and red, are plotted. Dark green points for protected APs, and red for unprotected APs (in this example it didn't work). The size of the points varies according to signal strength. The stronger the signal, the larger the data point:

A simpler representation can be created using Google Maps, without color and size differentiation.
Dataset 2: Visualization of Route
The original data from the GPS is in a format known as NMEA. Each GPS reading has multiple lines in the fie associated with it. Each line contains different information about the reading. To differentiate, the beginning of the line always begins with $GP followed by a three character suffix that indicates what kind of data is recorded on that line:
| NMEA Sentences Recorded by VisualGPS for iBlue GPS | |
| Suffix | Data Contained on the Line |
| GGA | GPS fixed data: location, diagnostic information, and elevation. |
| GLL | Geographic position only (latitude and longitude) * |
| RMC | Recommended Minimum Specific GNSS Data |
| VTG | Course Over Ground and Ground Speed |
Source: http://www.geoaps.com/NMEA.htm
| Raw Data as in NMEA File $GPGGA,193847.798,3509.2965,N,11434.0415,W,1,06,01.4,152.2,M,-27.9,M,,*5A $GPRMC,193847.798,A,3509.2965,N,11434.0415,W,37.77,185.16,020806,,,A*4D $GPVTG,185.16,T,,,37.77,N,69.94,K,A*7D $GPGLL,3509.2965,N,11434.0415,W,193847.798,A,A*49 |
The complete GPS data file is almost 6.5MB, and GPSVisualizer can only accept files up to 3MB. Fortunately, I do not need all of the data that was recorded. I write a Perl script to keep the geographic position sentences (GPGLL) and throw out the rest. It can be downloaded here. The original dataset that run through the Perl script is route.dat. This script only makes the GPS data readable by GPSVisualizer. For analysis in R, keep reading.
#!/usr/bin/perl
open(IN, "<route.dat");
open(OUT, ">routeLL.dat");
while (<IN>) {
$line = $_;
$sos = substr($line, 3, 3);
if ($sos eq "GLL") {
print OUT $line;
}
}
close(IN);
close(OUT);
There are a lot of cool tools on the Internet for visualizing spatial data. GPSVisualizer is one of them. The site allows users a variety of output options. For the graphic below, I outputted my route from home to the Colorado River in black, on top of a topographical image. GPSVisualizer also allows users to convert GPS data among various different formats, geocode an address and upload a NetStumbler file.

Another great visualization tool is Google Maps. The following image was created by GPSVisualizer and is actually "virtual" so the user can zoom in, zoom out, and pan the image so you can see my route down to the street level. The options I chose were to display cities, streets and my route over topographical images. After creating an image like this, to upload the image to a page like this one, the user needs to create an API key to use the Google Maps API.
Finally, Google Earth provides much of the same functionality as the Google Maps virtual image. I have not worked extensively with Google Earth so I cannot comment on exactly what the differences are, but GPSVisualizer can create an "overlay" for your route as well. The Google Earth file for my trip is here
, and the rather boring Mojave Desert overlay
.
But to generate a data file from route.dat that can be read in R, we need a different script.
#!/usr/bin/perl
#This script takes an NMEA file and converts it into a CSV format that can
#be read by both R and humans. In the future, the user can specify which NMEA sentence to
#to extract.
my $inFile = shift;
my $outFile = shift;
open(IN, "<$inFile");
open(OUT, ">$outFile");
print OUT "Time,Lat,Lon\n";
while (<IN>) {
$line = $_;
$sos = substr($line, 3, 3);
if ($sos eq "GLL") {
@tokens = split(',',$line);
#A GPGLL sentence looks like:
#$GPGLL,<1>,<2>,<3>,<4>,<5>,<6>,<7>BR
#1/2: Lat and Hemi, 3/4: Long and Hemi,
#5: Time UTC, 6: Status, 7: Checksum
#R only needs 1,3 and 5.
my @values;
$values[0] = $tokens[5];
$values[1] = ($tokens[1]*(-1**($tokens[2]=="N")))/100.0;
$values[2] = ($tokens[3]*(-1**($tokens[4]=="W")))/100.0;
$line = join(',',@values);
print OUT "$line\n";
}
}
close(IN);
close(OUT);
The resulting dataset, routelatlong.Rdata can be read in R.
Datasets
Dataset (Version: 09/18/06)
External Links
Sites that Use GPSVisualizer*
GPS Coordinate Grabber - A GPS coordinate scraper.
GPS Games - games played using a handheld GPS where the game field is the entire planet.
Movingcache.com - like geocaching (where "treasures" are hidden at specific coordinates), but the caches can move.
GPSBabel: By Robert Lipe et al. Claims to "flatten the Tower of Babel that the authors of various programs for manipulating GPS data have imposed upon us," and indeed, it supports MANY file formats and can connect to your Garmin or Magellan device. Available for Windows and Unix, including Mac OS X.