GPS and WiFi Datasets
![]() |
![]() |
Field of Application: mapping, GPS, 802.11, WiFi, Internet, navigation, visualization.
Goal: understanding geographic data and visualization of such data.
Explanation of the Data
The datasets above contain two types of data recorded on a roadtrip. On our way from home, I recorded wireless access points and their locations, and on the way back I recorded just our GPS coordinates each second. These datasets are grouped together here since they were recorded in the same way, and in the same context.
Dataset Format - ryantrip-wifi-gps
| Column # | Variable Name | Type/Units | Layman's Description |
| 1 | Latitude | degrees | Latitude reading at which the AP was discovered. |
| 2 | Longitude | degrees | Longitude reading at which the AP was discovered. |
| 3 | SSID | text | The "name" of the AP. |
| 4 | Type | factor | We won't use this. It specifies the type of infrastructure on which the AP operates: either as part of a network (BSS), or as an ad-hoc cluster of systems. |
| 5 | BSSID | xx:xx:xx:xx:xx:xx | Usually called a MAC address. A unique identifier assigned to every Ethernet device manufactured. |
| 6 | TimeGMT | time | Greenwich Mean Time |
| 7 | SNR | decibels (dB) | "Signal to Noise Ratio" |
| 8 | Sig | decibels (dB) | Signal strength. |
| 9 | Noise | decibels (dB) | Measurement of the amount of noise/randomness (useless information) detected in transmission. |
| 10 | Secure | boolean | True if WEP (Wired Equivalent Privacy) is enabled on the AP, false otherwise (unsecured). |
| 11 | Channelbits | flag | ??? |
| 12 | Bcnintvl | integer | Can be adjusted to improve power consumption by clients. Useless in this study. |
| 13 | DataRate | integer, Mbps | The speed at which data is transferred over the air. (megabits per second) |
| 14 | LastChannel | integer | Wifi data is transferred over one of 11 channels. The value of this variable indicates which channel was used. |
TASKS for ryantrip-wifi-gps dataset.
These are just some things to think about. These questions can be answered using statistics and data analysis. There are multiple ways to approach most of these, mostly dictated by the reader's depth of knowledge in stats.
boxplot(data[which(data$Sig > c),8])
What percentage of access points in this dataset are not secured?
Businesses, department stores, rest stops, hotels, restaurants, etc. sometimes provide wireless internet service to their customers (for free, or for pay), and sometimes have their own private access points for conducting their own transactions. The SSIDs for access points at each of these places usually follow some pattern. By using this information, we can very roughly approximate the location of the facility in question (or the nearest cross-streets), or at least identify that there is one nearby.
The table below lists some common SSID naming schemes in this dataset:
| SSID Scheme | Business/Venue |
| Orangen (n is an integer, or may be blank) | The Home Depot |
| Wayport_Access | McDonald's |
Dataset Format - routelatlong
| Column # | Variable Name | Type/Units |
| 1 | Time UTC | hhmmss.cc |
| 2 | Latitude | Degrees, ddmm.ss |
| 3 | Longitude | Degrees, ddmm.ss |
| 4 | Speed | Exercise |
TASKS for routelatlong dataset.
These are just some things to think about. These questions can be answered using statistics and data analysis. There are multiple ways to approach most of these, mostly dictated by the reader's depth of knowledge in stats.
After setting your working directory, load the R dataset, routelatlonsp.Rdata using the code:
load(“routelatlonsp.Rdata”)
This loads a dataframe called data into your workspace.
speedTrans <- data$Speed*___?____
Plot a histogram of speed.
hist(speedTrans)
What do you notice? (it might help to do the plot again, removing speed=0). What does this suggest about the roads and routes that make up the trip?
You can remove cases where speed is 0 using the code:
hist(data[-which(data$Speed==0),4])
Deg <- as.integer(data[,latcol]/100)
Min <- abs(as.integer(data[,latcol])-sign(data[,latcol])*100*abs(Deg))
DecMin <- Min + abs(data[,latcol] - as.integer(data[,latcol]))
Decimal <- DecMin / 60
data[,latcol] <- sign(Deg)*(abs(Deg) + Decimal)
Deg <- as.integer(data[,loncol]/100)
Min <- abs(as.integer(data[,loncol])-sign(data[,loncol])*100*abs(Deg))
DecMin <- Min + abs(data[,loncol] - as.integer(data[,loncol]))
Decimal <- DecMin / 60
data[,loncol] <- sign(Deg)*(abs(Deg) + Decimal)
dataTran <- data.frame(data[,1:ncol(data)])
varNames <- names(dataTran)
varNames[latcol] <- "latitude"
varNames[loncol] <- "longitude"
names(dataTran) <- varNames
dataTran
}
To convert the dataset, enter the command
newDatasetName <- decDeg(data,2,3)
To write the new dataset as a CSV file, enter
write.csv(file=”routelatlonsp.csv”,newDatasetName)
Now feed that data file to GPSVisualizer. That part is left as an exercise.
-- END OF EXERCISES --