This past weekend I did a little experimenting with Google Sheets. I used a function called REGREPLACE to return the search results on a phrase:
“I live n miles from work”
where n takes values between 1 and 100 in 5 unit intervals (except for the first interval, which was only 4 units, so 1, 5, 10, etc. all the way to 100). This gave me a table of the number of web pages in the Google index with the above phrase, but for different values of distance. I used both the number n (e.g, 5) and the written value (e.g., five) to be reasonably complete about it.
Here is a graph summarizing the results:
Each dot on the graph is the # of web page results with the search phrase (“I live n miles form work“).
What does this tell us? Well, the frequency of pages with the phrase “I live 1 mile from work” and “I live 5 miles from work” seem to be the most common, but it doesn’t say much about how far people actually live from work. This is not a random sample, after all.
The more interesting thing to me is the zig-zag pattern where most tenth intervals (20, 30, etc.) are higher than their neighbouring fifth intervals (25, 35, 45, etc.). This pattern is almost certainly not because people are actually more likely to live 30 rather than 25 miles from work, or 40 rather than 35 miles from work. So what’s going on?
It seems these data are telling us something about rounding behaviour; when thinking about a distance between where we live and where we work, we seem more likely to estimate that distance to the nearest 10th than the nearest 5th. This is worth keeping in mind when we ask people questions about distance, particularly if people that round to the nearest 10th are somehow different from people who round to the nearest 5th. Newer residents of a city may round to the less precise 10th compared to more established residents, for example. Understanding this rounding behaviour may be useful for improving how we understand perceptions of distance, but it’s also a good reminder to interpret these estimates with some caution, particularly for short distances; rounding from 44 to 40 is only 10% error, but rounding from 24 to 20 is double that.
It remains to be seen if the error is the same with respect to travel time…