Open data sets

Here I will post links to data that I have either collected myself or cleaned up for general use.  All data are free to use under the creative commons attribution license.  I’ll include geographical data, time series data, and pretty well anything that I find interesting.

1. Canadian populated places

I extracted this file from the GeoNames database, which is an outstanding source of free location data online.  The data set includes latitude and longitude of about 20,000 place locations in Canada (cities, towns, villages, etc.).  It also includes a feature code which is documented here.

2. World homicide data

This data set would be more useful if I kept track of the sources of information, but I didn’t, so they are of limited research use.  Nonetheless, it’s a fun data set to play with.  The column headings are self explanatory.

3. Hamilton cyclist survey (2009)

This is based on the work of my first graduate student, between 2008 and 2010.  She collected responses from 403 self-selected respondents via an online survey tool, community events and postings at public facilities around Hamilton in the summer of 2009.  I’ve removed some questions that could make respondents identifiable (mainly related to geography).  The survey questions are here.

4. Canadian word list

This is a list of words from a Canadian dictionary.  I can’t remember where I got it from, but maybe it’s useful?

5. Hockey GM data

Some data on the employment tenure of NHL hockey GMs (1950 to 2016) that I cobbled together from the web..