The 2 principles of ‘N principles’

I was watching a documentary recently in which the content was organized into ’10 principles’.  I am not sure if those principles came from the producers, or the subject of the documentary, but it lead me to ask whether this number ’10’ was meaningful, or just a number selected with little to no correspondence to the actual number of principles on the topic.  We see ‘N principles’ all the time, such as the “three principles of cell theory“, the “five principles of humanity” and the “eight principles of pilates“, but are the number of principles really meaningful?

If everyone who made up such ‘principles’ did so purely guided by the actual number of principles defined by the problem they are trying to categorize, we might expect that the frequency distribution of principles is completely uniform–in other words, there should be as many sets of ‘4 principles’ as ‘343,212 principles’.  This is because the number of available subjects to categorize in the universe is infinite, and the categorization process is probably complex enough to defy any universal rule or ‘principle’ of principle making.

However, in practice, we humans prefer simplicity (that’s why we come up with principles in the first place!).  Our brains our finite, and we like to reduce the complexities of the world into as few variables, parameters and categories as possible.  So in fact the frequency of ‘N principles’ is most likely inversely proportional to N:

theoretical graph of principle frequencies

Now, is this true in practice?  To find out I used the Google search engine to return search results on the phrase “The N principles”, where N is numbers (in numeric and written form) from 2 to 20.  The search results tell us the amount of content on the internet for a given search term–larger values mean more content for a given value of N.  Here is a plot of the log of the search results by N:

search results by principles

This seems roughly consistent with the general idea that we prefer fewer principles over many, but note that there are anomalies.  The numbers 7, 10, 12, 17 and 20 all seem to be over-represented.  It is possible that these findings are partly the result of some outlier content.  For example, it seems very likely that many of the “7 principles” search results come from one book “The Seven Principles for Making Marriage Work“.  The same seems to be true for the “17 principles”.  However, I wonder if the 10, 12 and 20 principle anomalies may reflect an inclination in favour of certain numbers because they are more memorable, or seem more authoritative in some way.  If you identify ‘9 principles’ in a system, maybe you would split one of the principles into two principles to get a nice even ’10 principles’.  This seems feasible given that on the graph ‘9 principles’ seem to be an anomaly in the other direction (less content than expected).

Naturally one might ask whether or not this model is useful.  Well, it probably isn’t. Nevertheless, I used Google Trends to explore whether or not Google searches of ‘N principles’ correlate with the the volume of content on the internet as seen in the figure above.  This allows me to identify any gaps between what people are searching for and what is actually available on the internet.  Here is the graph:

searches by principles

The data are from Google Trends based on Google searches from 2004 to 2016. Searches for ‘N principles’ greater than 14 did not result in enough data for Google Trends to estimate relative search frequency.  However, we can see that 5, 7, 8, 10, 12 and 14 principles seem the most popular.  Of these 8 and 14 seem the most interesting since neither of these were outliers in the web content search above.  It’s also interesting how infrequently people search for ‘2 principles’.  Perhaps this means that ‘2 principles’ is an oversimplification much of the time?

Conclusions

I have two conclusions (aka principles):

1.  People who categorize problems/systems/practices into ‘principles’ seem to overly favour numbers like 10 and 12.  As such, there are probably times when the actual number of useful principles is more or less than that.  This could be handy to know when someone is telling you about the ’10 principles to training a cat’ at a dinner party (‘excuse me, but are you sure there aren’t really nine?’).

2. If you want to capitalize on the gap between supply (the number of sets of principles on the internet) and demand (the number of sets of principles searched for) then you should pick 8 or 14 principles, and whatever you do, don’t pick two principles.