Will I "Really Like" this Movie?

Navigating Movie Website Ratings to Select More Enjoyable Movies

Sometimes When You Start To Go There You End Up Here

There are some weeks when I’m stumped as to what I should write about in this weekly trip to Mad Moviedom. Sometimes I’m in the middle of an interesting study that isn’t quite ready for publication. Sometimes an idea isn’t quite fully developed. Sometimes I have an idea but I find myself blocked as to how to present it. When I find myself in this position, one avenue always open to me is to create a quick study that might be halfway interesting.

This is where I found myself this week. I had ideas that weren’t ready to publish yet. So, my fallback study was going to be a quick study of which movie decades present the best “really like” viewing potential. Here are the results of my first pass at this:

“Really Like” Decades
Based on Number of “Really Like” Movies
As of April 6, 2017
My Rating
Really Liked Didn’t Really Like Total “Really Like” Probability
 All       1,108                  888         1,996
 2010’s           232                  117            349 60.9%
 2000’s           363                  382            745 50.5%
 1990’s           175                    75            250 62.0%
 1980’s             97                    60            157 58.4%
 1970’s             56                    49            105 54.5%
 1960’s             60                    55            115 53.9%
 1950’s             51                    78            129 46.6%
 1940’s             55                    43               98 55.8%
 1930’s             19                    29               48 46.9%

These results are mildly interesting. The 2010’s, 1990″s, 1980’s, and 1940’s are above average decades for me. There are an unusually high number of movies in the sample that were released in the 2000’s. Remember that movies stay in my sample for 15 years from the year I last watched the movie. After 15 years they are removed from the sample and put into the pool of movies available to watch again. The good movies get watched again and the other movies are never seen again, hopefully. Movies last seen after 2002 have not gone through the process of separating out the “really like” movies to be watched again and permanently weeding from the sample the didn’t “really like” movies. The contrast of the 2000’s with the 2010’s is a good measure of the impact of the undisciplined selection movies and the disciplined selection.

As I’ve pointed out in recent posts, I’ve made some changes to my algorithm. One of the big changes I’ve made is that I’ve replaced the number of movies that are “really like” movies with the number of ratings for the movies that are “really like” movies. After doing my decade study based on number of movies, I realized I should have used the number of ratings method to be consistent with my new methodology. Here are the results based on the new methodology:

“Really Like” Decades
Based on Number of “Really Like” Ratings
As of April 6, 2017
My Rating
Really Liked Didn’t Really Like Total “Really Like” Probability
 All    2,323,200,802    1,367,262,395    3,690,463,197
 2010’s        168,271,890        166,710,270        334,982,160 57.1%
 2000’s    1,097,605,373        888,938,968    1,986,544,341 56.6%
 1990’s        610,053,403        125,896,166        735,949,569 70.8%
 1980’s        249,296,289        111,352,418        360,648,707 65.3%
 1970’s          85,940,966          25,372,041        111,313,007 67.7%
 1960’s          57,485,708          15,856,076          73,341,784 68.0%
 1950’s          28,157,933          23,398,131          51,556,064 59.5%
 1940’s          17,003,848            5,220,590          22,224,438 67.4%
 1930’s            9,385,392            4,517,735          13,903,127 64.6%

While the results are different, the big reveal was that 63.0% of the ratings are for “really like” movies and only 55.5% of the number of movies are “really like” movies. It starkly reinforces the impact of the law of large numbers. Movie website indicators of “really like” movies are more reliable when the number of ratings driving those indicators are larger. The following table illustrates this better:

“Really Like” Decades
Based on Average Number of “Really Like” Ratings per Movie
As of April 6, 2017
My Rating
Really Liked Didn’t Really Like Total “Really Like” % Difference
 All      2,096,751.63      1,539,709.90      1,848,929.46 36.2%
 2010’s          725,309.87      1,424,874.10          959,834.27 -49.1%
 2000’s      3,023,706.26      2,327,065.36      2,666,502.47 29.9%
 1990’s      3,486,019.45      1,678,615.55      2,943,798.28 107.7%
 1980’s      2,570,064.84      1,855,873.63      2,297,125.52 38.5%
 1970’s      1,534,660.11          517,796.76      1,060,123.88 196.4%
 1960’s          958,095.13          288,292.29          637,754.64 232.3%
 1950’s          552,116.33          299,976.04          399,659.41 84.1%
 1940’s          309,160.87          121,409.07          226,779.98 154.6%
 1930’s          493,968.00          155,783.97          289,648.48 217.1%

With the exception of the 2010’s, the average number of ratings per movie is larger for the “really like” movies. In fact, they are dramatically different for the decades prior to 2000. My educated guess is that the post-2000 years will end up fitting the pattern of the other decades once those years mature.

So what is the significance of this finding. It clearly suggests that waiting to decide whether to see a new movie or not until a sufficient number of ratings come in will produce a more reliable result. The unanswered question is how many ratings is enough.

The finding also reinforces the need to have something like Oscar performance to act as a second measure of quality for movies that will never have “enough” ratings for a reliable result.

Finally, the path from “there to here” is not always found on a map.


Single Post Navigation

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: