Will I "Really Like" this Movie?

Navigating Movie Website Ratings to Select More Enjoyable Movies

The Art of Selecting “Really Like Movies: Older Never Before Seen

Last week I stated in my article that I could pretty much identify whether a movie has a good chance of being a “really like movie” within six months of its release. If you need any further evidence, here are my top ten movies that I’ve never seen that are older than six months.

My Top Ten Never Seen Movie Prospects 
Never Seen Movies =  > Release Date + 6 Months
Movie Title Last Data Update Release Date Total # of Ratings “Really Like” Probability
Hey, Boo: Harper Lee and ‘To Kill a Mockingbird’ 2/4/2017 5/13/2011          97,940 51.7%
Incendies 2/4/2017 4/22/2011        122,038 51.7%
Conjuring, The 2/4/2017 7/19/2013        241,546 51.7%
Star Trek Beyond 2/4/2017 7/22/2016        114,435 51.7%
Pride 2/4/2017 9/26/2014          84,214 44.6%
Glen Campbell: I’ll Be Me 2/9/2017 10/24/2014        105,751 44.6%
Splendor in the Grass 2/5/2017 10/10/1961        246,065 42.1%
Father of the Bride 2/5/2017 6/16/1950        467,569 42.1%
Imagine: John Lennon  2/5/2017 10/7/1998        153,399 42.1%
Lorenzo’s Oil 2/5/2017 1/29/1993        285,981 42.1%

The movies with a high “really like” probability in this group have already been watched. Of the remaining movies, there are three movies that are 50/50 and the rest have the odds stacked against them. In other words, if I watch all ten movies I probably won’t “really like” half of them. The dilemma is that I would probably “really like” half of them if I do watch all ten. The reality is that I won’t watch any of these ten movies as long as there are movies that I’ve already seen with better odds. Is there a way to improve the odds for any of these ten movies?

You’ll note that all ten movies have probabilities based on less than 500,000 ratings. Will some of these movies improve their probabilities as they receive more ratings? Maybe. Maybe not. To explore this possibility further I divided my database into quintiles based on the total number of ratings. When I look at the quintile with the most ratings, the most credible quintile, it does provide results that define the optimal performance of my algorithm.

Quintile 5

# Ratings Range > 2,872,053

# of Movies # “Really Like” Movies % “Really Like” Movies Proj.  Avg. Rating All Sites My Avg Rating My Rating to Proj. Rating Diff.
Movies Seen More than Once 152 134 88% 8.6 8.5 -0.1
Movies Seen Once 246 119 48% 7.5 6.9 -0.7
             
All Movies in Range 398 253 64% 7.9 7.5  

All of the movies in Quintile 5 have more than 2,872,053 ratings. My selection of movies that I had seen before is clearly better than my selection of movies I watched for the first time. This better selection is because the algorithm results led me to the better movies and my memory did some additional weeding. My takeaway is that, when considering movies I’ve never seen before, put my greatest trust in the algorithm if the movie falls in this quintile.

Lets look at the next four quintiles.

Quintile 4

# Ratings Range 1,197,745 to 2,872,053

# of Movies # “Really Like” Movies % “Really Like” Movies Proj.  Avg. Rating All Sites My Avg Rating My Rating to Proj. Rating Diff.
Movies Seen More than Once 107 85 79% 8.3 8.3 0.1
Movies Seen Once 291 100 34% 7.1 6.4 -0.7
             
All Movies in Range 398 185 46% 7.4 6.9
Quintile 3

# Ratings Range 516,040 to 1,197,745

# of Movies # “Really Like” Movies % “Really Like” Movies Proj.  Avg. Rating All Sites My Avg Rating My Rating to Proj. Rating Diff.
Movies Seen More than Once 122 93 76% 7.8 8.0 0.2
Movies Seen Once 278 102 37% 7.1 6.6 -0.6
             
All Movies in Range 400 195 49% 7.3 7.0
Quintile 2

# Ratings Range 179,456 to 516,040

# of Movies # “Really Like” Movies % “Really Like” Movies Proj.  Avg. Rating All Sites My Avg Rating My Rating to Proj. Rating Diff.
Movies Seen More than Once 66 46 70% 7.4 7.5 0.2
Movies Seen Once 332 134 40% 7.0 6.4 -0.6
             
All Movies in Range 398 180 45% 7.1 6.6
Quintile 1

# Ratings Range < 179,456

# of Movies # “Really Like” Movies % “Really Like” Movies Proj.  Avg. Rating All Sites My Avg Rating My Rating to Proj. Rating Diff.
Movies Seen More than Once 43 31 72% 7.0 7.5 0.5
Movies Seen Once 355 136 38% 6.9 6.2 -0.7
             
All Movies in Range 398 167 42% 6.9 6.4

Look at the progression of the algorithm projections as the quintiles get smaller. The gap between the movies seen more than once and those seen only once narrows as the number of ratings gets smaller. Notice that the difference between my ratings and the projected ratings for Movies Seen Once is fairly constant for all quintiles, either -0.6 or -0.7. But for the Movies Seen More than Once, the difference grows positively as the number of ratings gets smaller. This suggests that, for Movies Seen More than Once, the higher than expected ratings I give movies in Quintiles 1 & 2 are primarily driven by my memory of the movies rather than the algorithm.

What does this mean for my top ten never before seen movies listed above? All of the top ten is either in Quintiles 1 or 2. As they grow into the higher quintiles some may emerge with higher “really like” probabilities. Certainly, Star Trek Beyond, which is only 7 months old, can be expected to grow into the higher quintiles. But, what about Splendor in the Grass which was released in 1961 and, at 55 years old, might not move into Quintile 3 until another 55 years pass.

It suggests that another secondary movie quality indicator is needed that is separate from the movie recommender sites already in use. It sounds like I’ve just added another project to my 2017 “really like” project list.

 

 

Advertisements

Single Post Navigation

One thought on “The Art of Selecting “Really Like Movies: Older Never Before Seen

  1. Pingback: The Art of Selecting “Really Like” Movies: Oscar Provides a Helping Hand | Will I "Really Like" this Movie?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: