Will I "Really Like" this Movie?

Navigating Movie Website Ratings to Select More Enjoyable Movies

Archive for the category “Selecting Better Movies to Watch”

What Am I Actually Going to Watch This Week? Netflix Helps Out with One of My Selections.

The core mission of this blog is to share ideas on how to select movies to watch that we’ll “really like”. I believe that there have been times when I’ve bogged down on how to build the “really like” model. I’d like to reorient the dialogue back to the primary mission of what “really like” movies I am going to watch and more importantly why.

Each Wednesday I publish the ten movies on my Watch List for the week. These movies usually represent the ten movies with the highest “really like” probability that are available to me to watch on platforms that I’ve already paid for. This includes cable and streaming channels I’m paying for and my Netflix DVD subscription. I rarely use a movie on demand service.

Now, 10 movies is too much, even for the Mad Movie Man, to watch in a week. The ten movie Watch List instead serves as a menu for the 3 or 4 movies I actually most want to watch during the week. So, how do I select those 3 or 4 movies?

The first and most basic question to answer is who, if anyone, am I watching the movie with. Friday night is usually the night that my wife and I will sit down and watch a movie together. The rest of the week I’ll watch two or three movies by myself. So, right from the start, I have to find a movie that my wife and I will both enjoy. This week that movie is Hidden Figures, the 2016 Oscar nominated film about the role three black female mathematicians played in John Glenn’s orbit of the earth in the early 1960’s.

This movie became available to Netflix DVD subscribers on Tuesday May 9. I received my Hidden Figures DVD on that day. Something I’ve learned over the years is that Netflix ships DVD’s on Monday that become available on Tuesday. For this to happen you have to time the return of your old DVD to arrive on the Saturday or Monday before the Tuesday release. This gives you the best chance to avoid “long wait” queues.

I generally use Netflix DVD to see new movies that I don’t want to wait another 3 to 6 months to see or for old movies that I really want to see but aren’t available on my usual platforms.

As of the first quarter of 2017, Netflix reported that there are only 3.94 million subscribers to their DVD service. I am one of them. The DVD service is the only way that you can still access Netflix’ best in the business 5 star system of rating movies. It is easily the most reliable predictor of how you’ll rate a movie or TV show. Unfortunately, Netflix Streaming customers no longer have the benefit of the 5 Star system. They have gone to a less granular “thumbs up” and “thumbs down” rating system. To be fair, I haven’t gathered any data on this new system yet therefore I’ll reserve judgement as to its value. As for the DVD service, they will have me as a customer as long as they maintain their 5 star recommender system as one of the benefits of being a DVD subscriber.

The 5 star system is a critical assist to finding a movie for both my wife and I. Netflix allows you set up profiles for other members of the family. After my wife and I watch a movie, she gives it a rating and I give it a rating. These ratings are entered under our separate profiles. This allows a unique predicted rating for each of us based on our individual taste in movies. For example, Netflix predicts that I will rate Hidden Figures a 4.6 out of 5 and my wife will rate it a 4.9. In other words, according to Netflix, this is a movie that both of us, not only will “really like”, but we should absolutely “love”.

Hidden Figures has a “really like” probability of 61.4%. It’s Oscar Performance probability is 60.7% based on its three nominations. Its probability based solely on the feedback from the recommender sites that I use is 69.1%. At this point in time, it is a Quintile 1 movie from a credibility standpoint. This means that the 69.1% probability is based on a limited number of ratings. It’s not very credible yet. That’s why the 61.4% “really like” probability is closer to the Oscar Performance probability of 60.7%. I would fully expect that, as more people see Hidden Figures and enter their ratings, the “really like” probability will move higher for this movie.

Friday Night Movie Night this week looks like a “really like” lock…thanks to Netflix DVD.

 

 

Can You Increase Your Odds of Having a “Really Like” Experience at the Movie Theater

Last Friday, my wife and I were away from home visiting two different sets of friends. One group we met for lunch. The second group we were meeting in the evening. With some time to spare between visits, we decided to go to a movie. The end of April usually has slim pickings for “really like” movies at the theater. With the help of IMDB and Rotten Tomatoes, I was able to surface a couple of prospects but only one that both my wife and I might “really like”. We ended up seeing a terrific little movie, Gifted.

My experience got me thinking about the probabilities of seeing “really like” movies at the movie theater. These movies have the least data to base a decision off of and yet I can’t recall too many movies that I’ve seen in the theater that I haven’t “really liked”. Was this reality or merely perception.

I created a subset of my database of movies that I’ve seen within 3 months of their release. Of the 1,998 movies in my database, 99 movies, or 5%, met the criteria. Of these 99 movies, I “really liked” 86% of them. For the whole database, I “really liked” 60% of the movies I’ve watched over the last 15 years. My average score for the 99 movies was 7.8 out of 10. For the remaining 1,899 movies my average score was 6.8 out of 10.

How do I explain this? My working theory is that when a movie comes with an additional cash payout, i.e. theater tickets, I become a lot more selective in what I see. But, how can I be more selective with less data? I think it’s by selecting safe movies. There are movies that I know I am going to like. When I went into the movies theater a couple of months ago to see Beauty and the Beast I knew I was going to love it and I did. Those are the types of movie selections I tend to reserve for the theater experience.

There are occasions like last Friday when a specific movie isn’t drawing me to the movies but instead I’m drawn by the movie theater experience itself. Can I improve my chances of selecting a “really like” movie in those instances?

Last week I mentioned in my article that I needed to define better what I needed my “really like” probability model to do. One of the things that it needs to do is to provide better guidance for new releases. The current model has a gap when it comes to new releases. Because the data is scarce most new releases will be Quintile 1 movies in the model. In other words, very little of the indicators based on my taste in movies, i.e. Netflix, Movielens, and Criticker, is factored into the “really like” probability.

A second gap in the model is that new releases haven’t been considered for Academy Awards yet. The model treats them as if they aren’t award worthy, even though some of them will be Oscar nominated.

I haven’t finalized a solution to these gaps but I’m experimenting with one. As a substitute for the Oscar performance factor in my model I’m considering a combined IMDB/Rotten Tomatoes probability factor. These two outputs are viable indicators of the quality of a new release. This factor would be used until the movie goes through the Oscar nomination process. At that time, it would convert to the Oscar performance factor.

I’ve created a 2017 new release list of the new movies I’m tracking. You can find it on the sidebar with my Weekly Watch List movies. This list uses the new “really like” probability approach I’m testing for new releases. Check it out.

If you plan on going to the movies this weekend to see Guardians of the Galaxy Vol. 2, it is probably because you really liked the first one. Based on IMDB and Rotten Tomatoes, you shouldn’t be disappointed. It is Certified Fresh 86% on Rotten Tomatoes and 8.2 on IMDB.

 

 

“Really Like” Movies: Is That All There Is?

After scoring a movie that I’ve watched, one of my rituals is to read a critic’s review of the movie. If the movie is contemporaneous to Roger Ebert’s tenure as the world’s most read critic, he becomes my critic of choice. I choose Ebert, first of all, because he is a terrific writer. He has a way of seeing beyond the entertainment value of the movie and observing how it fits into the culture of the time. I also choose Ebert because I find that he “really likes” many of the movies I “really like”. He acts as a validator of my film taste.

The algorithm that I use to find “really like” movies to watch is also a validator. It sifts through a significant amount of data about a movie I’m considering and validates whether I’ll probably “really like” it or not based on how I’ve scored other movies. It guides me towards movies that will be “safe” to watch. That’s a good thing. Right? I guess so. Particularly, if my goal is to find a movie that will entertain me on a Friday night when I might want to escape the stress of the week.

But what if I want to experience more than a comfortable escape? What if I want to develop a more sophisticated movie palate? That won’t happen if I only watch movies that are “safe”. Is it possible that my algorithm is limiting my movie options by guiding me away from movies that might expand my taste? My algorithm suggests that because I “really liked” Rocky I & II, I’ll “really like” Rocky III as well. While that’s probably a true statement, the movie won’t surprise me. I’ll enjoy the movie because it is a variation of a comfortable and enjoyable formula.

By the same token, I don’t want to start watching a bunch of movies that I don’t “really like” in the name of expanding my comfort zone. I do, however, want to change the trajectory of my movie taste. In the end, perhaps it’s an algorithm design issue. Perhaps, I need to step back and define what I want my algorithm to do. It should be able to walk and chew gum at the same time.

I mentioned that I used Roger Ebert’s reviews because he seemed to “really like” many of the same movies that I “really liked”. It’s important to note that Roger Ebert “really liked” many more movies than I have over his lifetime. Many of those movies are outside my “really like” comfort zone. Perhaps I should aspire to “really like” the movies that Ebert did rather than find comfort that Ebert “really liked” the movies that I did.

 

A Movie Watch List is Built by Thinking Fast and Slow

In early 2012 I read a book by Daniel Kahneman titled Thinking Fast and Slow. Kahneman is a psychologist who studies human decision making and, more precisely, the thinking process. He suggests that the human mind has two thinking processes. The first is the snap judgement that evolved to quickly identify threats and react to them quickly in order to survive. He calls this “thinking fast”. The second is the rational thought process that weighs alternatives and evidence before reaching a decision. This he calls “thinking slow”. In the book, Kahneman discusses what he calls the “law of least effort”. He believes that the mind will naturally gravitate to the easiest solution or action rather than to the more reliable evidence based solution. He suggests that the mind is most subject to the “law of least effort” when it is fatigued, which leads to less than satisfactory decision making more often than not.

How we select the movies we watch, I believe, is generally driven by the “law of least effort”. For most of us, movie watching is a leisure activity. Other than on social occasions, we watch movies when we are too tired to do anything else in our productive lives. Typically, the movies we watch are driven by what’s available to watch at the time we decide to watch. From the movies available, we decide what seems like a movie we’d like at that moment in time. We choose by “thinking fast”. Sometimes we are happy with our choice. Other times, we get half way through the movie and start wondering, over-optimistically I might add, if this dreadful movie will ever be over.

It doesn’t have to be that way. One tool I use is a Movie Watch List that I update each week using a “thinking slow” process.. My current watch list can be found on the side bar under Ten Movies on My Watch List This Week. Since you may read this blog entry sometime in the future, here’s the watch list I’ll be referring to today:

Ten Movies On My Watch List This Week
As Of March 22, 2017
Movie Title Release Year Where Available Probability I Will “Really Like”
Fight Club 1999 Starz 84.8%
Amélie 2002 Netflix – Streaming 72.0%
Hacksaw Ridge  2016 Netflix – DVD 71.5%
Emigrants, The 1972 Warner Archive 69.7%
Godfather: Part III, The 1990 Own DVD 68.7%
Pride and Prejudice 1940 Warner Archive 67.3%
Steel Magnolias 1989 Starz 67.1%
Paper Moon 1973 HBO 63.4%
Confirmation 2016 HBO 57.0%
Beauty and the Beast 2017 Movie Theater 36.6%

The movies that make it to this list are carefully selected based on the movies that are available in the coming week on the viewing platforms I can access. I use my algorithm to guide me towards movies with a high “really like” probability. I determine who I’m likely to watch movies with during the upcoming week. If I’m going to watch movies with others, I make sure that there are movies on the list that those others might like. And, finally, I do some “thinking fast” and identify those movies that I really want to see and those movies that, instinctively, I am reluctant to see.

The movies on my list above in green are those movies that I really want to see. The movies in turquoise are those movies I’m indifferent to but are highly recommended by the algorithm. The movies in red are movies that I’m reluctant to see.

So, you may ask, why do I have movies that I don’t want to see on my watch list? Well, it’s because I’m the Mad Movie Man. These are movies that my algorithm suggests have a high “really like” probability. In the case of Fight Club, for example, I’ve seen the movie before and was turned off by the premise. On the other hand, it is a movie that my algorithm, based on highly credible data,  indicates is the surest “really like” bet of all the movies I haven’t seen in the last 15 years. Either my memory is faulty, or my tastes have changed, or there is a flaw in my algorithm, or a flaw in the data coming from the websites I use. It may just be that it is among the movies in the 15% I won’t like. So, I put these movies on my list because I need to know why the mismatch exists. I have to admit, though, that it is hard getting these red movies off the list because I often succumb to the “law of least effort” and watch another movie I’d much rather see.

Most of our family is gathering together in the coming week and so Beauty and the Beast and Hacksaw Ridge are family movie candidates. In case my wife and I watch a movie together this week, Amélie , Pride and Prejudice, and Steel Magnolias are on the list.

The point in all this is that by having a Watch List of movies with a high “really like” probability you are better equipped to avoid the “law of least effort” trap and get more enjoyment out of your leisure time movie watching.

 

The Art of Selecting “Really Like” Movies: Oscar Provides a Helping Hand

Sunday is Oscar night!! From my perspective, the night is a little bittersweet. The movies that have been nominated offer up “really like” prospects to watch in the coming months. That’s a good thing. Oscar night, though, also signals the end of the best time of the year for new releases. Between now and November, there won’t be much more than a handful of new Oscar worthy movies released to the public. That’s a bad thing. There is only a 35.8% chance I will “really like” a movie that doesn’t earn a single Academy Award nomination. On the other hand, a single minor nomination increases the “really like” probability to 56%. If a movie wins one of the major awards (Best Picture, Director, Actor, Actress, Screenplay), the probability increases to 69.7%.

At the end of last week’s post, I expressed a desire to come up with a “really like” movie indicator that was independent of the website data driven indicators. The statistical significance of Academy Award performance would seem to provide the perfect solution. All movies released over the past 90 years have been considered for Oscar nominations. A movie released in 1936 has statistical equivalence to a movie released in 2016 in terms of Academy Award performance.

By using the Total # of Ratings Quintiles introduced last week credibility weights can be assigned to each Quintile to allocate website data driven probabilities and Oscar performance  probabilities. These ten movies, seen more than 15 years ago, illustrates how the allocation works.

My Top Ten Seen Before Movie Prospects 
Not Seen in Last 15 Years
Movie Title Total Ratings Quintile Website Driven Probability Oscar Driven Probability Net  “Really Like” Probability
Deer Hunter, The 4 97.1% 73.8% 88.5%
Color Purple, The 4 97.9% 69.3% 87.4%
Born on the Fourth of July 4 94.0% 73.8% 86.6%
Out of Africa 4 94.0% 73.8% 86.6%
My Left Foot 3 94.0% 73.8% 83.9%
Coal Miner’s Daughter 3 97.9% 69.3% 83.6%
Love Story 3 92.7% 72.4% 82.6%
Fight Club 5 94.0% 55.4% 81.9%
Tender Mercies 2 94.0% 73.8% 81.2%
Shine 3 88.2% 73.8% 81.0%

The high degree of credible website data in Quintiles 4 & 5 weights the Net Probability closer to the Website driven probability. The Quintile 3 movies are weighted 50/50 and the resulting Net Probability ends up at the midpoint between the Data Driven probability and the Oscar driven probability. The movie in Quintile 2, Tender Mercies, which has a less credible probability from the website driven result, tilts closer to the Oscar driven probability.

The concern I raised last week about the “really like” viability of older movies I’ve never seen before goes away with this change. Take a look at my revised older movie top ten now.

My Top Ten Never Seen Movie Prospects 
Never Seen Movies =  > Release Date + 6 Months
Movie Title Last Data Update Release Date Total # of Ratings “Really Like” Probability
Movie Title Total Ratings Quintile Website Driven Probability Oscar Driven Probability Net  “Really Like” Probability
Yearling, The 1 42.1% 73.8% 71.4%
More the Merrier, The 1 26.9% 73.8% 70.2%
12 Angry Men (1997) 1 42.1% 69.3% 67.2%
Lili 1 26.9% 69.3% 66.0%
Sleuth 1 42.1% 66.8% 64.9%
Of Mice and Men (1939) 1 42.1% 66.8% 64.9%
In a Better World 1 41.5% 66.8% 64.9%
Thousand Clowns, A 1 11.8% 69.3% 64.9%
Detective Story 1 11.8% 69.3% 64.9%
Body and Soul 1 11.8% 69.3% 64.9%

Strong Oscar performing movies that I’ve never seen before become viable prospects. Note that all of these movies are Quintile 1 movies. Because of their age and lack of interest from today’s movie website visitors, these movies would never achieve enough credible ratings data to become recommended movies.

There is now an ample supply of viable, Oscar-worthy, “really like” prospects to hold me over until next year’s Oscar season. Enjoy your Oscar night in La La Land.

 

The Art of Selecting “Really Like Movies: Older Never Before Seen

Last week I stated in my article that I could pretty much identify whether a movie has a good chance of being a “really like movie” within six months of its release. If you need any further evidence, here are my top ten movies that I’ve never seen that are older than six months.

My Top Ten Never Seen Movie Prospects 
Never Seen Movies =  > Release Date + 6 Months
Movie Title Last Data Update Release Date Total # of Ratings “Really Like” Probability
Hey, Boo: Harper Lee and ‘To Kill a Mockingbird’ 2/4/2017 5/13/2011          97,940 51.7%
Incendies 2/4/2017 4/22/2011        122,038 51.7%
Conjuring, The 2/4/2017 7/19/2013        241,546 51.7%
Star Trek Beyond 2/4/2017 7/22/2016        114,435 51.7%
Pride 2/4/2017 9/26/2014          84,214 44.6%
Glen Campbell: I’ll Be Me 2/9/2017 10/24/2014        105,751 44.6%
Splendor in the Grass 2/5/2017 10/10/1961        246,065 42.1%
Father of the Bride 2/5/2017 6/16/1950        467,569 42.1%
Imagine: John Lennon  2/5/2017 10/7/1998        153,399 42.1%
Lorenzo’s Oil 2/5/2017 1/29/1993        285,981 42.1%

The movies with a high “really like” probability in this group have already been watched. Of the remaining movies, there are three movies that are 50/50 and the rest have the odds stacked against them. In other words, if I watch all ten movies I probably won’t “really like” half of them. The dilemma is that I would probably “really like” half of them if I do watch all ten. The reality is that I won’t watch any of these ten movies as long as there are movies that I’ve already seen with better odds. Is there a way to improve the odds for any of these ten movies?

You’ll note that all ten movies have probabilities based on less than 500,000 ratings. Will some of these movies improve their probabilities as they receive more ratings? Maybe. Maybe not. To explore this possibility further I divided my database into quintiles based on the total number of ratings. When I look at the quintile with the most ratings, the most credible quintile, it does provide results that define the optimal performance of my algorithm.

Quintile 5

# Ratings Range > 2,872,053

# of Movies # “Really Like” Movies % “Really Like” Movies Proj.  Avg. Rating All Sites My Avg Rating My Rating to Proj. Rating Diff.
Movies Seen More than Once 152 134 88% 8.6 8.5 -0.1
Movies Seen Once 246 119 48% 7.5 6.9 -0.7
             
All Movies in Range 398 253 64% 7.9 7.5  

All of the movies in Quintile 5 have more than 2,872,053 ratings. My selection of movies that I had seen before is clearly better than my selection of movies I watched for the first time. This better selection is because the algorithm results led me to the better movies and my memory did some additional weeding. My takeaway is that, when considering movies I’ve never seen before, put my greatest trust in the algorithm if the movie falls in this quintile.

Lets look at the next four quintiles.

Quintile 4

# Ratings Range 1,197,745 to 2,872,053

# of Movies # “Really Like” Movies % “Really Like” Movies Proj.  Avg. Rating All Sites My Avg Rating My Rating to Proj. Rating Diff.
Movies Seen More than Once 107 85 79% 8.3 8.3 0.1
Movies Seen Once 291 100 34% 7.1 6.4 -0.7
             
All Movies in Range 398 185 46% 7.4 6.9
Quintile 3

# Ratings Range 516,040 to 1,197,745

# of Movies # “Really Like” Movies % “Really Like” Movies Proj.  Avg. Rating All Sites My Avg Rating My Rating to Proj. Rating Diff.
Movies Seen More than Once 122 93 76% 7.8 8.0 0.2
Movies Seen Once 278 102 37% 7.1 6.6 -0.6
             
All Movies in Range 400 195 49% 7.3 7.0
Quintile 2

# Ratings Range 179,456 to 516,040

# of Movies # “Really Like” Movies % “Really Like” Movies Proj.  Avg. Rating All Sites My Avg Rating My Rating to Proj. Rating Diff.
Movies Seen More than Once 66 46 70% 7.4 7.5 0.2
Movies Seen Once 332 134 40% 7.0 6.4 -0.6
             
All Movies in Range 398 180 45% 7.1 6.6
Quintile 1

# Ratings Range < 179,456

# of Movies # “Really Like” Movies % “Really Like” Movies Proj.  Avg. Rating All Sites My Avg Rating My Rating to Proj. Rating Diff.
Movies Seen More than Once 43 31 72% 7.0 7.5 0.5
Movies Seen Once 355 136 38% 6.9 6.2 -0.7
             
All Movies in Range 398 167 42% 6.9 6.4

Look at the progression of the algorithm projections as the quintiles get smaller. The gap between the movies seen more than once and those seen only once narrows as the number of ratings gets smaller. Notice that the difference between my ratings and the projected ratings for Movies Seen Once is fairly constant for all quintiles, either -0.6 or -0.7. But for the Movies Seen More than Once, the difference grows positively as the number of ratings gets smaller. This suggests that, for Movies Seen More than Once, the higher than expected ratings I give movies in Quintiles 1 & 2 are primarily driven by my memory of the movies rather than the algorithm.

What does this mean for my top ten never before seen movies listed above? All of the top ten is either in Quintiles 1 or 2. As they grow into the higher quintiles some may emerge with higher “really like” probabilities. Certainly, Star Trek Beyond, which is only 7 months old, can be expected to grow into the higher quintiles. But, what about Splendor in the Grass which was released in 1961 and, at 55 years old, might not move into Quintile 3 until another 55 years pass.

It suggests that another secondary movie quality indicator is needed that is separate from the movie recommender sites already in use. It sounds like I’ve just added another project to my 2017 “really like” project list.

 

 

The Art of Selecting “Really Like” Movies: New Movies

I watch a lot of movies, a fact that my wife, and occasionally my children, like to remind of. Unlike the average, non-geeky, movie fan, though, I am constantly analyzing the process I go through to determine which movies I watch. I don’t like to watch mediocre, or worse, movies. I’ve pretty much eliminated bad movies from my selections. But, every now and then a movie I “like” rather than “really like” will get past my screen.

Over the next three weeks I’ll outline the steps I’m taking this year to improve my “really like” movie odds. Starting this week with New Movies, I’ll lay out a focused strategy for three different types of movie selection decisions.

The most challenging “really like” movie decision I make is which movies that I’ve never seen before are likely to be “really like” movies. There is only a 39.3% chance that watching a movie I’ve never seen before will result in a “really like” experience. My goal is to improve those odds by the end of the year.

The first step I’ve taken is to separate movies I’ve seen before from movies I’ve never seen in establishing my “really like” probabilities. As a frame of reference, there is a 79.5% chance that I will “really like” a movie I’ve seen before. By setting my probabilities for movies I’ve never seen off of the 39.3% probability I have created a tighter screen for those movies. This should result in me watching fewer never-before-seen movies then I’ve typically watched in previous years. Of the 20 movies I’ve watched so far this year, only two were never-before-seen movies.

The challenge in selecting never-before-seen movies is that, because I’ve watched close to 2,000 movies over the last 15 years, I’ve already watched the “cream of the crop” from those 15 years.. From 2006 to 2015, there were 331 movies that I rated as “really like” movies, that is 33 movies a year, or less than 3 a month. Last year I watched 109 movies that I had never seen before. So, except for the 33 new movies that came out last year that, statistically, might be “really like” movies, I watched 76 movies that didn’t have a great chance of being “really like” movies.

Logically, the probability of selecting a “really like” movie that I’ve never seen before should be highest for new releases. I just haven’t seen that many of them. I’ve only seen 6 movies that were released in the last six months and I “really liked” 5 of them. If, on average, there are 33 “really like” movies released each year, then, statistically, there should be a dozen “really like” movies released in the last six months that I haven’t seen yet. I just have to discover them. Here is my list of the top ten new movie prospects that I haven’t seen yet.

My Top Ten New Movie Prospects 
New Movies =  < Release Date + 6 Months
Movie Title Release Date Last Data Update “Really Like” Probability
Hacksaw Ridge 11/4/2016 2/4/2017 94.9%
Arrival 11/11/2016 2/4/2017 94.9%
Doctor Strange 11/4/2016 2/6/2017 78.9%
Hidden Figures 1/6/2017 2/4/2017 78.7%
Beatles, The: Eight Days a Week 9/16/2016 2/4/2017 78.7%
13th 10/7/2016 2/4/2017 78.7%
Before the Flood 10/30/2016 2/4/2017 51.7%
Fantastic Beasts and Where to Find Them 11/18/2016 2/4/2017 51.7%
Moana 11/23/2016 2/4/2017 51.7%
Deepwater Horizon 9/30/2016 2/4/2017 45.4%
Fences 12/25/2016 2/4/2017 45.4%

Based on my own experience, I believe you can identify most of the new movies that will be “really like” movies within 6 months of their release, which is how I’ve defined “new” for this list. I’m going to test this theory this year.

In case you are interested, here is the ratings data driving the probabilities.

My Top Ten New Movie Prospects 
Movie Site Ratings Breakdown
Ratings *
Movie Title # of Ratings All Sites Age 45+ IMDB Rotten Tomatoes ** Criticker Movielens Netflix
Hacksaw Ridge         9,543 8.2 CF 86% 8.3 8.3 8.6
Arrival      24,048 7.7 CF 94% 8.8 8.1 9.0
Doctor Strange      16,844 7.7 CF 90% 8.2 8.3 7.8
Hidden Figures         7,258 8.2 CF 92% 7.7 7.3 8.2
Beatles, The: Eight Days a Week         1,689 8.2 CF 95% 8.0 7.3 8.0
13th    295,462 8.1 CF 97% 8.3 7.5 8.0
Before the Flood         1,073 7.8 F 70% 7.6 8.2 7.8
Fantastic Beasts and Where to Find Them      14,307 7.5 CF 73% 7.3 6.9 7.6
Moana         5,967 7.7 CF 95% 8.4 8.0 7.0
Deepwater Horizon      40,866 7.1 CF 83% 7.8 7.6 7.6
Fences         4,418 7.6 CF 95% 7.7 7.1 7.2
*All Ratings Except Rotten Tomatoes Calibrated to a 10.0 Scale
** CF = Certified Fresh, F = Fresh

Two movies, Hacksaw Ridge and Arrival, are already probably “really like” movies and should be selected to watch when available. The # of Ratings All Sites is a key column. The ratings for Movielens and Netflix need ratings volume before they can credibly reach their true level. Until, there is a credible amount of data the rating you get is closer to what an average movie would get. A movie like Fences, at 4,418 ratings, hasn’t reached the critical mass needed to migrate to the higher ratings I would expect that movie to reach. Deep Water Horizon, on the other hand, with 40,866 ratings, has reached a fairly credible level and may not improve upon its current probability.

I’m replacing my monthly forecast on the sidebar of this website with the top ten new movie prospects exhibit displayed above. I think it is a better reflection of the movies that have the best chance of being “really like” movies. Feel free to share any comments you might have.

 

Create, Test, Analyze, and Recreate

Apple’s IPhone just turned 10 years old. Why has it been such a successful product? It might be because the product hasn’t stayed static. The latest version of the IPhone is the IPhone 7+. As a product, it is constantly reinventing itself to improve its utility. It is always fresh. Apple, like most producers of successful products, probably follows a process whereby they:

  1. Create.
  2. Test what they’ve created.
  3. Analyze the results of their tests.
  4. Recreate.

They never dust off their hands and say, “My job is done.”

Now I won’t be so presumptuous to claim to have created something as revolutionary as the IPhone. But, regardless of how small your creation, its success requires you to follow the same steps outlined above.

My post last week outlined the testing process I put my algorithm through each year. This week I will provide some analysis and take some steps towards a recreation. The results of my test was that using my “really like” movie selection system significantly improved the overall quality of the movies I watch. On the negative side, the test showed that once you hit some optimal number of movies in a year the additional movies you might watch has a diminishing quality as the remaining pool of “really like” movies shrinks.

A deeper dive into these results begins to clarify the key issues. Separating movies that I’ve seen at least twice from those that were new to me is revealing.

Seen More than Once Seen Once
1999 to 2001 2014 to 2016 1999 to 2001 2014 to 2016
# of Movies 43 168 231 158
% of Total Movies in Timeframe 15.7% 51.5% 84.3% 48.5%
IMDB Avg Rating                   7.6                   7.6                   6.9                   7.5
My Avg Rating                   8.0                   8.4                   6.1                   7.7
% Difference 5.2% 10.1% -12.0% 2.0%

There is so much interesting data here I don’t know where to start. Let’s start with the notion that the best opportunity for a “really like” movie experience is the “really like” movie you’ve already seen. I’ve highlighted in teal the percentage that My Avg Rating outperforms the IMDB Avg Rating in both timeframes. The fact that, from 1999 to 2001, I was able to watch movies that I “really liked” more than the average IMDB voter, without the assistance of any movie recommender website, suggests that memory of a “really like” movie is a pretty reliable “really like” indicator. The 2014 to 2016 results suggest that my “really like” system can help prioritize the movies that memory tells you that you will “really like” seeing again.

The data highlighted in red and blue clearly display the advantages of the “really like” movie selection system. It’s for the movies you’ve never seen that movie recommender websites are worth their weight in gold. With limited availability of movie websites from 1999 to 2001 my selection of new movies underperformed the IMDB Avg Rating by 12% and they represented 84.3% of all of the movies I watched during that timeframe. From 2014 to 2016 (the data in blue), my “really like” movie selection system recognized that there is a limited supply of new “really like” movies. As a result less than half of the movies watched from 2014 through 2016 were movies I’d never seen before. Of the new movies I did watch, there was a significant improvement over the 1999 to 2001 timeframe in terms of quality, as represented by the IMD Avg Rating, and my enjoyment of the movies, as represented by My Avg Rating.

Still, while the 2014 to 2016 new movies were significantly better than the new movies watched from 1999 to 2001, is it unrealistic to expect My Ratings to be better than IMDB by more than 2%? To gain some perspective on this question, I profiled the new movies I “really liked” in the 2014 to 2016 timeframe and contrasted them with the movies I didn’t “really like”.

Movies Seen Once
2014 to 2016
“Really Liked” Didn’t “Really Like”
# of Movies 116 42
% of Total Movies in Timeframe 73.4% 26.6%
IMDB Avg Rating                       7.6                                  7.5
My Avg Rating                       8.1                                  6.3
“Really Like” Probability 82.8% 80.7%

The probability results for these movies suggest that I should “really like” between 80.7% and 82.8% of the movies in the sample. I actually “really liked” 73.4%, not too far off the probability expectations. The IMDB Avg Rating for the movies I didn’t “really like” is only a tick lower than the rating for the “really liked” movies. Similarly, the “Really Like” Probability is only a tick lower for the Didn’t “Really Like” movies. My conclusion is that there is some, but not much, opportunity to improve selection of new movies through a more disciplined approach. The better approach would be to favor “really like” movies that I’ve seen before and give new movies more time for their data to mature.

Based on my analysis, here is my action plan:

  1. Set separate probability standards for movies I’ve seen before and movies I’ve never seen.
  2. Incorporate the probability revisions into the algorithm.
  3. Set a minimum probability threshold for movies I’ve never seen before.
  4. When the supply of “really like” movies gets thin, only stretch for movies I’ve already seen and memory tells me I “really liked”.

Create, test, analyze and recreate.

 

A New Year’s Ritual: Looking Back to Help Move Forward

I’m a big fan of the New Year’s ritual of taking stock of where you’ve been and resolving to make some adjustments to make the coming year better. This New Year marks the completion of my third year of working with an algorithm to help me select better movies to watch. Since establishing my database, I’ve used each New Year to take two snapshots of my viewing habits.

The first snapshot is of the movies that have met the fifteen year limit that I’ve imposed on my database. This year it’s the year 2001 that is frozen in time. I became a user of IMDB in June 2000. That makes 2001 the first full year that I used a data based resource to supplement my movie selection process which, at the time, was still primarily guided by the weekly recommendations of Siskel & Ebert.

The second snapshot is of the data supporting the movie choices I made in 2016. By looking at a comparison of 2001 with 2016, I can gain an appreciation of how far I’ve come in effectively selecting movies. Since this is the third set of snapshots I’ve taken I can also compare 1999 with 2014 and 2000 with 2015, and all years with each other.

Here are the questions I had and the results of the analysis. In some instances it suggests additional targets of research.

Am I more effective now than I was before in selecting movies to watch?

There is no question that the creation of online movie recommending websites and the systematic use of them to select movies improves overall selection. The comparison below of the two snapshots mentioned previously for the last three years demonstrates significant improvement over the last three years.

 Year # of Movies My Avg Rating  Year # of Movies My Avg Rating % Rating Diff.
2001 109                        6.0 2016 144 7.4 23.3%
2000 106                        6.9 2015 106 8.4 21.7%
1999 59                        6.4 2014 76 8.8 37.5%
1999 – 2001 274 6.4 2014 – 2016 326 8.1 25.1%

One area of concern might be a pattern, or it could be random, in the 2014 to 2016 data that might suggest that there is a diminishing return in the overall quality of movies watched as the number of movies watched increases.

Am I more likely to watch movies I “really like”?

Again, the answer is a resounding “Yes”.

# of Movies # “Really Liked” % “Really Liked”
1999 59 25 42.4%
2000 106 50 47.2%
2001 109 40 36.7%
2014 76 76 100.0%
2015 106 91 85.8%
2016 144 100 69.4%

The concern raised about diminishing returns from increasing the number of movies watched is in evidence here as well. In 2014 I “really liked” all 76 movies I watched. Is it worth my time to watch another 30 movies, as I did in 2015, if I will “really like” 15 of them? Maybe. Maybe not. Is it worth my while to watch an additional 68 movies, as I did in 2016, if I will “really like” only 24? Probably not.

How do I know that I am selecting better movies and not just rating them higher?

As a control, I’ve used the IMDB average rating as an objective measure of quality.

IMDB Avg Rating My Avg Rating Difference
1999 7.0 6.4                   (0.6)
2000 7.1 6.9                   (0.2)
2001 6.9 6.0                   (0.9)
2014 7.8 8.8                     1.0
2015 7.6 8.4                     0.8
2016 7.4 7.4                        –

The average IMDB voter agrees that the movies I’ve watched from 2014 to 2016 are much better than the movies I watched from 1999 to 2001. What is particularly interesting is that the movies I chose to watch from 1999 to 2001, without the benefit of any website recommending movies I’d personally like, were movies I ended up liking less than the average IMDB voter. From 2014 to 2016, with the benefit of tools like Netflix, Movielens, and Criticker, I’ve selected movies that I’ve liked better than the average IMDB voter. The 2016 results feed the diminishing returns narrative, suggesting that the more movies that I watch the more my overall ratings will migrate to average.

My 2017 “Really Like” resolution.

My selection algorithm is working effectively. But, the combination of a diminishing number of “really like” movies that I haven’t seen in the last fifteen years, and my desire to grow the size of my database, may be causing me to reach for movies that are less likely to result in a “really like” movie experience. Therefore, I resolve to establish within the next month a minimum standard below which I will not reach.

Now that’s what New Year’s is all about, the promise of an even better “really like” movie year.

 

 

 

 

 

 

 

 

How Do You Know a Tarnished Penny Isn’t a Tarnished Quarter?

One of my first posts on this site was The Shiny Penny in which I espoused the virtues of older movies. I still believe that and yet here I am, almost eleven months later, wondering if my movie selection algorithm does a good enough job surfacing those “tarnished quarters”. A more accurate statement of the problem is that older movies generate less data for the movie websites I use in my algorithm which in turn creates fewer recommended movies.

Let me explain the issue by using a comparison of IMDB voting with my own ratings for each movie decade. Since I began developing my algorithm around 2010, I’m also going to use 2010 as the year that I began disciplining my movie choices to an algorithm. Also, you might recall from previous posts, that my database consists of movies I’ve watched in the last fifteen years. Each month I remove movies from the database that go beyond the fifteen years and make them available for me to watch again. One other clarification, I use the IMDB ratings for age 45+ to better match with my demographic.

To familiarize you with the format I’ll display for each decade here’s a look at the 2010’s:

Database Movies Released in the 2010’s # of Movies % of Movies Avg # of Voters Avg. IMDB Rating My Avg. Rating
Viewed After Algorithm 340 100.0%    10,369 7.3 7.3
Viewed Before Algorithm 0 0.0%

The 340 movies that I’ve seen from the 2010’s are 17.2% of all of the movies I’ve seen in the last 15 years and there are three more years in the decade to go. If the number of recommended movies were distributed evenly across all nine decades this percentage would be closer to 11%. Because the “shiny pennies” are the most available to watch, there is a tendency to watch more of the newer movies. I also believe that many of the newer movies fit the selection screen before the data matures that might not fit the screen after the data matures. The Average # of Voters column is an indicator of how mature the data is. Keep this in mind as we look at subsequent decades.

The 2000’s represent my least disciplined movie watching. 38.4% of all of the movies in the database come from this decade. The decision to watch specific movies was driven primarily by what was available rather than what was recommended.

Database Movies Released in the 2000’s # of Movies % of Movies Avg # of Voters Avg. IMDB Score Avg.My Score
Viewed After Algorithm 81 10.6%    10,763 7.2 6.8
Viewed Before Algorithm 680 89.4%    10,405 7.1 6.4

One thing to remember about movies in this decade is that only movies watched in 2000 and 2001 have dropped out of the database. As a result, only 10.6% of the movies were selected to watch with some version of the selection algorithm.

The next three decades represent the reliability peak in terms of the algorithm.

Database Movies Released in the 1990’s # of Movies % of Movies Avg # of Voters Avg. IMDB Score Avg.My Score
Viewed After Algorithm 115 46.7%    18,179 7.4 8.1
Viewed Before Algorithm 131 53.3%    11,557 7.2 7.0
Database Movies Released in the 1980’s # of Movies % of Movies Avg # of Voters Avg. IMDB Score Avg.My Score
Viewed After Algorithm 68 44.4%    14,025 7.5 7.6
Viewed Before Algorithm 85 55.6%    12,505 7.4 7.0
Database Movies Released in the 1970’s # of Movies % of Movies Avg # of Voters Avg. IMDB Score Avg.My Score
Viewed After Algorithm 38 38.0%    18,365 7.8 7.6
Viewed Before Algorithm 62 62.0%      9,846 7.5 6.5

Note that the average number of voters per movie is higher for these three decades than the movies released after 2000. Each decade there is a growing gap in the number of voters per movie that get recommended by the algorithm and those that are seen before using the algorithm. This may be indicative of the amount of data needed to produce a recommendation. You also see larger gaps in my enjoyment of the movies that use the disciplined movie selection process against those movies seen prior to the use of the algorithm. My theory is that younger movie viewers will only watch the classics and as a result they are the movies that generate sufficient data for the algorithm to be effective.

When we get to the four oldest decades in the database, it becomes clear that the number of movies with enough data to fit the algorithm is minimal.

Database Movies Released in the 1960’s # of Movies % of Movies Avg # of Voters Avg. IMDB Score Avg.My Score
Viewed After Algorithm 23 20.0%    14,597 8.0 8.3
Viewed Before Algorithm 92 80.0%      6,652 7.7 6.6
Database Movies Released in the 1950’s # of Movies % of Movies Avg # of Voters Avg. IMDB Score Avg.My Score
Viewed After Algorithm 22 18.0%    11,981 8.0 8.4
Viewed Before Algorithm 100 82.0%      5,995 7.7 5.9
Database Movies Released in the 1940’s # of Movies % of Movies Avg # of Voters Avg. IMDB Score Avg.My Score
Viewed After Algorithm 21 22.1%      8,021 8.0 7.9
Viewed Before Algorithm 74 77.9%      4,843 7.8 6.5
Database Movies Released in the Pre-1940’s # of Movies % of Movies Avg # of Voters Avg. IMDB Score Avg.My Score
Viewed After Algorithm 7 14.0%    12,169 8.0 7.5
Viewed Before Algorithm 43 86.0%      4,784 7.9 6.2

The results are even more stark. For these oldest decades of movies, today’s movie viewers and critics are drawn to the classics for these decades but probably not much else. It is clear that the selection algorithm is effective for movies with enough data. The problem is that the “really like” movies from these decades that don’t generate data don’t get recommended.

Finding tarnished quarters with a tool that requires data when data diminishes as movies age is a problem. Another observation is that the algorithm works best for the movies released from the 1970’s to the 1990’s probably because the data is mature and plentiful. Is there a value in letting the shiny pennies that look like quarters get a little tarnished before watching them?

Merry Christmas to all and may all of your movies seen this season be “really like” movies.

 

 

Post Navigation