Will I "Really Like" this Movie?

Navigating Movie Website Ratings to Select More Enjoyable Movies

Some Facts Are Not So Trivial

As I’ve mentioned before on these pages, I always pay a visit to the IMDB trivia link after watching a movie. Often I will find a fun but ultimately trivial fact such as the one I discovered after viewing Beauty and the Beast. According to IMDB, Emma Watson was offered the Academy Award winning role of Mia in La La Land but turned it down because she was committed to Beauty and the Beast. Coincidentally, the heretofore non-musical Ryan Gosling was offered the role of the Beast and turned it down because he was committed to that other musical, La La Land. You really can’t fault either of their decisions. Both movies have been huge successes.

On Tuesday I watched the “really like” 1935 film classic Mutiny on the Bounty.My visit to the trivia pages of this film unearthed facts that were more consequential than trivial. For example, the film was the first movie of  historically factual events with actors playing historically factual people to win the Academy Award for Best Picture. The previous eight winners were all based on fiction. Real life became a viable source for great films as the next two Best Picture winners, The Great Ziegfeld and The Life of Emile Zola, were also biographies. Interestingly, it would be another 25 years before another non-fictional film, Lawrence of Arabia, would win a Best Picture award.

Mutiny on the Bounty also has the distinction of being the only movie ever to have three actors nominated for Best Actor. Clark Gable, Charles Laughton, and Franchot Tone were all nominated for Best Actor. Everyone expected one of them to win. After splitting the votes amongst themselves, none of them won. Oscar officials vowed to never let that happen again. For the next Academy Awards in 1937, they created two new awards for Actor and Actress in a Supporting Role. Since then, in only six other instances, have two actors from the same movie been nominated for Best Actor.

Before leaving Mutiny on the Bounty, there is one more non-trivial fact to relate about the movie. The characters of Captain Bligh and First Mate Fletcher Christian grow to hate each other in the plot. To further that requisite hate in the movie, Irving Thalberg, one of the producers, purposely cast the overtly gay Charles Laughton as Bligh and the notorious homophobe Gable as Fletcher Christian. This crass manipulation of the actors’ prejudice seemed to have worked as the hate between the two men was evident on the set and clearly translated to the screen. This kind of morally corrupt behavior was not uncommon in the boardrooms of the Studio system in Hollywood at the time.

Some other older Best Picture winning films with facts, not trivial, but consequential to the film industry or the outside world include:

  • It Happened One Night, another Clark Gable classic, in 1935 became the first of only three films to win the Oscar “grand slam”. The other two were One Flew Over the Cuckoo’s Nest and Silence of the Lambs. The Oscar “grand slam” is when a movie wins all five major awards, Best Picture, Director, Actor, Actress, and Screenplay.
  • Gone with the Wind, along with being the first Best Picture filmed in color,  is the longest movie, at four hours, to win Best Picture. Hattie McDaniel became the first black actor to be nominated and win an Oscar for her role in the film.
  • In Casablanca, there is a scene where the locals drown out the Nazi song “Watch on the Rhine” with their singing of the “Marseillaise”. In that scene you can see tears running down the cheeks of many of the locals. For many of these extras the tears were real since they were actual refugees from Nazi tyranny. Ironically, many of the Nazis in the scene were also German Jews who had escaped Germany.
  • To prepare for his 1946 award winning portrayal of an alcoholic in The Lost Weekend, IMDB reveals that “Ray Milland actually checked himself into Bellevue Hospital with the help of resident doctors, in order to experience the horror of a drunk ward. Milland was given an iron bed and he was locked inside the “booze tank.” That night, a new arrival came into the ward screaming, an entrance which ignited the whole ward into hysteria. With the ward falling into bedlam, a robed and barefooted Milland escaped while the door was ajar and slipped out onto 34th Street where he tried to hail a cab. When a suspicious cop spotted him, Milland tried to explain, but the cop didn’t believe him, especially after he noticed the Bellevue insignia on his robe. The actor was dragged back to Bellevue where it took him a half-hour to explain his situation to the authorities before he was finally released.”
  • In the 1947 film Gentlemen’s Agreement about anti-Semitism, according to IMDB, “The movie mentions three real people well-known for their racism and anti-Semitism at the time: Sen. Theodore Bilbo (D-Mississippi), who advocated sending all African-Americans back to Africa; Rep. John Rankin (D-Mississippi), who called columnist Walter Winchell  “the little kike” on the floor of the House of Representatives; and leader of “Share Our Wealth” and “Christian Nationalist Crusade” Gerald L. K. Smith, who tried legal means to prevent Twentieth Century-Fox from showing the movie in Tulsa. He lost the case, but then sued Fox for $1,000,000. The case was thrown out of court in 1951.”

One of the definitions of “trivia” is “an inessential fact; trifle”. Because IMDB lists facts under the Trivia link does not make them trivia. The facts presented here either promoted creative growth in the film industry or made a significant statement about society. Some facts are not so trivial.

 

 

 

Sometimes When You Start To Go There You End Up Here

There are some weeks when I’m stumped as to what I should write about in this weekly trip to Mad Moviedom. Sometimes I’m in the middle of an interesting study that isn’t quite ready for publication. Sometimes an idea isn’t quite fully developed. Sometimes I have an idea but I find myself blocked as to how to present it. When I find myself in this position, one avenue always open to me is to create a quick study that might be halfway interesting.

This is where I found myself this week. I had ideas that weren’t ready to publish yet. So, my fallback study was going to be a quick study of which movie decades present the best “really like” viewing potential. Here are the results of my first pass at this:

“Really Like” Decades
Based on Number of “Really Like” Movies
As of April 6, 2017
My Rating
Really Liked Didn’t Really Like Total “Really Like” Probability
 All       1,108                  888         1,996
 2010’s           232                  117            349 60.9%
 2000’s           363                  382            745 50.5%
 1990’s           175                    75            250 62.0%
 1980’s             97                    60            157 58.4%
 1970’s             56                    49            105 54.5%
 1960’s             60                    55            115 53.9%
 1950’s             51                    78            129 46.6%
 1940’s             55                    43               98 55.8%
 1930’s             19                    29               48 46.9%

These results are mildly interesting. The 2010’s, 1990″s, 1980’s, and 1940’s are above average decades for me. There are an unusually high number of movies in the sample that were released in the 2000’s. Remember that movies stay in my sample for 15 years from the year I last watched the movie. After 15 years they are removed from the sample and put into the pool of movies available to watch again. The good movies get watched again and the other movies are never seen again, hopefully. Movies last seen after 2002 have not gone through the process of separating out the “really like” movies to be watched again and permanently weeding from the sample the didn’t “really like” movies. The contrast of the 2000’s with the 2010’s is a good measure of the impact of the undisciplined selection movies and the disciplined selection.

As I’ve pointed out in recent posts, I’ve made some changes to my algorithm. One of the big changes I’ve made is that I’ve replaced the number of movies that are “really like” movies with the number of ratings for the movies that are “really like” movies. After doing my decade study based on number of movies, I realized I should have used the number of ratings method to be consistent with my new methodology. Here are the results based on the new methodology:

“Really Like” Decades
Based on Number of “Really Like” Ratings
As of April 6, 2017
My Rating
Really Liked Didn’t Really Like Total “Really Like” Probability
 All    2,323,200,802    1,367,262,395    3,690,463,197
 2010’s        168,271,890        166,710,270        334,982,160 57.1%
 2000’s    1,097,605,373        888,938,968    1,986,544,341 56.6%
 1990’s        610,053,403        125,896,166        735,949,569 70.8%
 1980’s        249,296,289        111,352,418        360,648,707 65.3%
 1970’s          85,940,966          25,372,041        111,313,007 67.7%
 1960’s          57,485,708          15,856,076          73,341,784 68.0%
 1950’s          28,157,933          23,398,131          51,556,064 59.5%
 1940’s          17,003,848            5,220,590          22,224,438 67.4%
 1930’s            9,385,392            4,517,735          13,903,127 64.6%

While the results are different, the big reveal was that 63.0% of the ratings are for “really like” movies and only 55.5% of the number of movies are “really like” movies. It starkly reinforces the impact of the law of large numbers. Movie website indicators of “really like” movies are more reliable when the number of ratings driving those indicators are larger. The following table illustrates this better:

“Really Like” Decades
Based on Average Number of “Really Like” Ratings per Movie
As of April 6, 2017
My Rating
Really Liked Didn’t Really Like Total “Really Like” % Difference
 All      2,096,751.63      1,539,709.90      1,848,929.46 36.2%
 2010’s          725,309.87      1,424,874.10          959,834.27 -49.1%
 2000’s      3,023,706.26      2,327,065.36      2,666,502.47 29.9%
 1990’s      3,486,019.45      1,678,615.55      2,943,798.28 107.7%
 1980’s      2,570,064.84      1,855,873.63      2,297,125.52 38.5%
 1970’s      1,534,660.11          517,796.76      1,060,123.88 196.4%
 1960’s          958,095.13          288,292.29          637,754.64 232.3%
 1950’s          552,116.33          299,976.04          399,659.41 84.1%
 1940’s          309,160.87          121,409.07          226,779.98 154.6%
 1930’s          493,968.00          155,783.97          289,648.48 217.1%

With the exception of the 2010’s, the average number of ratings per movie is larger for the “really like” movies. In fact, they are dramatically different for the decades prior to 2000. My educated guess is that the post-2000 years will end up fitting the pattern of the other decades once those years mature.

So what is the significance of this finding. It clearly suggests that waiting to decide whether to see a new movie or not until a sufficient number of ratings come in will produce a more reliable result. The unanswered question is how many ratings is enough.

The finding also reinforces the need to have something like Oscar performance to act as a second measure of quality for movies that will never have “enough” ratings for a reliable result.

Finally, the path from “there to here” is not always found on a map.

The Wandering Mad Movie Mind

Last week in my post I spent some time leading you through my thought process in developing a Watch List. There were some loose threads in that article that I’ve been tugging at over the last week.

The first thread was the high “really like” probability that my algorithm assigned to two movies, Fight Club and Amelie, that I “really” didn’t like the first time I saw them. It bothered me to the point that I took another look at my algorithm. Without boring you with the details, I had an “aha” moment and was able to reengineer my algorithm in such a way that I can now develop unique probabilities for each movie. Prior to this I was assigning the same probability to groups of movies with similar ratings. The result is a tighter range of probabilities clustered around the base probability. The base probability is defined as the probability that I would “really like” a movie randomly selected from the database. If you look at this week’s Watch List, you’ll notice that my top movie, The Untouchables, has a “really like” probability of 72.2%. In my revised algorithm that is a high probability movie. As my database gets larger, the extremes of the assigned probabilities will get wider.

One of the by-products of this change is that the rating assigned by Netflix is the most dominant driver of the final probability. This is as it should be. Netflix has by far the largest database of any I use.  Because of this it produces the most credible and reliable ratings of any of the rating websites. Which brings me back to Fight Club and Amelie. The probability for Fight Club went from 84.8% under the old formula to 50.8% under the new formula. Amelie went from 72.0% to 54.3%. On the other hand, a movie that I’m pretty confident that I will like, Hacksaw Ridge changed only slightly from 71.5% to 69.6%.

Another thread I tugged at this week was in response to a question from one of the readers of this blog.  The question was why was Beauty and the Beast earning the low “really like” probability of 36.6% when I felt that there was a high likelihood that I was going to “really like” it. The fact is that I saw the movie this past week and it turned out to be a “really like” instant classic. I rated it a 93 out of 100, which is a very high rating from me for a new movie. In my algorithm, new movies are underrated for two reasons. Because they generate so few ratings in their early months, e.g. Netflix has only 2,460 ratings for Beauty and the Beast so far, the credibility of the movie’s own data is so small that the “really like” probability is driven by the Oscar performance part of the algorithm. This is the second reason for the low rating. New movies haven’t been through the Oscar cycle yet and so their Oscar performance probability is that of a movie that didn’t earn an Oscar nomination, or 35.8%. This is why Beauty and the Beast was only at 36.6% “really like” probability on my Watch List last week.

I’ll leave you this week with a concern. As I mentioned above, Netflix is the cornerstone of my whole “really like” system. You can appreciate then my heart palpitations when it was announced a couple of weeks ago that Netflix is abandoning it’s five star rating system in April. It is replacing it with a thumbs up or down rating with a % next to it, perhaps a little like Rotten Tomatoes. While I am keeping and open mind about the change, it has the potential of destroying the best movie recommender system in the business. If it does, I will be one “mad” movie man, and that’s not “crazy” mad.

A Movie Watch List is Built by Thinking Fast and Slow

In early 2012 I read a book by Daniel Kahneman titled Thinking Fast and Slow. Kahneman is a psychologist who studies human decision making and, more precisely, the thinking process. He suggests that the human mind has two thinking processes. The first is the snap judgement that evolved to quickly identify threats and react to them quickly in order to survive. He calls this “thinking fast”. The second is the rational thought process that weighs alternatives and evidence before reaching a decision. This he calls “thinking slow”. In the book, Kahneman discusses what he calls the “law of least effort”. He believes that the mind will naturally gravitate to the easiest solution or action rather than to the more reliable evidence based solution. He suggests that the mind is most subject to the “law of least effort” when it is fatigued, which leads to less than satisfactory decision making more often than not.

How we select the movies we watch, I believe, is generally driven by the “law of least effort”. For most of us, movie watching is a leisure activity. Other than on social occasions, we watch movies when we are too tired to do anything else in our productive lives. Typically, the movies we watch are driven by what’s available to watch at the time we decide to watch. From the movies available, we decide what seems like a movie we’d like at that moment in time. We choose by “thinking fast”. Sometimes we are happy with our choice. Other times, we get half way through the movie and start wondering, over-optimistically I might add, if this dreadful movie will ever be over.

It doesn’t have to be that way. One tool I use is a Movie Watch List that I update each week using a “thinking slow” process.. My current watch list can be found on the side bar under Ten Movies on My Watch List This Week. Since you may read this blog entry sometime in the future, here’s the watch list I’ll be referring to today:

Ten Movies On My Watch List This Week
As Of March 22, 2017
Movie Title Release Year Where Available Probability I Will “Really Like”
Fight Club 1999 Starz 84.8%
Amélie 2002 Netflix – Streaming 72.0%
Hacksaw Ridge  2016 Netflix – DVD 71.5%
Emigrants, The 1972 Warner Archive 69.7%
Godfather: Part III, The 1990 Own DVD 68.7%
Pride and Prejudice 1940 Warner Archive 67.3%
Steel Magnolias 1989 Starz 67.1%
Paper Moon 1973 HBO 63.4%
Confirmation 2016 HBO 57.0%
Beauty and the Beast 2017 Movie Theater 36.6%

The movies that make it to this list are carefully selected based on the movies that are available in the coming week on the viewing platforms I can access. I use my algorithm to guide me towards movies with a high “really like” probability. I determine who I’m likely to watch movies with during the upcoming week. If I’m going to watch movies with others, I make sure that there are movies on the list that those others might like. And, finally, I do some “thinking fast” and identify those movies that I really want to see and those movies that, instinctively, I am reluctant to see.

The movies on my list above in green are those movies that I really want to see. The movies in turquoise are those movies I’m indifferent to but are highly recommended by the algorithm. The movies in red are movies that I’m reluctant to see.

So, you may ask, why do I have movies that I don’t want to see on my watch list? Well, it’s because I’m the Mad Movie Man. These are movies that my algorithm suggests have a high “really like” probability. In the case of Fight Club, for example, I’ve seen the movie before and was turned off by the premise. On the other hand, it is a movie that my algorithm, based on highly credible data,  indicates is the surest “really like” bet of all the movies I haven’t seen in the last 15 years. Either my memory is faulty, or my tastes have changed, or there is a flaw in my algorithm, or a flaw in the data coming from the websites I use. It may just be that it is among the movies in the 15% I won’t like. So, I put these movies on my list because I need to know why the mismatch exists. I have to admit, though, that it is hard getting these red movies off the list because I often succumb to the “law of least effort” and watch another movie I’d much rather see.

Most of our family is gathering together in the coming week and so Beauty and the Beast and Hacksaw Ridge are family movie candidates. In case my wife and I watch a movie together this week, Amélie , Pride and Prejudice, and Steel Magnolias are on the list.

The point in all this is that by having a Watch List of movies with a high “really like” probability you are better equipped to avoid the “law of least effort” trap and get more enjoyment out of your leisure time movie watching.

 

Playing Tag with Movielens, Redux

Last July I wrote an article introducing my use of tags in Movielens to organize my movies. I can’t impress upon you enough how useful this tool is to someone as manic about movies as I am.

Regular readers of this blog know that I’ve shifted my focus to Oscar nominated movies. My research revealed that movies that haven’t receive a single Academy Award nomination have only a 35.8% chance of being a “really like” movie. On the other hand, even a single minor nomination increases the “really like” odds to around 55%. My algorithm now incorporates Oscar recognition.

Based on this finding, I’ve altered my tagging strategy. I created an “Oscar” tag that I attach to any movie I run across that has received even a single nomination. Many of these movies are older without enough credible data in the movie ratings websites to earn reliable recommendations. The probabilities in my algorithm for these Quintile 1 & 2 movies are driven by their Oscar performance.

Movies that pique my interest that weren’t Oscar nominated are tagged separately. Now, because these movies have no Oscar nominations, a Quintile 1 or 2 movie is going to have a “really like” probability closer to the 35% mark that reflects its “no nomination” status. It can only climb to a high enough probability to be considered for my weekly watch list if it is highly recommended by the movie websites and it falls into a high enough credibility quintile that its Oscar status doesn’t matter much.

I apply one of two tags to non-Oscar nominated movies. If they have fewer than 25 ratings in Movielens, I tag them as “might like”. Realistically, they have no chance of being highly recommended in my algorithm until the number of ratings received from Movielens raters becomes more robust.

Those non-Oscar nominated movies that have more than 25 ratings are tagged as “prospect”. Movies with the “prospect” tag that are highly rated by the websites and have enough ratings to reach higher credibility quintiles can reach a “really like” probability high enough to be considered for the watch list. For example, a quintile 5 movie like The American President can earn a 75% “really like” probability even though it was never nominated for an Academy Award.

I also have created tags for movies I don’t want to see even though they are highly rated. If I’ve already seen a movie and I don’t want to see it again, I tag it “not again”. If I’ve never seen a movie but it’s just not for me, I tag it “not interested”. Movielens also has the capability of hiding movies that you don’t want to see in any of your searches for movies. I take advantage of this feature to hide my “not again” and “not interested” tagged movies.

So, I’ve tagged all of these movies. Now what do I do with them. That will be included in next week’s topic “Building a Watch List”.

And the 2018 Academy Award for Best Picture Goes To?

We are 11 days removed from the 2017 Best Picture award debacle and already Awards Circuit has projected its first list of nominees for the 2018 Best Picture race. Obviously it is way too early make predictions like this with a high degree of accuracy. Many of the movies are still being filmed and can’t be realistically evaluated. It does, though, give us an idea of the movies that evaluators believe have Oscar pedigree.

Here is the Awards Circuit list with its current production status:

2018 Projected Academy Award Nominees for Best Picture
Movie Release Status Short Description
Untitled Paul Thomas Anderson Project In Production 1950’s Drama set in London Fashion world.
Suburbicon Post-Production 1950’s Crime mystery set in small family town.
Darkest Hour Nov. 24 release Churchill biopic set in early days of WW II
The Kidnapping of Edgardo Mortara Pre-Production Historical Drama set in 19th century Italy
Battle of the Sexes Post-Production Billie Jean King-Bobby Riggs 1973 tennis match
The Current War In Production Edison-Westinghouse scientific competition.
Mudbound Post-Production Post-WW II drama set in rural Mississippi
Downsizing Dec. 22 release Social Satire about less is more.
Marshall Oct. 13 release Biopic about a young Thurgood Marshall
The Snowman Oct. 13 release Adaptation of Jo Nesbo crime thriller.

If this first list is representative of the entire year, 2018 is going to be a year of looking back in time. Only two of the ten movies listed here take place in a contemporary setting, Downsizing and The Snowman.

I’m probably most interested in Battle of the Sexes. Emma Stone plays Billie Jean King and is projected as a Best Actress nominee by Awards Circuit. Can she go back to back years as Best Actress?

I’m least interested in the Paul Thomas Anderson movie, even if it includes a rare star turn by Daniel Day Lewis. I hated There Will Be Blood and wasn’t a big fan of Boogie Nights.

In any event, that’s my gut reaction to the Best Picture projections. Is there any data to support my gut? I’m trying out a new data point called an Anticipation Score. The website Criticker provides averages of my ratings for movies involving specific Directors, Screenwriters, and Actors. By tabulating the scores for the film makers involved in each movie I can create an Anticipation Score based on my historical rating of their work. I’m including the two lead actors for each movie. For example, Battle of the Sexes is directed by Jonathon Dayton, screen written by Simon Beaufoy, and stars Emma Stone and Steve Carell. I’ve seen two of Jonathon Dayton’s movies and given them an average rating of 65.5 out of 100. I’ve seen five movies written by Simon Beaufoy for an average of 68.6. I’ve seen seven Emma Stone movies, averaging 81.57, and eight Steve Carell movies, averaging 73. When you add all four numbers together they total an Anticipation Score of 288.67. This represents my potential enjoyment of the movie if each artist entertains me at the average level that they have in the past.

Here’s the entire list ranked by Anticipation Score:

My Anticipation Score
Director Writer Lead Actor 1 Lead Actor 2 Score
The Kidnapping of Edgardo Mortara S. Spielberg T. Kushner M. Rylance O. Isaac 323.39
Downsizing A. Payne J. Taylor M. Damon K. Wiig 313.14
Darkest Hour J. Wright A. McCarten G. Oldman K. Scott-Thomas 293.39
Battle of the Sexes J. Dayton S. Beaufoy E. Stone S. Carell 288.67
Suburbicon G. Clooney E. Coen M. Damon O. Isaac 283.01
The Snowman T. Alfredson H. Amini M. Fassbender R. Ferguson 216.67
The Current War A. Gomez-Rejon M. Mitnick B. Cumberbatch M. Shannon 212.13
Untitled Paul Thomas Anderson Project P. T. Anderson P. T. Anderson D. D. Lewis L. Manville 159.13
Mudbound D. Rees D. Rees C. Mulligan J. Clarke 137.23
Marshall R. Hudlin J. Koskoff C. Boseman S. K. Brown 86.5

My gut reaction to Battle of the Sexes and the Paul Thomas Anderson movie are borne out in the data, although these movies are neither the best or worst of the rankings. The two movies at the bottom of the list are there because I have never seen movies directed or written by the two film makers involved. In the case of Marshall, although I’ve seen Sterling K. Brown on TV shows, I haven’t seen any movies that he has been in. As a result, the Anticipation Score for Marshall is based solely Chadwick Boseman’s movies that I’ve seen.

I think my Anticipation Score formula needs some tweaking to take into account the volume of movies seen for each artist. The fact that I’ve seen 21 Spielberg movies should be recognized in addition to the average rating I give each of his movies.

In any event, keep your eye out for these movies as we get back into Oscar season, beginning in October.

 

 

Is Meryl Streep’s Oscar Record Out of Reach?

With the presentation of Academy Awards completed last Sunday, I am able to tabulate the last Actors of the Decade winners. For the male actors, the winner is Daniel Day Lewis.

Top Actors of the Decade
2007 to 2016 Releases
Actor Lead Actor Nominations Lead Actor Wins Supporting Actor Nominations Supporting Actor Wins Total Academy Award Points
Daniel Day Lewis 2 2 0 0 12
Jeff Bridges 2 1 1 0 10
Leonardo DiCaprio 2 1 0 0 9
Colin Firth 2 1 0 0 9
Eddie Redmayne 2 1 0 0 9
George Clooney 3 0 0 0 9

This result is pretty incredible when you consider that Daniel Day Lewis only appeared in three movies during the entire decade. His three Academy Award Best Actor wins stands alone in the history of the category. It might be interesting to measure Oscar nominations per movie made. I’d be surprised if we found any actor who is even close to Daniel Day Lewis.

As for the Best Female Actor, once again, it is Meryl Streep.

Top Actresses of the Decade
2007 to 2016 Releases
Actress Lead Actress Nominations Lead Actress Wins Supporting Actress Nominations Supporting Actress Wins Total Academy Award Points
Meryl Streep 5 1 1 0 19
Cate Blanchett 3 1 1 0 13
Jennifer Lawrence 3 1 1 0 13
Marion Cotillard 2 1 0 0 9
Sandra Bullock 2 1 0 0 9
Natalie Portman 2 1 0 0 9

When the 28 year old Emma Stone accepted her Best Actress in a Leading Role award, she commented that she still has a lot to learn. It is that kind of attitude, and a commensurate work ethic, for a young actress today to take a run at Meryl Streep’s Oscar nomination record of 20 nominations. Consider that the actresses that Streep chased early in her career, Katherine Hepburn and Bette Davis, received their first nominations some 45 years before Streep earned her first nomination. It has been 38 years since Meryl Streep received her first nomination. We should be on the lookout for the next actress of a generation. Is there a contender already out there?

Let’s look first at the career Oscar performance of Streep, Hepburn, and Davis.

Acting Nomination Points
Lead Actress = 1 point,  Supporting Actress = .5 points
Points at Age:
30 40 50 60 70 80
Meryl Streep 1 7 11 14.5 18
Katherine Hepburn 2 4 6 9 10 11
Bette Davis 3 8 10 11 11 11

I chose not to equate a supporting actress role with a lead actress role to be fair to Hepburn and Davis. With the studios in control of the movies they appeared in, stars didn’t get the chance to do supporting roles. Bette Davis had a strong career before age 50. Katherine Hepburn was strong after age 50. Meryl Streep has outperformed both of them before 50 and after 50. It is not unreasonable to expect more nominations in her future.

As for today’s actresses, I looked at multiple nominated actresses in different age groups to see if anyone is close to tracking her.

Age as of 12/31/2016 Comparison Age Points at Comparison Age Streep at Comparison Age
Cate Blanchett 47 50 5.5 11
Viola Davis 51 50 2 11
Kate Winslet 41 40 5.5 7
Michelle Williams 36 40 3 7
Amy Adams 42 40 3 7
Natalie Portman 35 40 2.5 7
Marion Cotillard 41 40 2 7
Jennifer Lawrence 26 30 3.5 1
Emma Stone 28 30 1.5 1
Keira Knightley 31 30 1.5 1
Rooney Mara 31 30 1.5 1

Except for the 30-ish actresses, none are keeping pace. You might argue that Kate Winslet is in striking distance but, given Streep’s strength after 40, that’s probably not good enough.

Of the young actresses, Jennifer Lawrence has had a very strong start to her career. With 3 lead actress nominations and 1 supporting nomination over the next 14 years she would join Bette Davis as the only actresses to keep pace with Meryl Streep through age 40. Then all she would have to do is average between 3.5 and 4 points every 10 years for anther 30 years or more.

Good luck with that. Along side Joe DiMaggio’s 56 game hitting streak, it may become a record that will never be broken.

The Art of Selecting “Really Like” Movies: Oscar Provides a Helping Hand

Sunday is Oscar night!! From my perspective, the night is a little bittersweet. The movies that have been nominated offer up “really like” prospects to watch in the coming months. That’s a good thing. Oscar night, though, also signals the end of the best time of the year for new releases. Between now and November, there won’t be much more than a handful of new Oscar worthy movies released to the public. That’s a bad thing. There is only a 35.8% chance I will “really like” a movie that doesn’t earn a single Academy Award nomination. On the other hand, a single minor nomination increases the “really like” probability to 56%. If a movie wins one of the major awards (Best Picture, Director, Actor, Actress, Screenplay), the probability increases to 69.7%.

At the end of last week’s post, I expressed a desire to come up with a “really like” movie indicator that was independent of the website data driven indicators. The statistical significance of Academy Award performance would seem to provide the perfect solution. All movies released over the past 90 years have been considered for Oscar nominations. A movie released in 1936 has statistical equivalence to a movie released in 2016 in terms of Academy Award performance.

By using the Total # of Ratings Quintiles introduced last week credibility weights can be assigned to each Quintile to allocate website data driven probabilities and Oscar performance  probabilities. These ten movies, seen more than 15 years ago, illustrates how the allocation works.

My Top Ten Seen Before Movie Prospects 
Not Seen in Last 15 Years
Movie Title Total Ratings Quintile Website Driven Probability Oscar Driven Probability Net  “Really Like” Probability
Deer Hunter, The 4 97.1% 73.8% 88.5%
Color Purple, The 4 97.9% 69.3% 87.4%
Born on the Fourth of July 4 94.0% 73.8% 86.6%
Out of Africa 4 94.0% 73.8% 86.6%
My Left Foot 3 94.0% 73.8% 83.9%
Coal Miner’s Daughter 3 97.9% 69.3% 83.6%
Love Story 3 92.7% 72.4% 82.6%
Fight Club 5 94.0% 55.4% 81.9%
Tender Mercies 2 94.0% 73.8% 81.2%
Shine 3 88.2% 73.8% 81.0%

The high degree of credible website data in Quintiles 4 & 5 weights the Net Probability closer to the Website driven probability. The Quintile 3 movies are weighted 50/50 and the resulting Net Probability ends up at the midpoint between the Data Driven probability and the Oscar driven probability. The movie in Quintile 2, Tender Mercies, which has a less credible probability from the website driven result, tilts closer to the Oscar driven probability.

The concern I raised last week about the “really like” viability of older movies I’ve never seen before goes away with this change. Take a look at my revised older movie top ten now.

My Top Ten Never Seen Movie Prospects 
Never Seen Movies =  > Release Date + 6 Months
Movie Title Last Data Update Release Date Total # of Ratings “Really Like” Probability
Movie Title Total Ratings Quintile Website Driven Probability Oscar Driven Probability Net  “Really Like” Probability
Yearling, The 1 42.1% 73.8% 71.4%
More the Merrier, The 1 26.9% 73.8% 70.2%
12 Angry Men (1997) 1 42.1% 69.3% 67.2%
Lili 1 26.9% 69.3% 66.0%
Sleuth 1 42.1% 66.8% 64.9%
Of Mice and Men (1939) 1 42.1% 66.8% 64.9%
In a Better World 1 41.5% 66.8% 64.9%
Thousand Clowns, A 1 11.8% 69.3% 64.9%
Detective Story 1 11.8% 69.3% 64.9%
Body and Soul 1 11.8% 69.3% 64.9%

Strong Oscar performing movies that I’ve never seen before become viable prospects. Note that all of these movies are Quintile 1 movies. Because of their age and lack of interest from today’s movie website visitors, these movies would never achieve enough credible ratings data to become recommended movies.

There is now an ample supply of viable, Oscar-worthy, “really like” prospects to hold me over until next year’s Oscar season. Enjoy your Oscar night in La La Land.

 

The Art of Selecting “Really Like Movies: Older Never Before Seen

Last week I stated in my article that I could pretty much identify whether a movie has a good chance of being a “really like movie” within six months of its release. If you need any further evidence, here are my top ten movies that I’ve never seen that are older than six months.

My Top Ten Never Seen Movie Prospects 
Never Seen Movies =  > Release Date + 6 Months
Movie Title Last Data Update Release Date Total # of Ratings “Really Like” Probability
Hey, Boo: Harper Lee and ‘To Kill a Mockingbird’ 2/4/2017 5/13/2011          97,940 51.7%
Incendies 2/4/2017 4/22/2011        122,038 51.7%
Conjuring, The 2/4/2017 7/19/2013        241,546 51.7%
Star Trek Beyond 2/4/2017 7/22/2016        114,435 51.7%
Pride 2/4/2017 9/26/2014          84,214 44.6%
Glen Campbell: I’ll Be Me 2/9/2017 10/24/2014        105,751 44.6%
Splendor in the Grass 2/5/2017 10/10/1961        246,065 42.1%
Father of the Bride 2/5/2017 6/16/1950        467,569 42.1%
Imagine: John Lennon  2/5/2017 10/7/1998        153,399 42.1%
Lorenzo’s Oil 2/5/2017 1/29/1993        285,981 42.1%

The movies with a high “really like” probability in this group have already been watched. Of the remaining movies, there are three movies that are 50/50 and the rest have the odds stacked against them. In other words, if I watch all ten movies I probably won’t “really like” half of them. The dilemma is that I would probably “really like” half of them if I do watch all ten. The reality is that I won’t watch any of these ten movies as long as there are movies that I’ve already seen with better odds. Is there a way to improve the odds for any of these ten movies?

You’ll note that all ten movies have probabilities based on less than 500,000 ratings. Will some of these movies improve their probabilities as they receive more ratings? Maybe. Maybe not. To explore this possibility further I divided my database into quintiles based on the total number of ratings. When I look at the quintile with the most ratings, the most credible quintile, it does provide results that define the optimal performance of my algorithm.

Quintile 5

# Ratings Range > 2,872,053

# of Movies # “Really Like” Movies % “Really Like” Movies Proj.  Avg. Rating All Sites My Avg Rating My Rating to Proj. Rating Diff.
Movies Seen More than Once 152 134 88% 8.6 8.5 -0.1
Movies Seen Once 246 119 48% 7.5 6.9 -0.7
             
All Movies in Range 398 253 64% 7.9 7.5  

All of the movies in Quintile 5 have more than 2,872,053 ratings. My selection of movies that I had seen before is clearly better than my selection of movies I watched for the first time. This better selection is because the algorithm results led me to the better movies and my memory did some additional weeding. My takeaway is that, when considering movies I’ve never seen before, put my greatest trust in the algorithm if the movie falls in this quintile.

Lets look at the next four quintiles.

Quintile 4

# Ratings Range 1,197,745 to 2,872,053

# of Movies # “Really Like” Movies % “Really Like” Movies Proj.  Avg. Rating All Sites My Avg Rating My Rating to Proj. Rating Diff.
Movies Seen More than Once 107 85 79% 8.3 8.3 0.1
Movies Seen Once 291 100 34% 7.1 6.4 -0.7
             
All Movies in Range 398 185 46% 7.4 6.9
Quintile 3

# Ratings Range 516,040 to 1,197,745

# of Movies # “Really Like” Movies % “Really Like” Movies Proj.  Avg. Rating All Sites My Avg Rating My Rating to Proj. Rating Diff.
Movies Seen More than Once 122 93 76% 7.8 8.0 0.2
Movies Seen Once 278 102 37% 7.1 6.6 -0.6
             
All Movies in Range 400 195 49% 7.3 7.0
Quintile 2

# Ratings Range 179,456 to 516,040

# of Movies # “Really Like” Movies % “Really Like” Movies Proj.  Avg. Rating All Sites My Avg Rating My Rating to Proj. Rating Diff.
Movies Seen More than Once 66 46 70% 7.4 7.5 0.2
Movies Seen Once 332 134 40% 7.0 6.4 -0.6
             
All Movies in Range 398 180 45% 7.1 6.6
Quintile 1

# Ratings Range < 179,456

# of Movies # “Really Like” Movies % “Really Like” Movies Proj.  Avg. Rating All Sites My Avg Rating My Rating to Proj. Rating Diff.
Movies Seen More than Once 43 31 72% 7.0 7.5 0.5
Movies Seen Once 355 136 38% 6.9 6.2 -0.7
             
All Movies in Range 398 167 42% 6.9 6.4

Look at the progression of the algorithm projections as the quintiles get smaller. The gap between the movies seen more than once and those seen only once narrows as the number of ratings gets smaller. Notice that the difference between my ratings and the projected ratings for Movies Seen Once is fairly constant for all quintiles, either -0.6 or -0.7. But for the Movies Seen More than Once, the difference grows positively as the number of ratings gets smaller. This suggests that, for Movies Seen More than Once, the higher than expected ratings I give movies in Quintiles 1 & 2 are primarily driven by my memory of the movies rather than the algorithm.

What does this mean for my top ten never before seen movies listed above? All of the top ten is either in Quintiles 1 or 2. As they grow into the higher quintiles some may emerge with higher “really like” probabilities. Certainly, Star Trek Beyond, which is only 7 months old, can be expected to grow into the higher quintiles. But, what about Splendor in the Grass which was released in 1961 and, at 55 years old, might not move into Quintile 3 until another 55 years pass.

It suggests that another secondary movie quality indicator is needed that is separate from the movie recommender sites already in use. It sounds like I’ve just added another project to my 2017 “really like” project list.

 

 

The Art of Selecting “Really Like” Movies: New Movies

I watch a lot of movies, a fact that my wife, and occasionally my children, like to remind of. Unlike the average, non-geeky, movie fan, though, I am constantly analyzing the process I go through to determine which movies I watch. I don’t like to watch mediocre, or worse, movies. I’ve pretty much eliminated bad movies from my selections. But, every now and then a movie I “like” rather than “really like” will get past my screen.

Over the next three weeks I’ll outline the steps I’m taking this year to improve my “really like” movie odds. Starting this week with New Movies, I’ll lay out a focused strategy for three different types of movie selection decisions.

The most challenging “really like” movie decision I make is which movies that I’ve never seen before are likely to be “really like” movies. There is only a 39.3% chance that watching a movie I’ve never seen before will result in a “really like” experience. My goal is to improve those odds by the end of the year.

The first step I’ve taken is to separate movies I’ve seen before from movies I’ve never seen in establishing my “really like” probabilities. As a frame of reference, there is a 79.5% chance that I will “really like” a movie I’ve seen before. By setting my probabilities for movies I’ve never seen off of the 39.3% probability I have created a tighter screen for those movies. This should result in me watching fewer never-before-seen movies then I’ve typically watched in previous years. Of the 20 movies I’ve watched so far this year, only two were never-before-seen movies.

The challenge in selecting never-before-seen movies is that, because I’ve watched close to 2,000 movies over the last 15 years, I’ve already watched the “cream of the crop” from those 15 years.. From 2006 to 2015, there were 331 movies that I rated as “really like” movies, that is 33 movies a year, or less than 3 a month. Last year I watched 109 movies that I had never seen before. So, except for the 33 new movies that came out last year that, statistically, might be “really like” movies, I watched 76 movies that didn’t have a great chance of being “really like” movies.

Logically, the probability of selecting a “really like” movie that I’ve never seen before should be highest for new releases. I just haven’t seen that many of them. I’ve only seen 6 movies that were released in the last six months and I “really liked” 5 of them. If, on average, there are 33 “really like” movies released each year, then, statistically, there should be a dozen “really like” movies released in the last six months that I haven’t seen yet. I just have to discover them. Here is my list of the top ten new movie prospects that I haven’t seen yet.

My Top Ten New Movie Prospects 
New Movies =  < Release Date + 6 Months
Movie Title Release Date Last Data Update “Really Like” Probability
Hacksaw Ridge 11/4/2016 2/4/2017 94.9%
Arrival 11/11/2016 2/4/2017 94.9%
Doctor Strange 11/4/2016 2/6/2017 78.9%
Hidden Figures 1/6/2017 2/4/2017 78.7%
Beatles, The: Eight Days a Week 9/16/2016 2/4/2017 78.7%
13th 10/7/2016 2/4/2017 78.7%
Before the Flood 10/30/2016 2/4/2017 51.7%
Fantastic Beasts and Where to Find Them 11/18/2016 2/4/2017 51.7%
Moana 11/23/2016 2/4/2017 51.7%
Deepwater Horizon 9/30/2016 2/4/2017 45.4%
Fences 12/25/2016 2/4/2017 45.4%

Based on my own experience, I believe you can identify most of the new movies that will be “really like” movies within 6 months of their release, which is how I’ve defined “new” for this list. I’m going to test this theory this year.

In case you are interested, here is the ratings data driving the probabilities.

My Top Ten New Movie Prospects 
Movie Site Ratings Breakdown
Ratings *
Movie Title # of Ratings All Sites Age 45+ IMDB Rotten Tomatoes ** Criticker Movielens Netflix
Hacksaw Ridge         9,543 8.2 CF 86% 8.3 8.3 8.6
Arrival      24,048 7.7 CF 94% 8.8 8.1 9.0
Doctor Strange      16,844 7.7 CF 90% 8.2 8.3 7.8
Hidden Figures         7,258 8.2 CF 92% 7.7 7.3 8.2
Beatles, The: Eight Days a Week         1,689 8.2 CF 95% 8.0 7.3 8.0
13th    295,462 8.1 CF 97% 8.3 7.5 8.0
Before the Flood         1,073 7.8 F 70% 7.6 8.2 7.8
Fantastic Beasts and Where to Find Them      14,307 7.5 CF 73% 7.3 6.9 7.6
Moana         5,967 7.7 CF 95% 8.4 8.0 7.0
Deepwater Horizon      40,866 7.1 CF 83% 7.8 7.6 7.6
Fences         4,418 7.6 CF 95% 7.7 7.1 7.2
*All Ratings Except Rotten Tomatoes Calibrated to a 10.0 Scale
** CF = Certified Fresh, F = Fresh

Two movies, Hacksaw Ridge and Arrival, are already probably “really like” movies and should be selected to watch when available. The # of Ratings All Sites is a key column. The ratings for Movielens and Netflix need ratings volume before they can credibly reach their true level. Until, there is a credible amount of data the rating you get is closer to what an average movie would get. A movie like Fences, at 4,418 ratings, hasn’t reached the critical mass needed to migrate to the higher ratings I would expect that movie to reach. Deep Water Horizon, on the other hand, with 40,866 ratings, has reached a fairly credible level and may not improve upon its current probability.

I’m replacing my monthly forecast on the sidebar of this website with the top ten new movie prospects exhibit displayed above. I think it is a better reflection of the movies that have the best chance of being “really like” movies. Feel free to share any comments you might have.

 

Post Navigation