Create, Test, Analyze, and Recreate
Apple’s IPhone just turned 10 years old. Why has it been such a successful product? It might be because the product hasn’t stayed static. The latest version of the IPhone is the IPhone 7+. As a product, it is constantly reinventing itself to improve its utility. It is always fresh. Apple, like most producers of successful products, probably follows a process whereby they:
- Test what they’ve created.
- Analyze the results of their tests.
They never dust off their hands and say, “My job is done.”
Now I won’t be so presumptuous to claim to have created something as revolutionary as the IPhone. But, regardless of how small your creation, its success requires you to follow the same steps outlined above.
My post last week outlined the testing process I put my algorithm through each year. This week I will provide some analysis and take some steps towards a recreation. The results of my test was that using my “really like” movie selection system significantly improved the overall quality of the movies I watch. On the negative side, the test showed that once you hit some optimal number of movies in a year the additional movies you might watch has a diminishing quality as the remaining pool of “really like” movies shrinks.
A deeper dive into these results begins to clarify the key issues. Separating movies that I’ve seen at least twice from those that were new to me is revealing.
|Seen More than Once||Seen Once|
|1999 to 2001||2014 to 2016||1999 to 2001||2014 to 2016|
|# of Movies||43||168||231||158|
|% of Total Movies in Timeframe||15.7%||51.5%||84.3%||48.5%|
|IMDB Avg Rating||7.6||7.6||6.9||7.5|
|My Avg Rating||8.0||8.4||6.1||7.7|
There is so much interesting data here I don’t know where to start. Let’s start with the notion that the best opportunity for a “really like” movie experience is the “really like” movie you’ve already seen. I’ve highlighted in teal the percentage that My Avg Rating outperforms the IMDB Avg Rating in both timeframes. The fact that, from 1999 to 2001, I was able to watch movies that I “really liked” more than the average IMDB voter, without the assistance of any movie recommender website, suggests that memory of a “really like” movie is a pretty reliable “really like” indicator. The 2014 to 2016 results suggest that my “really like” system can help prioritize the movies that memory tells you that you will “really like” seeing again.
The data highlighted in red and blue clearly display the advantages of the “really like” movie selection system. It’s for the movies you’ve never seen that movie recommender websites are worth their weight in gold. With limited availability of movie websites from 1999 to 2001 my selection of new movies underperformed the IMDB Avg Rating by 12% and they represented 84.3% of all of the movies I watched during that timeframe. From 2014 to 2016 (the data in blue), my “really like” movie selection system recognized that there is a limited supply of new “really like” movies. As a result less than half of the movies watched from 2014 through 2016 were movies I’d never seen before. Of the new movies I did watch, there was a significant improvement over the 1999 to 2001 timeframe in terms of quality, as represented by the IMD Avg Rating, and my enjoyment of the movies, as represented by My Avg Rating.
Still, while the 2014 to 2016 new movies were significantly better than the new movies watched from 1999 to 2001, is it unrealistic to expect My Ratings to be better than IMDB by more than 2%? To gain some perspective on this question, I profiled the new movies I “really liked” in the 2014 to 2016 timeframe and contrasted them with the movies I didn’t “really like”.
|Movies Seen Once|
|2014 to 2016|
|“Really Liked”||Didn’t “Really Like”|
|# of Movies||116||42|
|% of Total Movies in Timeframe||73.4%||26.6%|
|IMDB Avg Rating||7.6||7.5|
|My Avg Rating||8.1||6.3|
|“Really Like” Probability||82.8%||80.7%|
The probability results for these movies suggest that I should “really like” between 80.7% and 82.8% of the movies in the sample. I actually “really liked” 73.4%, not too far off the probability expectations. The IMDB Avg Rating for the movies I didn’t “really like” is only a tick lower than the rating for the “really liked” movies. Similarly, the “Really Like” Probability is only a tick lower for the Didn’t “Really Like” movies. My conclusion is that there is some, but not much, opportunity to improve selection of new movies through a more disciplined approach. The better approach would be to favor “really like” movies that I’ve seen before and give new movies more time for their data to mature.
Based on my analysis, here is my action plan:
- Set separate probability standards for movies I’ve seen before and movies I’ve never seen.
- Incorporate the probability revisions into the algorithm.
- Set a minimum probability threshold for movies I’ve never seen before.
- When the supply of “really like” movies gets thin, only stretch for movies I’ve already seen and memory tells me I “really liked”.
Create, test, analyze and recreate.