Looking into my Netflix data

16 Dec 2023

Genres watched

Data prep

Step 1: Organise the data according to genre.

My initial plan was to use an official Netflix API to pull the genre values for each title in my watch history but it turns out there is no API because apparently, Netflix put out a statement saying that “To better focus our efforts and to align them with the needs of our global member base,” the company would shut down its public API.

Reelgood could be used as an alternative but it was too costly to pay for. So instead I consulted IMDB and copied their genre categories. Finally, I manually assigned a primary category and, in some cases, a secondary category for each title in my list because lots of content does not fit neatly into one genre. As an aside, there might be some room for debate with IMDB about what differentiates a Thriller from a Mystery but I digress. I also briefly considered using the IMDB API but abandoned that idea because I thought it would slow me down and I was done with the manual categorisation of 121 titles in about 30 minutes.

Findings

My top 5 favourite genres were:

I had many internal debates about the categories and where some shows should go. For instance, is Orange Is The New Black a crime, comedy or drama show? Is Black Mirror perhaps a sci-fi, dark-comedy show? To what extent was The Crown a historical documentary or a high-profile family drama? And to all these question my answer is yes. It all applies. Much of my favourite content is layered and straddles multiple genres. I loved a good dramedy (drama + comedy) like Pain Hustlers or BEEF and was equally delighted by the unique comedy + mystery combo in movies like They Cloned Tyrone and Shimmer Lake.

With Documentaries being my 2nd favourite genre I appreciate Netflix’s ability to source and deliver compelling serialised documentaries. I was utterly fascinated by the depth and story-telling in Free Money, MH370: The Plane That Disappeared, Beckham, Big Vape and Senzo: Murder of a Soccer Star (desperately hoping season 2 of Senzo is on the way!).

Bingeing habits 🍿

Data prep

Step 2: Organise the data according to date started and date finished

Here I had to ensure I had clean and well-formatted data in the ‘date_watched’ field of my dataframe. Additionally it was important to have a ‘title’ and an ‘episode_name’ field to differentiate and enable me to count unique show or movie titles and unique episodes which may be part of a series.

I briefly considered creating an additional field for ‘season’ to enable me to drill on the ways I watched seasons of the same title, however the naming convention around seasons is not always consistent for example:

the traditional-ish bunch

title ~ season number ~ episode name

title ~ subtitle ~ season number ~ episode name

title ~ subtitle ~ season number ~ part ~ episode name

the limited series bunch

title ~ series type ~ episode name

title ~ subtitle ~ series type ~ episode name

the bunch which refused to use the word ‘season’

title ~ part ~ episode name

title ~ volume ~ episode name

and then there's bojack

Findings

It’s well-known that many video streaming platforms like Netflix and YouTube are built to keep you watching so you could say binge-watching is a key marker of success for these platforms. Let’s just say the house always wins and the cliff-hangers employed worked on me. Aside from Netflix actively trying to keep me locked in by any means necessary, my curiosity and love for good story-telling also contribute to my binge-sessions.

In 2023 I watched 450 unique titles. 67 of these were once-off shows or movies and 383 were episodes in a series. The data revealed I am a loyalist for the most part, and if I find a show I like, I watch it all before moving on. In March specifically, I watched an entire season of 3 different game shows. It makes sense that bingeing and game shows are a powerful combo as viewers look forward to seeing the next challenge for contestants and the drama of who is safe, who gets eliminated, and who eventually wins the prize. The cause-and-effect nature of these shows is exciting and is a built-in cliffhanger, because where else in our lives can we observe such immediate results? I recall the game shows The Mole and Is It Cake? used this strategy brazenly by always introducing the next episode’s challenge, at the end of the current episode, to the point where each series started to feel like one really long take. It was like drowning in an uninterrupted stream of content with no moment to pause.

In September my viewing peaked again with 55 titles watched, which coincided with the much-anticipated drop of the final seasons of Top Boy and Sex Education. Instead of drowning, I felt like I was savouring these shows. I sometimes even had the self-restraint to only watch two episodes a day 😇.

As a viewer it is both exciting and overwhelming when an entire season of Top Boy, Sex Education or The Crown is dropped in one go. These shows often have really long production and post-production periods which only add to the anticipation. Although it may feel like winning the lottery when you stumble upon a whole new season of your favourite show I feel there are also some down-sides to Netflix’s binge-release strategy:

Places I went with Netflix Airlines ✈️

Data Prep

Step 3: Organise data according to geographic origin

For this visualisation I categorized my viewing history manually by recalling the language and location of each title. Again I briefly considered using the IMDB API here but abandoned that idea because only a handful of shows are from outside Hollywood so there was minimal classification to do.

For this visualisation the map, labels, and dots were the easy part. I spent the longest time trying to get the lines to be drawn neatly. I eventually settled on writing code for the lines to be drawn between points that were in a similar latitude or longitude range but upon closer inspection, the locations are just connected in the descending order as they appear in the data. Sometimes trying to write sophisticated code is a road to nowhere.

Findings

I think TV and movies are great ways to explore new perspectives and places and venture out of your comfort zone with relative ease and at a low cost. I’ve truly enjoyed going off the beaten path of Hollywood and being immersed in strange new contexts and storylines.

It’s awesome to see the incredible amount of non-American content Netflix has available. I have managed to watch several Brazilian, South African and Kenyan titles but most bizarrely 2023 has been the year of Australian content for me. I particularly enjoyed the funny Gen-X show Why Are You Like This? and the dry humor from Fisk which might make you nostalgic for The Office (IMO George has cemented his place as an iconic receptionist next to Pam.) An honourable mention must also go to the migrant-drama Stateless for excellent acting and gripping story-telling.

Looking at the data through a language lens, most of what I watched was unremarkably in English. What I did notice is that it seems any foreign-language content I watch seems to overlap with the languages I am juggling in Duolingo. I’m always so impressed by the movies that come out of Kenya for how gritty and real they are. Veve in particular had a tight and compelling story paired with some excellent acting performances - who knew Savara from Sauti Sol could act like that and seamlessly weave singing into his character, genius. Although I was almost completely reliant on the subtitles I still enjoyed the French comedy Nothing to Hide (also can someone explain to me if all foreign-language films have an English title? Because that would mean we are most certainly judging a book by its cover, no?)