Facebook comment preview craze using Django

In my spare time I tend to check Facebook every now and then. With my family and most of my friends living in a different country than me it is a great way to see how they are doing.

Aside from all the weird videos and "tag a friend that has to give you food" posts on it, I saw an interesting new concept. The idea is you comment a url and it will return you a random preview. The first one I saw was using Pokémon.

This is a fun idea and great for generating traffic. Their main website has over 35,000 likes. However, it's technically not very smart. The same person can comment 3 different times and is likely to yield 3 different results.

Another example I saw - but didn't take a screenshot of - was using songs. "Which song describes you best", it said. The format was a bit different, you had to type the url with some parameters. For example "example-site.com/(jilles-soeters)" returned "Justin Bieber - Baby" - how accurate. When I tried it again I got a different song. In this post I'll show you how I made a cooler version of these "apps" in no time using Django.

The idea

I did not want to copy Pokémon or songs so I went for actors / actresses. It would be cool if I could somehow make it a bit intelligent. I aimed for guessing the gender from the name. This way Laura wouldn't get Johnny Depp. Another thing that would be cool is matching names. So if my friend Jack used it, he would get an actor named Jack (e.g. Jack Nicholson)

Once I had the idea, it was time to model the data.

The Django models

For this very simple app I only needed 2 models. The first model would be an Actor of course. I'd like to show the actor name and an image. Actors would also have to be categorised as male or female.

[models.py]

from django.db import models

class Actor(models.Model):  
    GENDERS = (('M', 'Male'), ('F', 'Female'))

    name = models.CharField(max_length=50)
    img_url = models.URLField()
    gender = models.CharField(max_length=1, choices=GENDERS)

This very simple model should do the trick. Now we only need one more model: a Person using our "app". We'd get this person's name from the URL and assign him an Actor. That's exactly what our model will do:

class Person(models.Model):  
    name = models.CharField(max_length=75)
    actor = models.ForeignKey(Actor)

That's it for the models. Now we need to get the actors and actresses.

Gathering the data

A quick Google search for "best actors" and "best actresses" yielded great results.

Great, all I needed now was to put those 2 lists into my database (still a simple SQLite DB for this small app). I had used BeautifulSoup4 before. It is "a Python library for pulling data out of HTML and XML files". Perfect.

Both imdb lists return identical structure so that will make one script work for both urls.

As you can see in the image we need the list_item class. From there just the img[src] and the b > a content. The script to populate the data is around 20 lines and was created in a ./scripts/ folder relative to the models.py file.

[scripts/populate_db.py]

# The HTML parser
from bs4 import BeautifulSoup  
# Fetching imdb's source
from urllib import request

# The model to save our data to
from ..models import Actor

men_list = 'http://www.imdb.com/list/ls050274118/'  
women_list = 'http://www.imdb.com/list/ls000055475/'


# For Django's manage.py runscript
def run():

    def fetch_data_and_populate(imdb_list_url, gender):
        html = request.urlopen(imdb_list_url).read()
        soup = BeautifulSoup(html, 'html.parser')
        # A list with BeautifulSoup items to iterate over
        actor_items = soup.findAll('div', {'class': 'list_item'})
        new_actors = []
        for actor in actor_items:
            # Finds .list_item b a and its content
            name = actor.b.a.text
            # Finds .list_item's img tag and its source
            img_url = actor.img['src']
            # Create a new Actor object
            new_actors.append(Actor(name=name, img_url=img_url, gender=gender))

        # Save all models in 1 query instead of 100
        Actor.objects.bulk_create(new_actors)

    fetch_data_and_populate(men_list, 'M')
    fetch_data_and_populate(women_list, 'F')

Now all that's left is to run the script :)

python manage.py runscript populate_db  

It takes a few seconds and voila, the actors!

And of course, the actresses!

Time to start the next step.

Guessing the gender from a name

One of the things I love about Python is PyPi. I guess you can compare it to NPM for Node. I found a great library called "gender-guesser". After downloading it I tested it quickly using the Python REPL. If it recognised my rather uncommon name as male I would be sold.

Awesome! We're sold. Let's dive in the actual app code!

Creating the view

We only have one view, the actor view. I decided that domain.com/actor/firstname-lastname would be best. It's easy to understand for the average Facebook user and the URL should be fairly unique for every user.

[views.py]

# Random number generator to select a random entry 
from random import randint  
# Our awesome gender guesser
import gender_guesser.detector as gender_detector

# Django supplied Class Based View to show a template
from django.views.generic import TemplateView

from .models import Actor, Person


class ActorView(TemplateView):

    template_name = 'actor_view.html'

    def get_context_data(self, **kwargs):
        context = super().get_context_data(**kwargs)

        # The name from the URL
        self.name = kwargs['name'] or 'unknown'
        # Check if we already have this person. 
        # We don't want a new result for the same name!
        visitor = Person.objects.filter(name=self.name).first()

        if not visitor:
            visitor = self.create_new_actor()

        context['actor'] = visitor.actor
        return context

    def create_new_actor(self):
        # We need just the first name to match against our actors
        first_name = self.get_first_name()

        # Filters our objects that start with the name
        # e.g. John would get ['John Wayne', 'Johnny Depp', 'John Goodman']
        # istartswith is case insensitive startswith
        name_matches = Actor.objects.filter(name__istartswith=first_name)

        count = len(name_matches)
        if count == 1:
            # If we have 1 Actor, use that one
            actor = name_matches.first()
        elif count > 1:
            # If there's more than one, use a random one *from those*
            actor = name_matches[randint(0, count - 1)]
        else:
            # Otherwise, most likely create a new actor
            guesser = gender_detector.Detector()
            guessed_gender = guesser.get_gender(first_name)
            if guessed_gender == 'unknown':
                # We didn't find the gender, just use any actor
                actor = self.get_random_actor()
            else:
                gender = 'M' if guessed_gender == 'male' else 'F'
                # Woohoo we can get a gender accurate actor!
                actor = self.get_random_actor(gender)

        return Person.objects.create(name=self.name, actor=actor)

    def get_first_name(self):
        # Returns 'Jilles' for 'jilles-soeters'
        # Gender guesses returns unknown if name doesn't start with a capital letter
        return (self.name.split('-')[0] if '-' in self.name else self.name).title()

    @staticmethod
    def get_random_actor(gender=None):
        # Get all our Actor objects, perhaps filtered by gender
        qs = Actor.objects.filter(gender=gender) if gender else Actor.objects.all()
        total_count = qs.count()
        random_actor = qs[randint(0, total_count - 1)]
        # Return a random one
        return random_actor

This is all the logic we need to make an intelligent guess. Now we just need to create the actual view. Facebook really just cares about our meta tags to show the preview.

[templates/actor_view.html]

<!doctype html>  
<head>  
<meta property="og:image" content="{{ actor.img_url }}" />  
<meta property="og:title" content="{{ actor.name }}" />  
</head>  
...

We get the actor object from our views context['actor'] = visitor.actor.

Creating our URL

The last step before releasing our app in the wild. A very simple URL handler but with a regex that accepts names like "François" and "Esmé"

from django.conf.urls import url

from .views import ActorView

urlpatterns = [  
    url(r'^(?i)actor/(?P<name>[a-zàâçéèêëîïôûùüÿñæœ-]{0,75})', ActorView.as_view()),
]

That's it! I ran Django's migrations and deployed the app to my super professional domain name *ahem*.

The result

I created a new post and only shared it with myself not to bother my friends and tried some stuff.

I'm pretty happy with the result. Django is awesome. The name guessing library is great and I had a lot of fun making this! The repo is on GitHub here if you want to play around with it yourself :)


If you liked it don't hesitate to message me on Twitter @Jilles.