I'm not a robot

I have a confession - when I started using Duolingo 1,105 days ago, I was looking for a quick fix. I thought it would be an easy and painless way to consistently revise my Mandarin, making my GCSE a breeze. Tragically, spending a couple minutes a day being bullied by an owl wasn’t quite the magic path to total fluency I had been optimistically hoping for. And yet - I’m still using Duolingo, which is no mean feat considering the sizeable graveyard of educational apps taking up my phone storage. I don’t even study Mandarin anymore, but I'm still using Duolingo every single day. When I saw a blog post about Birdbrain (one of Duolingo’s AI models, for generating personalised lesson content), I realised there was so much more going on behind the cute characters with their goofy phrases.

Way back in 2000, the chief scientist of Yahoo! (the most valuable company in the world, at the time) visited Carnegie Mellon to give a talk: 10 problems Yahoo! didn’t know how to solve. In the audience, Luis Von Ahn - a postgraduate student looking for a topic for his first research project, found inspiration in Yahoo!'s struggles with spam emails.

Widespread adoption of email in the late nineties meant it quickly became marketer's new favourite method for low cost outreach. By 2000 regulation still hadn’t caught up with this new technology, and spam email was prolific - not only adverts and promotions, but the creation of more malicious scam campaigns like phishing schemes and the ILOVEYOU computer worm. Yahoo! Mail was growing rapidly, and whilst they encouraged this growth hoping to dominate the market, with each new account, the problem of spam became harder and harder to solve. They had tried limiting the number of emails a single account could send in one day, but there was nothing they could do to stop people just making loads of accounts (and sending the spam from all of them). Although a human is relatively slow at setting up an email account, and can only create so many before getting exceptionally bored, spammers were just writing programmes to automate the process - getting the computer to generate new email accounts and send innumerable spams.

Luis figured that as long as there was a way to differentiate between humans (trying to set up a legitimate email), and a spammer’s computer, then Yahoo!’s problem would become significantly more manageable. But how do you identify the humans? How can they prove that they’re not a robot?

Luis sent Yahoo! an email explaining his proposed solution: a Completely Automated Public Turing Test to Tell Computers and Humans Apart, or CAPTCHA for short. Computers can generate a CAPTCHA, taking a random assortment of letters and distorting them, so humans can read it but computers can’t. By inserting a CAPTCHA on the sign up page, the user’s “humanness” would be automatically verified every time someone tried to set up a new email.

Yahoo! were using CAPTCHAs within a week.

And soon, so was everyone else. Other websites saw these squiggly letters on the front page of Yahoo! (again - a much bigger deal back then), and quickly developed their own versions to use to validate their users - sign ups, comment sections, mailing lists, you name it. But Luis was bothered by this widespread adoption. It wasn’t long before people started considering CAPTCHAs the annoying hurdles we know today, and Luis estimated CAPTCHA was wasting 500,000 hours of humanity’s time every single day. If people were going to spend their time completing CAPTCHAs, couldn’t they at least extract some computational value from this effort?

ReCAPTCHA launched in 2007 - this time, the wonky words being shown came from scanned book pages, so every reCAPTCHA typed would be helping to digitise the world’s literary works.

A small social networking site called Facebook added a reCAPTCHA to their sign up page, and soon they were gaining so many new users that a year's worth of the New York Times archive could be digitised in a single week. Within the first year, 440 million words had been deciphered through reCAPTCHA, and in 2008 it was bought by Google (who used it for the words that were too distorted to be understood by Optical Character Recognition from their scans of texts on Google Books).

Having sold reCAPTCHA to Google, Luis started working on Duolingo with postgrad Severin Hacker. Neither were native English speakers, and so had first-hand experience of how beneficial learning another language could be - in most countries, knowing English doubles a person's earning potential. Both were adamant about providing free education, and ensured the company’s key focus would always be on constantly improving the learning experience, drawing both from psychology and computer science to make their product the best it can be.

Luis is still running Duolingo, now the world's most popular education platform with 500 million users. Educating the world for free is an ambitious goal, and one that many have failed at before. But Duolingo are trying to do it differently, focussing on making people want to learn - the fun characters and phrases, leaderboards, and challenges make the learning experience feel more like a game than a lesson - and like many games, it's an addictive one. If the heartbreak of losing a streak is the extra motivation boost needed to keep someone studying, then Duolingo have mastered the art of education.

If you'd like to learn more about Duolingo's latest ventures, I'd recommend their blog, and for a fuller version of Luis' story - this podcast interview

I'm not a robot

Related Posts