Earlier this week I wrote about the first company based on open source to reach a turnover of one billion dollars. But of course, there are lots of multi-billion dollar turnover companies that are based on open source - Google, Facebook, Twitter etc. - it's just that they don't make money off it directly.
Another company that is open source through and through has done rather well recently. Everyone and their dog must know by know that Instagram is being bought by Facebook for $1 billion, but less well known is the fact that the company is built entirely on open source.
We run Ubuntu Linux 11.04 (“Natty Narwhal”) on Amazon EC2. We’ve found previous versions of Ubuntu had all sorts of unpredictable freezing episodes on EC2 under high traffic, but Natty has been solid. We’ve only got 3 engineers, and our needs are still evolving, so self-hosting isn’t an option we’ve explored too deeply yet, though is something we may revisit in the future given the unparalleled growth in usage.
There are a number of things to note there. First, that Instagram is running on Ubuntu, not always appreciated enough as an enterprise system capable of running a company's entire computing infrastructure, and not just its desktops. Second, that Ubuntu is being run Amazon's EC2: I suspect this is becoming more and more popular, since it combines the rock-solid stability of GNU/Linux with the effortless scalability of the cloud computing approach. Finally, Instagram has only three engineers running the entire operation - further testimony to the generally trouble-free nature of open source, even for a large company, and the low staffing costs that this implies.
Recently, we moved to using Amazon’s Elastic Load Balancer, with 3 NGINX instances behind it that can be swapped in and out (and are automatically taken out of rotation if they fail a health check).
That's a further sign of the continuing rise of nginx for large-scale operations.
We run Django on Amazon High-CPU Extra-Large machines, and as our usage grows we’ve gone from just a few of these machines to over 25 of them (luckily, this is one area that’s easy to horizontally scale as they are stateless).
Most of our data (users, photo metadata, tags, etc) lives in PostgreSQL; we’ve previously written about how we shard across our different Postgres instances. Our main shard cluster involves 12 Quadruple Extra-Large memory instances (and twelve replicas in a different zone.)
So much for that old FUD that open source solutions can't handle the biggest enterprise tasks.
The rest of the post provides fascinating insights into how Instagram handles its large number of users and all their data. But the key takeaway is this: if you are startup, you would be crazy not to use open source products, probably running them in the cloud for future scalability. Indeed, I suspect that there are very few startups of any kind that would use anything else by now. After all, why pay for costly closed-source infrastructure software when all the high-profile success stories are running open source from top to bottom? Do you really want to bet they're doing it wrong?
Instagram may not be a billion-dollar turnover success story like Red Hat, but it is indubitably yet another huge success for open source, and it's unlikely to be the last.