4 killer apps that were built using DevOps

March 15, 2016 andrew.morris

 

It used to take up to about 9 months to deploy or longer. With months’ worth of lead time in the run-up.

It’s marginally quicker to walk from Istanbul to Edinburgh than it was to make any significant updates to your application.

So some smart people sat down to figure out how to reduce deployment time. Reducing deployment time is incredibly powerful because it shortens the feedback loop, which means that the features that customers want (or don’t want) are built (or removed) as quickly as possible.

What resulted snugly fits under what we now know as DevOps – a buzzword for the kind of IT culture that fosters collaboration and integration with the aim of accelerating the application lifecycle.

Several players have really shown the market what is possible and inspired many other companies to change their own cultures.

Here are the top four.

4-killer-apps-blog_Icons.png

Flickr

Flickr was the first to announce they were following DevOps principles, achieving 10 deployments a day all the way back in 2009.

At that time about 40,000 photos per second where being uploaded to their platform. Under those conditions you have to be flexible.

Flickr accomplishes this through an automated testing cycle at all levels of the software stack in a realistic staging environment. If the code passes the test, it is then tagged, released, built, and pushed into production.

John Allspaw and Paul Hammond were instrumental in institutionalising their winning DevOps culture at Flickr, made a famous presentation about how they achieved this here: http://www.slideshare.net/jallspaw/10-deploys-per-day-dev-and-ops-cooper....

These guys demonstrated the huge commercial advantages that can be achieved with DevOps and paved the way for others to follow.

NETFLIX

Netflix took the DevOps baton and ran with it.

They even built their own in-house PaaS-style platform that permits members from each team to deploy the infrastructure they need whenever they want.
A lot of Netflix’s infrastructure is open source and available on Github. Their game plan is to one day release all of their open source resources so that other companies can benefit from their hard work.

In November 2015 they announced the release of ‘Spinnaker’, their ‘open source multi-cloud Continuous Delivery platform for releasing software changes with high velocity and confidence’.

Jason Chan, cloud security architect at Netflix, summarises their view on DevOps: “we have to make humans more effective via automated decision-making, automated data gathering, and analysis. You really need to help get what’s most important in front of people as quickly and easily as possible, so you’re using your human resources as effectively as possible.”
Somewhat uniquely, Netflix try to make life as difficult as possible for themselves, to ensure that they can deal with the worst eventualities. They created what they call the ‘Netflix Simian Army’: chaotic (virtual) primates that randomly induce failure in Netflix’s systems.

The ‘Chaos Monkey’, for example, randomly incapacitates servers. The ‘Chaos Gorilla’ replicates the effects of an entire AWS availability zone going down. The ‘Conformity Monkey’ finds instances that don’t adhere to accepted best practice and shuts them down.

It must drive the DevOps guys completely…nuts. Right?

Etsy

But ultimately it’s not just the frequency of deployment, as Etsy have demonstrated over the years. They don’t deploy as often as some of the big guns (e.g. Amazon) at around 50 deploys a day (although that is still pretty frequent!)

Just check out what BLA had to say: “[w]e want to be able to scale the number of deploys we’re doing with how quickly the rest of the teams are moving. So if a designated ops or development team starts feeling some pain, we’ll look at how we can improve the process. We want to make sure we’re getting the features out we want to get out and if that means we have to deploy faster, then we’re going to solve that problem. So it’s not around the number of deploys.”

Every developer has their own kernel-based virtual machine, which means that everyone has their own full Etsy stack, as well as access to every single monitoring dashboard. At the same time, there is a policy that everything that can be tracked in a graph, is tracked in a graph – ‘Kale’ is a piece of bespoke software that searches for anomalous patterns in the mounds of data that Etsy produces.

Testing is emphasised and made easy through the use of ‘Try’ a tool they developed, which allows for quick, reliable testing (over 14,000 per day). Lastly, they developed their own deployment tool (the excellently-dubbed ‘Deployinator’), which allows for single-click deployments.

This is underpinned by the use of IRC to create a collaborative culture from the ground up. There is no Devops group at Etsy – it’s integrated into the whole IT department.
Has it worked?

Etsy’s 54 million members would probably say ‘yes’.

Amazon

Amazon used to waste about 40 per cent of its server capacity. That’s what happens when you have dedicated servers and are trying to predict the future (i.e. guess traffic demands for the coming months).

The online retailer moved to its own cloud (AWS), which allowed engineers to simply flex capacity up or down as needed. This helped make Amazon’s server usage much more efficient but, more importantly, set them up for a transition to a culture of continuous deployment that put them down in the record books of DevOps.

Within a year of Amazon's move to AWS, engineers were deploying code every 11.7 seconds, on average. That’s around 7,500 deployments every 24 hours. They have had 75% fewer outages since 2006, 90% fewer outage minutes, and only 0.001% deployments cause a problem.

And that’s how you become one of the most successful companies in the world.

How times have changed

Nowadays, in the time it might take you to walk from Istanbul to Edinburgh, Amazon will have deployed about 2,000,000 times.
Takeaway: companies today move fast.

Correction: successful companies today move fast.

The above companies have fundamentally changed customer expectations. All companies are now software-driven and, to compete, they must quickly and efficiently respond to the marketplace, preferably on the basis of rapid A/B testing, monitoring and short feedback loops.

 

Previous Article
Security show-down: cloud versus on-premises
Security show-down: cloud versus on-premises

Is the cloud secure?

Next Article
Thar be dragons: reducing the top 5 disaster recovery myths to ashes
Thar be dragons: reducing the top 5 disaster recovery myths to ashes

Here are the top five Disaster Recovery myths that are preventing you from properly defending your castle f...