One development team’s journey into DevOps (Part 2)

TwitterFacebookGoogle+LinkedInRedditStumbleUpon

In my last post, we talked about the first principle in continuous delivery: building a good safety net. We started with a fast and reliable rollback system so we could quickly get healthy again when we made mistakes.

For this post, I am going to talk about step two: building a culture of test automation. This one seems pretty obvious, right? After all who would argue with having good test automation for our product’s features so that we could be sure we are doing the job well?  Well, nobody would really argue, but resistance showed up in lots of passive forms:

  • “We have deadlines to meet and won’t get all of our code done if we also have to write a bunch more test automation.”
  • “Talk to ‘the other guys’ who have the job to make that stuff.”
  • “Running test automation will slow down our end-to-end build and therefore our team’s productivity will go down.”

Given these reactions from the team, we developed a sort of “Excuse Buster” set of principles and responses:

  1. Deadlines can be like the call of a siren  watch out!  This one seems obvious, and is the subject of pretty much every book or article you read about good software development.  We all know that if you don’t bake quality in from the start, it is far more costly to try to beat it in towards the end.  Despite this, that illusion of the nearest milestone is so tempting – maybe just this time I can go ahead and just slam my code into the library and it will be fine.  I know what I am doing! Before adopting a DevOps model, this kind of thinking simply killed us.  Our build success rates were very low, and because so many changes were going in each day, tracking down those build problems was exhaustive and slow.  The team simply thrashed.  The other disastrous side effect is that you actually create far more uncertainty about when you are done in the later parts of your cycle, which is also the worst time to talk about ability to hit deadlines with your stakeholders.  Testing up front allows you to really predict where you will land – you know what you have and you know that it works.
  2. The ‘Other Guys’ aren’t coming.  They are you!  In the end, this is about accountability – do you as a developer feel truly accountable for delivering high quality, highly consumable features for a user?  If you do, you will want to spend time on great test automation to prove to yourself that you can be proud of your work.  If you don’t, then you think it is someone else’s job (which really means that you think we are only doing this because some manager said so).  Again, the readings are clear on this topic – if you think about how to test something and prove it works well from the beginning, your designs and implementations will benefit just from that thinking process alone.  Add to that a good test harness to prove it day after day, and ensure that changes all over the system don’t ripple any side effects, and you can sleep with a smile on your face each night.
  3. Builds have to be like breathing – you don’t even think about it.  If you run test automation suites after your builds are finished, it is certainly true that it will take longer before a “green build” can be declared.  The real question you have to ask is “how many builds are going to break, and what’s the productivity cost of each?”  As I said before, our team simply thrashed – we had 30 or 40 code changes going in each day.  So, we would finish a build, run a minimal “sniff test,” and then hand it over to the test team.  They would install it in several environments (which would take many hours), start to test, and then after an hour or so run into some significant blocking issue.  The chief programmer would then start collecting evidence (blood spatter, hair samples) in order to find the guilty party and bring them to justice.  That might take a couple more hours.  Then the person would make the fix and check it into the library.  Respin.  Install again.  Begin the tests again.  Meanwhile the rest of the team is prevented from checking on other changes that are needed to expose other tests.  In the end, this was an incredibly costly way to do business – far too many builds, far too long a cycle to find a blocking problem, and that cost multiplied quickly given the number of different environments and topologies that we needed to try.  Implementing a strong automated test system so that we were very confident of a build that could be used, tested and enhanced saved us countless hours per week across hundreds of team members worldwide.  It’s the difference between playing offense on your project as opposed to having to play defense.

The method for change

In the end, we chose to dictate this from the top to overcome our passive resistance.  We needed to give our team members “permission” to invest in test automation by stating it as a mandatory part of their job.  Code check-ins could not happen without this also being present.  This eliminated much of the deadline tradeoff problem and also aligned the accountability and pride where we wanted it – with each of our team members.  To overcome the startup costs of putting this in place (the ROI problem), we also defined a baseline of tests that had to be done immediately – before any further enhancements could be checked in.  This baseline, although not exhaustive, represented our main use cases and allowed us to start to capture 80 to 90 percent of our problems immediately.  After that, all check-ins from that point forward were required to include test automation to cover that feature or that bug fix.  Code committers reviewing those check-ins were asked to also review the test automation to ensure appropriate coverage.

Next time, I will talk about how we are applying the output of our cloud tools in changing the way we do our development and support work.

What did you use to embrace test automation as a way of life?  Carrot or stick?

TwitterFacebookGoogle+LinkedInRedditStumbleUpon
Comments: 3
Kendall Lock

About Kendall Lock

Kendall is the Development Executive responsible for Cloud Provisioning and Orchestration Solutions, leading teams within the US, Canada, Ireland, Germany, Italy, India and China. In addition to delivering a family of SmartCloud offerings, Kendall also has driven the formation of the Common Cloud Stack, a shared Cloud management infrastructure combining assets and technologies from across IBM, and built upon OpenStack as a foundation. Kendall has held a wide variety of positions in his IBM career, starting as a developer and later becoming the lead architect for IBM's Workgroup products. He was part of the team that succeeded in acquiring Lotus, and then went to join Lotus as a new manager, bringing his key leadership into Lotus Notes as well as connectivity to all of IBM's previous OfficeVision platforms. Kendall has held numerous executive positions ranging from Monitoring and Availability products to Application Management solutions. He was also the Director of IBM's Laboratory in Rome, Italy, responsible for both development and services within that region. Kendall is a huge music fan and an occasional basketball player. Kendall is a graduate of the University of Texas - Hook 'Em Horns!
This entry was posted in Managing the Cloud and tagged , , , , , , , , . Bookmark the permalink.

3 Responses to One development team’s journey into DevOps (Part 2)

  1. Elliot says:

    #1 is by far the most important. Good planning is planning for success. Poor management almost always gets the blame if a project goes south….

    • Kendall says:

      Agreed – you have to accept the investment that is needed and plan for it…including some learning curve if this is your first rodeo. What techniques do you use to plan appropriately?

  2. Rich Brumpton says:

    Great pair of posts, can we hope for more parts in the future?

Comments are closed.