Testing Infrastructure Code BoF at LISA14
I need to pretty these up...but here are my notes from the BoF I ran at LISA14 on testing infrastructure code. "TK" is Chef's Test Kitchen.
How are we testing code right now?
- Unit testing Cf3 bundles + lint checks in git hook
- Integration test in Vagrant
- Not good coverage -- ie, how to test FC card setup if no FC card?
- Testing Puppet modules on whole test machine
- Production web traffic gets pointed at it
- verified as it goes through test, stage, production
- Cf3 using work flows to progress
- Not sure if this was test -> staging -> prod, or git branch/merge/etc
- Chef cookbooks tested in Git, Gerrit (code review), Jenkins
- problems with Test Kitchen sometimes
- Chef + TestKitchen + Jenkins + chef lint
- struggling with integration testing -- so many VMs to spin up
- Stage -> Test -> Prod + Beaker
- struggling with how to test (scope, what level, how often, etc)
- Chef spec - test, lint, integration
- Pester: TDD framework for Powershell
- fairly manual
- Article on Pester, Github repo
- Chef-spec, then TK for end-to-end
- local commit -> verified by chef lint.
- Repeat as necessary...
- Finally, Test Kitchen; lots of VMs, takes a while
- then push to repo, merge to master, whatever
- Lots of little checks, then one larger, longer test right before final merge.
- CF3 pre-commit hook for linting
- integration tests on testing branch on long-running VMs to
verify that things converge on existing code
- So rather than (say) vagrant destroy and fresh build every time, have a machine that runs for (say) weeks, and push tests to that. Make sure your code works on machines in production -- you're not reinstalling every time Cf3 runs!
- Plans to start unit testing in Docker for speed
- Nice!
- Tests will run against yesterday's image, snapshotted
- So: run docker image a while (weeks? rolling sort of thing?) then snapshot; test against that; spin up docker container and it's the cumulative changes of last few weeks ready to test again.
- Testing against pristine image takes much longer.
- integration tests on testing branch on long-running VMs to
verify that things converge on existing code
- Chef; devel background, so fast feedback is king
- Gerrit code review. Very, very rigorous
- Rubocop, chef spec, food critic
- Testing with Bats works fine
- "I don't test. If I have an hour, I'll spend it writing a better
monitor. Tests are bullshit; production rules."
- Nathen trolling room -- but has a point. Don't forget that app is the final result.
- Why test? Because it's a way to avoid reactive work. It also tells you where something broke, so easier to track down. It forces you to examine the task & become more familiar.
How confident are you that your tests are useful, match the real world and catch problems?
- Yep, that's a problem.
- Always a work in progress. "You'll level up in time."
- Need to build confidence; loop through process many times, but
also build layers:
- Does this code do what I think it does?
- And does that actually accomplish the larger task I'm trying to do?
- The test that's written and executed is a hundred times better than a test that's never written and never executed.
- If post-mortems are well-documented, it helps a lot -- add tests for what caused last outage.
- Get code review for your tests!
- Go to your team and brainstorm failure scenarios. Do this in isolation first, then come back together -- avoid everyone converging on same ideas.
How to get people to start testing?
- Step zero: consistent dev environment.
- Have it standardized and ready to go. Reduce the friction. Use Docker/Vagrant/etc for this, and use the file sharing facilities these things offer.
- This is something I'm struggling with because I'm not a ruby dev.
- Bats is nice and easy, but you have to get going yourself.
- Testing in pipeline along the way (not sure what that note means)
- Have consistent image for production, so you're limiting the surface of what you have to test -- same libs, same os, etc.
Unicorn lands from outer space and offers you the best test framework in the universe. What does it look like?
- Faster and more automated.
- Easy to pick up and well-documented.
- Understands higher-level abstractions -- more DWIM
- Cross-platform support (OS, not tied to one cfg mgt tool, etc)
- No false positives -- and no false negatives!
- Unified. There's too much sprawl right now -- different layers, directories for tests, conventions, etc. Maybe this problem needs more/better tooling.
- Integrated with Bosun or other monitoring -- here's the test, please spit out Nagios configs (or whatever).
- Easy -- so you're not tempted to say "Fuck it, I'll spend the time improving monitoring instead." Or generate Nagios check (like prev example).
- Analyze config mgt code and generate tests from that code --
even if it was just a start and you had to go back and fix it,
this would be a great start.
- One person had a tool that did something like this: Puppet generates YAML to list things it wants, and the code would parse that and generate ServerSpec tests. He can't release it, but it was pretty simple -- ~100 lines of code.
- Not have to switch framework to test another config mgt tool.
- Assertions that TK does this -- Steve Murawski built DSC TK plugin. This contradicts what I heard from Chef dev earlier at meetup...