Testing Infrastructure Code BoF at LISA14

I need to pretty these up...but here are my notes from the BoF I ran at LISA14 on testing infrastructure code. "TK" is Chef's Test Kitchen.

How are we testing code right now?

Unit testing Cf3 bundles + lint checks in git hook
- Integration test in Vagrant
- Not good coverage -- ie, how to test FC card setup if no FC card?
Testing Puppet modules on whole test machine
- Production web traffic gets pointed at it
- verified as it goes through test, stage, production
Cf3 using work flows to progress
- Not sure if this was test -> staging -> prod, or git branch/merge/etc
Chef cookbooks tested in Git, Gerrit (code review), Jenkins
- problems with Test Kitchen sometimes
Chef + TestKitchen + Jenkins + chef lint
- struggling with integration testing -- so many VMs to spin up
Stage -> Test -> Prod + Beaker
- struggling with how to test (scope, what level, how often, etc)
Chef spec - test, lint, integration
Pester: TDD framework for Powershell
- fairly manual
- Article on Pester, Github repo
Chef-spec, then TK for end-to-end
- local commit -> verified by chef lint.
- Repeat as necessary...
- Finally, Test Kitchen; lots of VMs, takes a while
- then push to repo, merge to master, whatever
- Lots of little checks, then one larger, longer test right before final merge.
CF3 pre-commit hook for linting
- integration tests on testing branch on long-running VMs to verify that things converge on existing code
  - So rather than (say) vagrant destroy and fresh build every time, have a machine that runs for (say) weeks, and push tests to that. Make sure your code works on machines in production -- you're not reinstalling every time Cf3 runs!
- Plans to start unit testing in Docker for speed
  - Nice!
  - Tests will run against yesterday's image, snapshotted
    - So: run docker image a while (weeks? rolling sort of thing?) then snapshot; test against that; spin up docker container and it's the cumulative changes of last few weeks ready to test again.
    - Testing against pristine image takes much longer.
Chef; devel background, so fast feedback is king
- Gerrit code review. Very, very rigorous
- Rubocop, chef spec, food critic
- Testing with Bats works fine
"I don't test. If I have an hour, I'll spend it writing a better monitor. Tests are bullshit; production rules."
- Nathen trolling room -- but has a point. Don't forget that app is the final result.
- Why test? Because it's a way to avoid reactive work. It also tells you where something broke, so easier to track down. It forces you to examine the task & become more familiar.

How confident are you that your tests are useful, match the real world and catch problems?

Yep, that's a problem.
Always a work in progress. "You'll level up in time."
Need to build confidence; loop through process many times, but also build layers:
1. Does this code do what I think it does?
2. And does that actually accomplish the larger task I'm trying to do?
The test that's written and executed is a hundred times better than a test that's never written and never executed.
If post-mortems are well-documented, it helps a lot -- add tests for what caused last outage.
Get code review for your tests!
Go to your team and brainstorm failure scenarios. Do this in isolation first, then come back together -- avoid everyone converging on same ideas.

How to get people to start testing?

Step zero: consistent dev environment.
- Have it standardized and ready to go. Reduce the friction. Use Docker/Vagrant/etc for this, and use the file sharing facilities these things offer.
- This is something I'm struggling with because I'm not a ruby dev.
Bats is nice and easy, but you have to get going yourself.
Testing in pipeline along the way (not sure what that note means)
Have consistent image for production, so you're limiting the surface of what you have to test -- same libs, same os, etc.

Unicorn lands from outer space and offers you the best test framework in the universe. What does it look like?

Faster and more automated.
Easy to pick up and well-documented.
Understands higher-level abstractions -- more DWIM
Cross-platform support (OS, not tied to one cfg mgt tool, etc)
No false positives -- and no false negatives!
Unified. There's too much sprawl right now -- different layers, directories for tests, conventions, etc. Maybe this problem needs more/better tooling.
Integrated with Bosun or other monitoring -- here's the test, please spit out Nagios configs (or whatever).
Easy -- so you're not tempted to say "Fuck it, I'll spend the time improving monitoring instead." Or generate Nagios check (like prev example).
Analyze config mgt code and generate tests from that code -- even if it was just a start and you had to go back and fix it, this would be a great start.
- One person had a tool that did something like this: Puppet generates YAML to list things it wants, and the code would parse that and generate ServerSpec tests. He can't release it, but it was pretty simple -- ~100 lines of code.
Not have to switch framework to test another config mgt tool.
- Assertions that TK does this -- Steve Murawski built DSC TK plugin. This contradicts what I heard from Chef dev earlier at meetup...