Automated Testing Implementation Roundtable

Notes on a roundtable talk at GDC 2019

  • Again, see
  • How do you define a sufficient level of test coverage?
    • Aim to cover 95% of the weight of the code based on what gets run/used the most
      • Cover the hotspots/inner loops
    • Add tests as a part of every refactoring, bug fix
    • Best place to start is to figure out your current coverage—try to correlate that with number of bugs coming out of each subsystem
      • In the areas that are producing bugs, improve your coverage
    • Can go low on coverage, but high on assertions
      • Run auto-play overnight and find all the assertion failures
  • How to change company culture to write more tests
    • Show success stories: such-and-such system has heavy tests and never has any bugs
    • Easier to get people to test new code rather than going through old stuff
    • Make adding a test to verify a fix a prerequisite to committing
  • Practical tips for testing against client/server setups
    • See talk “Automated Testing of Gameplay Features in Sea of Thieves” by Robert Masella
    • Can spin up the server and client process on the same machine, pass control between them
    • Alternatively, mock the input/output on one side
    • Can just run AI in networked mode to run overnight, see what fails
    • Can you record the I/O from the server for replay?
  • Tips for structure code for testability
    • Bringing dependency injection in helps
    • Writing testable code is an education problem—read Working Effectively with Legacy Code
      • Even Google, with their very strong testing culture, has a constant investment in improving education around testing
    • Code reviews go a long way toward making this part of the culture
      • Start of the code review: Show me the tests!
      • Let junior devs review code from seniors (educational—helps trickle down experience)
    • Improve testing by modularizing the components, so that you can mock out both sides of the communication
    • Google Testing blog is good for education
  • At what point is a unit test too trivial? (Testing by rote—is it worth testing “plumbing” code?)
    • Purpose of tests can go beyond catching bugs—”living documentation”/specification explaining what the system is doing
    • Can you find a more broad test that covers this?
    • Keep the tests, but maybe don’t run them all all the time
      • What’s the cost of running the test?
      • What would be the business cost of this failing?
  • Testing against a number of hardware permutations
    • Screenshot tests: it’s hard to know whether the screenshot is okay/expected for this GPU?
    • Can improve things by giving your QA multiple configurations, make sure it looks okay as they change configs
    • Can do “non-deterministic” screenshot tests for cases where you can’t automate it: have a human tester look through the results once a week
    • Get a quantitative measure of how different two screenshots are
    • On mobile, make sure you talk to the community team to find out what actual chipsets you have problems with
      • May be getting an outsized number of issues from particular hardware—go buy it
  • Property-based testing or fuzzing for games
  • Improve run-time of tests (or stop people from skipping test)
    • Have a server run the tests, not individual devs
      • Compile tests (run automatically as you build)
      • Commit tests (run when you merge)
      • Nightly tests (may be very long running)
    • Gate all pull requests on having been tested
    • Make the people who skip the tests fix the bugs
    • Figure out why the are tests slow
      • Devs don’t treat tests like production code (don’t pay attention to runtime)
  • UI visual regression testing
    • Use screenshots with image detection scripts just like you would a rendering engine
  • Versioning test results (comparing images—how do you decide the “master”?)
    • Test runner automatically keeps artifacts
    • Don’t bother storing images if they were a 100% pass
    • All pass/fail results for all tests get recorded in a database
      • Allows you to do things like skip tests that have never failed, query flaky tests, etc.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s