Leadership In Test: Infrastructure Featured Image

Leadership In Test: A Guide To Infrastructure And Environments

Editor’s Note: Welcome to the Leadership In Test series from software testing guru & consultant Paul Gerrard. The series is designed to help testers with a few years of experience—especially those on agile teams—excel in their test lead and management roles.

In the previous article, I took you through how to manage performance testing. Following on from that, in this article we’re going to talk about infrastructure, testing infrastructure, and test environments.

Sign up to The QA Lead newsletter to get notified when new parts of the series go live. These posts are extracts from Paul’s Leadership In Test course which we highly recommend to get a deeper dive on this and other topics. If you do, use our exclusive coupon code QALEADOFFER to score $60 off the full course price!

Infrastructure is the term we use to describe all the hardware, cloud services, networking, supporting software, and our application under test required to develop, test, deploy and operate our systems.

It makes sense not to limit our definition to technology though. The data centres, office space, desks, desktops, laptops, tablet computers, and mobile phones with their own software stacks installed are all part of the ecosystem required to develop, test and deploy systems.

There’s even more to it if you include developer tools, DevOps tools and procedures, test tools, and the business processes and subject-matter expertise required.

The most mundane things—down to the access codes or smart cards used to gain access to buildings—can turn critical if they aren’t in place.

Infrastructure, in all its variety, exists to support development, testing, deployment, and operations of your systems. It is either critical to testing, or it needs testing—by someone.

We’ll look at tools for development, testing, and collaboration in the next article. In this article, we’ll consider what most people regard as test environments and also look briefly at what is often termed infrastructure testing. I’ll be covering:

Let’s go.

Test Environments

All testing makes an implicit, critical, simplifying assumption: that our tests will be run in a known environment.

What Is An Environment?

All systems need to be tested in context. What this means is that for a test to be meaningful, the system to be tested must be installed, set up, deployed, or built-in a realistic environment that simulates the real world in which the system will be used. 

We might use test scenarios that push the systems’ capability in terms of functionality, performance, or security, for example, but these are properties of the tests, not the environment.

A realistic environment would replicate all of the business, technical and organisational environments too. Much of this comprises data that is used to drive business processes, configure the system, and provide reference data.

But perfectly realistic environments are usually impractical or far too expensive (even testers of high criticality systems such as aeroplanes, nuclear reactors, or brain scanners have to compromise at some point). Almost all testing takes place in environments that simulate, with some acceptable level of compromise, the real world.

Cars are tested on rolling roads, wind tunnels, vibration-beds, and private test tracks before they are tested on the open road. Computer systems are tested in software labs by programmers and software testers before end-users are engaged to try them out in a production-like environment.

Getting Realistic Environments To Test In

Simulated environments are fallible just like our requirements and test models, but we just have to live with that.

We must stage tests that are meaningful in the environments we have available and that test outcomes really do mean what we interpret them to mean.

The reliability of test outcomes is dependent on the environment in which tests are run. If a test is run in an environment that is incorrectly set up:

  • A test that fails may imply the system is defective when in fact it is correct.
  • A test that passes may imply the system is correct when in fact it is defective.

Both situations are highly undesirable, of course.

Getting Environments Set Up And Delivered In Time

Even with the emergence of cloud infrastructure, test environments can be difficult and expensive to set up and maintain.

Often, just when support teams are working on the new production environment, the testers demand test environments (and perhaps several of them). Late in projects, there always seem to be competing demands on support teams.

Environments for developers, or any later test activity, may be delivered late or not at all, or they are not configured or controlled as required. Inevitably, this will delay testing and/or undermine the confidence in any outcome from testing.

A critical task is to establish the need and requirements for an environment to be used for testing, including a mechanism for managing changes to that environment—as soon as possible.

Infrastructure as code is a recent evolution in how environments can be constructed with tools following procedures and using declarative code to define the environment set up. 

Although base operating system platforms—servers—can be created very easily in the cloud, or as virtual machines in your own environment, fully specified, special-purpose servers with all required software, data, configurations, and interfaces take more effort.

However, once set up, they provide a highly efficient means of environment creation. Infrastructure code can be source-controlled just like any application code and managed through change.

A major tenet of continuous delivery is that, as soon as possible, some software—not even anything useful—should be pushed through the delivery pipeline to prove the processes work. 

Of course, this needs viable environments for builds, continuous integration, system-level testing, and deployment. The aim is to deploy test and production environments without constraint. Once the environment definitions and deployment processes are in place to do this, generating environments becomes an automated and routine task.

Under any circumstance, the definitions of these environments are an early deliverable from the project.

Development Environments

Developer testing focuses on the construction of software components that deliver features internally to the application or at the user or presentation layer. 

Tests tend to be driven by a knowledge of the internal structure of the code and tests may not use or require ‘realistic’ data to run. Tests of low-level components or services are usually run through an API using either custom-built or proprietary drivers or tools.

The range of development tools, platforms, and what are usually called Integrated Development Environments (IDEs) is huge. In this article, we can only touch upon some of the principal test-related requirements and features of environments.

To support the development and the testing in scope for developers, environments need to support the following activities. This is just a selection—there may be additional or variations of these activities in your situation:

  • A ‘Sandbox’ environment to experiment with new software. Sandboxes are often used to test new libraries, to develop throwaway prototype code, or to practice programming techniques. All common programming languages have hundreds or thousands of software libraries. Sandboxes are used to install and test software that is not yet part of the main thread of development in order to evaluate it and to practice using it. These environments may be treated as disposable environments.
  • Local development environment. This is where developers maintain a local copy of a subset or all of the source code for their application from a shared code repository and can create builds of the system for local testing. This environment allows developers to make changes to code in their local copy and to test their changes. Some tests are ad-hoc and perhaps never repeated. Other tests are automated. Automated tests are usually retained for all time particularly if they follow a Test-Driven approach.
  • Shared (Continuous) Integration environment. When developers trust that their code is ready, they push their changes to the shared, controlled code repository. Using the repository, the CI environment performs automated builds and executes automated tests. At this point, the new or changed code is integrated and tested. The CI system runs automated tests on demand, hourly or daily, and the whole team gets notifications and can see the status of tests of the latest integrated build. Failures are exposed promptly and dealt with as a matter of urgency.

A development or CI environment supports developer testing, but other application servers, web services, messaging, or database servers that complete the system might not be available. 

If these interfacing systems don’t exist because they haven’t been built, or because they belong to a partnering company and there is only a live system and no test version, then developers have to stub or mock these interfaces to at least be able to test their own code. 

Mocking tools can be sophisticated, but mocked interfaces cannot usually support tests that require integrated data across multiple systems.

If an interface to a test database server is available to developers, the test data they use might be minimal, not integrated or consistent, and not representative of production data. 

Development databases that are shared across a team are usually unsatisfactory—developers might reuse, corrupt, or delete each other’s data if there isn’t a good regime for managing this shared resource.

System-Level Test Environments

System-level testing focuses on the integration of components and sub-systems in collaboration. 

The purpose of these environments is to provide a platform to support the objectives of larger-scale integration, validation of functionality, and the operations of systems in the context of the user or business processes. 

Environments might also be dedicated to the non-functional aspects of the system, such as performance, security, or service management.

Works On My Machine Graphic

One of the most common testing pitfalls is where a system tester in their environment experiences some kind of failure but, no matter how hard they try, the developer or tester cannot reproduce the failure in the development environment.

“It works on my machine!”

“Yes, of course, it does.”

This is almost certainly caused by some lack of consistency in the two environments. The difference in behaviour might be caused by the configuration, or some software version difference, or the difference in data in the database.

Differences in data causing problems are the first thing to check… they are usually easy to identify and often can be resolved quickly.

When there is a software version or configuration discrepancy, testers and developers can waste a lot of time tracing the cause of the difference in behaviour. 

When these problems occur it often means there is a failure in communication between dev and test. It could also indicate that there is a loss of configuration control in the developer or test environment setup or the deployment process.

Infrastructure as code and automated environment provisioning will make environment consistency problems a thing of the past.

Types Of Dedicated Test Environments

To support system, acceptance, and non-functional testing, environments need to support the following activities (there may be more in your organization):

  • (Functional) System testing environment. In this environment, the system is validated against the requirements documented for the system as a whole. Requirements may be large text documents, with tabulated sets of test cases defined for a system test. In agile projects, this environment may be necessary to allow testers to explore the integrated system without limiting themselves to specific features.
  • End-to-end test environment. Where the CI environment allows components to be integrated with subsystems, the business processes may require other interfacing systems (not under the control of developers) to be available. Full-scope environments are required to conduct large-scale integration, as well business process or overall acceptance testing. Usually data is a copy of live, or at least of appropriate scale. Where large-scale integration needs to be proven, the flows of data and control are exercised using longer user-journeys and independent reconciliations of the data across integrated systems.
  • Performance environment. These environments must provide a meaningful platform for evaluating the performance of a system (or selected sub-system(s)). Compromises in the architecture may be possible where there is redundant or cloning of servers. But the data volumes need to be of production scale even if the data is synthetic. Certainly, the environment needs to be of a scale to support production transaction volumes to enable useful prediction of the performance of systems in production to be made.
  • Availability, Resiliency, Manageability (ARM) environments. In some respects, these environments are similar to the performance environments but, depending on the test objective, variations may be inevitable. The goal of availability testing is to verify that the system can operate for extended periods without failing. Resilience testing (often called failover testing) checks that system components, when they do fail, do not cause an unacceptable disruption to the delivered service. Manageability or operations testing aims to demonstrate that system administrative, management and backup and recovery procedures operate effectively.

Data In Environments

In some very large projects, there can be as many as 20 or even 30 large-scale environments dedicated to different aspects of testing, training, data migration, and trial cutovers. In smaller projects there will be fewer, perhaps only one shared environment or a continuous delivery regime—and all testing might be implemented automatically in environments that are instantiated for a single-use, then torn down.

All environments need data, but the scale and degree of realism of that data can vary. Here are some common patterns in the way test data is acquired and managed. These patterns focus on ownership (local or shared) the means of creation (manually, automated, or copied from production), and scale:

  • Local, manually created, small-scale data—suitable for ad-hoc testing by developers or testers.
  • Local, automated, synthetic data. Suitable for developer automated tests or environments where the functionality of specific modules or features can be covered.
  • Shared, manually created data. Used in integration and system test environments, often where test data has evolved in parallel with manually run tests. Backed up and restored when required.
  • Shared, automatically created data. Used in integration and system test environments where test data has evolved in parallel with automated or manually run tests. Generated and/or restored from backups, when required.
  • Shared large-scale synthetic/random data. Performance and ARM testing require coherent data in large volume. This data does not usually need to be meaningful – randomised data works fine and is generated when required, or generated initially and restored from backups.
  • Shared large-scale meaningful data. End-to-end or acceptance or user testing usually needs meaningful data at scale. Sometimes copies or extracts from live data are used. Beware you don’t fall foul of data regulations if you don’t scramble/anonymise it, however.
  • Re-testing and regression testing. You will require a known, controlled dataset, in a known state, so it is usually restored from backups. This applies to any of the environments above as these tests need to be re-run with data in a known state to reproduce failures reliably.

Infrastructure Testing

At the start of this article, we looked at what infrastructure includes and since then we’ve primarily focused on the technical components, namely the software systems, and presumed that the hardware—real or virtual—is available.

When we build systems initially, we presume that the infrastructure exists and that it functions correctly, is performant, secure, and resilient, etc.

We can test all these aspects when we have integrated our application, and no doubt expose shortcomings in the infrastructure at a relatively late stage in our projects. But finding infrastructure problems so late in a project is usually extremely disruptive.

  • Changes to address failures in infrastructure might require significant redesign and changes in our application.
  • Results from our application or whole-system testing will need repeating.
  • If 3rd party components such as database, web, networking or messaging services fail, we are at the mercy of the suppliers (or the open source community) who support them.

To ensure that our confidence in infrastructure components is well-founded, we can rely on our (or others’) experience of using them in the past. Or we need to assess—through testing—their reliability before we commit to using them in the design and construction of our system.

Depending on the infrastructure under investigation, the environment we use may vary—from a single server to a near-complete infrastructure platform. 

Although some tests will be manual, mostly we will use tools, drivers, or robots to simulate the transaction load our application would generate. We would need to mock or stub-out these interfaces:

  • Interfaces that are currently not available
  • Interfaces to components that we trust and are easy to simulate
  • Interfaces not in scope that do not affect the infrastructure under test.

Infrastructure, needless to say, usually does not operate through a user or GUI interface.

The integration of our application to infrastructure will mostly take the form of messaging or remote service calls. Often the traffic to be simulated requires API calls to either web or application servers, message or database servers, or services delivered through the cloud or remote locations.

Performance and ARM objectives may be known, in which case tests can be performed to ensure these objectives are met.

However, infrastructure is often shared with other applications than our own, so knowing its ultimate capacity helps to gauge how much capacity will remain when our application is deployed.

In this case, infrastructure testing is addressing the risk to our own and perhaps other applications to be based on it in the future. 

Sign up to The QA Lead newsletter to get notified when new parts of the series go live. These posts are extracts from Paul’s Leadership In Test course which we highly recommend to get a deeper dive on this and other topics. If you do, use our exclusive coupon code QALEADOFFER to score $60 off the full course price!