top of page

Running Docker-Based Integration Tests in Scala

By Matteo Di Pirro

In this blog post, we’ll explore two ways of writing Docker-based tests in Scala: sbt-docker-compose, an SBT plugin, and TestContainers-scala, a Scala wrapper of the TestContainers Java library.

The focus of the article will not be on showing you how to be proficient in either of those technologies. As a matter of fact, there are many other sources of information out there for that, including the official documentation. Instead, we’ll focus on the readability and usability of the resulting testing code. As we’ll see, even with a simple running example, the difference in code readability and easiness of use will be substantial.

Before diving into code and implementation details, let’s revise why testing is important in modern software.

Why Testing is so Important

Usually, when asked why testing is important, many of us would say that’s because tests are how programmers make sure their applications comply with the requirements. This is certainly true, but writing tests is fundamental for other two reasons:

  • Refactoring. It is very difficult to refactor an application without a trustable test base. In fact, refactoring without tests won’t give us the confidence that even the smallest of changes won’t break the application;

  • Documentation. Tests can also be a valuable documentation tool. For example, if we’re developing a new open-source library, we can use the tests to provide our users with examples of how to use our code. This is actually very common nowadays.

Nonetheless, often we pay less attention to tests than to “production” code (that is, the code fulfilling the use cases of the application). This is not a wise choice. The worse the quality of the tests is, the less useful they are for refactoring and documentation purposes. Just imagine heading into the tests of a library to find out how to use it and find ourselves in a badly written set of testing modules. Chances are we’ll just discard the library and go straight to the next one. Similarly, if we cannot understand what we’re testing, we might not feel confident enough to refactor our application, or even to develop new features.

In a few words, if tests are badly written and not readable, the maintenance cost of our software increases over time. The more “spaghetti” tests we add to our code base, the more difficult it becomes to add new features, fix bugs, and reduce technical debt.

Types of Tests

Writing high-quality tests is surely easier said than done. As a matter of fact, nowadays we typically have to write several different tests (Figure 1).

Figure 1. A simple testing pyramid.

Figure 1 depicts a simple testing pyramid. There are many alternatives to the one we are showing here. In very few words, a testing pyramid describes the different types of tests, from unit to end to end (“E2E”) tests. The former should be the simplest and fastest to run. Unit tests should validate the behavior of units of code in isolation, that is without the interaction with other components of the code base.

The higher we are in the pyramid the less we’re interested in the behavior of a single unit and the more we focus on the interaction among different components. In integration testing, for example, we group different modules and test them as a single unit. E2E testing, instead, tests whole use cases.

The more dependencies and complexity we add to our tests, the slower and more expensive it becomes to run them.

Testing the Application Context

Nowadays, testing is even more complex as applications normally depend on many third-party services. For example, it is common for software to depend on other applications (think of a microservices architecture), on a DBMS (DataBase Management System), on a cloud storage service (to store files), on a cache (such as Redis), and so on. This set of “infrastructural” dependencies represents the so-called application context.

One question immediately comes into mind: “How can we test our application in an environment as close as possible to the application context used in the Production environment?”

We can get away with in-memory implementation of many of the dependencies listed above. For instance, the file system can serve both as a cache implementation and as a file storage means. Similarly, we can leverage H2 as a DBMS and hard-code HTTP responses in a mock HTTP server to emulate third-party applications.

Still, does this give us the level of confidence we’re looking for? Is an in-memory environment a trustable double of a Production environment? We’d say it’s not. Luckily, Docker comes to the rescue. As a matter of fact, we can leverage Docker containers to double our infrastructure dependencies, building a Production-like testing environment on our local workstation (or in our Continuous Integration/Delivery pipeline).

If you’re not familiar with Docker, don’t worry. For what concerns this article, a Docker image is simply a way to pack an application, together with all the libraries and files it needs to run. When we run a Docker image, we get a Docker container.

Docker containers are suitable as test doubles only from integration tests up. Unit tests should not test any dependencies among different components of the software.

Using Docker in Scala Tests

In the rest of this article, we’ll see how we can leverage Docker containers to double the application context. As we said above, we’ll pay great attention to the code readability and maintainability.

Based on our experience, we have four different ways to integrate Docker in Scala tests:

  1. Manually running Docker containers: we prepare the application context ourselves, by running Docker commands bringing up Docker containers. If we have many containers to set up, we might rely on Docker Compose, a tool to define and run more than one container at the same time. In any case, this approach is not automated and quite error-prone. Furthermore, it forces us to duplicate configuration settings in the Docker Compose file (or in the shell Docker command) and our test code. For example, assume one container needs a port mapping, to make it accessible on localhost. If that’s the case, we’ll have to duplicate the mapped port in both places. In the long run, this might lead to inconsistencies and misalignments, causing bugs difficult to spot.

  2. Wrapping Docker commands in the test code: if the main issue with the aforementioned approach is the duplication of the container settings, the solution might seem straightforward: wrapping Docker commands in the test code. In the past, we have worked on projects behaving like this, but, although possible, this road is a dangerous one. As a matter of fact, we’d end up with test classes with two different responsibilities: testing some features and taking care of configuring, creating, and destroying Docker containers. In the long run, the resulting code is likely to end up as a real mess, making it very difficult to refactor or extend the existing test cases.

So, what’s the solution? The key idea here is not to reinvent the wheel, relying on frameworks and plugins dedicated to interacting with Docker:

  1. Using sbt-docker-compose: an SBT plugin that automatically creates and destroys Docker containers based on a Docker Compose file;

  2. Using TestContainers(-Scala): TestContainers-Scala is a Scala wrapper of the Java TestContainers library.

In the rest of this blog post, we’ll focus on the comparison between sbt-docker-compose and TestContainers-Scala, but, first, we’ll take a look at our running example.

The Running Example

The code we’ll be testing is a very simple web application, accepting incoming connections (one at a time), and replying with a hardcoded 200 OK response:

@main def bootServer(): Unit = val text = "HTTP/1.0 200 OK…<html>...<body><p>Hello, ScalaMatters!</p></body></html>" val port = 8080 val listener = ServerSocket(port) while (true) { val sock = listener.accept() PrintWriter(sock.getOutputStream, true).println(text) sock.shutdownOutput() }

The application uses Scala 3, but the code is very simple. As we said above, it accepts one incoming connection at a time and responds with a pre-defined HTML page (Figure 2):

You might be wondering why we chose such a simple application. The reason is that that is all the code we need to show the differences between sbt-docker-compose and TestContainers-Scala in terms of code readability and maintainability.

For the testing part, we’ll build a Docker image for it and run it using both sbt-docker-compose and TestContainers-Scala. This way, we’ll examine the differences in the testing flow and the resulting test classes. We’ll use ScalaTest.

As we said above, this blog post will not show in detail how to configure SBT to build Docker images or to use either sbt-docker-compose or TestContainers-Scala. If you are not familiar with those topics, at the end of the blog post you’ll find a link to a GitHub repository containing all the code shown here.


Sbt-docker-compose is an SBT plugin to integrate Docker Compose into our building environment. In particular, it lets us write a .yaml file describing all the containers we want to start (e.g. our application context). Then, it runs all the tests in our test base against those containers.

In this section, we’ll use it together with ScalaTest. More generally, sbt-docker-compose publishes some information regarding the running containers (such as port mappings, if any) to our ScalaTest suites so that we know where to reach them.

Nonetheless, this plugin is a bit cumbersome to configure. Furthermore, it actually suffers from the same issues we saw above when talking about running Docker containers ourselves via shell commands: as we’ll see, we’ll have to duplicate some configuration settings in the Scala code and in the Docker Compose file, which is precisely what we wanted to avoid in the first place.

As a matter of fact, configuring the containers programmatically (i.e. via Scala code) is not possible. Furthermore, getting to know when a container is ready to accept connections is also difficult and it requires, as we’ll see, manual workarounds.

This being said, likely the biggest showstopper here is that sbt-docker-compose was abandoned and is not actively maintained anymore. Many people forked the original project to add support for new versions of Scala and Docker Compose. If you decide to use it, you might have to go down the same road.

Testing Flow

As we saw above, to test our simple application we’ll build a Docker image. To avoid hardcoding the application(/Docker image) version in the Docker Compose file, we can rely on the latest tag. Since we can’t easily edit the Docker Compose file programmatically, latest is the only way to avoid hardcoding the application version, which can lead to inconsistencies (what happens if we forget to upgrade it before running the tests?).

Hence, the first step is to configure SBT to always publish a Docker image tagged with latest. Secondly, we have to create a JAR file containing all the tests we want to run (sbt packageBin). Lastly, we can run our tests, using dedicated SBT tasks provided by the plugin (such as sbt dockerComposeTest).

Running the aforementioned command will start all the containers described in the Docker Compose file and then will start the tests we packaged as a JAR file. However, sbt-docker-compose will not wait for the containers to be either ready or healthy. As we’ll see, we’ll have to treat our system as an eventually consistent one to cope with that limitation.

The flow described above is, in our opinion, a bit cumbersome for many reasons:

  • Building a latest image just for testing is not ideal. As a matter of fact, we might not want to publish that version after testing, which makes our build system and continuous integration pipeline more difficult. We worked on many projects where the latest tag was not being published nor used, and, when using sbt-socker-compose, we had to manually prevent SBT from publishing it.

  • Using a dedicated SBT task to run the tests leads to poor IDE support. In fact, we can’t even run them using the buttons in our IDEs, which makes the testing experience more clumsy. Furthermore, running a single test (and not the whole test base) or debugging tests is much more complex as well.

The Docker Compose File

The Docker Compose file for our tests is pretty simple:

version: '3.8' services: scalaMatters: image: scala-matters:latest<localBuild> ports: - "0:8080"

The code snippet above is a standard Docker Compose file. First, we specify the version of the Compose file. Then, we list the services, that is the containers Compose will bring up. In this case, we only have one, named scalaMatters, representing our web application. The service definition comprises the full name of the Docker image (notice the latest tag) and the port mapping. In particular, 0:8080 tells Docker Compose to map port 8080 of the container to a random port in our workstation (aka the host).

Using random ports is very important to avoid the so-called integrated tests, that is tests relying on the workstation configuration. In fact, had we specified a fixed port (for example 8000:8080) then we’d have had to make sure that port 8000 in the host was available. This makes the tests more difficult to run in any environment.

On a side note, the <localBuild> “annotation” is an sbt-docker-compose-specific one and tells the plugin not to pull the image from anywhere but, instead, to use a locally available one or fail otherwise.

The Test Suite

After seeing the Docker Compose file, it’s time to take a look at the test suite. It’s a fairly large one, so we’ll split it into two different parts and analyze them separately.

The first part is all about the test suite and test case definitions:

class SampleSbtDockerComposeSpec extends FixtureAnyFunSuite with fixture.ConfigMapFixture with Eventually with IntegrationPatience with ...: test("Response successful and body as expected") { configMap => { val hostInfo = getHostInfo(configMap) val client = SimpleHttpClient() val req = basicRequest.get(uri"http://$hostInfo") eventually { val resp = client.send(req) resp.code.isSuccess shouldBe true resp.body.value should include("Hello, ScalaMatters!") } } }

Again, the test suite is written in ScalaTest. If you’re not familiar with it, let us introduce it step by step.

In ScalaTest, a test suite is just a class with some traits mixed-in. In our case, the first trait is FixtureAnyFunSuite. This sets the testing style of the suite, that is, how test cases are written. For instance, FixtureAnyFunSuite lets us define them using the test method.

Secondly, we have fixture.ConfigMapFixture. In software testing, we can define “fixture” as data used to set the application to a known fixed state. In our case, sbt-docker-compose publishes the container’s port mapping using a ConfigMap (essentially a Scala class wrapping a Map[String, Any]) as a fixture.

Thirdly, we mix in the Eventually trait. As we mentioned above, with sbt-docker-compose we have no general way to know when the containers we need are ready to be used.

If you’re into Docker Compose, you might argue there is, in fact, a way to get to know when a container is healthy: the healthcheck directive. However, sbt-docker-compose will not honor it and wait for the containers to be healthy before starting the tests. Furthermore, healthcheck only lets us specify a command to verify the container’s healthiness. This is rather limiting and, in some cases, might not be possible or require additional packages to be installed in the Docker image.

Therefore, we have to treat our system as an eventually consistent one and cope with this in our test cases. One way is to use timeouts to try and wait for the application to be up and running. In our example, instead, we chose to rely on Eventually. As we’ll see, this allows us to evaluate the same assertions multiple times (more generally, to re-run a set of instructions multiple times). This way we’ll test if, eventually, our application becomes ready and its behavior is as expected. This is surely not ideal, since we’d like the tests to run only if the application is known to be ready, without making our code more complex to bear with the limitations of the plugin we’re using.

Lastly, IntegrationPatience simply configures Eventually, setting the number of retries and the interval between two subsequent runs.

In the test body, we retrieve the host/port configuration of our container from the configMap fixture (more on this shortly) and use it to build a GET request to our web application (val req = basicRequest.get(uri"http://$hostInfo")). The eventually block is where the magic happens, and the request is sent multiple times until either the two subsequent assertions work or the timeout set by IntegrationPatience elapses.

As for the assertions, we check that the response is a successful one (resp.code.isSuccess shouldBe true) and that its body contains the string “Hello, ScalaMatters!” (resp.body.value should include("Hello, ScalaMatters!")). should and shouldBe are ScalaTest syntax to express assertions in a more human-readable way and are known as “matchers”.

If you think that’s a lot of boilerplate code for a very simple test, hold on and brace yourselves, the worst is yet to come. Let’s take a look at the implementation of getHostInfo.

class SampleSbtDockerComposeSpec extends ...: val ServiceName = "scalaTest" val ServiceHostKey = s"$ServiceName:8080" ... def getHostInfo(configMap: ConfigMap): String = getContainerSetting(configMap, ServiceHostKey) def getContainerSetting(configMap: ConfigMap, key: String): String = { if (configMap.keySet.contains(key)) then configMap(key).toString else { throw TestFailedException( message = s"Cannot find the expected Docker Compose service key '$key'", failedCodeStackDepth = 10 ) } }

The getContainerSetting method does nothing particularly special, but it’s quite low-level. It simply checks whether the ConfigMap provided as a fixture contains a given key, returning the corresponding value, if any. If there’s no such value, the method throws an exception.

Nonetheless, this is a lot of low-level boilerplate code. After all, it shouldn’t be us doing that! We’d prefer the plugin to handle that for us. As a matter of fact, we’re likely to duplicate that code across different projects or worse, across multiple test suites within the same project.

Furthermore, we’re also forced to duplicate some information we already wrote in the Docker Compose file, namely the name of the service (see ServiceName) as well as the container port (see ServiceHostKey). This is exactly what we wanted to avoid in the first place.


Let’s now turn to TestContainers-Scala. As we already saw, it’s a Scala wrapper of a Java library originally designed to manage Docker containers in our tests. The main difference w.r.t. sbt-docker-compose is that the former creates disposable containers for each test suite, or even test case. Lastly, TestContainers lets us configure the execution of our Docker images in different ways: either using a Docker Compose file or programmatically (either by hand or using some pre-defined modules).

One of the main advantages of this library over sbt-docker-compose is its better interoperability with the build system. For example, in SBT, we can use the “usual” tasks to run tests, such as test and testOnly. Debugging a test is also simpler.

The out-of-the-box integration with SBT is useful for another purpose. As we’ll see in a second, we can configure our build to publish the name and version of the Docker image as an environment variable, to avoid code duplication as much as possible.

The Container Definition

In our example, we’ll tell TestContainers how to run our Docker image programmatically, via the so-called “container definition”:

class AppContainer private(underlying: GenericContainer) extends GenericContainer(underlying): lazy val port: Int = mappedPort(AppContainer.Port) object AppContainer: private val Port = 8080 object Def extends GenericContainer.Def[AppContainer]( AppContainer( GenericContainer( dockerImage = System.getenv("DOCKER_IMAGE_FULL_NAME"), exposedPorts = Seq(Port), waitStrategy = HostPortWaitStrategy() ) ) )

The container definition is a Scala object extending TestContainer’s GenericContainer.Def and specifying the configuration of the Docker container. In our example, we define the name of the Docker image (notice the code reads it from an environment variable, set at build time by SBT), the container port, and the waiting strategy.

This latter setting is very important, as TestContainers uses it to decide when the container is ready and, therefore, when to start the test suite (or test case). This approach is very declarative. Furthermore, we can use it to specify very complex conditions, from a simple check on the container port (as we did in our example) to inspections of the container’s logs.

Lastly, we define a class, named AppContainer, providing a well-typed interface to access the information of the running container. In this case, we only expose the port mappings. However, we could use this pattern to expose other settings, such as credentials.

The Test Suite

It is now time to look at our test suites. Notice how simpler it is, with much less boilerplate code.

class SampleTestContainersSpec extends AnyFunSuite with TestContainerForAll with ...: override val containerDef: AppContainer.Def.type = AppContainer.Def test("The app returns a success code and the string 'Hello, ScalaMatters!'") { withContainers { app => val host = val port = app.port val client = SimpleHttpClient() val req = basicRequest.get(uri"http://$host:$port") val resp = client.send(req) resp.code.isSuccess shouldBe true resp.body.value should include("Hello, ScalaMatters!") } }

We removed Eventually and IntegrationPatience, to mix-in the TestContainerForAll trait, to reuse the same Docker container for all the test cases in the suite.

As a consequence, the code of the test case becomes much simpler. In particular, we’re not treating the system as an eventually consistent one anymore, for TestContainers-Scala takes care of waiting for our application to be ready.

Lastly, notice how we get an instance of the AppContainer class using the withContainers method. This way we can access the network configuration without fetching it from an untyped map. Instead, we’re using a much more clean and readable (and well-typed!) interface.


In this blog post, we saw how important testing code is. Unfortunately, too often we don’t pay enough attention to it, ending up with “spaghetti” tests that are too coupled and unreadable to make the entire code base difficult to refactor and extend. Testing libraries, frameworks, and plugins play a key role in enhancing, or reducing, the readability and maintainability of the test base.

Here, we compared two alternatives to write Docker-based tests in Scala. First, we met sbt-docker-compose, which is abandoned and difficult to use. Secondly, we dove into TestContainers-Scala, a Java-based library with a Scala wrapper.

We discussed that the former is generally faster than the latter, as it brings up the containers just once, and not (at least) as many times as the number of test suites. However, the code produced by the latter is arguably more readable and reusable across different projects.

Generally speaking, whenever you come across a new library or framework, do not only evaluate its performance. Instead, once you’re done with your evaluation project, take a step back and look at the resulting code. How readable is it? How much boilerplate is there? How easy will it be for your coworkers to pick that technology up and work with it? Those are questions that programmers and software engineers cannot get away with. They need precise answers because, as Alan Page noted:

When you hear the phrase “it’s just test code”, to me that’s a code smell.



Thank you so much to Matteo for submitting a blog to Scala Matters! Your contribution to the Scala community means a lot.

If you would like to submit a blog or a video, contact Patrycja or email

Alternatively, follow Scala Matters and UMATR for more Scala blogs and updates!


bottom of page