Kafka, Schema Registry, JUnit and Test Containers — Part III: Reducing test time by removing all topics between tests
Introduction
In Part II of this series, we saw how we could create a JUnit Test Extension to make testing with test containers easier. We just need to add the extension to the test to leverage both containers. In fact, we can even add more utility methods to the KafkTestCluster utility class for things we need. The problem is that each time we have a test that uses this extension, we pay more than 20 seconds for the start-up time of both Kafka and the Schema registry. How can we improve this solution?
Multiple Tests and Performance
If we look at the Test Containers documentation, one of the ways to avoid the penalty of starting multiple containers is to use the Singleton Container Pattern. To use it in our Test Extension, we simply have to declare our test containers statically and we no longer need to implement the BeforeAll and AfterAll interfaces.
private static final Network network = Network.newNetwork();
private static final KafkaContainer kafkaContainer = new KafkaContainer(
DockerImageName.parse("confluentinc/cp-kafka:7.5.2"))
.withNetwork(network);
private static final GenericContainer<?> schemaRegistry = new GenericContainer<>(
DockerImageName.parse("confluentinc/cp-schema-registry:7.5.2"))
.withNetwork(network)
.withExposedPorts(8081)
.withEnv("SCHEMA_REGISTRY_HOST_NAME", "schema-registry")
.withEnv("SCHEMA_REGISTRY_LISTENERS", "http://0.0.0.0:8081")
.withEnv("SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS",
"PLAINTEXT://" + kafkaContainer.getNetworkAliases().get(0) + ":9092")
.waitingFor(Wait.forHttp("/subjects").forStatusCode(200));
static {
kafkaContainer.start();
schemaRegistry.start();
}Clean Up Strategies
Having the containers run continuously is not sufficient for all the tests to work. Since the containers are now shared between test classes, we need to ensure that the actions of one test don’t affect the next. A simple way to do this is to use random topic names and application IDs in our Kafka applications. This ensures that they will not collide. It forces us to have the topic name and application ID configurable, which is a good practice in my opinion.
An alternative is to use Kafka’s Admin API and delete all the topics and consumer groups between tests. This means implementing the BeforeAll interface and using the Admin API to ensure the state is cleared before the next test is started.
@Override
public void beforeAll(ExtensionContext context) throws Exception {
try (AdminClient adminClient = AdminClient.create(getKafkaSettings())) {
// Delete all topics
ListTopicsResult result = adminClient.listTopics();
Set<String> topicNames = result.names().get();
topicNames.remove("_schemas");
adminClient.deleteTopics(TopicCollection.ofTopicNames(topicNames));
// Delete all consumer groups
List<String> consumerGroups = adminClient
.listConsumerGroups().all().get()
.stream()
.map(ConsumerGroupListing::groupId)
.collect(Collectors.toList());
adminClient.deleteConsumerGroups(consumerGroups);
}
}Conclusion
With this technique, we can avoid the start-up time of the Kafka and Schema Registry contaienrs for each test class. This is just scratching the surface of what can be done in a test extension, but it is best to keep the extension small. Having a lot of magical code in your tests will make them harder to read.
Personally, I prefer to have random topic and application IDs in my tests. It makes my Kafka application more adaptable. The names of the topics, applications IDs, and other settings, like the Kafka Streams storage directory, can all be configured via external settings. I use these settings in the unit tests and then leverage them when deploying the applications in production. It also provides additional flexibility if, for instance, I need to add a filter between two applications: I can simply configure the output and input topics and deploy a new application that does the filtering, all without chaining any of the existing applications.
I hope this taster of unit tests with Test Containers and Kafka has gotten you interested in both technologies. If you would like to see me expand on any of these technologies, reach out. For my next articles, I am tempted to write about Kafka Connect, what do you think?
Articles in the series: