Tools and Resources for Testing Apache Kafka® (original) (raw)

Apache Kafka® has extensive tooling that help developers write good tests and build continuous integration pipelines:

Dev/Test/Prod Environments

Separate your development resources from the test resources from the production resources.

Dev/Test/Prod Environments

Types of Testing

Unit testing: Cheap and fast to run, giving developers a fast feedback cycle. Their fine granularity means the exact location of the problem will be clearly defined. Unit tests typically run as a single process and do not use network or disk.
Integration testing: For testing with other components like a Kafka broker, you can run with simulated Kafka components or real ones. Sometimes integration tests are slower to run and failures are harder to troubleshoot than they are in unit tests.
Performance, soak, chaos testing: For optimizing your client applications, ensuring long-running code, and resilience against failures

Clients that write to Kafka are called producers, and clients that read from Kafka are called consumers. Often, applications and services act as both producers and consumers. The streaming database ksqlDB and the Kafka Streams application library are also clients as they have embedded producers and consumers. Here are various available test utilities you can use in testing your client applications:

ksqlDB Kafka Streams JVM Producer & Consumer librdKafka Producer & Consumer
Unit Testing ksql-test-runner TopologyTestDriver MockProducer, MockConsumer rdkafka_mock
Integration Testing Testcontainers trivup
Confluent Cloud

Learn more about application testing

Schema Registry and Data Formats like Avro, Protobuf, JSON Schema

Without schemas, data contracts are defined only loosely and out-of-band (if at all), which carries the high risk that consumers of data will break as producers change their behavior over time. Data schemas help to put in place explicit “data contracts” to ensure that data written by Kafka producers can always be read by Kafka consumers, even as producers and consumers evolve their schemas.

This is where Confluent Schema Registry helps: It provides centralized schema management and compatibility checks as schemas evolve. It supports Avro, Protbuf, and JSON Schema.

Schema Registry
Unit Testing MockSchemaRegistryClient
Integration Testing EmbeddedSingleNodeKafkaCluster, Testcontainers
Compatibility Testing Schema Registry Maven Plugin

Learn more about application testing

Other Testing

Harden your application with:

You can benchmark with basic Apache Kafka command line tools like kafka-producer-perf-test and kafka-consumer-perf-test (docs). For the other types of testing, you can consider: Trogdor, TestContainers modules like Toxiproxy, and Pumba for Docker environments.

Using Realistic Data

To dive into more involved scenarios, test your client application, or perhaps build a cool Kafka demo for your teammates, you may want to use more realistic datasets. One option may be to copy data from a production environment to a test environment (Cluster Linking, Confluent Replicator, Kafka's Mirror Maker, etc.) or pull data from another live system.

Alternatively, you can generate mock data for your topics with predefined schema definitions, including complex records and multiple fields. If you use the Datagen Connector, you can format your data as one of Avro, JSON, or Protobuf.