Harnessing Apache Kafka on Kubernetes with Strimzi (original) (raw)
Strimzi is a CNCF incubating project (moved on Feb 08, 2024) that allows us to run Apache Kafka on the Kubernetes cluster.
The Apache Kafka is a leading tool for building the real-time, event-driven applications. You can also develop and run highly scalable & fault-tolerant data streaming, and data pipelines using Kafka. However, deploying and managing the Kafka infrastructure is always a bit tricky & complicated and this is where Strimzi comes into the picture where it simplifies the whole experience of Kafka on Kubernetes.
In this blog, we will understand what Strimzi is all about and how it is helpful to both the developers & the organization in our Engineering domain.
Why Strimzi to run Kafka on Kubernetes?
Strimzi enables running an Apache Kafka cluster on Kubernetes with various deployment configurations. It uses the operator pattern that makes it easy for running Kafka on Kubernetes.
As a developer, you can leverage Strimzi to run a local Kafka cluster on Minikube on your machine for your end-to-end application development locally without making use of real Kafka servers, say, running on the Cloud or on-premise thus saving the potential cost to the company.
For production, you can customize Strimzi configuration as per your requirements, for instance, rack awareness to distribute Kafka brokers across various AZ (Availability Zones) on Cloud or use Kubernetes' native taints and tolerations to run Kafka on the dedicated nodes. The Kafka can also be exposed outside Kubernetes using NodePort, Load Balancer, or Ingress all of which can be easily secured using TLS certificates.
Strimzi's Kubernetes-native management extends beyond the broker to include Kafka topics, users, and Kafka Connect through Custom Resources. This allows you to manage the complete Kafka applications using your existing Kubernetes processes and tools like kubectl.
Core Principles of Strimzi
- High Security: Ensures a secure Kafka cluster with support for TLS, certificate management, and OAuth authentication
- Simplicity and Flexibility: Offers a straightforward yet highly configurable setup, enabling Kafka access through Kubernetes-native NodePort, Ingress, and LoadBalancer options
- Dedicated Node Deployment: Runs Kafka on dedicated nodes within a Kubernetes cluster
- Kubernetes Integration: Allows interaction and management of the cluster using Kubernetes-native tools like kubectl, operator-based approaches, and GitOps
- Seamless Integration: Integrates smoothly with other projects like OpenTelemetry, Prometheus, OPA, and more
Architecture - Strimzi Operators
Strimzi provides three operators to manage Kafka on Kubernetes and they are broadly classified as:-
- Cluster Operator - This is the main operator, responsible for deploying the Kafka cluster and managing broker configurations. It is also responsible for managing Kafka versions upgrades by rolling out newer version on one broker at a time. It also supports other operands like Kafka Connect, etc.
- Topic Operator - Using the KafkaTopic custom resource, this operator is responsible for creating, deleting, and managing the topic(s) on the Kafka cluster created by users.
- User Operator - It is responsible for managing the cluster users and related ACL (Access Control List) permissions on the topic using KafkaUser custom resource.
Advantages of using Strimzi
- Local Development Support: Developers can use Strimzi with Minikube on their local machines to quickly set up a Kafka cluster for development and testing.
- Enhanced Security: Strimzi offers TLS encryption and strong authentication/authorization to safeguard data streams.
- Exceptional Performance: Kafka, backed by Strimzi, delivers high throughput and low latency, enabling efficient real-time data processing and analytics.
- Robust Data Handling: Strimzi ensures data resilience with features like message ordering, replay capabilities, and message compaction, providing reliable data storage and processing.
- Simplified Deployment: Strimzi streamlines Kafka cluster management with custom resources, allowing easy configuration through YAML files. This approach speeds up deployment and minimizes errors.
- Scalability and Reliability: Leveraging Kafka's scalability and fault tolerance, Strimzi supports seamless data streaming as your infrastructure grows.
Setting up Kafka using Strimzi on Minikube
As a pre-requisites, you need Docker for Desktop and Minikube configured on your local machine. First start the Minikube using minikube start command on terminal and then follow the steps below to setup Kafka on it.
(1) Create a new Kubernetes namespace for Kafka deployment
kubectl create namespace kafka
kubectl config set-context --current --namespace=kafka # To default to kafka ns
Enter fullscreen mode Exit fullscreen mode
(2) Deploy the Strimzi operator on this newly created namespace, kafka.
kubectl create -f 'https://strimzi.io/install/latest?namespace=kafka' -n kafka
Enter fullscreen mode Exit fullscreen mode
(3) Now, deploy the Apache Kafka Cluster using the Strimzi CRD (Custom Resource Definition)
kubectl apply -f https://strimzi.io/examples/latest/kafka/kraft/kafka-single-node.yaml -n kafka
Enter fullscreen mode Exit fullscreen mode
To verify the setup of Kafka on Minikube, run the following commands as shown below:-
~ » kubectl get po -n kafka vinod827@Vinods-MacBook-Pro NAME READY STATUS RESTARTS AGE kafka-consumer 1/1 Running 0 4m33s my-cluster-dual-role-0 1/1 Running 0 7m22s my-cluster-entity-operator-6b5c9f5764-s5xbc 2/2 Running 0 6m58s strimzi-cluster-operator-865f986d89-tplcs 1/1 Running 0 8m47s vinod827@Vinods-MacBook-Pro ~ » kubectl get kafka -n kafka vinod827@Vinods-MacBook-Pro NAME DESIRED KAFKA REPLICAS DESIRED ZK REPLICAS READY METADATA STATE WARNINGS my-cluster True KRaft (base) ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Enter fullscreen mode Exit fullscreen mode
Voila! it is all done in a very simplified manner.
Let's test the connectivity by sending and receiving the message from this Kafka cluster.
Producer, sending a message:-
~ » kubectl -n kafka run kafka-producer -ti --image=quay.io/strimzi/kafka:0.43.0-kafka-3.8.0 --rm=true --restart=Never -- bin/kafka-console-producer.sh --bootstrap-server my-cluster-kafka-bootstrap:9092 --topic my-topic If you don't see a command prompt, try pressing enter. >hello, Vinod! Demo for Strimzi
Enter fullscreen mode Exit fullscreen mode
Receiver, receiving the message:-
~ » kubectl -n kafka run kafka-consumer -ti --image=quay.io/strimzi/kafka:0.43.0-kafka-3.8.0 --rm=true --restart=Never -- bin/kafka-console-consumer.sh --bootstrap-server my-cluster-kafka-bootstrap:9092 --topic my-topic --from-beginning If you don't see a command prompt, try pressing enter. hello, Vinod! Demo for Strimzi
Enter fullscreen mode Exit fullscreen mode
Conclusion
Strimzi simplifies deploying and managing Apache Kafka on Kubernetes, making it easier for organizations and developers to handle complex data streaming needs. Its Kubernetes-native approach enhances Kafka's flexibility, scalability, and reliability, whether for local development or large-scale production. By embracing Strimzi, you can harness the full power of Kafka on Kubernetes, ensuring your data infrastructure is both robust and future-proof. Start exploring Strimzi today to see how it can transform your Kafka deployments.
References
https://strimzi.io/
https://strimzi.io/docs/operators/latest/overview