In this course we’ll use message brokers, like Rabbit and Kafka, via Docker to set up data pipelines on your local machine to introduce Broadway’s functionality. We’ll show how it’s easy to leverage these tools within Broadway to create data pipelines capable of processing tens of thousands of messages per second.
We’ll start with a simple producer and consumer, and work through extending it using various producer-consumers. From there we’ll experiment with increasing load at various stages, and how the Erlang VM’s fault tolerance combines with Broadway’s back-pressure to solve load issues. Then, we’ll show how you can instrument your pipeline using telemetry to get valuable insights about how it’s performing. Finally, we’ll introduce the batching capabilities that come baked into Broadway.
By the time you’re finished, you’ll have implemented your own ETL pipeline, understand how to integrate it with an existing ETL pipeline, and see how easy it is to switch message brokers without changing the underlying implementation.
Tutorial objectives:
- Set up and configure message brokers with Docker
- Build scalable data pipelines using Broadway
- Implement producer-consumer patterns
- Handle load management and back-pressure
- Instrument pipelines with telemetry
- Master Broadway’s batching capabilities
- Create and integrate ETL pipelines
Target audience:
- Intermediate users looking to build robust data processing pipelines with Broadway.
Tutorial prerequisites:
- Docker
- Elixir
- Familiarity with:
- Broadway
- RabbitMQ
- Kafka
- Redpanda
- Some Ecto knowledge