Название: Building Real-Time Analytics Systems: From Events to Insights with Apache Kafka and Apache Pinot (Final) Автор: Mark Needham Издательство: O’Reilly Media, Inc. Год: 2023 Страниц: 218 Язык: английский Формат: True EPUB (Retail Copy) Размер: 10.1 MB
Gain deep insight into real-time analytics, including the features of these systems and the problems they solve. With this practical book, data engineers at organizations that use event-processing systems such as Kafka, Google Pub/Sub, and AWS Kinesis will learn how to analyze data streams in real time. The faster you derive insights, the quicker you can spot changes in your business and act accordingly.
Author Mark Needham from StarTree provides an overview of the real-time analytics space and an understanding of what goes into building real-time applications. The book's second part offers a series of hands-on tutorials that show you how to combine multiple software products to build real-time analytics applications for an imaginary pizza delivery service.
Tools and platforms such as Apache Kafka (for data streaming), Apache Flink (stream processing), Apache Pinot (data analytics) and Apache Superset (data visualization) provide an excellent foundation for real-time analytics and have seen a tremendous uptake over the last years. At the same time, getting started with implementing your first use cases can be challenging, and you might ask yourself questions such as these: Which tools to choose for which purpose? How to put the individual pieces together for a coherent solution? What challenges exist when putting them into production and how to overcome those?
Mark’s book is a treasure trove of guidance around these and many other concerns. Starting with the foundations (What even is real-time analytics?), he provides a comprehensive overview of the software ecosystem in this space, discusses Apache Pinot as one of the leading real-time analytics platforms, and dives into production considerations as well as more specific aspects such geospatial queries and upsert operations (a notoriously tricky part in most analytics stores).
This book is a practical guide for implementing real-time analytics applications on top of existing data infrastructure. It is aimed at data engineers, data architects, and application developers who have some experience working with streaming data or would like to get acquainted with it.
In Chapters 1 and 2, we give an introduction to the topic and an overview of the types of real-time analytics applications that you can build. We also describe the types of products/tools that you’ll likely be using, explaining how to pick the right tool for the job, as well as explaining when a tool might not be necessary.
In Chapter 3, we introduce a fictional pizza company that already has streaming infrastructure set up but hasn’t yet implemented any real-time functionality. The next seven chapters will show how to implement different types of real-time analytics applications for this pizza company. If you’re interested in getting your hands dirty, these chapters will be perfect for you, and hopefully you’ll pick up some ideas (and code!) that you can use in your own projects.
The book will conclude with considerations when putting applications into production, a look at some real-world use cases of real-time analytics, and a gaze into our real-time analytics crystal ball to see what might be coming in this field over the next few years.
You will:
• Learn common architectures for real-time analytics • Discover how event processing differs from real-time analytics • Ingest event data from Apache Kafka into Apache Pinot • Combine event streams with OLTP data using Debezium and Kafka Streams • Write real-time queries against event data stored in Apache Pinot • Build a real-time dashboard and order tracking app • Learn how Uber, Stripe, and Just Eat use real-time analytics
Скачать Building Real-Time Analytics Systems: From Events to Insights with Apache Kafka and Apache Pinot (Final)