Название: Snowflake Data Engineering (Final Release) Автор: Maja Ferle Издательство: Manning Publications Год: 2025 Страниц: 370 Язык: английский Формат: pdf (true) Размер: 19.4 MB
A practical introduction to data engineering on the powerful Snowflake cloud data platform.
Data engineers create the pipelines that ingest raw data, transform it, and funnel it to the analysts and professionals who need it. The Snowflake cloud data platform provides a suite of productivity-focused tools and features that simplify building and maintaining data pipelines. In Snowflake Data Engineering, Snowflake Data Superhero Maja Ferle shows you how to get started.
In Snowflake Data Engineering you will learn how to:
• Ingest data into Snowflake from both cloud and local file systems • Transform data using functions, stored procedures, and SQL • Orchestrate data pipelines with streams and tasks, and monitor their execution • Use Snowpark to run Python code in your pipelines • Deploy Snowflake objects and code using continuous integration principles • Optimize performance and costs when ingesting data into Snowflake
Snowflake Data Engineering reveals how Snowflake makes it easy to work with unstructured data, set up continuous ingestion with Snowpipe, and keep your data safe and secure with best-in-class data governance features. Along the way, you’ll practice the most important data engineering tasks as you work through relevant hands-on examples. Throughout, author Maja Ferle shares design tips drawn from her years of experience to ensure your pipeline follows the best practices of software engineering, security, and data governance.
Data engineering is the practice of building solutions that extract data from source systems, transform the data into useful information, and present the harmonized data to users for downstream consumption. Data engineers are responsible for building data pipelines that enable data analysts, data scientists, and other users to access the data they need to do their jobs. Providing high-quality data on time is essential for effective analytics, which is why data engineers play a critical role in the data analytics domain.
This book teaches data engineering skills on the powerful Snowflake platform. It starts by guiding you in building your first simple data pipeline and then expands the pipeline with increasingly complex features, including performance optimization, data governance, security, orchestration, and augmenting your data with generative AI.
Foreword by Joe Reis.
About the technology:
Pipelines that ingest and transform raw data are the lifeblood of business analytics, and data engineers rely on Snowflake to help them deliver those pipelines efficiently. Snowflake is a full-service cloud-based platform that handles everything from near-infinite storage, fast elastic compute services, inbuilt AI/ML capabilities like vector search, text-to-SQL, code generation, and more. This book gives you what you need to create effective data pipelines on the Snowflake platform.
Who should read this book: This book is for readers who have some familiarity with Snowflake, such as navigating the Snowsight user interface and using worksheets to execute queries and commands. The readers should have a basic understanding of data warehousing and data ingestion techniques. Previous use of ETL or ELT technologies for data ingestion is beneficial but not required.
Since Snowflake is a relational database, knowledge of SQL querying, including data definition language (DDL) and data manipulation language (DML) operations, is vital. Depending on the reader’s preference, if they plan to use Snowpark with Python, they must also know how to write Python code. If they plan to stage data for loading, they must know how to set up a cloud object store bucket/container and upload files with any of the supported providers (AWS S3, Azure blob storage, GCP Google Cloud Storage).
Скачать Snowflake Data Engineering (Final Release)