Duckdb In Action
Download Duckdb In Action full books in PDF, epub, and Kindle. Read online free Duckdb In Action ebook anywhere anytime directly on your device. Fast Download speed and no annoying ads.
Author |
: Mark Needham |
Publisher |
: Simon and Schuster |
Total Pages |
: 310 |
Release |
: 2024-09-10 |
ISBN-10 |
: 9781638355595 |
ISBN-13 |
: 1638355592 |
Rating |
: 4/5 (95 Downloads) |
Synopsis DuckDB in Action by : Mark Needham
Dive into DuckDB and start processing gigabytes of data with ease—all with no data warehouse. DuckDB is a cutting-edge SQL database that makes it incredibly easy to analyze big data sets right from your laptop. In DuckDB in Action you’ll learn everything you need to know to get the most out of this awesome tool, keep your data secure on prem, and save you hundreds on your cloud bill. From data ingestion to advanced data pipelines, you’ll learn everything you need to get the most out of DuckDB—all through hands-on examples. Open up DuckDB in Action and learn how to: • Read and process data from CSV, JSON and Parquet sources both locally and remote • Write analytical SQL queries, including aggregations, common table expressions, window functions, special types of joins, and pivot tables • Use DuckDB from Python, both with SQL and its "Relational"-API, interacting with databases but also data frames • Prepare, ingest and query large datasets • Build cloud data pipelines • Extend DuckDB with custom functionality Pragmatic and comprehensive, DuckDB in Action introduces the DuckDB database and shows you how to use it to solve common data workflow problems. You won’t need to read through pages of documentation—you’ll learn as you work. Get to grips with DuckDB's unique SQL dialect, learning to seamlessly load, prepare, and analyze data using SQL queries. Extend DuckDB with both Python and built-in tools such as MotherDuck, and gain practical insights into building robust and automated data pipelines. About the technology DuckDB makes data analytics fast and fun! You don’t need to set up a Spark or run a cloud data warehouse just to process a few hundred gigabytes of data. DuckDB is easily embeddable in any data analytics application, runs on a laptop, and processes data from almost any source, including JSON, CSV, Parquet, SQLite and Postgres. About the book DuckDB in Action guides you example-by-example from setup, through your first SQL query, to advanced topics like building data pipelines and embedding DuckDB as a local data store for a Streamlit web app. You’ll explore DuckDB’s handy SQL extensions, get to grips with aggregation, analysis, and data without persistence, and use Python to customize DuckDB. A hands-on project accompanies each new topic, so you can see DuckDB in action. What's inside • Prepare, ingest and query large datasets • Build cloud data pipelines • Extend DuckDB with custom functionality • Fast-paced SQL recap: From simple queries to advanced analytics About the reader For data pros comfortable with Python and CLI tools. About the author Mark Needham is a blogger and video creator at @?LearnDataWithMark. Michael Hunger leads product innovation for the Neo4j graph database. Michael Simons is a Java Champion, author, and Engineer at Neo4j.
Author |
: Bo Ingram |
Publisher |
: Simon and Schuster |
Total Pages |
: 390 |
Release |
: 2024-11-12 |
ISBN-10 |
: 9781638356127 |
ISBN-13 |
: 1638356122 |
Rating |
: 4/5 (27 Downloads) |
Synopsis ScyllaDB in Action by : Bo Ingram
Build, maintain, and run databases that are easy to scale and quick to query—all with ScyllaDB. ScyllaDB in Action is your guide to everything you need to know about ScyllaDB, from your very first queries to running it in a production environment. It starts you with the basics of creating, reading, and deleting data and expands your knowledge from there. You’ll soon have mastered everything you need to build, maintain, and run an effective and efficient database. Inside ScyllaDB in Action you’ll learn how to: • Read, write, and delete data in ScyllaDB • Design database schemas for ScyllaDB • Write performant queries against ScyllaDB • Connect and query a ScyllaDB cluster from an application • Configure, monitor, and operate ScyllaDB in production This book teaches you ScyllaDB the best way—through hands-on examples. Dive into the node-based architecture of ScyllaDB to understand how its distributed systems work, how you can troubleshoot problems, and how you can constantly improve performance. About the technology ScyllaDB is a versatile NoSQL database that can move large volumes of data fast. Very, very, very fast. This drop-in replacement for Cassandra takes full advantage of modern multi-core hardware and scales to handle large real-time data workloads with incredibly low latency. It features built-in monitoring and management tools, and its efficient use of computing resources can save a lot of money on high-volume applications. About the book ScyllaDB in Action demonstrates how to integrate ScyllaDB into data-intensive applications. You’ll work through a hands-on project step by step as you use ScyllaDB to store data and learn to configure, monitor, and safely operate a distributed database. Along the way, you’ll discover how ScyllaDB’s unique “shard per core” approach helps you deliver impressive performance in real-time systems. What's inside • Design schemas for ScyllaDB • Write performant queries • Get an instant speed boost over Cassandra About the reader For backend and infrastructure engineers who know the basics of SQL. About the author Bo Ingram is a staff software engineer at Discord working in database infrastructure. He has extensive experience working with ScyllaDB as an operator and developer. The technical editor on this book was Piotr Wiktor Sarna. Table of Contents Part 1 1 Introducing ScyllaDB 2 Touring ScyllaDB Part 2 3 Data modeling in ScyllaDB 4 Data types in ScyllaDB 5 Tables in ScyllaDB Part 3 6 Writing data to ScyllaDB 7 Reading data from ScyllaDB Part 4 8 ScyllaDB’s architecture 9 Running ScyllaDB in production 10 Application development with ScyllaDB 11 Monitoring ScyllaDB 12 Moving data in bulk with ScyllaDB Appendix Docker
Author |
: Simon Aubury |
Publisher |
: Packt Publishing Ltd |
Total Pages |
: 382 |
Release |
: 2024-06-24 |
ISBN-10 |
: 9781803232539 |
ISBN-13 |
: 1803232536 |
Rating |
: 4/5 (39 Downloads) |
Synopsis Getting Started with DuckDB by : Simon Aubury
Analyze and transform data efficiently with DuckDB, a versatile, modern, in-process SQL database Key Features Use DuckDB to rapidly load, transform, and query data across a range of sources and formats Gain practical experience using SQL, Python, and R to effectively analyze data Learn how open source tools and cloud services in the broader data ecosystem complement DuckDB’s versatile capabilities Purchase of the print or Kindle book includes a free PDF eBook Book DescriptionDuckDB is a fast in-process analytical database. Getting Started with DuckDB offers a practical overview of its usage. You'll learn to load, transform, and query various data formats, including CSV, JSON, and Parquet. The book covers DuckDB's optimizations, SQL enhancements, and extensions for specialized applications. Working with examples in SQL, Python, and R, you'll explore analyzing public datasets and discover tools enhancing DuckDB workflows. This guide suits both experienced and new data practitioners, quickly equipping you to apply DuckDB's capabilities in analytical projects. You'll gain proficiency in using DuckDB for diverse tasks, enabling effective integration into your data workflows.What you will learn Understand the properties and applications of a columnar in-process database Use SQL to load, transform, and query a range of data formats Discover DuckDB's rich extensions and learn how to apply them Use nested data types to model semi-structured data and extract and model JSON data Integrate DuckDB into your Python and R analytical workflows Effectively leverage DuckDB's convenient SQL enhancements Explore the wider ecosystem and pathways for building DuckDB-powered data applications Who this book is for If you’re interested in expanding your analytical toolkit, this book is for you. It will be particularly valuable for data analysts wanting to rapidly explore and query complex data, data and software engineers looking for a lean and versatile data processing tool, along with data scientists needing a scalable data manipulation library that integrates seamlessly with Python and R. You will get the most from this book if you have some familiarity with SQL and foundational database concepts, as well as exposure to a programming language such as Python or R.
Author |
: Mark Pollack |
Publisher |
: "O'Reilly Media, Inc." |
Total Pages |
: 315 |
Release |
: 2012-10-24 |
ISBN-10 |
: 9781449323950 |
ISBN-13 |
: 1449323952 |
Rating |
: 4/5 (50 Downloads) |
Synopsis Spring Data by : Mark Pollack
You can choose several data access frameworks when building Java enterprise applications that work with relational databases. But what about big data? This hands-on introduction shows you how Spring Data makes it relatively easy to build applications across a wide range of new data access technologies such as NoSQL and Hadoop. Through several sample projects, you’ll learn how Spring Data provides a consistent programming model that retains NoSQL-specific features and capabilities, and helps you develop Hadoop applications across a wide range of use-cases such as data analysis, event stream processing, and workflow. You’ll also discover the features Spring Data adds to Spring’s existing JPA and JDBC support for writing RDBMS-based data access layers. Learn about Spring’s template helper classes to simplify the use of database-specific functionality Explore Spring Data’s repository abstraction and advanced query functionality Use Spring Data with Redis (key/value store), HBase (column-family), MongoDB (document database), and Neo4j (graph database) Discover the GemFire distributed data grid solution Export Spring Data JPA-managed entities to the Web as RESTful web services Simplify the development of HBase applications, using a lightweight object-mapping framework Build example big-data pipelines with Spring Batch and Spring Integration
Author |
: Johan Vos |
Publisher |
: Simon and Schuster |
Total Pages |
: 262 |
Release |
: 2022-02-08 |
ISBN-10 |
: 9781617296321 |
ISBN-13 |
: 1617296325 |
Rating |
: 4/5 (21 Downloads) |
Synopsis Quantum Computing in Action by : Johan Vos
Quantum computing is on the horizon, ready to impact everything from scientific research to encryption and security. But you don't need a physics degree to get started in quantum computing. Quantum Computing for Developers shows you how to leverage your existing Java skills into writing your first quantum software so you're ready for the revolution. Rather than a hardware manual or academic theory guide, this book is focused on practical implementations of quantum computing algorithms. Using Strange, a Java-based quantum computer simulator, you'll go hands-on with quantum computing's core components including qubits and quantum gates as you write your very first quantum code. Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications.
Author |
: Bill Chambers |
Publisher |
: "O'Reilly Media, Inc." |
Total Pages |
: 594 |
Release |
: 2018-02-08 |
ISBN-10 |
: 9781491912294 |
ISBN-13 |
: 1491912294 |
Rating |
: 4/5 (94 Downloads) |
Synopsis Spark: The Definitive Guide by : Bill Chambers
Learn how to use, deploy, and maintain Apache Spark with this comprehensive guide, written by the creators of the open-source cluster-computing framework. With an emphasis on improvements and new features in Spark 2.0, authors Bill Chambers and Matei Zaharia break down Spark topics into distinct sections, each with unique goals. Youâ??ll explore the basic operations and common functions of Sparkâ??s structured APIs, as well as Structured Streaming, a new high-level API for building end-to-end streaming applications. Developers and system administrators will learn the fundamentals of monitoring, tuning, and debugging Spark, and explore machine learning techniques and scenarios for employing MLlib, Sparkâ??s scalable machine-learning library. Get a gentle overview of big data and Spark Learn about DataFrames, SQL, and Datasetsâ??Sparkâ??s core APIsâ??through worked examples Dive into Sparkâ??s low-level APIs, RDDs, and execution of SQL and DataFrames Understand how Spark runs on a cluster Debug, monitor, and tune Spark clusters and applications Learn the power of Structured Streaming, Sparkâ??s stream-processing engine Learn how you can apply MLlib to a variety of problems, including classification or recommendation
Author |
: Holden Karau |
Publisher |
: "O'Reilly Media, Inc." |
Total Pages |
: 289 |
Release |
: 2015-01-28 |
ISBN-10 |
: 9781449359058 |
ISBN-13 |
: 1449359051 |
Rating |
: 4/5 (58 Downloads) |
Synopsis Learning Spark by : Holden Karau
Data in all domains is getting bigger. How can you work with it efficiently? Recently updated for Spark 1.3, this book introduces Apache Spark, the open source cluster computing system that makes data analytics fast to write and fast to run. With Spark, you can tackle big datasets quickly through simple APIs in Python, Java, and Scala. This edition includes new information on Spark SQL, Spark Streaming, setup, and Maven coordinates. Written by the developers of Spark, this book will have data scientists and engineers up and running in no time. You’ll learn how to express parallel jobs with just a few lines of code, and cover applications from simple batch jobs to stream processing and machine learning. Quickly dive into Spark capabilities such as distributed datasets, in-memory caching, and the interactive shell Leverage Spark’s powerful built-in libraries, including Spark SQL, Spark Streaming, and MLlib Use one programming paradigm instead of mixing and matching tools like Hive, Hadoop, Mahout, and Storm Learn how to deploy interactive, batch, and streaming applications Connect to data sources including HDFS, Hive, JSON, and S3 Master advanced topics like data partitioning and shared variables
Author |
: Rui Pedro Machado |
Publisher |
: "O'Reilly Media, Inc." |
Total Pages |
: 324 |
Release |
: 2023-12-08 |
ISBN-10 |
: 9781098142346 |
ISBN-13 |
: 1098142349 |
Rating |
: 4/5 (46 Downloads) |
Synopsis Analytics Engineering with SQL and dbt by : Rui Pedro Machado
With the shift from data warehouses to data lakes, data now lands in repositories before it's been transformed, enabling engineers to model raw data into clean, well-defined datasets. dbt (data build tool) helps you take data further. This practical book shows data analysts, data engineers, BI developers, and data scientists how to create a true self-service transformation platform through the use of dynamic SQL. Authors Rui Machado from Monstarlab and Hélder Russa from Jumia show you how to quickly deliver new data products by focusing more on value delivery and less on architectural and engineering aspects. If you know your business well and have the technical skills to model raw data into clean, well-defined datasets, you'll learn how to design and deliver data models without any technical influence. With this book, you'll learn: What dbt is and how a dbt project is structured How dbt fits into the data engineering and analytics worlds How to collaborate on building data models The main tools and architectures for building useful, functional data models How to fit dbt into data warehousing and laking architecture How to build tests for data transformations
Author |
: Dr. Gernot Starke |
Publisher |
: Packt Publishing Ltd |
Total Pages |
: 236 |
Release |
: 2019-10-07 |
ISBN-10 |
: 9781839219269 |
ISBN-13 |
: 1839219262 |
Rating |
: 4/5 (69 Downloads) |
Synopsis arc42 by Example by : Dr. Gernot Starke
Document the architecture of your software easily with this highly practical, open-source template. Key FeaturesGet to grips with leveraging the features of arc42 to create insightful documentsLearn the concepts of software architecture documentation through real-world examplesDiscover techniques to create compact, helpful, and easy-to-read documentationBook Description When developers document the architecture of their systems, they often invent their own specific ways of articulating structures, designs, concepts, and decisions. What they need is a template that enables simple and efficient software architecture documentation. arc42 by Example shows how it's done through several real-world examples. Each example in the book, whether it is a chess engine, a huge CRM system, or a cool web system, starts with a brief description of the problem domain and the quality requirements. Then, you'll discover the system context with all the external interfaces. You'll dive into an overview of the solution strategy to implement the building blocks and runtime scenarios. The later chapters also explain various cross-cutting concerns and how they affect other aspects of a program. What you will learnUtilize arc42 to document a system's physical infrastructureLearn how to identify a system's scope and boundariesBreak a system down into building blocks and illustrate the relationships between themDiscover how to describe the runtime behavior of a systemKnow how to document design decisions and their reasonsExplore the risks and technical debt of your systemWho this book is for This book is for software developers and solutions architects who are looking for an easy, open-source tool to document their systems. It is a useful reference for those who are already using arc42. If you are new to arc42, this book is a great learning resource. For those of you who want to write better technical documentation will benefit from the general concepts covered in this book.
Author |
: William Ayd |
Publisher |
: Packt Publishing Ltd |
Total Pages |
: 405 |
Release |
: 2024-10-31 |
ISBN-10 |
: 9781836205869 |
ISBN-13 |
: 1836205864 |
Rating |
: 4/5 (69 Downloads) |
Synopsis Pandas Cookbook by : William Ayd
From fundamental techniques to advanced strategies for handling big data, visualization, and more, this book equips you with skills to excel in real-world data analysis projects. Key Features This book targets features in pandas 2.x and beyond Practical, easy to implement recipes for quick solutions to common problems in data using pandas Master the fundamentals of pandas to quickly begin exploring any dataset Book DescriptionThe pandas library is massive, and it's common for frequent users to be unaware of many of its more impressive features. The official pandas documentation, while thorough, does not contain many useful examples of how to piece together multiple commands as one would do during an actual analysis. This book guides you, as if you were looking over the shoulder of an expert, through situations that you are highly likely to encounter. With this latest edition unlock the full potential of pandas 2.x onwards. Whether you're a beginner or an experienced data analyst, this book offers a wealth of practical recipes to help you excel in your data analysis projects. This cookbook covers everything from fundamental data manipulation tasks to advanced techniques for handling big data, visualization, and more. Each recipe is designed to address common real-world challenges, providing clear explanations and step-by-step instructions to guide you through the process. Explore cutting-edge topics such as idiomatic pandas coding, efficient handling of large datasets, and advanced data visualization techniques. Whether you're looking to sharpen or expand your skills, the "Pandas Cookbook" is your essential companion for mastering data analysis and manipulation with pandas 2.x, and beyond.What you will learn The pandas type system and how to best navigate it Import/export DataFrames to/from common data formats Data exploration in pandas through dozens of practice problems Grouping, aggregation, transformation, reshaping, and filtering data Merge data from different sources through pandas SQL-like operations Leverage the robust pandas time series functionality in advanced analyses Scale pandas operations to get the most out of your system The large ecosystem that pandas can coordinate with and supplement Who this book is for This book is for Python developers, data scientists, engineers, and analysts. pandas is the ideal tool for manipulating structured data with Python and this book provides ample instruction and examples. Not only does it cover the basics required to be proficient, but it goes into the details of idiomatic pandas