Getting Started with Kudu

Bok av Jean-Marc Spaggiari

Get up to speed with Apache Kudu, the column-oriented data store for Hadoop that not only provides an architectural simplification of several existing use cases, but also allows use cases not possible before. With this practical guide, enterprise architects working on big data implemetations will learn how Kudu's architecture and features solve a unique problem in the Hadoop ecosystem. For example, Kudu makes Hadoop viable for real-time IoT use cases in addition to making a transition from a massively parallel processing (MPP) SQL database engine plausible. If you're familiar with other storage layer projects such HDFS, HBase, Spanner, and Cassandra, you'll quickly learn-and appreciate-the unique contribution Kudu makes to this ecosystem. Explore how Kudu is compatible with data processing frameworks in the Hadoop environment Understand Kudu's architecture, internals, installation, and deployment Learn how to fully administer a Kudu cluster Become acquainted with low-level client APIs, how to integrate with SQL engines like Impala, and frameworks for integration Learn about table and schema design Get use cases, examples, best practices, and sample code