Agile Data Science 2.0 : Building Full-Stack Data Analytics Applications with Spark

Bok av Russell Jurney
Data science teams looking to turn research into useful analytics applications require not only the right tools, but also the right approach if theyre to succeed. With the revised second edition of this hands-on guide, up-and-coming data scientists will learn how to use the Agile Data Science development methodology to build data applications with Python, Apache Spark, Kafka, and other tools.Author Russell Jurney demonstrates how to compose a data platform for building, deploying, and refining analytics applications with Apache Kafka, MongoDB, ElasticSearch, d3.js, scikit-learn, and Apache Airflow. Youll learn an iterative approach that lets you quickly change the kind of analysis youre doing, depending on what the data is telling you. Publish data science work as a web application, and affect meaningful change in your organization.Build value from your data in a series of agile sprints, using the data-value pyramidExtract features for statistical models from a single datasetVisualize data with charts, and expose different aspects through interactive reportsUse historical data to predict the future via classification and regressionTranslate predictions into actionsGet feedback from users after each sprint to keep your project on track