售价: ¥364.00 待定配送费
前翻 后翻
正在播放... 已暂停   您正在聆听的 Audible 音频版本的样品。
了解更多信息
查看全部 3 张图片

Advanced Analytics with Spark: Patterns for Learning from Data at Scale (英语) 平装 – 2015年4月14日

平均4.5 星
5 星
16
4 星
4
3 星
1
2 星
1
1 星
0
平均4.5 星 22条亚马逊美国的评论 us-flag |
| 天天低价·正品质优
|
分享
广告

显示所有 2 格式和版本 隐藏其他格式和版本
亚马逊价格
全新品最低价 非全新品最低价
平装
"请重试"
¥364.00
¥364.00
 

此商品有一个较新的版本:

Advanced Analytics with Spark: Patterns for Learning from Data at Scale
¥302.80
预售商品:暂无上市时间,

click to open popover

无需Kindle设备,下载免费Kindle阅读软件,即可在您的手机、电脑及平板电脑上畅享阅读。

  • iPhone/iPad/Mac
  • Android手机或平板电脑

请输入您的手机号码,获取Kindle阅读软件的下载链接。



基本信息

  • 出版社: O'Reilly Media, Inc, USA (2015年4月14日)
  • 平装: 276页
  • 语种: 英语
  • ISBN: 1491912766
  • 条形码: 9781491912768
  • 商品尺寸: 17.8 x 1.5 x 23.3 cm
  • ASIN: 1491912766
  • 用户评分: 分享我的评价
  • 您想告诉我们您发现了更低的价格?

商品评论

在亚马逊中国上尚未有买家评论。
5 星
4 星
3 星
2 星
1 星

此商品在美国亚马逊上最有用的商品评论 (beta) (可能包括"Early Reviewer Rewards Program"的评论)

美国亚马逊: 平均4.5 星 22 条评论
28/29 人认为此评论有用
平均5.0 星 Great introduction to real world data science at scale 2015年4月25日
评论者 Ram - 已在美国亚马逊上发表
版本: 平装 已确认购买
This book fills an important gap in large scale data science.
Spark has emerged as the big data platform of choice for data scientists both from the ease of use as well as the performance / optimization point of view. In a few lines of Scala code, Spark allows you to write iterative algorithms that scale out very well. For a data scientist who wants to explore large scale data sets, Spark is a great starting point (this is incredible progress in the Spark community given the project is just about 4 years old). However, Spark itself is moving fast and maturing with time, and Spark and Scala as well as distributed algorithms are typically not in the arsenal of many data scientists today.
What this book does is teach you how to think about data science problems at scale, in the context of Spark. By well chosen examples covering both supervised and unsupervised learning, the authors take you step by step from a practical problem definition (say how to recommend music given user's history of music listened to) to what features are relevant, what machine learning algorithm to use and how to tune parameters to optimize the solution and how you can use Spark to do all of this in an interactive / iterative manner. As a bonus, they also point you to well engineered data sets that you can use to follow along the discussion and learn by trying out the examples yourself.
By embracing the feature engineering steps and data cleaning/ error handling and tuning /feedback steps, the authors manage to show how real world data science works and how you can do full stack data science using Spark and gain immensely from the interactive nature of the Spark REPL.
Overall, I highly recommend this book, and though it is the first book on Data Science using Spark, it sets a high standard for subsequent efforts.
13/13 人认为此评论有用
平均5.0 星 If you are looking for a intro to data science, data analysis and machine learning at scale - this is the right book 2015年8月3日
评论者 Adam Lieskovsky - 已在美国亚马逊上发表
版本: 平装 已确认购买
TL;DR If you are looking for a intro to data science, data analysis and machine learning at scale - this is the right book. Sure, there are others, maybe more popular books from O'Reilly considering these topics, but the authors of those are using R and Python and the books are not focused on the performance and scalability. For closer details regarding Spark you can also take a look at this introductory Spark book - Learning Spark.

This book presents 9 case studies of data analysis applications in various domains. The topics are diverse and the authors always use real world datasets. Beside learning Spark and a data science you will also have the opportunity to gain insight about topics like taxi traffic in NYC, deforestation or neuroscience. Without any previous exposure or contact with machine learning readers might struggle to understand certain chapters, so I think it's good idea to actually try those examples yourself while reading and Google for further details about the used methods. Many of the chapters end only with basic models, which barely outperform the baselines, so if you want to, there is a lot of space for their improvement and further work.

Spark itself provides it's users with APIs in three languages - Java, Scala and Python. This books successfully covers each one of these, although you can feel slight preference of a Scala throughout the book. For Scala starters - they always explain some of the special constructs or syntax features which is in fact a nice thing. Introduction and Appendix chapters provides basic information about the Spark core, RDDs (Resilient distributed datasets) or options of running Spark - whether in cluster (Mesos, YARN, Spark's own) or standalone settings. Throughout the book you can find some really worthy tips about Spark or data analysis - like using other serializer than the Java's default (they recommend kryo), overview of data cleansing and whole machine learning pipeline. To sum up, I recommend this book to every data scientist - because it demonstrates advanced topics like workload distribution and scaling on an enjoyable examples.
23/25 人认为此评论有用
平均4.0 星 Advanced and Scala heavy 2015年6月17日
评论者 Brian Castelli - 已在美国亚马逊上发表
版本: Kindle电子书 已确认购买
This is a solid book, with practical case study examples that one can follow. It really is an "advanced" book. One can learn quite a bit from this volume, but if you're a beginner you should start with something else. For beginners, I recommend Learning Spark (http://www.amazon.com/gp/product/B00SW0TY8O). I was disappointed with this advanced volume in that the authors focused almost exclusively on scala. This focus leads us down the path to unnecessary complexity in at least a few places. I would have liked to see more examples using Spark's pyspark library for Python.
3/3 人认为此评论有用
平均4.0 星 This book is a good overview of potential uses of Spark 2016年3月13日
评论者 origin415 - 已在美国亚马逊上发表
版本: 平装 已确认购买
This book is a good overview of potential uses of Spark, introducing different features through a sequence of vignettes. That said, it does not go in-depth into any particular aspect of Spark.

The vignettes introduce a variety of topics that Spark can tackle: recommendations, graph analysis, Monte Carlo methods, by analyzing some publicly-available dataset. The analysis conducted is explained well and very useful as an introduction to the techniques they used. As an overview of the capabilities of Spark, this method excels. In addition, all code is available in the author's Github, though there are some discrepancies between the code in the repo and the book (beyond what's expected for comparing a static book and a changing git repo). This is useful for following along and replicating the analysis as well as altering their techniques to explore the data further.

On the other hand, this is not an in-depth introduction to Spark as a whole. There is an appendix introducing some Spark basics, but you'll get much further with Spark's own documentation, or the other O'Reilly book, Learning Spark. Without this aspect, it becomes harder to generalize these analyses for your own purposes.
平均5.0 星 Five Stars 2017年1月28日
评论者 Daniel C Deng - 已在美国亚马逊上发表
版本: 平装 已确认购买
Very advanced indeed