

Buy anything from 5,000+ international stores. One checkout price. No surprise fees. Join 2M+ shoppers on Desertcart.
Desertcart purchases this item on your behalf and handles shipping, customs, and support to Greece.
Get ready to unlock the power of your data. With the fourth edition of this comprehensive guide, youâ??ll learn how to build and maintain reliable, scalable, distributed systems with Apache Hadoop. This book is ideal for programmers looking to analyze datasets of any size, and for administrators who want to set up and run Hadoop clusters. Using Hadoop 2 exclusively, author Tom White presents new chapters on YARN and several Hadoop-related projects such as Parquet, Flume, Crunch, and Spark. Youâ??ll learn about recent changes to Hadoop, and explore new case studies on Hadoopâ??s role in healthcare systems and genomics data processing. Learn fundamental components such as MapReduce, HDFS, and YARN Explore MapReduce in depth, including steps for developing applications with it Set up and maintain a Hadoop cluster running HDFS and MapReduce on YARN Learn two data formats: Avro for data serialization and Parquet for nested data Use data ingestion tools such as Flume (for streaming data) and Sqoop (for bulk data transfer) Understand how high-level data processing tools like Pig, Hive, Crunch, and Spark work with Hadoop Learn the HBase distributed database and the ZooKeeper distributed configuration service Review: Great book for new API - I have around 14 years of java experience and this was my first book ever on Hadoop. So far I have been reading from internet. I agree that at times it's hard to understand things in one shot but once you re read them, they get clear. The best thing about this book is that it covers everything in new API. I believe Tom White worked really hard to make the overall picture easy enough to understand. I would suggest one thing while reading this book, please have the computer and some IDE open while reading it otherwise you will get lost in theory (which is great nonetheless) and when you actually try to code later, things will go off the top of your head. Another thing is that, there are various things which you'll understand only after reaching at 4th or 5th chapter. If you are learning Hadoop (from scratch) and you have read its first few chapters only, you may find it difficult but once you reach at 4th or 5th chapters then I'm sure you will find this book amazing. Review: Nice update for an already great book - Nice update for an already great book. As other reviews have already mentioned, if you're new to Hadoop, you'll probably need to read a few chapters in order to really get a good picture of what Hadoop is, and how it works. This is not necessarily a problem with the book, but just the nature of the beast when learning as complex as Hadoop. If you're new to Hadoop, take the time to work through the first few chapters, and maybe expect to come back and re-read parts later. If you're an experienced engineer who is looking to get up-to-speed with what has changed in recent versions, you'll definitely find what you're looking for here. I appreciate that this book covers high-level concepts as well as dives deep into the technical details that you will need to know for the design, implementation and day-to-day running of Hadoop and its various associated technologies.































































| Best Sellers Rank | #419,293 in Books ( See Top 100 in Books ) #20 in Parallel Computer Programming #104 in Data Mining (Books) #374 in Software Development (Books) |
| Customer Reviews | 4.5 out of 5 stars 295 Reviews |
S**N
Great book for new API
I have around 14 years of java experience and this was my first book ever on Hadoop. So far I have been reading from internet. I agree that at times it's hard to understand things in one shot but once you re read them, they get clear. The best thing about this book is that it covers everything in new API. I believe Tom White worked really hard to make the overall picture easy enough to understand. I would suggest one thing while reading this book, please have the computer and some IDE open while reading it otherwise you will get lost in theory (which is great nonetheless) and when you actually try to code later, things will go off the top of your head. Another thing is that, there are various things which you'll understand only after reaching at 4th or 5th chapter. If you are learning Hadoop (from scratch) and you have read its first few chapters only, you may find it difficult but once you reach at 4th or 5th chapters then I'm sure you will find this book amazing.
A**N
Nice update for an already great book
Nice update for an already great book. As other reviews have already mentioned, if you're new to Hadoop, you'll probably need to read a few chapters in order to really get a good picture of what Hadoop is, and how it works. This is not necessarily a problem with the book, but just the nature of the beast when learning as complex as Hadoop. If you're new to Hadoop, take the time to work through the first few chapters, and maybe expect to come back and re-read parts later. If you're an experienced engineer who is looking to get up-to-speed with what has changed in recent versions, you'll definitely find what you're looking for here. I appreciate that this book covers high-level concepts as well as dives deep into the technical details that you will need to know for the design, implementation and day-to-day running of Hadoop and its various associated technologies.
G**K
Excellent overview
A good first book in the ecosystem, reasonably organized, not difficult to read. Recommended.
S**N
Very good
Very full treatment of Hadoop but NOT a simple read so don't buy this if you want a beginners guide. I have enough test hardware and expertise to set up a couple virtual stacks and used this book to then setup a Hadoop cluster and various web interfaces. Within this limit a very good book.
R**K
Large Data Computing Challenge in one Comprehensive Large Gulp
Very comprehensive and well developed book that is a great introduction to large data management. If you want to enter this field of computing, this is the reference to get. It covers all the important bases in only 700 pages. There is a catch here, the cost of the hardware is prohibitive and can easily exceed $70,000. So, companies that are using large distributed systems that operate using Hadoop are generally closed shops and Catch 22 applies. You will have to build or borrow a small Hadoop system and learn your way around it to be able to demonstrate your expertise to get hired in a company that uses this. However, this book can ease your way into Hadoop computing and I highly recommend it.
I**Z
Great book, highly recommended
This book is so good that it forced it me to write my first book review on Amazon :). Everything is so clearly described and thoroughly explained both for Hadoop and related Apache projects. Read (and re-read) at least the whole section I thoroughly, it is very important for the understanding of the rest of the book.
J**O
I did like the book
I'm a Hadoop newbey, purchase this book to get some inside view. I did like the book, since I had an objective of checking Hadoop and Spark, did jump some chapters, but consider o good book.
M**M
Can be difficult to read in some areas
I've been a Java developer for about 20 years... And in code development for 25+ years. Don't get me wrong with the 3 stars, this is a good general book on the subject. I was considering giving it more stars... just be warned that the book expects you to already have a solid understanding on certain topics. Hadoop seems to be very parameter driven, and this book drives thru those parameters. For me, it was hard to take in all the information without multiple re-reads.
M**S
Excellent coverage of the Hadoop ecosystem.
This book is a good reference for the Hadoop ecosystem with a lot of code examples. It goes into details of the Map Reduce paradigm and practical topics of how to set a cluster. It also have dedicated chapters for the key tools that compose the Hadoop ecosystem like Scoop, flume, Pig, Hive, HBase. Many of the code examples use Java (since Hadoop is built in java). I'm not familiar with java but I could get the jist of the explanations just fine. The code is very well explained. It's a great read!
A**.
Todo perfecto
Me ha llegado en perfecto estado. Un libro imprescindible para conocer Hadoop a fondo y Amazon la unica web donde se puede encontrar en españa.
C**N
Libro obligatorio para principiantes
Excelente referencia para quienes inician con el tema de Big Data y Hadoop. Para gente técnica interesada en saber cómo hacer las cosas.
C**N
Excellent book (must have)
Excellent book lots of detailed on configuration. Configuration examples according to cluster size (which is difficult to find in books). Will recommend to anyone from beginner to advanced as most used tools are explained in deep
A**R
Comprehensive with strong technical insights
Working in the big data domain, especially in the open source area requires that one has the required and correct technical knowledge to be able to analyze and solve problems as they arise. I find this book quite covering with respect to these aspects. I am not finished with it yet but I can strongly recommend it to anyone interested to dig deep into the Hadoop ecosystem.
Trustpilot
4 days ago
2 weeks ago