the quickest and easiest way to learn Hadoop
contains over 13 hours of video - equivalent to 4 days of live training.
Important note for Windows users: Hadoop is difficult to install on Windows, so in the course we show you to how set up a virtual machine running Linux. No prior knowledge of Linux is needed.
Having problems? check the errata
Welcome 10m 49s A brief overview chapter, with a preview of the work we're going to be doing. |
Preview |
Introducing Hadoop 16m 12s An overview of what Hadoop is and introduction to the concept of map-reduce. |
Watch |
The map-reduce programming model 20m 45s A deeper look at the map-reduce programming model. |
Preview |
Operating modes & installation environment 25m 10s Understanding the operating modes of Hadoop, getting ready to install (including setting up a virtual machine if needed) |
Watch |
Installing Hadoop 40m 0s Installing Hadoop and configuring for both standalone and pseudo-distributed modes. |
Watch |
Writing our first map-reduce job 52m 36s Using a generic map-reduce template to create a real Hadoop job. |
Watch |
HDFS 24m 49s Understanding the Hadoop file system and how to put files into and out of it from the command line. |
Preview |
Running in Pseudo-Distributed Mode 11m 26s Running larger jobs in pseudo-distributed mode. Viewing the Hadoop Web User Interface. |
Watch |
Map-reduce process flow 1 40m 36s Look at the steps in a map-reduce job in more detail. Learn about the shuffle process and adding a combine class. |
Watch |
Map-reduce process flow 2 14m 38s An exercise to practice with the full map-reduce workflow. |
Watch |
Enhancing Map and Reduce 23m 41s An overview of the built in map and reduce functions, and learning to create custom key and value data types. |
Watch |
Job Configuration 25m 11s Understanding Hadoop file formats, and using the tool runner template to set command line parameters. |
Watch |
Case Study 1 - Part 1 53m 8s An explanation of the first major case study, using real-world data, together with a walk through of the first 2 tasks. |
Watch |
Case Study 1 - Part 2 9m 16s Walk through of task 3 in our case study. |
Watch |
Case Study 1 - Part 3 9m 13s Walk through of task 4 in our case study. |
Watch |
Chaining Multiple Map-Reduce Jobs 27m 27s Learning to automate the chaining of jobs with the JobControl object. Using the sequence file format |
Watch |
Pre and Post Processing 47m 39s Using the ChainMapper and ChainReducer objects to add additional Map steps. |
Watch |
Optimising Map-Reduce jobs 29m 46s Looking at multiple ways to improve the efficiency of Map-Reduce jobs |
Watch |
Log Files & Counters 36m 28s Learning to use log files and counters as a tool to debug map-reduce code. |
Watch |
Working with relational databases 56m 11s Reading and writing from relational databases using JDBC |
Watch |
Unit testing 40m 56s Using Junit to test map-reduce code with the MRUnit library. |
Watch |
Secondary Sorting 36m 11s Understanding how to sort the values before the reduce phase. |
Watch |
Joining data 51m 56s Joining 2 data sets together with a reduce-side join. |
Watch |
Using Amazon Elastic Map Reduce 40m 38s Using the Amazon EMR cloud based Hadoop platform to run map-reduce jobs. |
Watch |
Case Study 2 42m 45s Our second major case study based on a real world use of Hadoop. |
Watch |
Course Summary 14m 47s Review of what we've learned, and ideas of where to go next. |
Watch |