Spark Module 3 Machine Learning SparkML

Try for free!

Subscribe and stream all our courses
from just USD30.00 per month
Start my free trial

Spark Module 3 Machine Learning SparkML

Machine Learning for Big Data

The course will take on average 3 days to complete, including practical work

  • Learn the basics of Machine Learning and how to apply to big data with SparkML
  • Supervised vs Unsupervised Learning
  • Linear Regressions
  • Logistic Regressions
  • Decision Trees
  • K-Means Clusters
  • Random Forests
  • Recommender Systems
We assume you're already familiar with Spark Core from modules 1 and 2.

Contents

Having problems? check the errata

Introduction 24m 2s

What is Machine Learning, Supervised vs Unsupervised Learning and the Model Building Process

Preview

Building a Linear Regression 30m 40s

Assembling vectors of features and Model Fitting

Watch

Training Data 26m 33s

Training vs Test and Holdout Data, Using data from Kaggle, RMSE and R2 tests

Watch

Model Fitting Parameters 25m 41s

Setting Linear Regression Parameters

Watch

Feature Selection 36m 23s

Correlation of features, Identifying duplicate features, data preparation

Watch

Non Numeric Data 25m 48s

Using OneHotEncoding and Vectors

Watch

Pipelines 19m 42s

How to build a pipeline in SparkML

Watch

Case Study 34m 51s

A full practical exercise

Watch

Logistic Regression 26m 12s

True and False Negatives and Postives, Coding a Logistic Regression Model

Watch

Decision Trees 46m 21s

Building a decicision tree model, Interpreting a tree and Random Forests

Watch

Unsupervised Learning: K-Means Clustering 10m 49s

K-Means Clustering and how to implement in SparkML

Watch

Recommender Systems 29m 7s

Matrix Factorisation and how to build a model in SparkML

Watch
Copyright ©2024 VirtualPairProgrammers.com