Autoplay
Autocomplete
Previous Lesson
Complete and Continue
Apache Spark with Scala - Learn Spark from a Big Data Guru
Get Started with Apache Spark
Course Overview (4:13)
Text Lecture: How to Take this Course and How to Get Support
Introduction to Spark (2:26)
Java 9 Warning
Set up Spark project with IntelliJ IDEA (7:02)
Install Java and Git (4:20)
Run our first Spark job (2:57)
Trouble shooting: running Hadoop on Windows
RDD
RDD Basics (2:45)
Create RDDs (2:32)
Text Lecture: Create RDDs
Map and Filter Transformation (8:43)
Solution to Airports by Latitude Problem (1:34)
FlatMap Transformation (4:52)
Set Operation (8:00)
Sampling With Replacement and Sampling Without Replacement
Solution for the Same Hosts Problem (1:36)
Actions (8:07)
Solution to Sum of Numbers Problem (1:46)
Important Aspects about RDD (1:36)
Summary of RDD Operations (2:26)
Caching and Persistence (5:14)
Spark Architecture and Components
Spark Architecture (3:00)
Spark Component (5:26)
Pair RDD
Introduction to Pair RDD (1:38)
Create Pair RDDs (3:44)
Filter and MapValue Transformations on Pair RDD (4:57)
Reduce By Key Aggregation (5:19)
Sample solution for the Average House problem (3:20)
Group By Key Transformation (4:50)
Sort By Key Transformation (2:37)
Sample Solution for the Sorted Word Count Problem (2:08)
Another Solution for the Sorted Word Count Problem
Data Partitioning (4:18)
Join Operations (5:01)
Extra Learning Material: How are Big Companies using Apache Spark
Advanced Spark Topic
Accumulators (3:50)
Solution to StackOverflow Survey Follow-up Problem (1:00)
Broadcast Variables (6:43)
Spark SQL
Introduction to Spark SQL (3:54)
Spark SQL in Action (13:27)
Spark SQL practice: House Price Problem (1:43)
Spark SQL Joins (6:33)
Strongly Typed Dataset (7:03)
Use Dataset or RDD (3:02)
Dataset and RDD Conversion (2:32)
Performance Tuning of Spark SQL (2:50)
Extra Learning Material: Avoid These Mistakes While Writing Apache Spark Program
Running Spark in a Cluster
Introduction to Running Spark in a Cluster (4:14)
Package Spark Application and Use spark-submit (8:13)
Run Spark Application on Amazon EMR (Elastic MapReduce) cluster (13:37)
Additional Learning Materials
Text Lecture: Future Learning
Coupons to Our Other Courses
Use Dataset or RDD
Lesson content locked
If you're already enrolled,
you'll need to login
.
Enroll in Course to Unlock