Introduction
  • Getting Started
  • Overview of Big Data sandboxes or virtual machine images
  • Pre-requisites
  • Choosing Virtualization Software (very important)
  • Installing VMWare Fusion on Mac
  • Installing Oracle VirtualBox on Mac
Cloudera Quickstart VM on VMWare Fusion
  • Setup Cloudera Quickstart VM - VMWare image
  • Review retail_db and gen_logs in Cloudera Quickstart VM
Cloudera Quickstart VM on Virtual Box
  • Download Cloudera Quickstart VM for Virtualbox
  • Setup Cloudera Quickstart VM for Virtualbox
  • Review retail_db and gen_logs in Cloudera Quickstart VM
Hortonworks Sandbox on VMWare Fusion
  • Setup Hortonworks Sandbox on VMWare - Mac
  • Setup MySQL Database - retail_db
  • Setup gen_logs application to generate logs
Hortonworks Sandbox on Virtual Box
  • Setup Hortonworks Sandbox on Virtual Box
  • Reset admin password
  • Setup MySQL Database - retail_db
  • Setup gen_logs application to generate logs
Setup Eclipse IDE for Map Reduce
  • Setup Eclipse with Maven Plugin - Introduction
  • Setup Eclipse with Maven Plugin
  • Create java application using Maven Project
  • Develop word count program introduction
  • Develop word count program
  • Run word count program
  • Setup github project - Introduction
  • Download and setup github project
  • Validate github project
Setup Eclipse IDE for Scala and Spark
  • Setup scala and sbt - Introduction
  • Setup and Validate Scala
  • Run simple scala application
  • Setup sbt and run scala application
  • Setup Scala IDE for Eclipse - Introduction
  • Install Scala IDE for Eclipse
  • Integrate sbt with Scala IDE for Eclipse
  • Develop Spark applications using Scala IDE - Introduction
  • Develop Spark applications using Scala IDE and sbt
  • Run Spark applications on cluster
Setup Development Environment for Scala and Spark using IntelliJ
  • Introduction
  • Setup Java and JDK
  • Install Scala with IntelliJ IDE
  • Develop Hello World Program using Scala
  • Setup sbt and run application HelloWorld
  • Add spark dependencies to the application
  • Setting up winutils.exe on Windows (64 bit)
  • Setup Data Sets - retail_db
  • Develop first spark application - Get revenue for each order from order_items
  • Build Jar file using sbt
  • Download and install Spark using 7z on Windows
  • Configure environment variables for Spark on Windows
  • Running spark job using spark-shell
  • Validating spark job from jar file using spark-submit