Apache Sqoop - Introduction and Setup Environment
  • Introduction
  • Setup Options
  • Setup Cloudera QuickStart VM
  • Setup Hortonworks Sandbox
  • Data Sets and Big Data labs for practicing Sqoop - from ITVersity
  • Using Windows - Putty
  • Using Windows - Cygwin
Introduction to Sqoop
  • Introduction to Sqoop
  • Validate Source Database - MySQL
  • Review JDBC Jar file to connect to MySQL
  • Getting help of Sqoop using Command Line
  • Overview of Sqoop User Guide
  • Validate Sqoop and MySQL integration using "sqoop list-databases"
  • List tables in MySQL using "sqoop list-tables"
  • Run Queries in MySQL using "sqoop eval"
  • Understanding Logs in Sqoop
  • Redirecting Sqoop Logs into files
Apache Sqoop - Importing Data into HDFS
  • Overview of Sqoop Import Command
  • Perform Sqoop Import of orders - --table and --target-dir
  • Perform Sqoop import of order_items - --warehouse-dir
  • Sqoop Import - Managing HDFS Directories - append or overwrite or fail
  • Sqoop Import - Execution Flow
  • Reviewing logs of Sqoop Import
  • Sqoop Import - Specifying Number of Mappers
  • Review the Output Files
  • Sqoop Import - Supported File Formats
  • Validating avro Files using avro-tools
  • Sqoop Import - Using Compression
Apache Sqoop - Importing Data into HDFS - Customizing
  • Sqoop Import - Customizing - Introduction
  • Sqoop Import - Specifying Columns
  • Sqoop Import - Using boundary query
  • Sqoop Import - Filter unnecessary data
  • Sqoop Import - Using Split By
  • Sqoop Import - Importing Query Results
  • Sqoop Import - Dealing with Composite Keys
  • Sqoop Import - Dealing with Primary Key or Split By using Non Numeric Field
  • Sqoop Import - Dealing with Tables with out Primary Key
  • Sqoop Import - Autoreset to One Mapper
  • Sqoop Import - Default Delimiters using Text File Format
  • Sqoop Import - Specifying Delimiters - Import NYSE Data with \t as delimiter
  • Sqoop Import - Dealing with NULL Values
  • Sqoop Import - import-all-tables
Apache Sqoop - Importing Data into Hive Tables
  • Quick Overview of Hive
  • Sqoop Import - Create Hive Database
  • Creating empty Hive Table using create-hive-table
  • Sqoop Import - Import orders table to Hive Database
  • Sqoop Import - Managing Table using Hive Import - Overwrite
  • Sqoop Import - Managing Table using Hive Import - Error out - create-hive-table
  • Sqoop Import - Understanding Execution Flow while importing into Hive Table
  • Sqoop Import - Review files in Hive Tables
  • Sqoop Delimiters vs. Hive Delimiters - Text Files
  • Sqoop Import - Hive File Formats
  • Sqoop Import all tables - Hive
Apache Sqoop - Exporting Data from HDFS to RDBMS
  • Introduction
  • Prepare data for Export
  • Creating Table in MySQL
  • Sqoop Export - Perform Simple Export - --table and --export-dir
  • Sqoop Export - Execution Flow
  • Sqoop Export - Specifying Number of Mappers
  • Sqoop Export - Troubleshooting the issues
  • Sqoop Export - Merging or Upserting Overview
  • Sqoop Export - Quick Overview of MySQL for Upsert
  • Sqoop Export - Using update-mode - update-only (default)
  • Sqoop Export - Using update-mode - allow-inseert
  • Sqoop Export - Specifying Columns
  • Sqoop Export - Specifying Delimiters
  • Sqoop Export - Using Stage Table
Apache Sqoop - Incremental Imports and Jobs
  • Overview of Sqoop Jobs
  • Adding Password File
  • Creating Sqoop Job
  • Running Sqoop Job
  • Overview of Incremental Imports
  • Incremental Import - Using where
  • Incremental Import - Append Mode
  • Incremental Import - Create training_orders_incr in retail_export
  • Incremental Import - Create Job
  • Incremental Import - Execute Job
  • Incremental Import - Add additional data (order_id > 30000)
  • Incremental Import - Rerun the job and validate results
  • Incremental Import - Using mode lastmodified