Pig For Wrangling Big Data

BY
Udemy

Mode

Online

Fees

₹ 1799

Quick Facts

particular details
Medium of instructions English
Mode of learning Self study
Mode of Delivery Video and Text Based

Course and certificate fees

Fees information
₹ 1,799
certificate availability

Yes

certificate providing authority

Udemy

The syllabus

You, This Course and Us

  • You, This Course and Us

Where does Pig fit in?

  • Pig and the Hadoop ecosystem
  • Install and set up
  • How does Pig compare with Hive?
  • Pig Latin as a data flow language
  • Pig with HBase

Pig Basics

  • Operating modes, running a Pig script, the Grunt shell
  • Loading data and creating our first relation
  • Scalar data types
  • Complex data types - The Tuple, Bag and Map
  • Partial schema specification for relations
  • Displaying and storing relations - The dump and store commands

Pig Operations And Data Transformations

  • Selecting fields from a relation
  • Built-in functions
  • Evaluation functions
  • Using the distinct, limit and order by keywords
  • Filtering records based on a predicate

Advanced Data Transformations

  • Group by and aggregate transformations
  • Combining datasets using Join
  • Concatenating datasets using Union
  • Generating multiple records by flattening complex fields
  • Using Co-Group, Semi-Join and Sampling records
  • The nested Foreach command
  • Debug Pig scripts using Explain and Illustrate

Optimizing Data Transformations

  • Parallelize operations using the Parallel keyword
  • Join Optimizations: Multiple relations join, large and small relation join
  • Join Optimizations: Skew join and sort-merge join
  • Common sense optimizations

A real-world example

  • Parsing server logs
  • Summarizing error logs

Installing Hadoop in a Local Environment

  • Hadoop Install Modes
  • Hadoop Standalone mode Install
  • Hadoop Pseudo-Distributed mode Install

Appendix

  • [For Linux/Mac OS Shell Newbies] Path and other Environment Variables
  • Setup a Virtual Linux Instance (For Windows users)

Instructors

Mr Janani Ravi
Instructor
Udemy

Mr Vitthal Srinivasan
Instructor
Udemy

Trending Courses

Popular Courses

Popular Platforms

Learn more about the Courses