info@npntraining.com   +91 8095918383 / +91 9535584691

Data Science Training in Bangalore using Python + R

Learn and master data analytics using Python and R programming languages. In this program you will learn Statistics, Data Analysis Libraries, Visualization Libraries, Machine Learning, Deep Learning using Tensor Flow. The program is a highly recommended for professional who intends to land as a successful Data Analyst

Course Description

The Big Data Masters Program is designed to empower working professionals to develop relevant competencies and accelerate their career progression in Big Data technologies through complete Hands-on training.

Being a Big Data Architect requires you to be a master of multiple technologies, and this program will ensure you to become an industry-ready Big Data Architect who can provide solutions to Big Data projects.

At NPN Training we believe in the philosophy “Learn by doing” hence we provide complete Hands-on training with a real time project development.

Course Objectives

By the end of the course,  you will:

  1. Understand what is Big Data, the challenges of with Big Data and how Hadoop solves the Big Data problem
  2. Understand Hadoop 2.x Architecture, Replication, Single Point of Failure, YARN
  3. Learn HDFS + YARN Commands to work with cluster.
  4. Understand how MapReduce can be used to analyze big data sets
  5. Perform Structured Data Analysis using Hive
  6. Learn different performance tuning techniques in Hive
  7. Learn Data Loading techniques using Sqoop
  8. Use Scala with an intermediate level of proficiency
  9. Use the REPL (the Scala Interactive Shell) for learning
  10. Learn Functional Programming using Scala
  11. Learn Apache Spark 2.x
  12. Use DataFrames and Structured Streaming in Spark 2.x
  13. Analyze and Visualize data using Zeeplein
  14. Learn popular NoSQL Cassandra database.

Work on a real time project on Big Data

This program comes with a portfolio of industry-relevant POC’s, Use cases and project work. Unlike other institutes we don’t say use cases as a project, we clearly distinguish between use case and Project.

Work is the target audience?

  • Software engineers and programmers who want to understand the larger Big Data ecosystem, and use it to store and analyze.
  • Project, program, or product managers who want to understand the high-level architecture and projects of Big Data.
  • Data analysts and database administrators who are curious about Hadoop and how it relates to their work.

Statistics

Course description: This section of the training will help you understand how Hadoop solves storage and processing of large data sets in a distributed environment .

Module 01 - Introduction to Data Science

Learning Objectives – In this module, you will learn fundamentals of statistics. After this module you will understand different statistical concepts which will help in data analysis and machine learning.

Topics –

  • Introduction to Statistics
  • Different types of Statistics
  • Descriptive statistics
  • Inferential statistics
  • Types of data
    1. Numerical data
      • Discrete data
      • Continuous data
    2. Categorical data
    3. Ordinal data
  • Deep dive into Descriptive statistics
    1. Uni-variate Analysis
    2. Bi-variate Analysis
    3. Multivariate Analysis
    4. Function Models
    5. Significance in Data Science
  • Deep dive into Inferential statistics
    1. Sampling Distributions & Estimation
    2. Hypothesis Testing (One and Two Group Means)
    3. Hypothesis Testing (Categorical Data)
    4. Hypothesis Testing (More Than Two Group Means)
    5. Quantitative Data (Correlation & Regression)
    6. Significance in Data Science
  • Numerical Parameters to represent data
    1. Mean
    2. Mode
    3. Median
    4. Sensitivity
    5. Information Gain
    6. Entrophy
  • Population and Sampling
  • Sampling techniques
  • Covariance
  • Point Estimation

Module 02 - Introduction to Statistics

Learning Objectives – In this module, you will learn fundamentals of statistics. After this module you will understand different statistical concepts which will help in data analysis and machine learning.

Topics –

  • Introduction to Statistics
  • Different types of Statistics
  • Descriptive statistics
  • Inferential statistics
  • Types of data
    1. Numerical data
      • Discrete data
      • Continuous data
    2. Categorical data
    3. Ordinal data
  • Deep dive into Descriptive statistics
    1. Uni-variate Analysis
    2. Bi-variate Analysis
    3. Multivariate Analysis
    4. Function Models
    5. Significance in Data Science
  • Deep dive into Inferential statistics
    1. Sampling Distributions & Estimation
    2. Hypothesis Testing (One and Two Group Means)
    3. Hypothesis Testing (Categorical Data)
    4. Hypothesis Testing (More Than Two Group Means)
    5. Quantitative Data (Correlation & Regression)
    6. Significance in Data Science
  • Numerical Parameters to represent data
    1. Mean
    2. Mode
    3. Median
    4. Sensitivity
    5. Information Gain
    6. Entrophy
  • Population and Sampling
  • Sampling techniques
  • Covariance
  • Point Estimation

R – Programming

Course description: This course will help you to learn one of the most popular Job Scheduling Library i.e Quartz that can be integrated into a wide variety of Java applications. Quartz is widely used in enterprise class applications to support scheduling of jobs and to build process workflow.

Module 01 - Getting started with R

Learning Objectives – In this module, you will learn about R fundamentals, understand different types of R Data Structures, Flow control statements and Functions. After this module you will be able to create/extract data from different R Data Structures and write your own R functions.

Topics –

  • Introduction to R – Overview and Features
  • Environment Setup
  • Understanding of different Arithmetic operation in R
  • Variables
  • Understanding of different R Data structures
  • Exploring Data Structure
    1. Vector
    2. List
    3. Matrices
    4. Arrays
    5. Factors
    6. DataFrames
  • Introduction to Vector
    1. Vector creation, data extraction and manipulation
  • Data Structure – List
    1. Introduction to List
    2. List creation and manipulation
  • Data Structure – Matrices
    1. Introduction to Matrices
    2. Matrices creation, data extraction and computations
  • Data Structure – Arrays
    1. Introduction to Arrays
    2. Arrays creation and manipulation
  • Data Structure – Factors
    1. Introduction to Factors
    2. Generating different Factor Levels
  • Data Structure – Data Frames
    1. Introduction to Data Frames
    2. Data Frame creation, data extraction and computations
    3. Data Reshaping
  • Flow Control statements in R
    1. If statement
    2. If…else statement
    3. switch statement
    4. while loop
    5. for loop
    6. repeat loop
    7. break and next
  • Exploring built-in functions in R
    1. Generating Sequence
    2. Generating Random Numbers
    3. Column Bind : cbind()
    4. Row Bind : rbind()
    5. Merge Functions
  • Exploring user defined functions in R
    1. Declaring Function
    2. Calling a function with/without arguments
    3. Lazy Evaluation of Function in R

Module 02 - Data Importing Techniques

Learning Objectives – In this module, you will learn about R fundamentals, understand different types of R Data Structures, Flow control statements and Functions. After this module you will be able to create/extract data from different R Data Structures and write your own R functions.

Topics –

  • Introduction to R – Overview and Features
  • Environment Setup
  • Understanding of different Arithmetic operation in R
  • Variables
  • Understanding of different R Data structures
  • Exploring Data Structure
    1. Vector
    2. List
    3. Matrices
    4. Arrays
    5. Factors
    6. DataFrames
  • Introduction to Vector
    1. Vector creation, data extraction and manipulation
  • Data Structure – List
    1. Introduction to List
    2. List creation and manipulation
  • Data Structure – Matrices
    1. Introduction to Matrices
    2. Matrices creation, data extraction and computations
  • Data Structure – Arrays
    1. Introduction to Arrays
    2. Arrays creation and manipulation
  • Data Structure – Factors
    1. Introduction to Factors
    2. Generating different Factor Levels
  • Data Structure – Data Frames
    1. Introduction to Data Frames
    2. Data Frame creation, data extraction and computations
    3. Data Reshaping
  • Flow Control statements in R
    1. If statement
    2. If…else statement
    3. switch statement
    4. while loop
    5. for loop
    6. repeat loop
    7. break and next
  • Exploring built-in functions in R
    1. Generating Sequence
    2. Generating Random Numbers
    3. Column Bind : cbind()
    4. Row Bind : rbind()
    5. Merge Functions
  • Exploring user defined functions in R
    1. Declaring Function
    2. Calling a function with/without arguments
    3. Lazy Evaluation of Function in R

Module 03 - Exploratory Data Analysis

Learning Objectives – In this module, you will learn about R fundamentals, understand different types of R Data Structures, Flow control statements and Functions. After this module you will be able to create/extract data from different R Data Structures and write your own R functions.

Topics –

  • Introduction to R – Overview and Features
  • Environment Setup
  • Understanding of different Arithmetic operation in R
  • Variables
  • Understanding of different R Data structures
  • Exploring Data Structure
    1. Vector
    2. List
    3. Matrices
    4. Arrays
    5. Factors
    6. DataFrames
  • Introduction to Vector
    1. Vector creation, data extraction and manipulation
  • Data Structure – List
    1. Introduction to List
    2. List creation and manipulation
  • Data Structure – Matrices
    1. Introduction to Matrices
    2. Matrices creation, data extraction and computations
  • Data Structure – Arrays
    1. Introduction to Arrays
    2. Arrays creation and manipulation
  • Data Structure – Factors
    1. Introduction to Factors
    2. Generating different Factor Levels
  • Data Structure – Data Frames
    1. Introduction to Data Frames
    2. Data Frame creation, data extraction and computations
    3. Data Reshaping
  • Flow Control statements in R
    1. If statement
    2. If…else statement
    3. switch statement
    4. while loop
    5. for loop
    6. repeat loop
    7. break and next
  • Exploring built-in functions in R
    1. Generating Sequence
    2. Generating Random Numbers
    3. Column Bind : cbind()
    4. Row Bind : rbind()
    5. Merge Functions
  • Exploring user defined functions in R
    1. Declaring Function
    2. Calling a function with/without arguments
    3. Lazy Evaluation of Function in R

Module 04 - Data Visualization using R

Learning Objectives – In this module, you will learn about R fundamentals, understand different types of R Data Structures, Flow control statements and Functions. After this module you will be able to create/extract data from different R Data Structures and write your own R functions.

Topics –

  • Introduction to R – Overview and Features
  • Environment Setup
  • Understanding of different Arithmetic operation in R
  • Variables
  • Understanding of different R Data structures
  • Exploring Data Structure
    1. Vector
    2. List
    3. Matrices
    4. Arrays
    5. Factors
    6. DataFrames
  • Introduction to Vector
    1. Vector creation, data extraction and manipulation
  • Data Structure – List
    1. Introduction to List
    2. List creation and manipulation
  • Data Structure – Matrices
    1. Introduction to Matrices
    2. Matrices creation, data extraction and computations
  • Data Structure – Arrays
    1. Introduction to Arrays
    2. Arrays creation and manipulation
  • Data Structure – Factors
    1. Introduction to Factors
    2. Generating different Factor Levels
  • Data Structure – Data Frames
    1. Introduction to Data Frames
    2. Data Frame creation, data extraction and computations
    3. Data Reshaping
  • Flow Control statements in R
    1. If statement
    2. If…else statement
    3. switch statement
    4. while loop
    5. for loop
    6. repeat loop
    7. break and next
  • Exploring built-in functions in R
    1. Generating Sequence
    2. Generating Random Numbers
    3. Column Bind : cbind()
    4. Row Bind : rbind()
    5. Merge Functions
  • Exploring user defined functions in R
    1. Declaring Function
    2. Calling a function with/without arguments
    3. Lazy Evaluation of Function in R

Module 05 - Exploring R Package

Learning Objectives – In this module, you will learn about R fundamentals, understand different types of R Data Structures, Flow control statements and Functions. After this module you will be able to create/extract data from different R Data Structures and write your own R functions.

Topics –

  • Introduction to R – Overview and Features
  • Environment Setup
  • Understanding of different Arithmetic operation in R
  • Variables
  • Understanding of different R Data structures
  • Exploring Data Structure
    1. Vector
    2. List
    3. Matrices
    4. Arrays
    5. Factors
    6. DataFrames
  • Introduction to Vector
    1. Vector creation, data extraction and manipulation
  • Data Structure – List
    1. Introduction to List
    2. List creation and manipulation
  • Data Structure – Matrices
    1. Introduction to Matrices
    2. Matrices creation, data extraction and computations
  • Data Structure – Arrays
    1. Introduction to Arrays
    2. Arrays creation and manipulation
  • Data Structure – Factors
    1. Introduction to Factors
    2. Generating different Factor Levels
  • Data Structure – Data Frames
    1. Introduction to Data Frames
    2. Data Frame creation, data extraction and computations
    3. Data Reshaping
  • Flow Control statements in R
    1. If statement
    2. If…else statement
    3. switch statement
    4. while loop
    5. for loop
    6. repeat loop
    7. break and next
  • Exploring built-in functions in R
    1. Generating Sequence
    2. Generating Random Numbers
    3. Column Bind : cbind()
    4. Row Bind : rbind()
    5. Merge Functions
  • Exploring user defined functions in R
    1. Declaring Function
    2. Calling a function with/without arguments
    3. Lazy Evaluation of Function in R

Python 3.0 – Preparatory Course

Module 01 - Language Fundamentals

Learning Objectives – In this module, you will understand Big Data, the limitations of the existing solutions for Big Data problem, how Hadoop solves the Big Data problem, the common Hadoop ecosystem components, Hadoop 2.x Architecture, HDFS, Anatomy of File Write and Read.

Topics –

  • Introduction to Python
  • Installing Python in Windows using PyCharm
  • Data Types
    1. Numbers
    2. Strings
    3. Booleans
  • Control Flow Statements
    1. Understanding Python Indentation
    2. Decisions
      • The if Statement
      • The if-else Statement
      • The if-elif-else Statement
    3. Looping
      • The while loop
      • The for loop
      • Using range() in for loops

Check E-Learning for more Assignments + Use cases + Project work + Materials + Case studies

Module 02 - Collections

Learning Objectives – In this module, you will learn Formatting NameNode, HDFS File System Commands, MapReduce Commands, Different Data Loading Techniques,Cluster Maintence etc.

Topics –

  • Exploring Python Collections
  • Lists
    1. Creating Lists
    2. Accessing List Elements
    3. Iterating through list elements
    4. Searching elements within Lists
      • Check for existence
      • Counting occurence
      • Locating elements
    5. List slices
    6. Adding and deleting elements
    7. Adding, Multiplying and Copying Lists
  • Tuples
    1. Creating tuples
      • Creating Tuples from Lists using tuple()
      • Creating empty tuples using tuples()
      • Creating Singleton Tuples
    2. Accessing Tuple elements
    3. Counting Tuple elements
    4. Iterating through tuple elements
    5. Searching elements within tuples
    6. Tuple slices
    7. Adding, Multiplying and Copying Tuples
      • Adding Tuples
      • Multiplying Tuples
      • Assigning and Copying Tuples
  • Sets
    1. Creating Sets
    2. Accessing Set elements
    3. Counting Set elements
    4. Iterating through Set elements
    5. Adding and Deleting elements
    6. Set Operations
    7. Set Union
    8. Set Intersection
    9. Set Difference
  • Dictionaries
    1. Creating Dictionaries
    2. Accessing Dictionary elements
    3. Iterating through Dictionary elements
      • Iterating through the keys of a Dictionary
      • Iterating through the values of Dictionary
      • Iterating through the key-value pairs of a Dictionary
    4. Searching elements within Dictionaries
      • Checking for the existence of a key in a Dictionary
      • Extracting the value of a key using []
      • Extracting the value of akey using dict.get()
    5. Adding and Deleting elements
      • Adding elements using []
      • Adding elements using setdefault()
    6. Deleting Elements
      • Using del to Delete and Element
      • Using popitem() to Delete Elements
      • Using pop() to Delete Elements
      • Using clear() to Delete all elements of a Dictionary

Module 03 - Functions & Lambdas Expressions

Learning Objectives –In this module, you will understand how MapReduce framework works.

Topics –

  • Introduction to Functions
  • Function Definition
  • Function call
  • Positional Arguments
  • Default arguments
  • Keyword arguments
  • Variable arguments
    1. Variable arguments with positional parameters
    2. Variable arguments with default arguments
    3. Variable arguments followed by default arguments
    4. Variable arguments followed by keyword arguments
  • Returning From Functions
    1. Returning tuples from functions
    2. Returning Lists from functions
    3. Returning Dictionaries from functions
  • Returning single values from functions
  • Returning Collection from functions
  • Global variables
  • Exploring Lambda Expressions
    1. Introduction to Lambda expressions
    2. Declaring Lambda expressions
    3. What is an expression
    4. Understanding when to use Lamdba expressions
    5. Defaults in Lambda expressions
  • Lambdas with built in functions
    1. The map() function
    2. The filter() function
    3. The reduce() function
    4. Practical use map(), filter() and reduce() using Lambda expressions

 View Module Presentation

[Capstone Project] - Spark Streaming

E-Commerce Data Analysis – [Real-time industry use case]

Use case Description :

  • E -Commerce company wants to build a real time analytics dash board to optimize its inventory and operations .
  • This Dashboard should have the information of how many products are getting purchased , shipped, derived and cancelled every minute.
  • This Dash board will be very useful for operational intelligence

Data Analysis – Python Libraries for Data Analysis

Module 01 - NumPy

Learning Objectives – In this module, you will learn NumPy which is one of the fundamental package for scientific computing with Python.

Topics –

  • Introduction to NumPy
  • Exploring NumPy Arrays
  • Python Lists vs NumPy Arrays
  • Exploring NumPy Operations
  • Looping through List and NumPy Arrays
  • Multiplying each elements in Lists and NumPy Arrays
  • Creating multi-dimensional array using NumPy library
  • Squaring the number of each element
  • Exploring NumPy Built in Methods
    1. ndim
    2. temsize
    3. dtype
    4. shape
    5. reshape
    6. arange
    7. linspace
    8. eye
  • Advantages of NumPy library
    1. Less Memory
    2. Fast
    3. Convenient

Module 02 - [Pandas] - Series

Learning Objectives – In this module you will understand about quartz job scheduler

Topics –

  • Introduction to Pandas
  • Exploring Pandas fundamental data structure
    1. Series Data Structure
    2. DataFrame Data Structure
  • Different ways to create series data structure
  • Parameters and Arguments for series object
    1. Understanding usecols parameters in Series object
    2. Modifying the squeeze parameters
    3. Exploring inplace parameter
  • Exploring Series attributes
    1. The .value attribute
    2. The .index attribute
    3. The .dtype attribute
  • Exploring Series methods
    1. The head() and .tail() method
    2. The .sort_values() method
    3. The .sort_index() method
    4. Extracting Series values by Index position
    5. Extracting Series values by index label
    6. The .get() Method
    7. Math methods and Series objects
    8. The .idxmax() and .idxmin() method
    9. The .value_counts() method
    10. The .apply() method
    11. The .map() method
    12. Applying Python Built-In Functions to Series

Module 03 - [Pandas] - DataFrame - I

Learning Objectives – In this module you will understand about quartz job scheduler

Topics –

  • Introduction to Pandas
  • Exploring Pandas fundamental data structure
    1. Series Data Structure
    2. DataFrame Data Structure
  • Different ways to create series data structure
  • Parameters and Arguments for series object
    1. Understanding usecols parameters in Series object
    2. Modifying the squeeze parameters
    3. Exploring inplace parameter
  • Exploring Series attributes
    1. The .value attribute
    2. The .index attribute
    3. The .dtype attribute
  • Exploring Series methods
    1. The head() and .tail() method
    2. The .sort_values() method
    3. The .sort_index() method
    4. Extracting Series values by Index position
    5. Extracting Series values by index label
    6. The .get() Method
    7. Math methods and Series objects
    8. The .idxmax() and .idxmin() method
    9. The .value_counts() method
    10. The .apply() method
    11. The .map() method
    12. Applying Python Built-In Functions to Series

Module 04 - [Pandas] - DataFrame - I

Learning Objectives – In this module you will understand about quartz job scheduler

Topics –

  • Introduction to Pandas
  • Exploring Pandas fundamental data structure
    1. Series Data Structure
    2. DataFrame Data Structure
  • Different ways to create series data structure
  • Parameters and Arguments for series object
    1. Understanding usecols parameters in Series object
    2. Modifying the squeeze parameters
    3. Exploring inplace parameter
  • Exploring Series attributes
    1. The .value attribute
    2. The .index attribute
    3. The .dtype attribute
  • Exploring Series methods
    1. The head() and .tail() method
    2. The .sort_values() method
    3. The .sort_index() method
    4. Extracting Series values by Index position
    5. Extracting Series values by index label
    6. The .get() Method
    7. Math methods and Series objects
    8. The .idxmax() and .idxmin() method
    9. The .value_counts() method
    10. The .apply() method
    11. The .map() method
    12. Applying Python Built-In Functions to Series

Course description: In this course you will learn one of the most popular web-based notebook which enables interactive data analytics.

Data Visualization – Python Libraries for Data Visualization

Course description: Today’s application are build in the Microservices Architecture. Having a lot of Microservices that needs to communicate with each other can be problematic as they quickly become tight coupled. Apache Kafka allows us to create services that are loosely coupled and operate in the event driven way.

Module 01 - Matplotlib

Learning Objectives –  In this module, you will learn one of the popular Data Visualization library i.e Matplotlib which makes it easy build various types of plots and to customize them to make them more visually appealing and interpretable.

Topics –

  • Introduction to Matplotlib
  • Plotting Line Chart
    1. Functional Method
    2. Object Oriented Method
  • Plotting Scatter Plot
  • Histograms
  • Customization
    1. Colors attributes
    2. Understanding linewidth
    3. Line Style attributes
    4. Exploring alpha attributes
    5. Markers

Module 02 - Seaborn

Learning Objectives –  In this module, you will learn one of the popular Data Visualization library i.e Matplotlib which makes it easy build various types of plots and to customize them to make them more visually appealing and interpretable.

Topics –

  • Introduction to Matplotlib
  • Plotting Line Chart
    1. Functional Method
    2. Object Oriented Method
  • Plotting Scatter Plot
  • Histograms
  • Customization
    1. Colors attributes
    2. Understanding linewidth
    3. Line Style attributes
    4. Exploring alpha attributes
    5. Markers

Module 03 - Geographical Plotting

Learning Objectives –  In this module, you will learn one of the popular Data Visualization library i.e Matplotlib which makes it easy build various types of plots and to customize them to make them more visually appealing and interpretable.

Topics –

  • Introduction to Matplotlib
  • Plotting Line Chart
    1. Functional Method
    2. Object Oriented Method
  • Plotting Scatter Plot
  • Histograms
  • Customization
    1. Colors attributes
    2. Understanding linewidth
    3. Line Style attributes
    4. Exploring alpha attributes
    5. Markers

Machine Learning

Course description: This course will help you to learn Apache NiFi which is designed to automate the flow of data between software systems.

Module 01 - Introduction to Machine Learning

Learning Objectives – This module is an introduction to Machine Learning.

Topics –

  • What is Machine Learning
  • Traditional Learning vs Machine Learning
  • Real life applications of Machine Learning
  • Types of Machine Learning
    1. Supervised Machine Learning
    2. Unsupervised machine Learning
    3. Reinforcement Learning
  • Supervised Learning
    1. Overview of Supervised Learning
    2. Walk through of Supervised Learning algorithms
    3. Real time Applications of Supervised Learning
    4. Pros / Cons of Supervised Learning
  • Unsupervised Learning
    1. Overview of Unsupervised Learning
    2. Walk through of Unsupervised Learning algorithms
    3. Real time Applications of Unsupervised Learning
    4. Pros / Cons of Supervised Learning
  • Reinforcement Learning
    1. Overview of Reinforcement Learning
    2. Walk through of Reinforcement Learning algorithms
    3. Real time Applications of Reinforcement Learning
    4. Pros / Cons of Reinforcement Learning

Module 02 - Introduction to Scikit-Learn

Learning Objectives – In this module, you will learn one of the popular Python library for machine learning

Topics –

  • Introduction to Scikit Learn
  • Features of Scikit Learn
  • Exploring popular groups of models provided by scikit-learn

Module 03 - Linear Regression [Supervised Learning]

Learning Objectives – In this module, you will learn one of the most well known algorithm in statistics. Linear Regression performs the task to predict a continuous dependent variable based on the given set of independent variables.

Topics –

  • Introduction to Linear Regression
  • Understanding of gradient descent and cost function
  • Implementation of linear regression model using scikit learn
  • Different ways to validate the linear regression models
  • Assumptions in linear regression
    1. Multicollinearity
    2. Heteroscedasticity
    3. Auto Serial correlation and
    4. Normal distribution of error)
  • Introduction to cross validation
  • Advantages and Drawbacks of Linear Regression

Module 04 - Logistic Regression [Supervised Learning]

Learning Objectives – In this module, you will learn about Logistic Regression algorithm. Logistic Regression performs the task to predict a categorical dependent variable based on the given set of independent variables.

Topics –

  • Introduction to Logistic Regression
  • Assumptions of Logistic Regression
  • Understanding of odds and odds ratio
  • Implementation of Logistic Regression using scikit learn
  • Understanding of TPR,TNR, PRECISION,RECALL and Confusion Matrix
  • Validation of the model using ROC curve
  • Advantages and Drawbacks of Logistic Regression

Module 05 - K-Nearest Neighbours [Supervised Learning]

Learning Objectives – In this module, you will learn about popular KNN algorithm. This is the most simplest but powerful supervised machine learning algorithm.

Topics –

  • Introduction to KNN Algorithm
  • Introduction to different distance measures
  • Implementation of KNN using scikit learn
  • Validation of KNN Model
  • Advantages of KNN
  • Drawbacks of KNN

Module 06 - Support Vector Machine [Supervised Learning]

Learning Objectives – In this module, you will learn about SVM algorithm. SVM is a powerful algorithm which can be used for both classification and regression use cases.

Topics –

  • Introduction to SVM
  • Understanding of hyperplanes
  • Benefits of SVM as compared to other algorithms
  • Kernel Trick in SVM
  • Implementation of SVM using scikit learn
  • Hyper parameters tuning in SVM

Module 07 - K-Means Clustering [Unsupervised Learning]

Learning Objectives – In this module, you will learn different clustering algorithms. We will also go through the popular K-Means clustering algorithm.

Topics –

  • Introduction to different clustering techniques
  • Understanding of hierarchical clustering
  • Introduction to K-Means clustering
  • Understanding of Euclidean distance
  • Implementation of K-Means using scikit Learn
  • Optimization of K-Means clustering

Module 08 - Principle Component Analysis [UnSupervised Learning]

Learning Objectives – In this module, you will learn about Principle Component Analysis(PCA). PCA is a dimensionality reduction technique. You will also learn the implementation using scikit Learn.

Topics –

  • Introduction to PCA
  • Introduction to Factor Analysis
  • Implementation using scikit learn
  • Advantages of PCA

Module 09 - Decision Trees [Supervised Learning]

Learning Objectives – In this module, you will learn about Decision Tree Algorithm which can be used for both classification and regression use cases. Decision Trees are widely used because of it’s high interpretability.

Topics –

  • Introduction to Decision Tree
  • Understanding of CART algorithm
  • Understanding of Entropy and Gini Index
  • Implementation of Decision Tree using scikit learn
  • Parameters tuning in Decision Trees

Module 10 - Random Forest Algorithm [Supervised Learning]

Learning Objectives – In this module, you will learn about Random Forest ensemble Algorithm which can be used for both classification and regression use cases. Random Forest is widely used because of it’s high accuracy and easy to use features.

Topics –

  • Introduction to Random Forest
  • Understanding of ensemble Modelling
  • Bagging and Boosting
  • Implementation of Random Forest using scikit learn
  • Hyper parameters tuning in Random Forest
  • Feature Importance in Random Forest

Module 11 - XGBOOST [Supervised Learning]

Learning Objectives – XGBOOST is the most popular algorithm in Machine Learning. It can be used for both classification and regression. In this session we will go through how to get high accuracy with this Boosting algorithm.

Topics –

  • Introduction to XGBOOST
  • Benefits of XGBOOST as compared to other algorithms
  • Implementation of XGBOOST using scikit learn
  • Hyper parameters tuning in XGBOOST

Module 12 - Time series Forecasting [Supervised Learning]

Learning Objectives – In this module, you will learn about various time series forecasting methods. We will do a deep dive into ARIMA model.

Topics –

  • Introduction to Time series Forecasting
  • Understanding of different Time Series forecasting methods
  • Understanding of ARIMA
  • Implementation of ARIMA

Module 13 - Natural Language Processing

Learning Objectives – In this module, you will learn about Natural Language Processing. We will go through the popular NLTK packages and sentiment analysis using Python.

Topics –

  • Introduction to Natural Language Processing
  • Deep Dive to NLTK package
  • Tokenizing words and sentences
  • Stop words
  • Stemming words
  • Lemmatization
  • Word Net
  • Sentiment Analysis using NLTK

Tensor Flow – Deep Learning

Module 01 - Neural Network

Learning Objectives – In this module, you will learn about Neural Network. We will go through the different concepts of neural networks. We will do deep dive into RNN and CNN models.

Topics –

  • Introduction to Neural Network
  • Understanding of gradient descent
  • Understanding of forward and backward propagation
  • Understanding of RNN and CNN
  • Implementation of neural network using scikit learn

Module 02 - TensorFlow

Learning Objectives – In this module, you will learn about Basics of deep learning. We will go through tensorflow API. We will do a neural network implementation using tensorflow.

Topics –

  • Introduction to tensorflow API
  • Implementation of neural network using tensorflow
  • Benefits of tensorflow API

This program comes with a portfolio of industry-relevant POC’s, Use cases and project work. Unlike other institutes we don’t say use cases as a project, we clearly distinguish between use case and Project.

Process we follow for project development

We follow Agile methodology for the project development,

  1. Each batch will be divided into scrum teams of size 4-5 members.
  2. We will start with a Feature Study before implementing a project.
  3. The Feature will be broken down into User Stories and Tasks.
  4. For each user story a proper Definition Of Done will be defined
  5. A Test plan will be defined for testing the user story

Real Time Data Simulator

Project description: Creating a project which generates dynamic mock data based on the schema at a real-time, which can be further used for Real-time Processing systems like Apache Storm or Spark Streaming.

Building Complex Real time Event Processing

Project Description:

In this project, you will be building a real-time event processing system using Apache Streaming where even sub seconds also matter for analysis, while still not fast enough for ultra-low latency (picosecond or nano second) applications, such as CDR (Calling Detailed Record) from telecommunication where you can expect millisecond response times.

User Story 01 – As a developer we should simulate Real time Network Data

  1. Task 01 – Use Java Socket programming to generate and publish data to a port
  2. Task 02 – Publish the data with different scenarios

User Story 02 – As a developer we should be able to consume data using Spark Streaming

User Story 03 – As a developer we should consume Google API to convert latitude and longitude to corresponding region names.

User Story 04 –  Perform computation to calculate some important KPI’s (Key Performance Indicator) on the real time data.

More detailed split up will be shared once you start the project.

Technologies Used :

  • Java Socket Programming
  • Google API
  • Scala Programming
  • Spark Streaming

Data Model Development Kit

Project Description :

This project helps data model developer to manage Hive tables with different tables, storage types, column types and column properties required for different use case development.

Roles & Responsibility

  1. Building .xml files to define structures of hive tables to be used for storing process data generated.
  2. Actively involved in development to read .xml files, create data models and load data in hive.

Technologies Used

Java, JAXB, JDBC, Hadoop, Hive,

Sample User Stories

[Study User Story 01] – Come up with a design to represent data model required to handle the following scenarios

  • To Handle different operations like “CREATE”, “UPDATE”,”DELETE”
  • Way to define partition table
  • To Store columns in Orders
  • To Store column Name
  • To Handle Update of Column type and Name

[User Story 02] – HQL Generator – As a developer, we have to provide a functionality to create table

**Tasks**
– [ ] . Building Maven project and adding dependency
– [ ] . Integrate Loggers
– [ ] . Code Commit
– [ ] . Create a standard package structure.
– [ ] . Utility to read xml and create Java Object
– [ ] . Utility code to communicate to Hive DB
– [ ] . Check for Hive Service before executing queries
– [ ] . Code to construct HQL query for create.
– [ ] . Exception Handling.

Definition of Done
– [ ] Package structure should be created.
– [ ] Table has to be created in Hive
– [ ] Validate all required schema is created
– [ ] Validation of Hadoop + Hive Services

**Test Cases**
1.If table already exists we need to print “Table already exists”
2.Verify schema with xml
3.If services are not up and running,it should handle and log it.

 

Course hours

90 hours extensive class room training30 sessions of 3 hours eachCourse Duration : 5 Months

Assignments

For each module multiple Hands-on exercises, assignments and quiz are provided in the E-Learning

Real time project

We follow Agile Methodology for the project development. Each project will have Feature Study followed by User stories and Tasks.

Mock Interview

There will be a dedicated 1 to 1 interview call between you and a Big Data Architect. Experience a real Mock Interview.

Forum

We have a community forum for all our students wherein you can enrich their learning through peer interaction and knowledge sharing.

Certification

From the beginning of the course, you will be working on a project. On completion of the project NPN Training certifies you as a “Big Data Architect” based on the project.

Nov 10th

Batch: Weekend Sat & Sun
Duration: 5 Months
 30,000

Enroll Now

Dec 15th

Batch: Weekend Sat & Sun
Duration: 5 Months
 30,000

Enroll Now

Jsn 12th

Batch: Weekend Sat & Sun
Duration: 5 Months
 25,000

Enroll Now

Batches not available

Is Java a pre-requisite to learn Big Data Masters Program?

Yes Java is a pre-requisite, there are institutes who says Java is not required all those are false information

Can I attend a demo session before enrollment?

Yes, You will be sitting in an actual live class to experience the quality of training.

How will I execute the Practicals?

We will help you to setup NPN Training’s Virtual Machine + Cloudera Virtual Machine in your System with local access. The detailed installation guides are provided in the E-Learning for setting up the environment.

Who are the Instructor at NPN Training?

All the Big Data classes will be driven by Naveen sir who is a working professional with more than 12 years of experience in IT as well as teaching.

how do I access the eLearning content for the course?

Once you have registered and paid for the course, you will have 24/7 access to the E-Learning content

What If I miss a session?

The course validity will be one year so that you can attend the missed session in another batches.

Do I avail EMI option?

The total fees you will be paying in 2 installments

Are there any group discounts for classroom training programs?

Yes, we have group discount options for our training programs. Contact us using the form “Drop Us a Query” on the right of any page on the NPN Training website, or select the Live Chat link. Our customer service representatives will give you more details.

Reviews

Anindya Banerjee
Cognizant
Linkedin

After searching extensively in internet about big data courses,I came to know about NPN training and Naveen sir…it was a tough call for me to pick the right training institute which would provide me the right blend of practical exposure with theoretical knowledge on big data and Hadoop technologies..After attending few classes I am mesmerized by Naven Sir’s way of teaching ,his command over various topics and his study materials.unlike other training institutes ,Naveen sir believes in extensive learning and he believes in hands on training which makes ‘NPN training’far apart from other institutes.I would highly recommend any one to join NPN training if he/she wants to make his career in big data technologies .

Sarbartha Paul
HCL Technologies
Linkedin

The best thing I liked about this institute is the way Naveen sir teaches and also is his way of taking care of individual person’s doubt and interest and his tendency to make others learn big data with complete hands-on experience. The theory he teaches is compact and crunchy enough to get a good hold of the basics.
Other thing that always keeps this institute apart is the way Naveen sir has designed it’s Big Data Architecture Program course which covers nearly everything that other institutes lack. The course materials are also very to the point.
In one word, Naveen sir’s way of teaching is a class apart!
I am greatly moved by his ideology and teaching and this is probably one of the finest institute in town as far as big data courses are concerned. It is worth joining his classroom in all aspect. Thank you for all your efforts sir!

Sai Venkata Krishna
Capgemini
Linkedin

Naveen is an excellent trainer. Naveen mainly focus on HANDS ON and REAL TIME SCENARIOS which helps one to understand the concepts easily.I feel that NPN training curriculum is best in market for Big Data.
Naveen is very honest in his approach and he delivers additional concepts which are not present in the syllabus of particular topics.E learning and assignments are very informative and helpful.The amount you pay for the Big data course is worth every penny.
Thank you NPN Training for your support and motivation.

Chat with Instructor