HoningDS.com offers data science training, with coding challenges, and real-time projects in Python and R.There are many institutes offering data science course in Hyderabad**,** you need to choose the one which gives you practical exposure. There will be 80% hands-on, and 20% theoretical concepts taught here. You will work with Kaggle datasets.

By building a GitHub portfolio and by working with important packages like dplyr during interactive coding exercises, you will definitely get an edge over other aspirants.

Contents

- 1 Best Data Science Course in Hyderabad that will get you a job
- 1.1 Highlights of this training that will blow your mind
- 1.2 Projects covered in this course that gets you jobs
- 1.3 What is the demand for data scientists now ?
- 1.4 Who can learn data science and upskill themselves?
- 1.5 About the faculty
- 1.6 The problem with classroom coaching
- 1.7 Go through my curriculum that delivers results.
- 1.8 How will you benefit from my sessions?
- 1.9 Commonly asked questions on the course
- 1.10 References

## Highlights of this training that will blow your mind

I believe in learning by doing in my data science training classes , and trying out things. Imagine you are given a bunch of textbooks on a subject and asked to learn without any context. You get bored, right? This is where my data scientist course is unique. I provide 35 projects and coding challenges to work with. I describe the datasets and discuss the approach to find insights from them. We will implement the solution together during data science classes. I am sure you will agree that no other training institute in Ameerpet, Hyderabad comes even close to teaching hands-on knowledge to such an extent.

The first few days will be used to strengthen your fundamentals in various technologies.You will start with training in Python , R and SQL . We will work on projects after getting a reasonable understanding in these areas.

In the recent past , Python has become the language of choice for data scientists . (https://www.kdnuggets.com/2018/05/poll-tools-analytics-data-science-machine-learning-results.html) . I expect this trend to continue, with increasing developments in it’s ecosystem.

## Projects covered in this course that gets you jobs

The first project covered during the lessons is Exploring Hacker News. We will then understand visualization using a College Majors dataset. For data cleaning and munging , we will use Star Wars Survey dataset (https://data.world/fivethirtyeight/star-wars-survey) . You get to build an analytics pipeline using spark ! I am sure you will agree with me that none of the courses in Hyderabad have such content . Note we also learn to build machine learning models using Car price prediction dataset (https://archive.ics.uci.edu/ml/datasets/automobile ) and many more.

## What is the demand for data scientists now ?

The demand for data scientists is huge. The reason is the demand vs supply gap. The salaries in this field are moving above $100,000 per year in the US. In India, there is an estimated shortage of about 200000 data scientists

If you want to become one of the lucky, well-compensated professionals filling the current gap in this field, then you should learn data science now. Make use of this opportunity right away.

## Who can learn data science and upskill themselves?

- Graduate students looking for a trending field to get into ( https://in.linkedin.com/jobs/data-scientist-jobs )

- If you are looking for a rigorous data science bootcamp

- IT professionals who want a change from legacy technologies

- Trainees with no knowledge of hadoop concepts

- If you want to attend an educational programme with experiential learning ( https://elearningindustry.com/project-based-learning-better-traditional-classroom )

- You want your learning to be immediately useful and job oriented.

Develop the latest skills in this field for a better future. The schedule will suit anyone with a commitment to learn.

## About the faculty

I am an IT professional with 15+ years of real-time experience.I have been working in the field of big data and data science for the last 5 years. I’ve bundled up my knowledge and started building Honing Data Science. My mission is to teach you the main concepts applied across the industry. I will train all of you with the skills needed to work in Data Science jobs, for a low price.

## The problem with classroom coaching

Have you been to institutes promising to teach you Data Science, but didn’t get a Data Scientist job due to lack of quality coaching? The reason is that you did not learn the real knowledge to get a Data Scientist job. Note that employers don’t care what courses you have done. Not every course in the market will teach you the concepts of mathematics, statistics, information science and computer science to make you capable of a career in business analytics.

## Go through my curriculum that delivers results.

1. The print() Command

2. Syntax Errors

3. Code Comments

4. Arithmetical Operations

5. Variables

6. Variable Names

7. Integers and Floats

8. Conversion Between Types

9. Strings

10. Escaping Special Characters

11. String Operations

12. Lists

13. Indexing

14. Negative Indexing

15. Retrieving Multiple List Elements

16. List Slicing

17. List of Lists

18. Opening a File

19. Repetitive Processes

20. For Loops

21. CSV File Handling

Conditional statements

1. If Statements

2. Booleans

3. Multiple Conditions

4. The or Operator

5. Combining Logical Operators

6. Comparison Operators

7. The else Clause

8. The elif Clause

Dictionaries and frequency tables

1. Dictionaries

2. Alternative Way of Creating a Dictionary

3. Key-Value Pairs

4. Checking for Membership

5. Counting with Dictionaries

6. Finding the Unique Values

7. Proportions and Percentages

8. Looping over Dictionaries

9. Frequency Tables for Numerical Columns

Functions

1. Built-in Functions

2. Creating Our Own Functions

3. The Structure of a Function

4. Parameters and Arguments

5. Extract Values From Any Column

6. Creating Frequency Tables

7. Writing a Single Function

8. Reusability and Multiple Parameters

9. Keyword and Positional Arguments

10. Combining Functions

11. Debugging Functions

12. Interfering with the Built-in Functions

13. Variable Names and Built-in Functions

14. Default Arguments

15. Multiple Return Statements

16. Returning Multiple Variables

17. More About Tuples

18. Functions — Code Running Quirks

19. Scopes — Global and Local,Searching Order

Jupyter Notebook

1. Running Code Using the Keyboard

2. Keyboard Shortcuts

3. Hidden State

4. Text and Markdown Cells

5. Installing Jupyter Locally

6. Opening and Closing a Notebook File

7. Absolute and Relative Paths

Cleaning and preparing data in python

1. Introducing Data Cleaning

2. Replacing Substrings with the replace Method

3. String Capitalization

4. Errors During Data Cleaning

5. Parsing Numbers from Complex Strings

Data analysis basics

1. Summarizing Numeric Data

2. String Formatting

3. Formatting Numbers Inside Strings

4. Function to Summarize Data

Object Oriented Programming Concepts

1. Introducing Classes, Objects, and Methods

2. Defining a Class

3. Creating Methods

4. Attributes and the ‘Init’ Method

5. Dunder Methods

Working with dates and times

1. The Datetime Module

3. The Datetime Class

4. Using Strptime to Parse Strings as Dates

5. Using Strftime to format dates

6. The Time Class

7. The Date Class

8. Calculations with Dates and Times

9. Epoch Time

object oriented programming

1. Solving Problems with Code

2. Defining Custom Classes

3. Instance Properties

4. Instance Methods

5. Class Methods

6. Inheritance

7. Overloading Inherited Behavior

exception handling

1. Handling Exceptions

2. Overloading Comparison Operators

lambda functions

1. String Manipulation

2. Omitting Starting or Ending Indices

3. Skipping Indices in a Slice with Steps

4. Negative Indexing

5. Searching for Substrings

6. First-Class Functions

7. Lambda Functions

Intro to numpy

1. Vectorization

2. Understanding NumPy ndarrays

4. Selecting and Slicing Rows and Items from ndarrays

5. Selecting Columns and Custom Slicing ndarrays

6. Vector Math

7. Arithmetic Numpy Functions

8. Statistics For 1D and 2D ndarrays

9. Adding Rows and Columns to ndarrays

10. Sorting ndarrays

Boolean indexing with numpy

1. Reading CSV files with NumPy

2. Boolean Arrays

3. Boolean Indexing with 1D & 2D ndarrays

4. Assigning Values in ndarrays

Intro to pandas

1. DataFrames

2. Select columns From a DataFrame by Label

3. Column selection shortcuts

4. Selecting Items from a Series by Label

5. Selecting Rows From a DataFrame by Label

6. Series and Dataframe Describe Methods

7. Using Boolean Indexing with pandas Objects

8. Using Boolean Arrays to Assign Values

Exploring data with pandas

1. Using iloc to select by integer position

2. Reading CSV files with pandas

3. Working with Integer Labels

4. Using pandas methods to create boolean masks

5. Boolean Operators

6. Pandas Index Alignment

7. Using Loops with pandas

Combining data with pandas

1. The Concat Function

2. Combining Dataframes with Different Shapes Using the Concat Function

3. Joining Dataframes with the Merge Function

4. Joining on Columns with the Merge Function

5. Left Joins with the Merge Function

6. Join on Index with the Merge Function

Transforming data with pandas

1. Apply a Function Element-wise Using the Map and Apply Methods

2. Apply a Function Element-wise to Multiple Columns Using Applymap Method

3. Apply Functions along an Axis using the Apply Method

4. Reshaping Data with the Melt Function

Working with strings in pandas

1. Using Apply to Transform Strings

2. Vectorized String Methods Overview

3. Exploring Missing Values with Vectorized String Methods

4. Finding Specific Words in Strings

5. Extracting Substrings from a Series

6. Extracting All Matches of a Pattern from a Series

7. Extracting More Than One Group of Patterns from a Series

Working with missing and duplicate data

1. Identifying Missing Values

2. Correcting Data Cleaning Errors that Result in Missing Values

3. Visualizing Missing Data

4. Using Data From Additional Sources to Fill in Missing Values

5. Handling Duplicate Values

6. Handle Missing Values by Dropping Columns

7. Analyzing Missing Data

8. Handling Missing Values with Imputation

9. Dropping Rows

Processing Large Datasets In Pandas

optimizing dataframe memory footprint

1. How Pandas Represents Values in a Dataframe

2. Different Types Have Different Memory Footprints

3. Calculating the True Memory Footprint

4. Optimizing Integer and Float Columns With Subtypes

5. Converting To DateTime

6. Converting to Categorical to Save Memory

7. Selecting Types While Reading the Data

Processing dataframes in chunks

1. Processing Chunks

2. Batch Processing

3. Optimizing Performance

4. Combining Chunks Using GroupBy

Augmenting Pandas With SQLite

1. Pandas Types vs. SQLite Types

2. Setting Appropriate Types

3. Computing Primarily in Pandas

4. Reading in SQL Results Using Chunks

1. Reading CSV Files with Encodings

2. Cleaning Column Names

3. Converting String Columns to Numeric

4. Extracting Values from Start and End of Strings

5. Correcting Bad Values

6. Dropping Missing Values

7. Filling Missing Values

Data cleaning walkthrough- Analyzing and visualizing the data

1. Finding Correlations With the r-value

2. Plot Accessor

3. Mapping the data With Basemap

Data munging

1. Data exploration

2. Filtering

3. Consolidating datasets

4. Counting

Data cleaning and exploration using csvkit

1. Csvkit

2. Csvstack

3. Csvlook

4. Csvcut

5. Csvstat

6. Csvgrep

Line charts

1. Visual Representation

2. Introduction to Matplotlib

3. Fixing Axis Ticks

4. Adding Axis Labels And A Title.

Multiple Plots

1. Matplotlib Classes

2. Grid Positioning

3. Formatting And Spacing

4. Overlaying Line Charts

5. Adding More Lines

6. Adding A Legend

Bar plots and scatter plots

1. Bar Plot

2. Creating Bars

3. Aligning Axis Ticks And Labels

4. Horizontal Bar Plot

5. Scatter plot

6. Switching axes

7. Benchmarking correlation

Histograms and box plots

1. Frequency Distribution

2. Binning

3. Histogram In Matplotlib

4. Comparing histograms

5. Quartiles

6. Box Plot

7. Multiple Box Plots

Improving plot aesthetics

1. Aesthetics

2. Data-Ink Ratio

3. Hiding Tick Marks

4. Hiding Spines

Color , layout and annotations

1. Setting Line Color Using RGB

2. Setting Line Width

3. Layout and Ordering

4. Replacing the Legend With Annotations

5. Annotating in Matplotlib

Conditional plots

1. Histograms In Seaborn

2. Kernel Density Plot

3. Modifying The Appearance Of The Plots

4. Conditional Distributions

5. Creating Conditional Plots

6. Adding A Legend

Visualizing geographic data

1. Geographic Data and Coordinate Systems

2. Basemap

3. Converting From Spherical to Cartesian Coordinates

4. Customization Using Basemap and Matplotlib

5. Great Circles

1. Using Loops to Aggregate Data

2. The GroupBy Operation

3. Common Aggregation Methods with Groupby

4. Aggregating Specific Columns with Groupby

5. Agg() Method

6. Computing Multiple and Custom Aggregations with the Agg() Method

7. Pivot Tables

Command line basics

1. Overview of the Filesystem

2. Absolute vs. Relative Paths

3. The Home Directory

4. Making a New Directory

5. Using Command Options to Modify Behavior

6. Reviewing Available Command Options

7. Listing the Contents of a Directory

8. Removing a Directory

Working with files

1. Making a File

2. Standard Streams

3. Editing a File

4. File Permissions

5. Moving Files

6. Copying Files

7. Overview of File Extensions

8. Deleting a File

9. Bypassing Permissions as the Root User

Working with programs

1. Setting and accessing variables

2. Environment Variables

3. Calling Programs

4. The PATH Variable

5. Long Flags

Command line python scripting

1. Introduction to Command Line Python

2. Using Different Python Versions

3. Installing Packages that Extend Python

4. Overview of Virtual Environments

5. Creating and activating a Python 3 virtualenv

6. Verifying the Installed Packages

7. Importing Saved Functions into a File

8. Accessing Command Line Arguments

9. Deactivating a virtualenv

Working with the jupyter console

1. Getting help in Jupyter console

2. Persistent sessions

3. Jupyter magics

4. Autocompletion

5. Accessing the shell

6. Pasting in code

Piping and redirecting output

1. Appending

2. Redirecting from a file

3. The grep command

4. Special characters

5. The star wildcard

6. Piping output

7. Chaining commands

8. Escaping characters

GitHub training for data scientists

1. Version Control Systems

2. The .git Folder

3. Creating Files in the Repository

4. Checking File Status

5. Configuring Identity in Git

6. Committing Changes

7. Viewing File Differences

8. Making a Second Commit

9. Reviewing the Commit History

10.Viewing Commit Differences

git remotes

1. Remote Repositories

2. Making Changes to Cloned Repositories

3. Overview of the Master Branch

4. Pushing Changes to the Remote

5. Viewing Individual Commits

6. Commits and the Working Directory

7. Switching to a Specific Commit

8. Pulling From a Remote Repo

9. Referring to the Most Recent Commit

git branches

1. Switching Branches

2. Pushing a Branch to a Remote

3. Merging Branches

4. Deleting Branches

5. Checking Out Branches From the Remote

6. Finding Differences Across Branches

7. Branch Naming Conventions

8. Branch History

merge conflicts

1. Aborting a Merge

2. Resolving Conflicts

3. Accepting Changes From Only One Branch

4. Ignoring Files

5. Removing Cached Files

git installation and github integration

1. Installing Git

2. Configuring Git

3. Creating a GitHub Account

4. Authenticating with GitHub

1. Introduction to Databases

2. Previewing A Table Using SELECT

3. Filtering Rows Using WHERE

4. Expressing Multiple Filter Criteria Using AND

5. Returning One of Several Conditions With OR

6. Grouping Operators With Parentheses

7. Ordering Results Using ORDER BY

8. Practice Writing A Query

summary statistics

1. Finding a Column’s Minimum and Maximum Values in SQL

2. Calculating Sums and Averages in SQL

3. Combining Multiple Aggregation Functions

4. Customizing The Results

5. Counting Unique Values

6. Performing Arithmetic in SQL

group summary statistics

1. Calculating Group-Level Summary Statistics

2. GROUP BY

3. Querying Virtual Columns With the HAVING Statement

4. ROUND()

5. Nesting functions

6. Casting

subqueries

1. Subquery In SELECT

2. Returning Multiple Results In Subqueries

3. Building Complex Subqueries

4. Integrating A Subquery With The Outer Query

Querying sqlite from python

1. Connecting to the Database

2. Introduction to Cursor Objects and Tuples

3. Working With Sequences of Values as Tuples

4. Creating a Cursor and Running a Query

5. Execute as a Shortcut for Running a Query

6. Fetching a Specific Number of Results

7. Closing the Database Connection

Joining data in sql

1. Introducing Joins

2. Inner Joins

3. Left Joins

4. Right Joins and Outer Joins

5. Combining Joins with Subqueries

Intermediate joins in sql

1. Working With Larger Databases

2. Joining Three Tables

3. Combining Multiple Joins with Subqueries

4. Recursive Joins

5. Pattern Matching Using Like

6. Generating Columns With The Case Statement

Building and organizing complex queries

1. Writing Readable Queries

2. The With Clause

3. Creating Views

4. Combining Rows With Union, Intersect and Except

5. Multiple Named Subqueries

Table relations and normalization

1. The SQLite Shell

2. Creating Tables

3. Primary and Foreign Keys

4. Database Normalization

5. Inserting and Deleting Rows

6. Adding Columns to a Table

7. Adding Values to Existing Rows

Designing and creating a database

1. Importing Data into SQLite

2. Planning a Normalized Schema

3. Creating Tables Without Foreign Key Relations

Using postgresql

1. SQLite vs PostgreSQL

2. PostgreSQL overview

3. Psycopg2

4. Creating a table

5. SQL Transactions

6. Autocommitting

7. Executing queries

8. Creating a database

9. Deleting a database

Command line postgresql

1. The psql

2. Running SQL queries

3. Special PostgreSQL commands

4. Switching databases

5. Creating users

6. Adding permissions

7. Removing permissions

8. Superusers

Postgresql installation

1. Installing PostgreSQL

2. Psycopg2

3. Connecting to PostgreSQL from psycopg2

Indexing

1. Query planner

2. Explain query plan

3. Data representation

4. Time complexity

5. Search and rowid

6. Indexing

7. Create an index

Multi column indexing

1. Query Plan

2. Creating a multi-column index

3. Covering index

1. What’s an API?

2. Introduction to API Requests

3. Types of Requests

4. Understanding Status Codes

5. Adding Query Parameters

6. JSON Format

7. Getting JSON From a Request

8. Content Type

9. API Authentication

10. Endpoints and Objects

11. Pagination

12. User-Level Endpoints

13. POST Requests

14. PUT/PATCH Requests

15. DELETE Requests

Web scraping

1. Web Page Structure

2. Retrieving Elements from a Page

3. Using Find All

4. Element IDs

5. Element Classes

6. Using CSS Selectors

sampling in data science

1. Solving Problems with Statistics

2. Populations and Samples

3. Sampling Error

4. Simple Random Sampling

5. The Importance of Sample Size

6. Stratified Sampling

7. Choosing the Right Strata

8. Cluster Sampling

9. Descriptive and Inferential Statistics

Variables in Statistics

1. Quantitative and Qualitative Variables

2. Scales of Measurement

3. The Nominal Scale

4. The Ordinal Scale

5. The Interval Scale and Ratio Scales

6. Discrete and Continuous Variables

7. Real Limits

Frequency distribution

1. Frequency Distribution Tables

2. Sorting Tables for Ordinal Variables

3. Proportions and Percentages

4. Percentiles and Percentile Ranks

5. Finding Percentiles with pandas

6. Grouped Frequency Distribution Tables

7. Information Loss

8. Readability for Grouped Frequency Tables

9. Frequency Tables and Continuous Variables

Visualizing frequency distributions

1. Visualizing Distributions

2. Bar Plots

3. Horizontal Bar Plots

4. Pie Charts

5. Customizing a Pie Chart

6. Histograms

7. The Statistics Behind Histograms

8. Histograms as Modified Bar Plots

9. Binning for Histograms

10. Skewed Distributions

11. Symmetrical Distributions

Comparing frequency distributions

1. Comparing Frequency Distributions

2. Grouped Bar Plots

3. Comparing Histograms

4. Kernel Density Estimate Plots

5. Drawbacks of Kernel Density Plots

6. Strip Plots

7. Box plots

8. Outliers

The Mean

1. The Mean as a Balance Point

2. Defining the Mean Algebraically

3. Estimating the Population Mean

4. Estimates from Low-Sized Samples

5. Variability Around the Population Mean

6. The Sample Mean

The weighted mean and the median

1. Different Weights

2. The Weighted Mean

3. The Median for Open-ended Distributions

4. Distributions with Even Number of Values

5. The Median as a Resistant Statistic

6. The Median for Ordinal Scales

7. Sensitivity to Changes

The mode

1. The Mode for Ordinal, Nominal and Discrete Variables

2. Special Cases

3. Skewed Distributions

4. Symmetrical Distributions

Measures of variability

1. The Range

2. The Average Distance

3. Mean Absolute Deviation

4. Variance

5. Standard Deviation

6. Average Variability Around the Mean

7. A Measure of Spread

8. The Sample Standard Deviation

9. Bessel’s Correction

10. Standard Notation

11. Sample Variance — Unbiased Estimator

Z-scores

1. Locating Values in Different Distributions

2. Transforming Distributions

3. The Standard Distribution

4. Standardizing Samples

5. Using Standardization for Comparisons

6. Converting Back from Z-scores

1. R Programming and Data Science

2. Evaluating Expressions in R

3. Adding Notes to Your Code Using Comments

4. Assigning Values to a Variable

5. Performing Calculations Using Variables

6. Creating Vectors

7. Using a Function to Calculate the Mean

8. Performing Operations on Vectors

working with vectors

1. Indexing Vectors by Position

2. Numeric and Character Data Types

3. Naming Elements of a Vector

4. Indexing Vectors Using Names

5. Comparing Values And Logical Data Types

6. Comparing Single Values Against Vectors

7. Logical Indexing

8. Performing Arithmetic with Vectors

9. Vector Recycling

10. Appending Elements To A Vector

working with matrices

1. Matrices: Two-Dimensional Data Structures

2. Combining Vectors into Matrices

3. Naming Matrix Rows and Columns

4. Finding Matrix Dimensions

5. Adding Columns to Matrices

6. Indexing Matrices By Element

7. Subsetting Matrices by Rows and Columns

working with lists

1. Lists: Objects That Can Contain Multiple Data Types

2. Anatomy of a List

3. Assigning Names to List Objects

4. Indexing Lists

5. Modifying List Elements

6. Adding Elements to Lists

7. Combining Lists

working with dataframes

1. Introduction to Data Frames

2. Installing Packages

3. Importing Data into R

4. Tibbles: Specialized Data Frames

5. Indexing Data Frames

6. Selecting Data Columns

7. Adding a New Column

8. Filtering by a Single Condition

9. Filtering by Multiple Conditions: Meeting At Least One Criterion

10. Filtering by Multiple Conditions

11. Arranging Data Frames by Variables

install rstudio

1. Introduction to RStudio

2. Installing R

3. Installing RStudio

4. Working in the Console

5. The Global Environment

6. Importing Data

7. Writing Scripts

working with control structures

1. Introduction to Control Structures

2. Importing the Data

3. Selection: Writing Conditional Statements

4. Repetition: Writing For-Loops

5. Looping Over Rows of a Data Frame

6. Nested Control Structures

7. Storing For-Loop Output in Objects

8. More Than Two Cases: Writing a For-Loop

working with vectorized functions

1. R Functions as Alternatives to Loops

2. How Does Vectorization Make Code Faster?

3. A Vectorized Function for If-Else Statements

4. Multiple Cases: Nesting Functions to Chain If-Else Statements

5. Functions for Solving “Split-Apply-Combine” Problems

6. Grouping and Summarizing Data Frames

7. Summarizing a Data Frame by Multiple Variables

8. Chaining Functions Together Using the Pipe Operator

writing custom functions

1. Introduction to Writing Your Own Functions

2. Anatomy of a Function

3. When to Write a Function

4. Writing Functions with Two Variables as Arguments

5. Writing Functions for Conditional Execution

6. Functions with More Than Two Arguments

working with functionals

1. Working With Functionals From the Tidyverse purrr Package

2. Using Functionals to Apply Custom Functions

3. Functionals to Return Vectors of Specified Types

4. Functionals for Two-Variable Functions

5. Functionals for Returning Vectors of Specific Types from Functions With Two Variables

6. Functionals for Functions with More Than Two Variable

string manipulation

1. Working With Strings

2. Subsetting Strings by Position

3. Splitting Strings

4. Combining Strings

5. String Manipulations for Reformatting Match Dates

6. Padding Strings

7. Creating New Variables

8. Combining Strings to Create Match Summaries

data visualization in R with ggplot

creating line graphs

1. Introduction to Data Visualization

2. Using Plots to Visualize Patterns in Data

3. Data Visualization and the Grammar of Graphics

4. Mapping Variables to Axes

5. Adding Geometric Objects to Visualize Data Points

6. Selecting Data for Visualization

7. Adding Graph Titles and Changing Axis Labels

8. Refining Graph Aesthetics

9. Using Your Line Graph to Understand the Data

creating multiple line graphs

1. Visualizing Data for Multiple Populations

2. Manipulating the Data for Visualization

3. Graphing a Subset of Data

4. Exploring the Data Further

5. Manipulating Multiple Line Graph Aesthetics

6. Deciding How to Present the Data

1. Visualizing Differences Among Groups Using Bar Charts

2. Using Histograms to Understand Distributions

3. Comparing Distributions of Multiple Variables: Faceted Plots

4. Comparing Distributions of Multiple Variables: Specifying Aesthetics

5. Visualizing Averages and Variation

6. Anatomy of a Box Plot

7. Deciding on a Visualization

scatterplots for exploratory analysis

1. Importing and Modifying Data

2. Understanding Relationships Between Variables

3. Creating Informative Scatter Plots

4. Creating Multiple Scatter Plots

5. Write a Function to Create Multiple Scatter Plots

6. Learning from Scatter Plots

data cleaning with r

string manipulation and relational data

1. Tidy Data and Efficient Analysis

2. Parsing Numbers from Strings

3. Extracting Numeric Data From Strings: Creating New Variables

4. Splitting Strings

5. Subsetting strings

6. Relational Data: Keys and Joins

7. Inner Joins

8. Outer Joins

9. Using Joins to Create A Single Data Frame

correlations and reshaping data

1. Visualizing Relationships Between Variables Using Scatter Plots

2. Reshaping Data for Visualization

3. Gathering Data into Columns

4. Comparing the Strength of Relationships Among Pairs of Variables

5. Correlation Analysis: Measuring the Strength of Relationships Between Variables

6. Creating and Interpreting Correlation Matrices

7. Identifying Interesting Relationships

dealing with missing data

1. Defining “Missing Data”

2. Contagious Missing Values

3. Dropping Rows With Missing Values for one Variable

4. Complete Cases: Dropping All Rows With Missing Data

5. Using Complete Cases: When to Avoid

6. Understanding Effects of Different Techniques for Handling Missing Data

7. Imputing to Replace Missing Data

1. Probability basics

2. Calculating probability

3. Conjunctive probability

4. Dependent probabilities

5. Disjunctive probability

6. Disjunctive dependent probabilities

7. Disjunctive probabilities with multiple conditions

Calculating probabilities

1. Calculating probabilities

2. Number of combinations formula

3. Function to calculate the probability of a single combination

probability distributions

1. Binomial distributions

2. Computing the distribution

3. Plotting the distribution

4. Simplifying the computation

5. Computing the mean of a probability distribution

6. Computing the standard deviation

7. The normal distribution

8. Cumulative density function

9. Calculating z-scores

10. Faster way to calculate likelihood

significance testing

1. Hypothesis testing

2. Research design

3. Statistical significance

4. Test statistic

5. Permutation test

6. Sampling distribution

7. Dictionary representation of a distribution

8. P value

chi squared tests

1. Observed and expected frequencies

2. Calculating differences

3. Generating a distribution

4. Smaller samples

5. Sampling distribution equality

6. Degrees of freedom

7. Increasing degrees of freedom

8. Using SciPy

multi category chi squared tests

1. Multiple categories

2. Calculating expected values

3. Calculating chi-squared

4. Cross tables

5. Finding expected values

K-nearest neighbors

1. Euclidean distance

2. Calculate distance for all observations

3. Randomizing, and sorting

5. Make predictions

Evaluating model performance

1. Testing quality of predictions

2. Error Metrics

3. Mean Squared Error

5. Root Mean Squared Error

6. Comparing MAE and RMSE

multivariate k nearest neighbors(knn)

1. Removing features

2. Handling missing values

3. Normalize columns

4. Euclidean distance for multivariate case

5. scikit-learn

6. Fitting a model and making predictions

7. Calculating MSE using Scikit-Learn

Hyperparameter optimization

1. Expanding grid search

2. Visualizing hyperparameter values

3. Varying features and hyperparameters

Cross validation

1. Holdout Validation

2. K-Fold Cross Validation

3. Training models

4. Performing K-Fold Cross Validation Using Scikit-Learn

5. Exploring Different K Values

6. Bias-Variance Tradeoff

1. Linear Function

2. Slope and y-intercept

3. Math Behind Slope

4. Nonlinear function

5. Secant Lines

6. Secant Lines And Slope

7. Tangent Line

8. Limits

9. Defined vs. Undefined Limits

10. SymPy

11. Properties Of Limits

12. Undefined Limit To Defined Limit

13. Introduction To Derivatives

14. Differentiation

15. Critical Points

16. Extreme Values

17. Power Rule

18. Linearity Of Differentiation

19. Practicing Finding Extreme Values

1. Overview Of Linear Algebra

2. Solving Linear Systems By Elimination

3. Representing Functions In General Form

4. Representing An Augmented Matrix In NumPy

5. Matrix Representation

6. Row Operations

7. Simplifying Matrix To Echelon Form

8. Row Reduced Echelon Form

vectors

1. From Matrices To Vectors

2. Geometric Intuition Of Vectors

3. Vector Operations

4. Scaling Vectors

5. Vectors In NumPy

6. Dot Product

7. Linear Combination

8. The Matrix Equation

matrix algebra

1. Basic Matrix Operations

2. Matrix Vector Multiplication

3. Matrix Multiplication

4. Matrix Transpose

5. Identity Matrix

6. Matrix Inverse

7. Solving The Matrix Equation

8. Determinant For Higher Dimensions

9. Matrix Inverse For Higher Dimensions

solution sets

1. Inconsistent Systems

2. Singular Matrix

3. Possible Solutions For Nonhomogenous Systems

4. Homogenous Systems

5. Summary of Linear Systems

linear regression for machine learning

the linear regression model

1. Instance Based Learning Vs. Model Based Learning

2. Simple Linear Regression

3. Least Squares

4. Using Scikit-Learn To Train And Predict

5. Making Predictions

6. Multiple Linear Regression

feature selection

1. Missing Values

2. Correlating Feature Columns With Target Column

3. Correlation Matrix Heatmap

4. Train And Test Model

5. Removing Low Variance Features

gradient descent

1. Single Variable Gradient Descent

2. Derivative Of The Cost Function

3. Multi Parameter Gradient Descent

4. Gradient Of The Cost Function

5. Gradient Descent For Higher Dimensions

ordinary least squares

1. Cost Function

2. Derivative Of The Cost Function

3. Gradient Descent vs. Ordinary Least Squares

processing and transforming features

1. Categorical Features

2. Dummy Coding

3. Transforming Improper Numerical Features

4. Imputing Missing Values

logistic regression

1. Classification

3. Logistic regression

4. Logistic function

5. Training a logistic regression model

6. Plotting probabilities

7. Predict labels

evaluating binary classifiers

1. Accuracy

2. Binary classification outcomes

3. Sensitivity

4. Specificity

multiclass classification

1. Dummy variables

2. Multiclass classification

3. Training a multiclass logistic regression model

4. Testing the models

overfitting

1. Bias and Variance

2. Bias-variance tradeoff

3. Multivariate models

4. Cross validation

5. Plotting cross-validation error vs. cross-validation variance

Clustering overview

1. Initial clustering

2. Exploring the clusters

3. Plotting out the clusters

4. Finding the most extreme

k means clustering

decision trees

1. Converting Categorical Variables

2. Decision Trees as Flows of Data

3. Splitting Data to Make Predictions

4. Data Set Entropy

5. Information Gain

6. Finding the Best Split

7. Build the Whole Tree

building a decision tree

1. ID3 Algorithm

2. Determining the Column to Split On

3. Simple Recursive Algorithm

4. Storing the decision Tree

5. Printing Labels for a More Attractive Tree

6. Making Predictions With the Printed Tree

Applying decision trees

1. Using Decision Trees With scikit-learn

2. Splitting the Data into Train and Test Sets

3. Evaluating Error With AUC

4. Computing Error on the Training Set

5. Decision Tree Overfitting

6. Reducing Overfitting With a Shallower Tree

7. Tweaking Parameters, Tree Depth to Adjust AUC

8. Underfitting in Simplistic Trees

9. The Bias-Variance Tradeoff

10. Exploring Decision Tree Variance

11. Pruning Leaves to Prevent Overfitting

12. Knowing When to Use Decision Trees

random forests

1. Combining Model Predictions With Ensembles

2. Why Ensembling Works

3. Variation With Bagging

4. Selecting Random Features

5. Random Subsets in scikit-learn

6. Tweaking Parameters to Increase Accuracy

7. Reducing Overfitting

8. When to Use Random Forests

Representing neural networks

1. Nonlinear Models

2. Computational Graphs

3. A Neural Network That Performs Linear Regression

4. Generating Regression Data

5. Fitting A Linear Regression Neural Network

6. Generating Classification Data

7. Implementing A Neural Network That Performs Classification

nonlinear activation functions

1. ReLU Activation Function

2. Trigonometric Functions

3. Reflecting On The Tangent Function

4. Hyperbolic Tangent Function

Hidden Layers in neural networks

1. Generating Data That Contains Nonlinearity

2. Hidden Layer With A Single Neuron

3. Training A Neural Network Using Scikit-learn

4. Hidden Layer With Multiple Neurons

4. Multiple Hidden Layers

Binary classification in data science

Learning to participate in Kaggle competitions

1. Creating Our First Machine Learning Model

2. Making Predictions and Measuring their Accuracy

3. Using Cross Validation for More Accurate Error Measurement

4. Making Predictions on Unseen Data

5. Creating a Submission File

6. Making Our First Submission to Kaggle

feature preparation , selection and engineering

1. Training a model using relevant features

2. Submitting our Improved Model to Kaggle

3. Engineering a New Feature Using Binning

4. Finding Correlated Features

5. Final Feature Selection using RFECV

6. Training A Model Using our Optimized Columns

7. Submitting our Model to Kaggle

model selection and tuning

1. Model Selection

2. Training a Baseline Model

3. Training a Model using K-Nearest Neighbors

4. Exploring Different K Values

5. Automating Hyperparameter Optimization with Grid Search

6. Submitting K-Nearest Neighbors Predictions to Kaggle

7. Tuning our Random Forests Model with GridSearch

8. Submitting Random Forest Predictions to Kaggle

Using naive bayes for sentiment analysis

k nearest neighbors

1. Understanding the kNN Algorithm

2. Finding Similar Rows With Euclidean Distance

3. Normalizing Columns

4. Finding the Nearest Neighbor

5. Generating Training and Testing Sets

6. Using sklearn

7. Computing Error

Natural language processing

1. Tokenization

2. Preprocessing Tokens to Increase Accuracy

3. Assembling a Matrix of Unique Words

4. Counting Token Occurrences

5. Removing Columns to Increase Accuracy

6. Splitting the Data Into Train and Test Sets

7. Making Predictions With fit()

8. Calculating Prediction Error

1. A Brief History of Big Data

2. The Spark

3. Resilient Distributed Data Sets (RDDs)

4. SparkContext

5. Lazy Evaluation

6. Pipelines

7. Python and Scala

8. ReduceByKey()

9. Filter

10.PySpark Shell

transformations and actions

1. The Map Method

2. Beyond Lambda Functions

3. The FlatMap Method

4. Filter Using a Named Function

5. Actions

spark dataframes

1. Reading in Data

2. Schema

3. Pandas vs Spark DataFrames

4. Row Objects

5. Selecting Columns

6. Filtering Rows

7. Using Column Comparisons as Filters

8. Converting Spark DataFrames to pandas DataFrames

spark sql

1. Register the DataFrame as a Table

2. Querying

3. Filtering

4. Mixing Functionality

5. Multiple tables

6. Joins

7. SQL Functions

1. Installing PostgreSQL

2. Psycopg2

3. Connecting to PostgreSQL from psycopg2

functional programming

1. Comparing Object-Oriented to Functional

2. Understanding Pure Functions

3. The Lambda Expression

4. The Map Function

5. The Filter Function

6. The Reduce Function

7. Rewriting with List Comprehension

8. Writing Function Partials

9. Using Functional Composition

pipeline tasks

1. Generators in Python

2. Generator Comprehension

3. Manipulating Generators in Tasks

4. Data Cleaning in Parse Log

5. Write to CSV

6. Chaining Iterators

7. Counting Unique Request Types

8. Task Reusability

Building a pipeline class

1. Inner Functions

2. Function Closures

3. Python Decorators

4. Method Decorators

5. Decorator Arguments

6. Running the Pipeline

Building a multiple dependency pipeline

1. Intro to DAGs

2. The DAG Class

3. Sorting the DAG

4. Finding Number of In Degrees

5. Enhance the Add Method

6. Adding DAG to the Pipeline

## How will you benefit from my sessions?

There are a lot of training courses and university programs out there for teaching data science already. They also provide certification!However, most of these courses have two main issues. They are neither interesting nor practical. (https://www.kdnuggets.com/2018/11/get-hired-as-data-scientist.html) .

Most of the training programmes are obsessed with covering so many topics that you will get lost in the detail. You will spend most of the time in learning things that are not of use in real time scenario. In this course, I will cover most of the techniques used in the majority of cases(https://www.datasciencecentral.com/profiles/blogs/40-techniques-used-by-data-scientists). This way, you will acquire useful data science knowledge much faster.

If you want to become a professional data scientist, my data science course in hyderabad will be the best starting point(http://businessoverbroadway.com/2018/01/15/top-machine-learning-and-data-science-methods-used-at-work/). Once you go through the schedule, you will have gained a strong foundation to crack the interviews. So dive in !

## Commonly asked questions on the course

A common question that you may have is, how is this course different from other courses , like the John Hopkin data science course or the courses that institutes in Hyderabad offer.

The main difference is , this educational programme is interactive.General educational platforms are not set up to do live hands on coding with real time datasets.The usual method of instruction** , **is to set up lessons in video or text or in a classroom , and the students learn in passive manner.

As per a study , students who learned passively , were 1.5 times more likely to fail to reach their end goals. Hence , I ensure that you get hands on coding as part of my lectures** .
**Another question that you may have is , about the duration of the training. The answer is, it takes about 2 months to complete the data science lessons. Considering the in-depth content and the pace at which the trainees understand, the duration might increase or decrease by a week sometimes.

## References

Below given are some of the posts I have referred, to write this content.

- Research paper showing that students in online learning conditions perform much better than those receiving face to face instruction.
- Employees need to have data know-how in 2019 and beyond
- Essential skills needed to become a data scientist.
- Data cleaning needs to be learnt by anyone aspiring to be a data scientist.
- Usage of calculus for implementing algorithms in data science and machine learning.