Data Science

What does this corse contain?


This course will introduce you to what data science is and what data scientists do. You’ll discover the applicability of data science across fields, and learn how data analysis can help you make data driven decisions, as well as build machine learning models and deploy data science solution to a web app.

What do you need?


Basic knowledge of Python programming language.

 

 

Who is this course for?


▣ Software Engineer
▣ Computer Science Engineer
▣ Data Analyst

What are the Objectives?


▣ Use principles of statistics and probability to design and execute A/B tests and recommendation engines to assist businesses in making data-automated decisions.
▣ Build Machine Learning Models & make predictions
▣ Deploy a data science solution to a web app.
▣ Manipulate and analysedistributed datasets using Apache Spark
▣ Communicate results effectively to stakeholders.

Course Outlines: Basic level Data Scientist program


Module 1:

Introduction to Data Science
The Data Science Process
Communicating to Stakeholders

Module 2:

Linear Algebra
Vectors
Linear Combination
Linear Transformation and Matrices

Module 3:

Practical Statistics
Data Types
Measures of centre(mean, median, mode)
Standard Deviation, Variance, Outliers
Probability
Binomial Distribution
Conditional Probability
Bayes Rule
Normal Distribution theory
Sampling distributions and the Central Limit Theorem
Confidence Intervals
Hypothesis Testing
Type I and type II errors
P-values
Null Hypothesis, Alternate Hypothesis

Module 4:

Data Engineering
ETL pipelines
Extract –CSV, JSON, XML, SQL databases
Transform –combining, cleaning, encoding, missing data, duplicate data
Dummy data, Outlier data, scaling data
Feature Engineering
Load

Module 5:

Database Programming

Module 6:

Data Preparation, Data
Wrangling

Module 7:

Data Visualization
Data Visualization in Data Analysis
Design of Visualizations
Univariate Exploration of Data
Bar Charts
Pie Charts
Histograms
Bivariate Exploration of Data
Scatterplots and Correlation
Overplotting, Transparency, and Jitter
Heat Maps
Violin Plots
Box Plots
Clustered Bar Charts
Faceting
Line Plots
Swarm Plots
Multivariate Exploration of Data
Feature Engineering
Explanatory Visualizations
A Data Visualization in Data Analysis –Case study

Advanced level Data Scientist program


Module 1:

Time Series Analysis & Forecasting

Module 2:

Natural Language Processing
Tokenization
Stop Words
Speech tagging
Named Entity Recognition
Stemming and Lemmatization
Feature extraction
Bag of words
TF-IDF
One-Hot Encoding
Word Embeddings
Modeling

Module 3:

Deep Learning
Introduction to Neural Networks
Implementing Gradient Descent
Training Neural Networks
Keras
Deep Learning with PyTorch
Image classifier project

Module 4:

Deploying model on Web / Dashboard
Front-End, HTML, Flask
Deploying model

Module 5:

Experimental Design and Recommendations
Intro to Experiment Design and Recommendation Engines
Experiment Design & A/B Testing
Concepts in Experiment Design
Types of Experiment
Types of Sampling
Measuring Outcomes
Creating Metrics
Controlling Variables
Checking Validity
Checking Bias
Ethics in Experimentation
A SMART Mnemonic for Experiment Design
Statistical Considerations in Testing
A/B Testing Case Study

Module 6:

Recommendation Engines