CS 360 : Introduction to Data Science and Machine learning
Course Code CS 360
Course Name Introduction to Data Science and Machine learning
Offered to UG/PG
Pre-requisites NIL
Lecture 3
Tutorial 0
Practical 0
Credit 6
Reference 1. Pattern Recognition and Machine Learning. C. Bishop
2. Elements of Statistical Learning. Hastie, Tibshirani, Friedman.,br> 3. CS229: Machine Learning by Dan Boneh and Andrew Ng.
4. CS391: Machine learning by Raymond J. Mooney
5. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data By Hadley
2. Wickham, Garrett Grolemund
6. The Art of Data Science by Elizabeth Matsui and Roger Peng
Description The course introduces data science and machine learning to students. Data Science:
1. R Basics - Build a foundation in R
a. how to wrangle, analyze, and visualize data common programming commands,
1. vector operations, etc.
b. Exploratory Data Analysis: Basic data visualization principles and how to apply them using R
2. Using models to explore your data
3. Probability
a. Important concepts in probability theory including random variables and
2. independence
b. Monte Carlo simulation
c. Expected values and standard errors
d. The Central Limit Theorem
4. Modeling and Inference
3. Machine Learning:
1. Basic concepts: supervised learning, unsupervised learning, reinforcement
4. learning. Aspects of developing a learning system: training data, concept
5. representation, function approximation. Discussion Sections: Linear Algebra,
6. Probability, Vectorization
2. Decision tree learning:
a. Representing concepts as decision trees. Recursive induction of decision trees.
b. Picking the best splitting attribute: entropy and information gain.
c. Searching for simple trees and computational complexity.
d. Occam's razor. Overfitting, noisy data, and pruning.
1. Supervised learning:
a. Linear Regression
e. Logistic Regression, Perceptron, Exponential family
f. Generative learning algorithms, Gaussian discriminant analysisMLE, MAP, Naive
7. Bayes (review)
g. Support vector machines, and kernel methods
h. k-Nearest Neighbors (review)
1. Practical machine learning advice:
a. Bias/variance tradeoff and error analysis
b. Learning Theory, Generalization errors + model selection, VC dimension
8. Regularization and Model Selection
i. Experimental evaluation of learning algorithms, cross-validation, learning
9. curves, statistical hypothesis testing.
1. Deep Learning:
a. NN architecture
j. Forward/Back propagation
1. Unsupervised learning:
a. Clustering. k-means, agglomerative clustering
k. The EM Algorithm, Mixture of Gaussians.
l. Principal Components Analysis, Dimensionality Reduction
m. Independent components analysis