STAT-5525: Data Analytics

Description: 5525: Basic techniques in data analytics including the preparation and manipulation of data for analysis and the creation of data files from multiple and dissimilar sources. The data mining and knowledge discovery process. Overview of data mining algorithms in classification, clustering, association analysis, probabilistic modeling, and matrix decompositions. Detailed study of classification methods including tree-based methods, Bayesian methods, logistic regression, ensemble, bagging and boosting methods, neural network methods, use of support vectors and Bayesian networks. Detailed study of clustering methods including k-means, hierarchical and self-organizing map methods. Prerequisite: Graduate Standing required. 5526: Techniques in unsupervised and visualized learning in high dimension spaces. Theoretical, probabilistic, and applied aspects of data analytics. Methods include generalized linear models in high dimensional spaces, regularization, lasso and related methods, principal component regression (pca), tree methods, and random forests. Clustering methods including k-means, hierarchical clustering, biclustering, and model-based clustering will be thoroughly examined. Distance-based learning methods include multi dimensional scaling, the self organizing map, graphical/network models, and isomap. Supervised learning will consist of discriminant analyses, supervised pca, support vector machines, and kernel methods.

Pathways: N/A

Course Hours: 3 credits

Prerequisites: N/A

Required By: ADS-5224, STAT-5234

Corequisites: N/A

Crosslist: N/A

Repeatability: N/A

Last 3 Years

Last 5 Years

Last 10 Years

All

Sections Taught: 24

Average GPA: 3.78 (A)

Strict A Rate (No A-) : 63.78%

Average Withdrawal Rate: 0.11%


Bodicherla A Prakash	2018	47.3%	46.3%	6.4%	0.0%	0.0%	0.0%	3.38	2
Robert B Gramacy	2019	70.5%	29.3%	0.0%	0.0%	0.0%	0.0%	3.67	1
Anuj Karpatne	2021	66.6%	31.7%	0.0%	0.0%	0.9%	0.9%	3.60	3
Chandan K Reddy	2020	85.8%	14.2%	0.0%	0.0%	0.0%	0.0%	3.73	4
Scott C Leman	2017	100.0%	0.0%	0.0%	0.0%	0.0%	0.0%	4.00	3
Thomas H Woteki	2020	83.4%	0.0%	16.7%	0.0%	0.0%	0.0%	3.62	1
Jonathan K Alt	2020	83.3%	16.7%	0.0%	0.0%	0.0%	0.0%	3.88	1
Oliver Schabenberger	2023	100.0%	0.0%	0.0%	0.0%	0.0%	0.0%	3.95	1
Jyotishka Datta	2022	94.4%	5.6%	0.0%	0.0%	0.0%	0.0%	3.94	3
Xinwei Deng	2023	100.0%	0.0%	0.0%	0.0%	0.0%	0.0%	3.94	2
Chang Tien Lu	2012	66.6%	33.3%	0.0%	0.0%	0.0%	0.0%	3.57	1
Martin Skarzynski	2023	100.0%	0.0%	0.0%	0.0%	0.0%	0.0%	3.98	2

STAT-5525: Data Analytics

Grade Distribution Over Time