CIS 4523/5523: Knowledge Discovery and Data Mining
Spring 2024
Spring 2024
Home
Course Syllabus
Software and Data
Homework Assignments
Mini Lectures
Project Presentations
SOFTWARE
General Purpose Data Mining
WEKA
(Source: Java)
MLC++
(Source: C++)
SIPINA
List from KDNuggets
(Various)
List from Data Management Center
(Various)
Classification
C5.0
(Decision tree) - R implementation
OC1
(Oblique decision tree) - python implementation
Ripper
(Rule-based)
CBA
(association-rule based)
bayes
(Naive Bayes)
Evidential distance-based
(nearest-neighbor)
PEBLS
(nearest-neighbor)
mlp
(Neural Network)
tiberius
(Neural Network)
svmlight
(Support Vector Machine)
Supervised learning in Python with scikit-learn
(Classification and Regression)
Classification in Matlab
caret
(Classification and Regression Training in R)
kernLab
(Clustering, Classification, Regression and more in R)
mlxtend classifier"
(Python library)
Association Analysis
FIMI Repository of Algorithms
Apriori, Eclat, and FP Growth
ARTool
ARMADA
(Association rule mining in Matlab)
PAFI
arules: Mining Association Rules and Frequent Itemsets
(R library)
mlxtend frequent patterns"
(Python library)
Cluster Analysis
CLUTO
Clustering in Python with scikit-learn
Cluster Analysis in Matlab
kernLab
(Clustering, Classification, Regression and more in R)
mlxtend cluster"
(Python library)
Anomaly Detection
ORCA
(distance based)
Anomaly detection in Python with PyOD
Anomaly detection in Matlab
anomaly
(Anomaly detection in R)
Regression
Supervised learning in Python with scikit-learn
(Classification and Regression)
Regression in Matlab
caret
(Classification and Regression Training in R)
kernLab
(Clustering, Classification, Regression and more in R)
mlxtend regressor"
(Python library)
Data Preprocessing
Isomap (Dimensionality Reduction - in Matlab)
Dimensionality Reduction in Python with scikit-learn
Data preprocessing in Python with scikit-learn
Dimensionality Reduction and Feature Extraction in Matlab
DataExplorer
(Automate Data Exploration and Treatment in R)
mlxtend"
(modules data, feature selection, feature extraction, preprocessing)
DATA SETS
Google Dataset Search
Kaggle Datasets
IDS data sets
Data Sets for Data Mining
UCI Machine learning repository
IBM Quest Synthetic Data Generator
KDNuggets
DePaul Center for Data Science datasets list
Data World datasets
Columbia Dataset list