Statistical Software
- yaglm: a Python package for fitting and tuning generalized linear models that supports structured, adaptive and non-convex penalties.
- fclsp: a flexible python package for penalized generalized linear models that supports structured, adaptive and concave penalties.
- mvmm: a Python package for multi-view mixture modeling.
- ya_pca: a Python package PCA that focuses on rank selection.
- py_jive: a Python package for dimensionality reduction for multiple data matrices (implements AJIVE).
- ajive: an R package implementing AJIVE.
- DWD: a Python package implementing the DWD classifier in a sklearn compatible API. kitware now maintains this package.
- explore: a Python package for automating various exploratory analysis tasks (e.g. multiple testing control, visual diagnostics) particularly those arising from the interpretation of unsupervised learning analyses.
- what_the_cluster: a python package implementing algorithms to determine the "optimal" number of clusters (e.g. gap statistic).
- jackstraw: a python package implementing jackstraw type methods which perform statistical tests for dimensionality reduction and other unsupervised algorithms.
- diproperm: a python package implementing DiProPerm for high dimensional hypothesis testing with linear classifiers.
Code to reproduce papers
Code to reproduce the results in my papers can be found on my research page and on my github page.
Tutorials and other software
- spurious correlations: a shiny app to demonstrate spurious correlations with UNC mens/womens basketball data for a middle/high-school science fair (this was collaboration with a number of others).
- python implementation of a number of optimization algorithms
- word_embed_tutorial: a tutorial in Python for getting started with word embeddings using a corpus of Supreme Court opinions.
- ds_tutorials: exercises for getting started with doing data science in R/Python. Built to help undergrads compete in DataFest .