lolR

Linear Optimal Low-Rank Projection

Supervised learning techniques designed for the situation when the dimensionality exceeds the sample size have a tendency to overfit as the dimensionality of the data increases. To remedy this High dimensionality; low sample size (HDLSS) situation, we attempt to learn a lower-dimensional representation of the data before learning a classifier. That is, we project the data to a situation where the dimensionality is more manageable, and then are able to better apply standard classification or clustering techniques since we will have fewer dimensions to overfit. A number of previous works have focused on how to strategically reduce dimensionality in the unsupervised case, yet in the supervised HDLSS regime, few works have attempted to devise dimensionality reduction techniques that leverage the labels associated with the data. In this package and the associated manuscript Vogelstein et al. (2017) <arXiv:1709.01233>, we provide several methods for feature extraction, some utilizing labels and some not, along with easily extensible utilities to simplify cross-validative efforts to identify the best feature extraction method. Additionally, we include a series of adaptable benchmark simulations to serve as a standard for future investigative efforts into supervised HDLSS. Finally, we produce a comprehensive comparison of the included algorithms across a range of benchmark simulations and real data applications.

Total

762

Last month

108

Last week

24

Average per day

4

Daily downloads

Total downloads

Description file content

Package
lolR
Type
Package
Title
Linear Optimal Low-Rank Projection
Version
2.0
Date
2018-04-12
Maintainer
Eric Bridgeford
Description
Supervised learning techniques designed for the situation when the dimensionality exceeds the sample size have a tendency to overfit as the dimensionality of the data increases. To remedy this High dimensionality; low sample size (HDLSS) situation, we attempt to learn a lower-dimensional representation of the data before learning a classifier. That is, we project the data to a situation where the dimensionality is more manageable, and then are able to better apply standard classification or clustering techniques since we will have fewer dimensions to overfit. A number of previous works have focused on how to strategically reduce dimensionality in the unsupervised case, yet in the supervised HDLSS regime, few works have attempted to devise dimensionality reduction techniques that leverage the labels associated with the data. In this package and the associated manuscript Vogelstein et al. (2017) , we provide several methods for feature extraction, some utilizing labels and some not, along with easily extensible utilities to simplify cross-validative efforts to identify the best feature extraction method. Additionally, we include a series of adaptable benchmark simulations to serve as a standard for future investigative efforts into supervised HDLSS. Finally, we produce a comprehensive comparison of the included algorithms across a range of benchmark simulations and real data applications.
Depends
R (>= 3.4.0)
License
GPL-2
URL
Imports
ggplot2, abind, MASS, irlba, pls
Encoding
UTF-8
LazyData
true
VignetteBuilder
knitr
RoxygenNote
6.0.1
Suggests
knitr, rmarkdown, parallel, randomForest, latex2exp, testthat, covr
NeedsCompilation
no
Packaged
2018-04-13 14:39:01 UTC; eric
Author
Eric Bridgeford [aut, cre], Minh Tang [ctb], Jason Yim [ctb], Joshua Vogelstein [ths]
Repository
CRAN
Date/Publication
2018-04-13 15:14:54 UTC

install.packages('lolR')

2.0

3 months ago

https://github.com/neurodata/lol

Eric Bridgeford

GPL-2

Depends on

R (>= 3.4.0)

Imports

ggplot2, abind, MASS, irlba, pls

Suggests

knitr, rmarkdown, parallel, randomForest, latex2exp, testthat, covr

Discussions