GitHub - jlmelville/uwot: An R package implementing the UMAP dimensionality reduction method. (original) (raw)
uwot
An R implementation of theUniform Manifold Approximation and Projection (UMAP)method for dimensionality reduction of McInnes et al. (2018). Also included are the supervised and metric (out-of-sample) learning extensions to the basic method. Translated from thePython implementation.
News
April 21 2024 As ordained by prophecy, version 0.2.2 of uwot
has been released to CRAN. RSpectra
is back as a main dependency and I thought I had worked out a clever scheme to avoid the failures seen in some installations with the irlba
/Matrix
interactions. This releases fixes the problem on all the systems I have access to (including GitHub Actions CI) but some CRAN checks remain failing. How embarrassing. That said, if you have had issues, it's possible this new release will help you too.
April 18 2024 Version 0.2.1 of uwot
has been released to CRAN. Some features to be aware of: RcppHNSW andrnndescent are now supported as optional dependencies. If you install and load them, you can use them as an alternative to RcppAnnoy in the nearest neighbor search and should be faster. Also, a new umap2
function has been added, with updated defaults compared toumap
. Please see the updated and new articles onHNSW,rnndescent,working with sparse dataand umap2. I consider this worthy of moving from 0.1.x
to 0.2.x
, but in the interests of full disclosure, on-goingirlba problems has caused a CRAN check failure, so we might be onto 0.2.2 sooner than I'd like.
Installing
From CRAN
From github
uwot
makes use of C++ code which must be compiled. You may have to carry out a few extra steps before being able to build this package:
Windows: installRtools and ensureC:\Rtools\bin
is on your path.
Mac OS X: using a custom ~/.R/Makevars
may cause linking errors. This sort of thing is a potential problem on all platforms but seems to bite Mac owners more.The R for Mac OS X FAQmay be helpful here to work out what you can get away with. To be on the safe side, I would advise building uwot
without a custom Makevars
.
install.packages("devtools") devtools::install_github("jlmelville/uwot")
Example
library(uwot)
umap2 is a version of the umap() function with better defaults
iris_umap <- umap2(iris)
but you can still use the umap function (which most of the existing
documentation does)
iris_umap <- umap(iris)
Load mnist from somewhere, e.g.
devtools::install_github("jlmelville/snedata")
mnist <- snedata::download_mnist()
mnist_umap <- umap(mnist, n_neighbors = 15, min_dist = 0.001, verbose = TRUE) plot( mnist_umap, cex = 0.1, col = grDevices::rainbow(n = length(levels(mnist$Label)))[as.integer(mnist$Label)] |> grDevices::adjustcolor(alpha.f = 0.1), main = "R uwot::umap", xlab = "", ylab = "" )
I recommend the following optional packages
for faster or more flexible nearest neighbor search:
install.packages(c("RcppHNSW", "rnndescent")) library(RcppHNSW) library(rnndescent)
Installing RcppHNSW will allow the use of the usually faster HNSW method:
mnist_umap_hnsw <- umap(mnist, n_neighbors = 15, min_dist = 0.001, nn_method = "hnsw")
nndescent is also available
mnist_umap_nnd <- umap(mnist, n_neighbors = 15, min_dist = 0.001, nn_method = "nndescent")
umap2 will choose HNSW by default if available
mnist_umap2 <- umap2(mnist)
Documentation
https://jlmelville.github.io/uwot/. For more examples see theget started doc. There are plenty of articlesdescribing various aspects of the package.
License
Citation
If you want to cite the use of uwot, then use the output of runningcitation("uwot")
(you can do this with any R package).
See Also
- The UMAP reference implementation andpublication.
- The UMAP R package(see also its github repo), predates
uwot
's arrival on CRAN. - Another R package is umapr, but it is no longer being maintained.
- umappp is a full C++ implementation, andyaumap provides an R wrapper. The batch implementation in umappp are the basis for uwot's attempt at the same.
uwot
uses the RcppProgresspackage to show a text-based progress bar whenverbose = TRUE
.