PCA
Neighborhood
Umap
https://www.youtube.com/watch?v=G9s3cE8TNZo
Umap: uniform manifold approximation and projection for dimensional reduction
overlap would be classification error
compare to UMAP: slower, doing embedding with dimension higher than 2 computation impossible (limited to visualization), x and y axis no degree of highest variance
faster and more robuster than UMAP
low randomnes
Pros: | Cons: | |
---|---|---|
Principal component analysis | ||
• Relatively computationally cheap. | • Linear reduction limits information that can be captured; not as discriminably clustered as other algorithms. | |
• Can save embedding model to then project new data points into the reduced space. | ||
t-Distributed stochastic neighbor embedding | • Produces highly clustered, visually striking embeddings. | • Global structure may be lost in favor of preserving local distances. |
• Non-linear reduction, captures local structure well. | • More computationally expensive. | |
• Requires setting hyperparameters that influence quality of the embedding. | ||
• Non-deterministic algorithm. | ||
Uniform manifold approximation and projection | • Non-linear reduction that is computationally faster than t-SNE. | • New, less prevalent algorithm. |
• User defined parameter for preserving local or global structure. | • Requires setting hyperparameters that influence quality of the embedding. | |
• Solid theoretical foundations in manifold learning. | • Non-deterministic algorithm. |