Umap | Notion

PCA

Neighborhood

Umap

https://www.youtube.com/watch?v=G9s3cE8TNZo

Umap: uniform manifold approximation and projection for dimensional reduction

螢幕截圖 2022-10-07 下午10.33.39.png

螢幕截圖 2022-10-07 下午10.41.15.png

螢幕截圖 2022-10-07 下午10.44.29.png

螢幕截圖 2022-10-07 下午10.45.02.png

overlap would be classification error
compare to UMAP: slower, doing embedding with dimension higher than 2 computation impossible (limited to visualization), x and y axis no degree of highest variance
faster and more robuster than UMAP
low randomnes

	Pros:	Cons:
Principal component analysis
	• Relatively computationally cheap.	• Linear reduction limits information that can be captured; not as discriminably clustered as other algorithms.
	• Can save embedding model to then project new data points into the reduced space.

t-Distributed stochastic neighbor embedding	• Produces highly clustered, visually striking embeddings.	• Global structure may be lost in favor of preserving local distances.
	• Non-linear reduction, captures local structure well.	• More computationally expensive.
		• Requires setting hyperparameters that influence quality of the embedding.
		• Non-deterministic algorithm.

Uniform manifold approximation and projection	• Non-linear reduction that is computationally faster than t-SNE.	• New, less prevalent algorithm.
	• User defined parameter for preserving local or global structure.	• Requires setting hyperparameters that influence quality of the embedding.
	• Solid theoretical foundations in manifold learning.	• Non-deterministic algorithm.