{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# PCA - Principal Component Analysis\n", "\n", "**Problem**: you have a multidimensional set of data (such as a set of hidden unit activations) and you want to see which points are closest to others.\n", "\n", "PCA allows you to identify the dimensions of greatest variance, to the dimensions of least variance. from sklearn import datasets
iris = datasets.load_iris()

iris.get("feature_names")

['sepal length (cm)',
 'sepal width (cm)',
 'petal length (cm)',
 'petal width (cm)']

https://rpubs.com/sarthakdasadia11/iris

iris.data

iris.target

%matplotlib notebook

from sklearn.decomposition import PCA

pca = PCA(n_components=2)

pca.fit(iris.data)

X = pca.transform(iris.data) 