# Difference between revisions of "Principal component analysis"

Line 1: | Line 1: | ||

− | + | ||

[[Category:PCA]] | [[Category:PCA]] | ||

[[Category:Classification]] | [[Category:Classification]] | ||

− | In general, a Principal Component Analysis (PCA) aims at analyzing a data set and discovering a set of coordinates that capture the most representative features of said data. Often the term ''PCA classification'' is used | + | In general, a Principal Component Analysis (PCA) aims at analyzing a data set and discovering a set of coordinates that capture the most representative features of said data. Often the term ''PCA classification'' is loosely used. PCA is not a classification method: classification itself is performed on the features extracted through PCA. |

In ''Dynamo'', the PCA is the process of finding a reduced set of "eigenvolumes" that allow to approximatively represent each particle in our data set as a combination of these eigenvolumes. Which this representation, a generic particle can be represented by the contributions of each "eigenvolume" to the particle, i.e., by a set of "eigencomponents", normally in a number no much higher than 20. | In ''Dynamo'', the PCA is the process of finding a reduced set of "eigenvolumes" that allow to approximatively represent each particle in our data set as a combination of these eigenvolumes. Which this representation, a generic particle can be represented by the contributions of each "eigenvolume" to the particle, i.e., by a set of "eigencomponents", normally in a number no much higher than 20. | ||

Line 8: | Line 8: | ||

Once the particles are represent by small sets of scalars, they can be classified with standard methods like k-means. | Once the particles are represent by small sets of scalars, they can be classified with standard methods like k-means. | ||

− | + | ==Operative steps == | |

− | Operatively, | + | Operatively, completing a PCA based classification requires three steps: |

; Selecting the input | ; Selecting the input | ||

a data folder, a table, a mask | a data folder, a table, a mask | ||

Line 16: | Line 16: | ||

; Computing the eigenvalues, eigenvolumes and eigencomponents | ; Computing the eigenvalues, eigenvolumes and eigencomponents | ||

; Using the eigencomponents to create a classification. | ; Using the eigencomponents to create a classification. | ||

− | = | + | |

+ | == GUIs for PCA classification == | ||

+ | |||

+ | PCA | ||

+ | |||

+ | There are two GUIs available to cover the [[#Operative steps | pipeline]]: | ||

+ | {{t|dynamo_ccmatrix_project_manager}} | ||

+ | |||

+ | == Tutorials == | ||

+ | There are some pdf tutorials available inside the ''Dynamo''distribution: | ||

+ | * General introduction to PCA based classification. | ||

+ | * Command line classification. |

## Revision as of 09:51, 19 April 2016

In general, a Principal Component Analysis (PCA) aims at analyzing a data set and discovering a set of coordinates that capture the most representative features of said data. Often the term *PCA classification* is loosely used. PCA is not a classification method: classification itself is performed on the features extracted through PCA.

In *Dynamo*, the PCA is the process of finding a reduced set of "eigenvolumes" that allow to approximatively represent each particle in our data set as a combination of these eigenvolumes. Which this representation, a generic particle can be represented by the contributions of each "eigenvolume" to the particle, i.e., by a set of "eigencomponents", normally in a number no much higher than 20.

Once the particles are represent by small sets of scalars, they can be classified with standard methods like k-means.

## Operative steps

Operatively, completing a PCA based classification requires three steps:

- Selecting the input

a data folder, a table, a mask

- Computing a cross-correlation matrix
- this is typically the most consuming part, as it involves to compare all particles in the data folder against all particles.
- Computing the eigenvalues, eigenvolumes and eigencomponents
- Using the eigencomponents to create a classification.

## GUIs for PCA classification

PCA

There are two GUIs available to cover the pipeline:
`dynamo_ccmatrix_project_manager`

## Tutorials

There are some pdf tutorials available inside the *Dynamo*distribution:

- General introduction to PCA based classification.
- Command line classification.