﻿ Santiago Barreda

Santiago Barreda
Assistant Professor, Department of Linguistics, UC Davis

# Main Research Vignettes CV phonTools Statistics Links

## ldclassify [phonTools]

This function classifies the items described in the data matrix by comparing them to the reference patterns for the different candidate categories represented in the means matrix. The category with the minimum Mahalanobis distance to the observed pattern (i.e., a given row of the data matrix) is selected as the winner. Mahalanobis distances are found with using the covariance matrix provided to the function.

In the example below, random data points are generated and classified, resulting in pretty accurate performance. Code used to generate examples:

```
## create two groups with the different means and covariance patterns
group1 = rmvtnorm (200, means= c(0,0), k=2, sigma = -.4)
group2 = rmvtnorm (200, means= c(2,2), k=2, sigma = .2)
covariance = (var (group1) + var (group2)) / 2

## combine and predict category
all = rbind (group1, group2)
categories = ldclassify (all, means = rbind (c(0,0),c(2,2)), covariance)

par (mfrow = c(1,2), mar = c(4,4,4,1))
## plot real groups and boundary line between categories.
plot (group1, col = 2, pch = 16, ylim = c(-2,5), xlim = c(-2,5), main = 'Real Categories')
points (group2, col = 4, pch = 16)
ldboundary (c(0,0), c(2,2), covariance, add = TRUE)

## plot classified groups and boundary line between categories.
plot (all, col = c(2,4)[categories], pch = 16, ylim = c(-2,5), xlim = c(-2,5), main = 'Classification')
ldboundary (c(0,0), c(2,2), covariance, add = TRUE)

```