Model-based clustering of functional data via mixtures of t distributions
functional data analysis, model-based clustering, multivariate t distributions, EM algorithm, multivariate functional principal components analysis
We propose a procedure, called T-funHDDC, for clustering multivariate functional data with outliers which extends the functional high dimensional data clustering (funHDDC) method (Schmutz et al, 2020) by considering a mixture of multivariate t distributions. We de ne a family of latent mixture models following the approach used for the parsimonious models considered in funHDDC and also constraining or not the degrees of freedom of the multivariate t distributions to be equal across the mixture components. The parameters of these models are estimated using an expectation maximization (EM) algorithm. In addition to proposing the T-funHDDC method, we add a family of parsimonious models to C-funHDDC, which is an alternative method for clustering multivariate functional data with outliers based on a mixture of contaminated normal distributions (Amovin-Assagba et al, 2022). We compare T-funHDDC, C-funHDDC, and other existing methods on simulated functional data with outliers and for real-world data. T-funHDDC out-performs funHDDC when applied to functional data with outliers, and its good performance makes it an alternative to C-funHDDC. We also apply the T-funHDDC method to the analysis of traffic flow in Edmonton, Canada.
Anton, C., & Smith, I. (2023). Model-based clustering of functional data via mixtures of t distributions. Advances in Data Analysis and Classification. https://doi.org/10.1007/s11634-023-00542-w
All Rights Reserved