Seminário

Big Data: Opportunities and Risks - An Application to Time Series Outlier Detection

Sala 6.4.30, FCUL, Lisboa

Daniel Peña
Dep. Estadística - Universidad Carlos III de Madrid

The talk will discuss the changes that Big Data is producing in the methods for analyzing data. Some new procedures for data analysis  proposed for large data sets will be reviewed and also  the risks of applying them to  this rich data environment without a statistical model . The importance of initial cleaning of the data before any analysis will be emphasized  and a new  procedure for finding outliers in large sets of multivariate time series using dynamic factor models will be proposed. The method is able to find specific and common outliers and can be applied in a routine way to clean large data sets of dependent or independent data. The procedure proposed  will be illustrated in examples with both simulated and real data.

14h30
CEAUL - Centro de Estatística e Aplicações da Universidade de Lisboa