Andrea Cappozzo

Andrea Cappozzo

Assistant Professor of Statistics
Doctor Europaeus

University of Milan

Statistician, currently Assistant Professor (RTDB) at the Department of Economics, Management, and Quantitative Methods of University of Milan. My methodological interests are related to the domain of applied statistics and statistical learning, with particular focus on mixture modeling. I am a passionate R user and a Tidyverse fan.

Breaking news

  • New Preprint! Model-based clustering for covariance matrices via penalized Wishart mixture models (joint work with Casa, A.)

Interests

  • Mixture models
  • Robust statistics
  • Model-based clustering and classification
  • Variable selection
  • Statistical learning

Education

  • PhD in Statistics and Mathematical Finance, 2020

    University of Milano-Bicocca

  • MSc in Statistical Sciences (with honors), 2015

    University of Padua

  • BSc in Statistics and Management (with honors), 2012

    University of Padua

Experience

University
Assistant Professor, Department of Economics, Management, and Quantitative Methods (2024/02-Ongoing)
University of Milan

Assistant Professor, Department of Mathematics (2021/04-2024/02)
Politecnico di Milano

Teaching Assistant, BSc courses in Statistics and Statistical Methods (2019/02-2021/07)
University of Milano-Bicocca

Postdoctoral research fellow, Department of Statistics and Quantitative Methods (2020/04-2021/03)
University of Milano-Bicocca

Teaching Assistant, BSc course in Statistics (2017/09-2018/02)
Bocconi University

Industry
Freelance data scientist (2020/01-2020/04)
DCG, Milan

Business analyst and planner (2015/09-2016/09)
HP Inc, Barcelona

Visiting periods
Visiting PhD Student (2018/03-2019/03)
Insight Centre for Data Analytics, University College Dublin

Exchange Semester (2014/01-2014/06)
School of Economics and Management, Tilburg University

Publications

Articles in refereed journals

  1. Benetti, L., Boniardi, E., Chiani, L., Ghirri, J., Mastropietro, M., Cappozzo, A., & Denti, F. (2023)
    Variational Inference for Semiparametric Bayesian Novelty Detection in Large Datasets
    Advances in Data Analysis and Classification (online first)
    link | arXiv | code

  2. Cappozzo, A., Ieva, F., Fiorito, G. (2023)
    A general framework for penalized mixed-effects multitask learning with applications on DNA methylation surrogate biomarkers creation
    Annals of Applied Statistics 17(4), 3257-3282
    link | arXiv | code

  3. Cappozzo, A., García Escudero, L.A., Greselin, F., Mayo-Iscar, A. (2023)
    Graphical and computational tools to guide parameter choice for the cluster weighted robust model
    Journal of Computational and Graphical Statistics 32(3),1195–1214
    link | code

  4. Casa, A., Cappozzo, A., Fop, M. (2022)
    Group-wise shrinkage estimation in penalized model-based clustering
    Journal of Classification, 39(3), 648–674.
    link | arXiv | code

  5. Cappozzo, A., McCrory, C., Robinson, O. et al. (2022)
    A blood DNA methylation biomarker for predicting short-term risk of cardiovascular events
    Clinical Epigenetics 14, 121
    link | code

  6. Cappozzo, A., García Escudero, L.A., Greselin, F., Mayo-Iscar, A. (2021)
    Parameter Choice, Stability and Validity for Robust Cluster Weighted Modeling
    Stats 2021, 4(3)
    link | code

  7. Denti, F., Cappozzo, A., Greselin, F. (2021)
    A Two-Stage Bayesian Nonparametric Model for Novelty Detection with Robust Prior Information
    Statistics and Computing, 32, 18
    link | arXiv | code

  8. Cappozzo, A., Duponchel, L., Greselin, F., Murphy, T.B. (2021)
    Robust variable selection in the framework of classification with label noise and outliers: applications to spectroscopic data in agri-food
    Analytica Chimica Acta, 1153, 338245
    link | arXiv | code | cover

  9. Cappozzo, A., Greselin, F., Murphy, T.B. (2021)
    Robust variable selection for model-based learning in presence of adulteration
    Computational Statistics & Data Analysis, 158, 107186
    link | arXiv | code

  10. Cappozzo, A., Greselin, F., Murphy, T.B. (2020)
    Anomaly and Novelty detection for robust semi-supervised learning.
    Statistics and Computing, 30, 1545–1571
    link| arXiv | code

  11. Cappozzo, A., Greselin, F., Murphy, T.B. (2020)
    A robust approach to model-based classification based on trimming and constraints.
    Advances in Data Analysis and Classification, 14(2), 327-354
    link | arXiv | code

Submitted and working papers

  1. Cappozzo, A., Casa, A. (2024+)
    Model-based clustering for covariance matrices via penalized Wishart mixture models
    Submitted
    arXiv | code

  2. Caldera, L., Masci, C., Cappozzo, A., Forlani, M., Antonelli, B., Leoni, O., Ieva, F. (2024+)
    Uncover mortality patterns and hospital effects in COVID-19 heart failure patients: a novel Multilevel logistic cluster-weighted modeling approach
    Submitted
    arXiv | code

  3. Cappozzo, A., Casa, A., Fop, M. (2024+)
    Sparse model-based clustering of three-way data via lasso-type penalties
    Revision submitted
    arXiv | code

Awards

Young Researcher Paper Award (2023/09)
CLADAG 2023 14-th Scientific Meeting Classification and Data Analysis Group
Salerno, Italy

Best poster presentation (2018/09)
MBC2 Workshop on Models and Learning for Clustering and Classification
Catania, Italy

Member of the third best team in terms of algorithm predictive accuracy (2017/09)
Young CLADAG - Data science competition
Politecnico di Milano, Italy

Member of one of the four winning teams (2017/06)
Stats Under the Stars 3 - Data science competition
Università degli Studi di Firenze, Italy

Successful participant to HP Business Academy Assessment Week (2015/07)
Hewlett Packard Española and Fundación Universidad-Empresa
Barcelona, Spain

Outreach

Member of the Italian Statistical Society, its young group y-SIS and the Institute of Mathematical Statistics

Leaf at the Mathematics Genealogy Project. Are you a scholar in math or a related field? You should consider adding your dissertation, and let the branches grow!

Contact