News ed eventi

9 elementi trovati

Friday, April 4th, 2025 

Bruno ARPINO
Full Professor - University of Padova.

Ties and Older People's Well-Being: Evidence and Methodological Challenges

Abstract:

Demographic transformations such as population aging, declining fertility, increasing migration, and shifts in marriage and partnership patterns are reshaping family structures and intergenerational relationships. As a result, older adults are experiencing new family dynamics, including rising kinlessness and evolving roles within multigenerational networks. Understanding how these changes impact the well-being of older people is crucial for both researchers and policymakers. 
This seminar will explore the complex relationship between family ties and older adults’ well-being, drawing on empirical studies of changing family structures and relationships. Particular attention will be given to the role of grandparental childcare—an increasingly significant aspect of intergenerational support. I will discuss the potential benefits and challenges of this caregiving role, including its implications for grandparents' health and well-being. 
A key methodological challenge in this area of research is establishing causal relationships between grandparental childcare and older people’s well-being. In the final part of the seminar, I will review the most commonly used statistical methods for causal inference and present the results of a small simulation study comparing their performance. The seminar will conclude with a discussion on future research directions, including the potential of machine learning techniques for advancing this field and the bidirectional link between family relationships and older adults’ engagement with digital technologies.

 

 

Tuesday, March 11th, 2025

Andrea Cappozzo

Associate Professor - Università Cattolica del Sacro Cuore.

Model-Based Clustering of Right-Censored Survival Data with Frailties and Random Covariates

Abstract:

We present a novel approach for clustering multilevel survival data that accounts for baseline heterogeneity and the local distributions of explanatory variables. The proposed method identifies patient clusters with distinct survival patterns and evaluates the hierarchical structure’s impact on survival within each cluster. A stochastic EM algorithm, specifically adapted for right-censored survival data, is employed to maximize the objective function. We demonstrate the effectiveness of the proposed methodology by analyzing survival times in COVID-19 patients with heart failure, successfully revealing latent patient profiles, assessing hospital-level effects within clusters, and quantifying the influence of respiratory diseases on survival.

 

 

Thursday, February 20th, 2025
Luis CARVALHO
Associate Professor- Department of Mathematics and Statistics - Boston University

Deviance Matrix Factorization

Abstract:
The singular value decomposition can be used to find a low-rank representation of a matrix under the Frobenius norm (entrywise square-error loss) and, for this reason, it enjoys an ubiquitous presence in many areas, including in Statistics with principal component and factor analyses. In this talk, we discuss a generalization of this matrix factorization, the deviance matrix factorization (DMF), that assumes broader deviance losses and thus allows for more meaningful and representative decompositions under different data domains and variance assumptions. We provide an efficient algorithm for the DMF and discuss using entrywise weights to represent missing data. We propose two tests to identify suitable decomposition ranks and data distributions and prove a few theoretical guarantees such as consistency. To showcase the practical performance of the proposed decomposition, we present a number of case studies in genetics, network analysis, and image classification. Finally, we offer a few directions for future work. This is joint work with Liang Wang.

 

 

Thursday, January 16th, 2025

Raya MUTTARAK

Full Professor - Dipartimento di Scienze Statistiche "Paolo Fortunati” - Università di Bologna

Exploring data and methodological approaches for assessing climate change impacts on population dynamics

Abstract:

The extreme record-breaking heat in April and May 2024 in the Asian continent, major hurricanes like Helene and Milton in the US and severe flooding in Central Europe and Emilia Romagna, to name a few, are examples of extreme events that are documented to be attributable to anthropogenic climate change. Indeed, it is evident that the impacts of human-induced climate change on our lives, livelihoods and wellbeing are already being felt. This raises a question whether, in which direction and to what extent climate change also influences demographic processes, through affecting fertility, mortality and migration, the three key demographic outcomes driving population change. Although it is highly plausible that climate change also affects population trends, to date existing global population projections have not taken into account the climate feedback on demographic processes. This talk aims to present current evidence on the impact of climatic factors on demographic outcomes with a focus on fertility and migration. I will explore the methodological approaches and data employed to examine the connections between climate change and demographic behaviors. Additionally, I will discuss whether population projections should incorporate the impacts of climate feedback on demographic processes.

 

Thursday, December 19th, 2024

Alessandro ZITO 

Postdoctoral research fellow in Biostatistics at the Harvard T.H. Chan School of Public Health.

Compressive Bayesian non-negative matrix factorization for mutational signatures analysis

Abstract:

Non-negative matrix factorization (NMF) is widely used in many applications for dimensionality reduction. Inferring an appropriate number of factors for NMF is a challenging problem, and several approaches based on information criteria or sparsity-inducing priors have been proposed. However, inference in these models is often complicated and computationally costly. In this talk, we will describe a novel methodology for overfitted Bayesian NMF models that uses compressive hyperpriors to force unneeded factors down to negligible values while only imposing mild shrinkage on needed ones. The method uses simple semi-conjugate priors to facilitate inference while setting the strength of the hyperprior in a data-dependent way to achieve this compressive property.  This results in a simple yet effective way to find the appropriate rank of any NMF decomposition, allowing for better interpretability of the resulting factors.  We will discuss theoretical results establishing the compressive property, and show the benefits of our method within the context of mutational signatures analysis, which has become a routine practice in cancer genomics. In particular, our framework enables the use of biologically informed priors on the signatures, yielding significantly improved accuracy. 

 

Thursday, June 27th, 2024

Donatello TELESCA

Professor of Biostatistics - UCLA

Mixed membership models and phase variability in functional data analysis

Abstract:  A common concern in the field of functional data analysis is the challenge of temporal misalignment, which is typically addressed using curve registration methods. Currently, most of these methods assume the data is governed by a single common shape or a finite mixture of population level shapes. We introduce more flexibility using mixed membership models. Individual observations are assumed to partially belong to different pure mixtures, allowing for variation across multiple functional features. We propose a Bayesian hierarchical model to estimate the underlying shapes, as well as the individual time-transformation functions and levels of membership. Motivating this work is data from EEG signals in children with autism spectrum disorder (ASD). Our method agrees with the neuroimaging literature, recovering the 1/f pink noise feature distinctly from the peak in the alpha band. Furthermore, the introduction of a regression component in the estimation of time-transformation functions quantifies the effect of age and clinical designation on the location of the peak alpha frequency (PAF). 

 

Wednesday June 12th, 2024

Stefano CASTRUCCIO 

Associate Professor -  University of Notre Dame, Indiana, USA.

Stochastic environmental modeling in a time of convergence: physics meets artificial intelligence

Abstract: It is widely acknowledged that the relentless surge of Volume, Velocity and Variety of data, as well as the simultaneous increase of computational resources have stimulated the development of data-driven methods with unprecedented flexibility and predictive power. However, not every environmental study entails a large data set: many applications ranging from astronomy or paleo-climatology have a high associated sampling cost and are instead constrained by physics-informed partial differential equations. Throughout the past few years, a new and powerful paradigm has emerged in the machine learning literature, merging data-driven and physics-informed problems, hence providing a unified framework for a whole spectrum of problems ranging from data-rich/context-poor to data-poor/context-rich. In this talk, I will present this new framework and discuss some of the most recent efforts to reformulate it as a stochastic model-based approach, thereby allowing calibrated uncertainty quantification.

 

Tuesday, May 28th, 2024

Fulvio DE SANTIS

Full Professore of Statistics - Università La Sapienza, Rome

On the distribution of the risk function induced by a prior

Abstract:

In the frequentist approach to statistical decision theory, the risk function quantifies the average performance of a decision over the sample space. In parametric inference, the risk function depends on the parameter of the model. Hence, when a prior distribution is assigned to the parameter, the risk function is a random variable, typically summarized by its expected value, the Bayes risk. However, for a good assessment of the quality of a decision function, it might be useful to explore the whole distribution of its random risk and to consider additional summaries to complement or to replace the Bayes risk. We here discuss some classes of standard yet relevant models and decision problems where the cdf and the pdf of the random risk can be determined in closed-form or easily approximated with basic Monte Carlo. Issues related to sample size determination are also discussed. Illustrative examples are taken from the literature on clinical trials, a context where this approach is receiving increasing attention.

 

 

Thursday, January 25th, 2024
Daniele Durante
Assistant professor of Statistics, Department of Decision Sciences, Bocconi University

Bayesian Nonparametric Stochastic Block Modeling of Criminal Networks

Abstract:

Europol recently defined criminal networks as a modern version of the Hydra mythological creature, with covert and complex structure. Indeed, relationships data among criminals are subject to measurement errors, structured missingness patterns, and exhibit a sophisticated combination of an unknown number of core-periphery, assortative and disassortative structures that may encode key architectures of the criminal organization. The coexistence of these noisy block patterns limits the reliability of community detection algorithms routinely-used in criminology, thereby leading to overly-simplified and possibly biased reconstructions of organized crime architectures. In this seminar, I will present a number of model-based solutions which aim at covering these gaps via a combination of stochastic block models and priors for random partitions arising from Bayesian nonparametrics. These include Gibbs-type priors, and random partition priors driven by the urn scheme of a hierarchical normalized completely random measure. Product-partition models to incorporate criminals' attributes, and zero-inflated Poisson representations accounting for weighted edges and security strategies, will be also discussed. Collapsed Gibbs samplers for posterior computation are presented, and refined strategies for estimation, prediction, uncertainty quantification and model selection will be outlined. Results are illustrated in an application to an Italian Mafia network, where the proposed models unveil a structure of the criminal organization mostly hidden to state-of-the-art alternatives routinely used in criminology. I will conclude the seminar with ideas on how to learn the evolutionary history of the criminal organization from the relationship data among its criminals via a novel combination of latent space models for network data and phylogenetic trees. 

 

 

Wednesday, December 13th, 2023

Matteo Iacopini

Lecturer in Statistics, Queen Mary University of London

Static and Dynamic BART for Rank-Order Data

Abstract

Ranking lists are often provided at regular time intervals by one or multiple rankers in a range of applications, including sports, marketing, and politics. Most popular methods for rank-order data postulate a linear specification for the latent scores, which determine the observed ranks, and ignore the temporal dependence of the ranking lists. To address these issues, novel nonparametric static (ROBART) and autoregressive (ARROBART) models are introduced, with latent scores de- fined as nonlinear Bayesian additive regression tree functions of covariates.To make inferences in the dynamic ARROBART model, closed-form filtering, predictive, and smoothing distributions for the latent time-varying scores are de- rived. These results are applied in a Gibbs sampler with data augmentation for posterior inference.The proposed methods are shown to outperform existing competitors in sim- ulation studies, and the advantages of the dynamic model are demonstrated by forecasts of weekly pollster rankings of NCAA football teams.

 

 

 

Monday, June 5th, 2023

Marzia A. Cremona

Université Laval - Québec (Québec) G1V 0A6 (Canada)

smoothEM: a new approach for the simultaneous assessment of smooth patterns and spikes

Abstract:

We consider functional data where an underlying smooth curve is composed not just with errors, but also with irregular spikes that (a) are themselves of interest, and (b) can negatively affect our ability to characterize the underlying curve. We propose an approach that, combining regularized spline smoothing and an Expectation-Maximization algorithm, allows one to both identify spikes and estimate the smooth component. Imposing some assumptions on the error distribution, we prove consistency of EM estimates. Next, we demonstrate the performance of our proposal on finite samples and its robustness to assumptions violations through simulations. Finally, we apply our proposal to data on the annual heatwaves index in the US and on weekly electricity consumption in Ireland. In both datasets, we are able to characterize underlying smooth trends and to pinpoint irregular/extreme behaviors.

Work in collaboration with Huy Dang (Penn State University) and Francesca Chiaromonte (Penn State University and Sant’Anna School of Advanced Studies)



Thursday, May 18th, 2023

Matteo SESIA

Department of Data Sciences and Operations, Marshall School of Business - University of Southern California

Conformal Inference for Frequency Estimation with Sketched Data

Abstract

A flexible model-free method is developed to construct a confidence interval for the frequency of a queried object in a very large data set, based on a much smaller sketch of the data. The approach requires no knowledge of the data distribution or of the details of the sketching algorithm; instead, it constructs provably valid frequentist confidence intervals for random queries using a conformal inference approach. After achieving marginal coverage for random queries under the assumption of data exchangeability, the proposed method is extended to provide stronger inferences accounting for possibly heterogeneous frequencies of different random queries, redundant queries, and distribution shifts. While the presented methods are broadly applicable, this work focuses on use cases involving the count-min sketch algorithm and a non-linear variation thereof, to facilitate comparison to prior work. In particular, the developed methods are compared empirically to frequentist and Bayesian alternatives, through simulations and experiments with data sets of SARS-CoV-2 DNA sequences and classic English literature.

Slide seminario

 

 

Monday, March 27th, 2023

Augusto FASANO

Scalable and accurate variational Bayes for high-dimensional binary regression models and beyond

Abstract:

Bayesian binary probit regression and its extensions to time-dependent observations and multi-class responses are popular tools in binary and categorical data regression due to their interpretability and non-restrictive assumptions. Although the theory is well established in the frequentist literature, these models still face a florid research in the Bayesian framework to overcome computational issues or inaccuracies in high dimensions as well as the lack of a closed-form expression for the posterior distribution of the model parameters in many cases. We develop a novel variational approximation for the posterior distribution of the coefficients in high-dimensional probit regression with binary responses and Gaussian priors, resulting in a unified skew-normal (SUN) approximating distribution that converges to the exact posterior as the number of predictors increases. Moreover, we derive closed-form expressions for posterior distributions arising from models that account for correlated binary time-series and multi-class responses, developing computational methods that outperform state-of-the-art routines. Finally, we show that such methodological and computational results can be extended to a broad variety of routinely-used regression models leveraging on SUN conjugacy.