Papers by Leonardo Grilli

uses a self-evaluation test as an instrument to assess the competencies of candidates who want to... more uses a self-evaluation test as an instrument to assess the competencies of candidates who want to enrol in the three-year degree program. The aim of this study is to assess if the selfevaluation test scores give a gain in predicting the student performance when added to available student characteristics, such as the high school career. The student performance is measured by three binary indicators based on the number of credits gained after one year. For each binary outcome, the prediction is carried out using both logistic regression and random forest, using two alternative sets of predictors: (i) student characteristics; (ii) student characteristics and test scores. The predictive ability is assessed using 10-fold cross-validation. The main finding of the analysis, which refers to the academic year 2014/ 2015, is that the self-evaluation test scores do not help in predicting student performance once student characteristics are properly exploited.

Journal of the American Statistical Association, Jun 1, 2011
Within the principal stratification framework for causal inference, modeling partial compliance i... more Within the principal stratification framework for causal inference, modeling partial compliance is challenging because the continuous nature of the principal strata raises subtle specification issues. In this context, we propose an approach based on the assumption that the joint distribution of the degree of compliance to the treatment and the degree of compliance to the control follows a Plackett copula, so that their association is modeled in a flexible way through a single parameter. Moreover, given the two compliances, the distribution of the outcomes is parameterized in a flexible way through a regression model which may include interaction and quadratic terms and may also be heteroscedastic. In order to estimate the parameters of the resulting model, and then the causal effect of the treatment, we adopt a maximum likelihood approach via the EM algorithm. In applying this approach, the marginal distributions of the two compliances are estimated by their empirical distribution functions, so that no constraints are posed on these distributions. Since the two compliances cannot be jointly observed, there is not direct empirical support for the association parameter. We describe a strategy for studying this parameter by a profile likelihood method and discuss an analysis of the sensitivity of the causal inference results to its value. We apply the proposed approach to data previously analyzed by Efron and Feldman (1991) and Jin and Rubin (2008). Estimated causal effects are in line with those of previous analyses, but the pattern of association between the compliances is qualitatively different, apparently due to the flexibility of the copula and to allowing regression equations in the proposed method to include interactions and heteroscedasticity.
A finite mixture IRT model for ordinal responses with nonignorable missingness
International Federation of Classification Societies, 2015
We propose a multidimensional latent class IRT model for the analysis of ordinal responses subjec... more We propose a multidimensional latent class IRT model for the analysis of ordinal responses subject to nonignorable missingness. The missingness mechanism is driven by 2 sets of latent classes summarizing, respectively, the propensity to respond and the abilities measured by the test items. The model allows for both item covariates and examinee covariates and it is fitted by the EM algorithm. The model is illustrated through an application to university student careers, focusing on the grades of first-year exams, where the missingness of a grade is likely to be nonignorable.
Evaluation of university students’ performance through a multidimensional finite mixture IRT model
The paper analyzes the performance of university students, with reference to first-year compulsor... more The paper analyzes the performance of university students, with reference to first-year compulsory courses. We propose an Item Response Theory model that includes two latent variables corresponding to the student’s ability and the preference about the order of attempted exams. In this way, we explicitly account for nonignorable missing observations since the indicators of item response also contribute to measure the ability and then the model is of within-item multidimensional type. The two latent variables are assumed to have a discrete distribution defining latent classes of students that are homogenous in terms of ability and priority assigned to exams.

The European Journal of Public Health, 2016
Background: In 2010, Tuscany (Italy) implemented the Chronic Care Model (CCM), to improve general... more Background: In 2010, Tuscany (Italy) implemented the Chronic Care Model (CCM), to improve general practitioner (GP) management of chronic diseases. Aim: assessing how the introduction of CCM affected GPs' compliance with standards of care for diabetes patients. Methods: A controlled before-after study was performed. Two exposed groups of GPs, one entering the study in 2010 and one in 2011, were considered. Patients with diabetes assisted by GPs of the groups were identified through the healthcare administrative data of the Regional Healthcare System and followed up from 2009 to 2012. A diabetes care indicator called Guideline Composite Indicator (GCI: annual assessment of glycated haemoglobin and at least two assessment among eye examinations, total serum cholesterol, and microalbuminuria) and an indicator of adherence to statin therapy were computed per year and by group. Impact of intervention was estimated by difference in differences analysis for panel data, stratified by GP performance level at baseline. Results: 483 GPs constituting the first group entered the study in 2010, 258 GPs of the second group entered it in 2011, and 1,820 GPs constituted the control group. After 1 year, the diabetes care indicator increased of 8.1%. During the second year, it showed a further increase of 1.6%. The mean impact on the adherence to statin therapy was smaller (+1%), yet statistically significant. Conclusion: The first year of the CCM implementation had a significant impact on the diabetes care indicator, and performance was stabilized after the first year. Impact on therapy indicator was smaller.
Alternative specifications of bivariate multilevel probit ordinal response models
In the last years several methods for the analysis of ordinal multivariate multilevel data have b... more In the last years several methods for the analysis of ordinal multivariate multilevel data have been proposed (Muthén, 1994; Rabe-Hesketh et al., 2001; Mazzolli, 2001; Lillard and Panis 2000). The present paper highlights the interpretation of the variance-covariance parameters of the assumed multivariate distribution of the latent variables. Moreover, under the hypothesis of a multivariate Gaussian distribution, the paper illustrates some alternative specifications of the model, which have been proposed in order to use certain estimation ...
Combining multiple sources to overcome misclassification bias in epidemiological database studies
Social Science Research Network, 2016
La pubblicazione è resa disponibile sotto le norme e i termini della licenza di deposito, secondo... more La pubblicazione è resa disponibile sotto le norme e i termini della licenza di deposito, secondo quanto stabilito dalla Policy per l'accesso aperto dell'Università degli Studi di Firenze (https://www.sba.unifi.it/upload/policy-oa-2016-1.pdf)

This work presents new analyses on the relationship between student evaluation of teaching and st... more This work presents new analyses on the relationship between student evaluation of teaching and student, teacher and course specific characteristics, exploiting the richness of information collected by a new survey carried out among professors of the University of Padua. Data collected in this survey are able to highlight teacher needs, beliefs and practices of teaching and learning. This allows to introduce in the study some subjective traits of the teachers. The role of these new variables in explaining student evaluations is deeply investigated. Abstract In questo lavoro vengono presentate delle nuove analisi sulla relazione fra le opinioni espresse dagli studenti per la valutazione della qualità della didattica universitaria e caratteristiche specifiche del corso, degli studenti e dei docenti, sfruttando la ricchezza di informazioni raccolte per mezzo di una nuova indagine realizzata tra i docenti dell'Università di Padova. Questa indagineè in grado di evidenziare i bisogni, le credenze e le pratiche dei docenti legate alle loro attività didattiche, permettendo di introdurre nelle analisi un insieme di caratteristiche soggettive dei docenti. Il loro ruolo viene quindi approfonditamente studiato nelle successive analisi.

European journal of public health, Oct 15, 2016
Background: In 2010, Tuscany (Italy) implemented the Chronic Care Model (CCM), to improve general... more Background: In 2010, Tuscany (Italy) implemented the Chronic Care Model (CCM), to improve general practitioner (GP) management of chronic diseases. Aim: assessing how the introduction of CCM affected GPs' compliance with standards of care for diabetes patients. Methods: A controlled before-after study was performed. Two exposed groups of GPs, one entering the study in 2010 and one in 2011, were considered. Patients with diabetes assisted by GPs of the groups were identified through the healthcare administrative data of the Regional Healthcare System and followed up from 2009 to 2012. A diabetes care indicator called Guideline Composite Indicator (GCI: annual assessment of glycated haemoglobin and at least two assessment among eye examinations, total serum cholesterol, and microalbuminuria) and an indicator of adherence to statin therapy were computed per year and by group. Impact of intervention was estimated by difference in differences analysis for panel data, stratified by GP performance level at baseline. Results: 483 GPs constituting the first group entered the study in 2010, 258 GPs of the second group entered it in 2011, and 1,820 GPs constituted the control group. After 1 year, the diabetes care indicator increased of 8.1%. During the second year, it showed a further increase of 1.6%. The mean impact on the adherence to statin therapy was smaller (+1%), yet statistically significant. Conclusion: The first year of the CCM implementation had a significant impact on the diabetes care indicator, and performance was stabilized after the first year. Impact on therapy indicator was smaller.
In this paper, we analyse some aspects of job satisfaction by means of a multilevel factor model,... more In this paper, we analyse some aspects of job satisfaction by means of a multilevel factor model, decomposing the factor structure into the graduate and degree programme components, using data from a survey on the 1998 graduates of the University of Florence. Due to the ordinal scale of the response variables, we adopt a multilevel factor model for ordinal variables. The results show that the factor structures at the graduate and study programme levels are not the same, although they are similar; the study programmes with extreme factor scores should be selected for a deeper investigation.

Metron-International Journal of Statistics, Dec 1, 2010
The paper investigates the consequences of sample selection in multilevel or mixed models, focusi... more The paper investigates the consequences of sample selection in multilevel or mixed models, focusing on the random intercept two-level linear model under a selection mechanism acting at both hierarchical levels. The behavior of sample selection and the resulting biases on the regression coefficients and on the variance components are studied both theoretically and through a simulation study. Most theoretical results exploit the properties of Normal and Skew-Normal distributions. In the case of clusters of size two, analytic formulae of the bias are provided that generalize Heckman's formulae. The analysis allows to outline a taxonomy of sample selection in the multilevel framework that can support the qualitative assessment of the problem in specific applications and the development of suitable techniques for diagnosis and correction.

arXiv (Cornell University), Jan 4, 2021
We consider estimating the effect of a treatment on the progress of subjects tested both before a... more We consider estimating the effect of a treatment on the progress of subjects tested both before and after treatment assignment. A vast literature compares the competing approaches of modeling the post-test score conditionally on the pre-test score versus modeling the difference, namely the gain score. Our contribution resides in analyzing the merits and drawbacks of the two approaches in a multilevel setting. This is relevant in many fields, for example education with students nested into schools. The multilevel structure raises peculiar issues related to the contextual effects and the distinction between individual-level and cluster-level treatment. We derive approximate analytical results and compare the two approaches by a simulation study. For an individual-level treatment our findings are in line with the literature, whereas for a cluster-level treatment we point out the key role of the cluster mean of the pre-test score, which favors the conditioning approach in settings with large clusters.
Bayesian estimation with INLA for logistic multilevel models

arXiv (Cornell University), Sep 21, 2016
In certain academic systems, a student can enroll for an exam immediately after the end of the te... more In certain academic systems, a student can enroll for an exam immediately after the end of the teaching period or can postpone it to any later examination session, so that the grade is missing until the exam is not attempted. We propose an approach for the evaluation in itinere of a student's proficiency accounting also for non-attempted exams. The approach is based on considering each exam as an item, so that responding to the item amounts to attempting the exam, and on an Item Response Theory model that includes two latent variables corresponding to the student's ability and the propensity to attempt the exam. In this way, we explicitly account for non-ignorable missing observations as the indicators of item response also contribute to measure the ability. The two latent variables are assumed to have a discrete distribution defining latent classes of students that are homogeneous in terms of ability and priority assigned to exams. The model, which also allows for individual covariates in its structural part, is fitted by the Expectation-Maximization algorithm. The approach is illustrated through the analysis of data about the firstyear exams of freshmen of the School of Economics at the University of Florence (Italy).
Evaluation of university students’ performance through a multidimensional finite mixture IRT model
48th Scientific Meeting of the Italian Statistical Society, Apr 29, 2016
The paper analyzes the performance of university students, with reference to first-year compulsor... more The paper analyzes the performance of university students, with reference to first-year compulsory courses. We propose an Item Response Theory model that includes two latent variables corresponding to the student’s ability and the preference about the order of attempted exams. In this way, we explicitly account for nonignorable missing observations since the indicators of item response also contribute to measure the ability and then the model is of within-item multidimensional type. The two latent variables are assumed to have a discrete distribution defining latent classes of students that are homogenous in terms of ability and priority assigned to exams.

Quality & Quantity, Jan 30, 2023
The COVID-19 pandemic manifested around the World since February 2020, leading to disruptive effe... more The COVID-19 pandemic manifested around the World since February 2020, leading to disruptive effects on many aspects of people social life. The suspension of face-to-face teaching activities in schools and universities was the first containment measure adopted by the Governments to deal with the spread of the virus. Remote teaching has been the emergency solution implemented by schools and universities to limit the damages of schools and universities closure to students' learning. In this contribution we intend to suggest to policy makers and researchers how to assess the impact of emergency policies on remote learning in academia by analysing students' careers. In particular, we exploit the quasiexperimental setting arising from the sudden implementation of remote teaching in the second semester of academic year 2019/2020: we compare the performance of the cohort 2019/2020, which represents the treatment group, with the performance of the cohort 2018/2019, which represents the control group. We distinguish the impact of remote teaching at two levels: degree program and single courses within a degree program. We suggest to use Difference-InDifferences approach in the former case and multilevel modeling in the latter one. The proposal is illustrated analysing administrative data referred to freshmen of cohorts 2018/2019 and 2019/2020 for a sample of degree programs of the University of Florence (Italy).
Analysis of university teaching quality merging student ratings with professor characteristics and opinions
Statistical Methods and Applications, Apr 16, 2019
We contribute to the discussion of the paper of Piccolo and Simone by examining some issues conce... more We contribute to the discussion of the paper of Piccolo and Simone by examining some issues concerning the multivariate extension of CUB and GEM models.

Statistical Methods & Applications
The extension of quantile regression to count data raises several issues. We compare the traditio... more The extension of quantile regression to count data raises several issues. We compare the traditional approach, based on transforming the count variable using jittering, with a recently proposed approach in which the coefficients of quantile regression are modelled by parametric functions. We exploit both methods to analyse university students’ data to evaluate the effect of emergency remote teaching due to COVID-19 on the number of credits earned by the students. The coefficients modelling approach performs a smoothing that is especially convenient in the tails of the distribution, preventing abrupt changes in the point estimates and increasing precision. Nonetheless, model selection is challenging because of the wide range of options and the limited availability of diagnostic tools. Thus the jittering approach remains fundamental to guide the choice of the parametric functions.
Uploads
Papers by Leonardo Grilli