Traditional approaches to the analysis count data pose analytical challenges, considering the increasing proportion of zeroes in the distribution. The aim of this paper was to predict the probability of "caries-free" subjects and the dependence of dmfs index on the influence of childhood sociodemographic factors, through the application of regression models. Data were gathered as part of the National Pathfinder Survey of 4-year-old Italian children. Clinical data on caries disease (dmfs) and childhood sociodemographic factors were collected. The predicted probability for Poisson, negative binomial and zero-inflated models (Poisson and negative binomial) were estimated using STATA commands for count outcomes. The outcome variable in the regression models was the severity of the disease (dmfs index), while statistically significant variables on bivariate analysis were considered as covariates. Out of 5538 children, 4344 (78.44%) had a dmfs = 0. The mean dmfs index was 1.36 (range: 0–104). The statistical significance of the dispersion parameter (O = 141.6, P < 0.0001) showed the inappropriateness of the Poisson model when compared with the negative binomial model. Vuong's test indicated that the zero-inflated models (ZIP and ZINB) fitted the data significantly better than the others (P < 0.001). A significative likelihood ratio statistic indicates that the ZINB regression model fitted better than ZIP model (P < 0.0001). The father's educational level was significant in both parts of the ZINB regression model (P < 0.05), implying that the degree of caries experience increases in children whose fathers have a low level of education, while the excess of caries-free children decreases. Moreover, the increase of coefficients in the zero-inflated part of ZINB regression model implies that the excess of caries-free subjects increases with the later age of tooth eruption. The observed underestimation of the frequencies of zero dmfs counts by the Poisson model is a common result when a dual-group process is not taken into account. These regression models provide a useful approach to handling count outcomes as dmfs/DMFS index in caries epidemiology
What statistical method should be used to evaluate risk factors associated with dmfs index? Evidence from the National Pathfinder Survey of 4-year-old Italian children / Solinas, Maria Giuliana; Campus, Guglielmo Giuseppe; Maida, C; Sotgiu, Giovanni; Cagetti, Mg; Lesaffre, E; Castiglia, Paolo Giuseppino. - In: COMMUNITY DENTISTRY AND ORAL EPIDEMIOLOGY. - ISSN 0301-5661. - 37:(2009), pp. 539-546.
What statistical method should be used to evaluate risk factors associated with dmfs index? Evidence from the National Pathfinder Survey of 4-year-old Italian children
SOLINAS, Maria Giuliana
;CAMPUS, Guglielmo Giuseppe;SOTGIU, Giovanni;CASTIGLIA, Paolo Giuseppino
2009-01-01
Abstract
Traditional approaches to the analysis count data pose analytical challenges, considering the increasing proportion of zeroes in the distribution. The aim of this paper was to predict the probability of "caries-free" subjects and the dependence of dmfs index on the influence of childhood sociodemographic factors, through the application of regression models. Data were gathered as part of the National Pathfinder Survey of 4-year-old Italian children. Clinical data on caries disease (dmfs) and childhood sociodemographic factors were collected. The predicted probability for Poisson, negative binomial and zero-inflated models (Poisson and negative binomial) were estimated using STATA commands for count outcomes. The outcome variable in the regression models was the severity of the disease (dmfs index), while statistically significant variables on bivariate analysis were considered as covariates. Out of 5538 children, 4344 (78.44%) had a dmfs = 0. The mean dmfs index was 1.36 (range: 0–104). The statistical significance of the dispersion parameter (O = 141.6, P < 0.0001) showed the inappropriateness of the Poisson model when compared with the negative binomial model. Vuong's test indicated that the zero-inflated models (ZIP and ZINB) fitted the data significantly better than the others (P < 0.001). A significative likelihood ratio statistic indicates that the ZINB regression model fitted better than ZIP model (P < 0.0001). The father's educational level was significant in both parts of the ZINB regression model (P < 0.05), implying that the degree of caries experience increases in children whose fathers have a low level of education, while the excess of caries-free children decreases. Moreover, the increase of coefficients in the zero-inflated part of ZINB regression model implies that the excess of caries-free subjects increases with the later age of tooth eruption. The observed underestimation of the frequencies of zero dmfs counts by the Poisson model is a common result when a dual-group process is not taken into account. These regression models provide a useful approach to handling count outcomes as dmfs/DMFS index in caries epidemiologyI documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.