Archives

  • 2018-07
  • 2018-10
  • 2018-11
  • 2019-04
  • 2019-05
  • 2019-06
  • 2019-07
  • 2019-08
  • 2019-09
  • 2019-10
  • 2019-11
  • 2019-12
  • 2020-01
  • 2020-02
  • 2020-03
  • 2020-04
  • 2020-05
  • 2020-06
  • 2020-07
  • 2020-08
  • 2020-09
  • 2020-10
  • 2020-11
  • 2020-12
  • 2021-01
  • 2021-02
  • 2021-03
  • 2021-04
  • 2021-05
  • 2021-06
  • 2021-07
  • 2021-08
  • 2021-09
  • 2021-10
  • 2021-11
  • 2021-12
  • 2022-01
  • 2022-02
  • 2022-03
  • 2022-04
  • 2022-05
  • 2022-06
  • 2022-07
  • 2022-08
  • 2022-09
  • 2022-10
  • 2022-11
  • 2022-12
  • 2023-01
  • 2023-02
  • 2023-03
  • 2023-04
  • 2023-05
  • 2023-06
  • 2023-07
  • 2023-08
  • 2023-09
  • 2023-10
  • 2023-11
  • 2023-12
  • 2024-01
  • 2024-02
  • 2024-03
  • 2024-04
  • br Distinguishing features of and selected tensions within

    2018-11-02


    Distinguishing features of, and selected tensions within, a model-based perspective A model-based perspective emphasizes the processes producing outcome variation, and observed data are used to characterize that data-generating process. This is the majority perspective in statistical textbooks, including those focused specifically on multi-level modeling (Snijders & Bosker, 2012d). Attention is paid to minimizing bias and maximizing efficiency, and if weighting is used it is often for these purposes. Crucially for the topic at hand, a model-based inference perspective primarily considers independence of observations with respect to residual correlations in the observed data. Measured cluster-level characteristics may be of interest to explain such residual correlations, in which case a model structure is specified and the parameters estimated accordingly, or there may simply be an interest to account for the variance structure because the assumption about observations’ independence does not hold. Even before we consider the egfr inhibitors with a perspective focused on design-based inference, it is worth mentioning two tensions among those seeking to make model-based inference. First, in decisions on whether to condition on clusters, some may prefer a strategy specified a priori, while others may look to the data to guide the structure of the model. Often, the two strategies are blended, starting with an a priori strategy and then using sensitivity analyses to explore modifications to that strategy as colleagues and reviewers suggest them. A similar tension can be noted in deciding which variables to include as covariates: a strategy informed by expert knowledge and optimized to address the causal question of interest has clear strengths, but exploration of the data may suggest a simpler model that would serve just as well or a more complex model that fits the data better (which may come along with risks of overfitting the data). Investigators with model-based inference goals differ also in whether they view the implementation and interpretation challenges of more complicated models as inherently problematic due to the potential for user error, or as opportunities to enhance training and interdisciplinary partnerships. We will return to divergent views of model complexity as we conclude our discussion of analytic implications of model-based and design-based perspectives.
    Selecting a cluster definition using different perspectives However, different perspectives can be revealed when there is more than one defensible way to identify how individuals are clustered (Fig. 1). For example, consider an investigation of childhood obesity in relation to neighborhood characteristics conducted as part of a study in which subjects were sampled through schools (Gordon-Larsen, Nelson, Page, & Popkin, 2006). A design-based perspective would emphasize the sampling strategy and any investigator-induced structure in the sampling probabilities that would result. Inference about students in the finite population of students attending the schools from which the sample was drawn must account for sampling of students within schools. A model-based perspective would instead emphasize accounting for clustering that helps to approximate the probability model generating the observed data, whether by school or by neighborhood or both.
    When is it necessary to account for clustering? Most analysts would agree that we will gain nothing from efforts to account for clustering in datasets that are extremely sparse with respect to a given potential cluster definition (i.e., most clusters have only one observation) (Clarke, 2008; Rasbash et al., 2010). However, when multiple observations per cluster exist, the decision becomes more complicated. Even when there are several observations per cluster (Fig. 2c), a given outcome may not be strongly patterned by cluster. Moreover, the decision is neither purely theoretical nor purely empirical, but must balance both substantive and statistical considerations (Snijders & Bosker, 2012d). Mixed models or GEE approaches may be attractive if such divergence is detected, particularly because of their robustness to unbalanced designs with observations missing at random conditional on cluster (Ghisletta & Spini, 2004). However, from a design-based inference perspective (Lee & Forthofer, 2005), some emphasize that independent sampling probabilities justify analytic approaches that do not explicitly account for potentially health-relevant clustering in social contexts.