logo.gif (2221 bytes)

infoworks.gif (1859 bytes)measure2.gif (1737 bytes)

children2.gif (10850 bytes)

toc_6blue.gif (4991 bytes)

Technical Brief on the 1999 Statistical Model

Correlation Does Not Imply Causation

Even if two variables are legitimately related or correlated, there is not necessarily any causal relationship between them. In other words, changes in the one variable may not be directly caused by the independent operation of the other variable. The one may fluctuate in relation to the other due solely to chance (coincidence) or, as is often the case, each is strongly affected by one or more other (confounding) variables that were not considered by the researcher. Other possible reasons include both variables changing over time, one (response) variable causing a change in the other (explanatory) variable or one being the direct cause of the other, and one being a contributor but not the sole cause of the other. In the well-known expression "correlation does not imply causation," statisticians summarize this understanding of the legitimate use of statistical relationships. In the absence of any other evidence, data from an observational study cannot be used to establish causation.

However, a causal connection probably does exist if we can establish that: 1) there is a reasonable explanation of cause and effect, 2) the connection happens under varying conditions, and 3) potential confounding variables are ruled out. The best way to determine these factors is through a designed experiment in which groups which are strongly similar to one another in terms of certain important variables are exposed to different approaches (treatments) and analyzed to see whether the variable of interest performs differently among the treated groups. One or more groups is also held constant and not subjected to treatment(s) as a "control" group(s).

In the RI model, the relationships between achievement on selected state tests, socioeconomic status (SES), limited English proficiency (LEP) and special needs do not lend themselves to a controlled experiment. Therefore, we should use the results of the model solely to look at aggregate differences in schools educating similar types of students rather than as predictive of actual individual student achievement or even aggregate student achievement for a single school. If careful local study of conditions surrounding student achievement can rule out some of the other factors which might explain the results, we can increase our confidence that any results seen across multiple years may be attributable to the factors which make up the RI model. Taken only by itself, this model, like each of the other data fields in this year's Information Works!, does not provide sufficient information by itself and should not be the sole means for understanding and judging the complexity of a school, its students, and its results. When the fields are taken together and coupled with other local sources of information about a school, we begin to move out of the realm of speculation about a school and into a data-informed conversation about school contexts and school improvement efforts.




Return to Technical Brief Home Page || Return to Information Works Home Page