logo.gif (2221 bytes)

infoworks.gif (1859 bytes)measure2.gif (1737 bytes)


children2.gif (10850 bytes)

toc_6blue.gif (4991 bytes)

Technical Brief on the 1999 Statistical Model


Statistical Significance

Researchers want to know and to be able to say with some degree of confidence whether any relationships they have found among various types of data are different from relationships they would find solely due to chance. A measure for the degree of confidence we have in a relationship is statistical significance. Most researchers are willing to declare that a relationship is statistically significant if the chances of observing the relationship in the sample are less than 5%, assuming no other factors are affecting the data set. (Statistical modeling is based only on the factors included in the model and by its artificial nature automatically excludes all other factors.) In other words, a relationship is considered to be statistically significant if it appears less frequently than 95% of the relationships among the selected variables we would expect to see just by chance.

Thus, on the second field of the school report charts, the band (range) illustrating the statistical model's projection represents this 95% confidence level -- i.e., that there is no more than a 5% possibility that a school's actual scores would lie outside the band due solely to chance. This confidence level for the model (represented by the range of the band) includes not only actual numerical calculations for scores but also includes statistical errors that are part of the model. (Please note that all models have statistical errors and take them into account during calculations.) However, because this model is based on statistical probability, there is the possibility that a school could lie below or outside the band this year or even in subsequent years solely by chance. One goal of the SALT initiative is to shift the fundamental paradigm of school improvement in Rhode Island toward a blend of a rich variety of data sources for measuring, improving, and judging school performance. This means that the statistical model alone should not be used to assess a school but must be coupled with other data from independent sources that confirm that the results are not due to chance. For example, by using a combination of SALT survey data, observations of independent observers, analysis of selected samples of actual student work and other forms of local student assessment results, an observer could confirm that these "adjusted" assessment results are not due to chance. The RI Skills Commission is working on a similar paradigm shift at the level of the individual high school student in their efforts to design a Certificate of Initial Mastery (CIM) that credentials a student's achievement on the basis of having observed a rich array of data that demonstrates convincingly that the student has the requisite competencies for life, living and employment in the 21st century.

The meaning of statistical significance also holds true for any data results for individual students, groups of students or groups of schools. As a rule of thumb, for example, statisticians routinely conduct their studies accounting for the fact that 5% of any data set is likely flawed due to entry or other data errors. Statistical probability underscores why it is always inadvisable to judge schools solely, for example, on the results of state testing programs which by their very nature are limited in time, scope, complexity, and above all else, are themselves subject to the laws of probability and statistics.

Please note that statistical significance does not mean that two variables have a relationship that is necessarily more than statistically important. For example, a school may sit just outside the top of a band several years in a row. Quite possibly the school's staff and its parents might attach very different values to the three-percentage points difference between being within or just outside the band. The school's staff may interpret the percentage points as evidence that they are doing better than other schools and thus do not need significant improvement. The parents might see these percentage differences as evidence of an even larger gap between current proficiency and proficiency on the part of ALL students in the school. Sometimes a very interesting relationship may be missed if it fails to achieve statistical significance and the lack of complimentary observations does not flesh out this subtle relationship. Without a very strong relationship, a sufficiently large sample, or complimentary observations, chance is hard to rule out.

In terms of Rhode Island achievement results, for example, certain schools show disaggregations of student achievement results among groups of students (whites versus other groups, LEP versus non-LEP) which are not statistically significant due solely to the fact that the sample does not include a sufficient numbers of individuals to achieve statistical significance. Conversely, other schools show "gaps" which are statistically significant, but the 2-3 points difference may be educationally unimportant given the possibility of variation in achievement test scores from one day to the next among the same (or similar) group of students. In other words, there is a natural fluctuation in individual (and sometimes whole class) performance due to other factors.

« back

next »

 


Return to Technical Brief Home Page || Return to Information Works Home Page