Technical Brief
Statistical Significance
Statistical Significance
Researchers want to know and to be able to say with some degree
of confidence whether any relationships they have found among various types of data are
different from relationships they would find solely due to chance. A measure for the
degree of confidence we have in a relationship is statistical significance. Most
researchers are willing to declare that a relationship is statistically significant if the
chances of observing the relationship in the sample are less than 5%, assuming no other
factors are affecting the data set. (Statistical modeling is based only on the factors
included in the model and by its artificial nature automatically excludes all other
factors.) In other words, a relationship is considered to be statistically significant if
it appears less frequently than 95% of the relationships among the selected variables we
would expect to see just by chance.
Thus, on the second field of the school report charts, the band
(range) illustrating the statistical model's projection represents this 95% confidence
level -- i.e., that there is no more than a 5% possibility that a school's actual scores
would lie outside the band due solely to chance. This confidence level for the model
(represented by the range of the band) includes not only actual numerical calculations for
scores but also includes statistical errors that are part of the model. (Please note that
all models have statistical errors and take them into account during calculations.)
However, because this model is based on statistical probability, there is the possibility
that a school could lie below or outside the band this year or even in subsequent years
solely by chance. One goal of the SALT initiative is to shift the fundamental paradigm of
school improvement in Rhode Island toward a blend of a rich variety of data sources for
measuring, improving, and judging school performance. This means that the statistical
model alone should not be used to assess a school but must be coupled with other data from
independent sources that confirm that the results are not due to chance. For example, by
using a combination of SALT survey data, observations of independent observers in the
course of SALT visits, analysis of selected samples of actual student work and other forms
of local student assessment results, an observer could confirm that these
"adjusted" assessment results are not due to chance. The RI Skills Commission is
working on a similar paradigm shift at the level of the individual high school student in
their efforts to design a Certificate of Initial Mastery (CIM) that credentials a
student's achievement on the basis of having observed a rich array of data that
demonstrates convincingly that the student has the requisite competencies for life, living
and employment in the 21st century.
The meaning of statistical significance also holds true for any
data results for individual students, groups of students or groups of schools. As a rule
of thumb, for example, statisticians routinely conduct their studies accounting for the
fact that 5% of any data set is likely flawed due to entry or other data errors.
Statistical probability underscores why it is always inadvisable to judge schools solely,
for example, on the results of state testing programs which by their very nature are
limited in time, scope, complexity, and above all else, are themselves subject to the laws
of probability and statistics.
Please note that statistical significance does not mean that
two variables have a relationship that is necessarily more than statistically important.
For example, a school may sit just outside the top of a band several years in a row. Quite
possibly the school's staff and its parents might attach very different values to the
three-percentage points difference between being within or just outside the band. The
school's staff may interpret the percentage points as evidence that they are doing better
than other schools and thus do not need significant improvement. The parents might see
these percentage differences as evidence of an even larger gap between current proficiency
and proficiency on the part of ALL students in the school. Sometimes a very interesting
relationship may be missed if it fails to achieve statistical significance and the lack of
complimentary observations does not flesh out this subtle relationship. Without a very
strong relationship, a sufficiently large sample, or complimentary observations, chance is
hard to rule out.
In terms of Rhode Island achievement results, for example,
certain schools show disaggregations of student achievement results among groups of
students (whites versus other groups, LEP versus non-LEP) which are not statistically
significant due solely to the fact that the sample does not include a sufficient numbers
of individuals to achieve statistical significance. Conversely, other schools show
"gaps" which are statistically significant, but the 2-3 points difference may be
educationally unimportant given the possibility of variation in achievement test scores
from one day to the next among the same (or similar) group of students. In other words,
there is a natural fluctuation in individual (and sometimes whole class) performance due
to other factors.

Diagram "A"
The simplest kind of visual description of a relationship between two variables is a
straight line. Imagine, if you will, plotting (scatter plotting) a whole set of spending
per student data and then drawing a straight line that comes as close as possible to all
the points in the scatter plot. (See Diagram A)4 We call this procedure "regression," the resulting line
the "regression line" and the formula that describes the line the
"regression equation." The word "regression" originated from Francis
Galton's work in the late 1800s when he realized that for many relationships there was
"regression" (reversion) toward what he termed "mediocrity." We now
express this frequently seen statistical phenomenon as "regression toward the
mean." Human height data, for example, demonstrates that if two parents both have
above average heights, their children are more likely than not to have average or below
average heights.

Diagram "B"
Imagine if you will, plotting achievement scores for grade eight students on a particular
achievement test. The vertical axis can be achievement scores recorded as a number. The
horizontal axis can be the education level of the child's mother as reported by the child,
also expressed as numbers assigned to each level. (Of course, this axis could be any other
variable for which you have consistent data that you believe to be reliable). The question
is, then, where is the best straight line that relates these two variables (achievement
score and mother's education level) to each other? You could take a ruler and try to fit a
line through the scatter plot. However, different people would draw different lines, based
on their best visual guess as to which line is closest to most of the points. To find the
one line out of the infinite possibilities that is as close as mathematically possible to
all of the points, statisticians commonly use a procedure called the "least squares
line." (See Diagram "B".)5 To determine the least squares line, priority is given to the
vertical axis (in this case achievement scores) to calculate how close the points fall to
the line. Those distances are then squared and added up for all of the points in the
sample. For the least squares line, that sum is smaller than it would be for any other
line. The vertical distances are chosen because the equation is often used to predict that
variable when the one on the horizontal axis (mother's education level) is known.
All straight lines can be expressed by this formula for the
least squares line. The standard mathematical convention is to write an equation for the
line relating the two variables as: y = a + bx, where y represents the vertical axis
(achievement scores in our example); x represents the horizontal axis (education level of
the mother in this example), and a and b are replaced by numbers, i.e., two unique
constants derived from this particular regression line. The number represented by a is
called the intercept and the number represented by b is called the slope. The intercept
describes one particular point on the line that falls where the line crosses the vertical
axis, when the horizontal axis is at zero. A positive slope describes how much of an
increase there is for the variable on the vertical axis (here achievement scores) when the
other variable, on the horizontal axis (education level of the mother), increases by one
unit. A negative slope indicates a decrease in one variable as the other one increases.
Thus, for example, as a school's population becomes poorer in the overall data set of all
RI schools (e.g., an increase in the numbers of students eligible for free and reduced
lunch), achievement tends to decline (decrease in scores).
Back to top
|| Return to the Information Works Home Page |