AIMSweb National Norms Technical Information

The National Norms that AIMSweb is introducing in the fall of 2011 were developed in order to provide benchmark-measure norms that reflect the performance of the national student population at grades K through 8. Simultaneously, AIMSweb is implementing the midinterval method of calculating percentiles for all norms, including local norms as well as the National Norms. The midinterval method is a superior technique for describing where a raw score falls in a norm-sample distribution.

National Norms are offered for English-language measures in reading, math, and language arts at the standard grade levels and seasons at which these measures are used for benchmarking, as summarized in Table 1. Currently, National Norms are not available for Spanish-language measures or at preschool or high-school levels because nationally-representative samples have not yet been collected for those measures or grade levels.

Table 1. Measures and grade levels/seasons for National Norms

image111.gif

Method

Data for the National Norms were selected from the AIMSweb database, which includes all of the measure scores entered into the system by AIMSweb users. The sampling unit was the grade level of a school.

In order for a norm sample to accurately represent the student population, it must not only be appropriately stratified along demographic variables, but it must also represent the full range of student performance. Although many schools administer AIMSweb benchmark measures as universal screeners, some schools assess only a subset of their student population, such as those considered to be at risk of not achieving a favorable academic outcome. Thus, drawing a sample without considering the likelihood of universal screening would be likely to result in an overrepresentation of lower-achieving students. For this reason, the data for each National Norm sample came only from schools that had conducted universal screening.

In order to identify schools that had done universal screening with a measure at a grade, the AIMSweb team compared the number of benchmark administrations (at each season) with the total number of students enrolled in that grade, as reported by the National Center for Educational Statistics (NCES). If at least 95% of enrolled students had taken the AIMSweb benchmark measure, then that grade at that school was considered eligible for inclusion in the National Norms sample for that measure. If fewer than 95% of the students at that grade had scores on the measure, then data from that grade at that school were not eligible for inclusion. That is, the scores from a grade at a school were treated as eligible or ineligible as a complete set.

As a further requirement, only the scores from students who had taken the AIMSweb measure at all three benchmarking periods during the year were retained. (Written Expression was an exception at Grades 1, 6, 7, and 8.) This simplified the sampling process without sacrificing a significant number of cases, and it strengthens some analyses and interpretations of the data by reducing variations across benchmark periods due to sampling differences.

From the eligible data at a grade, the AIMSweb team selected the final norm sample to match the national student population by gender, ethnicity, and socioeconomic status (free/reduced lunch), according to demographic targets based on NCES data. Region was not included as a stratification variable because a preponderance of AIMSweb data meeting the inclusion criteria came from the Midwest and South regions, and the AIMSweb team judged that it was better to have larger samples stratified by gender, ethnicity, and SES than smaller samples that were also stratified by region. At each grade, the NCES-provided percentage of a school’s students in a demographic category (e.g., African American) was multiplied by the number of AIMSweb scores to yield an estimated number of students in that demographic category. The norm sample for each measure at each grade was constructed to be as large as possible while maintaining a close correspondence to national demographic percentages.

For most measures, data from the 2009–2010 school year were sufficient to provide adequate sample sizes. However, for Spelling, Written Expression, and (at some grades) MAZE it was necessary to use data from 2007–2010 in order to obtain sufficiently large samples with good demographic representation.

Table 2 reports the number of students in the norm sample for each measure at each benchmark period, per grade. Tables 2 and 3 report the demographic characteristics of each norm sample.

 Table 2. Number of cases (per benchmark period) in each norm sample

image112.gif

Table 3. Demographic representation of the norm samples (percentages)

image113.gif

image118.gif

image115.gif

Midinterval percentile norms were calculated separately for each norm sample. Only minor smoothing was applied to the growth curves across benchmark periods (mostly at the top or the bottom of the percentile range) because the samples are the same at all periods during the year and because the probes within a year are similar but not identical in difficulty.

Results

The raw-score distributions in the National Norms samples tend to be about the same as, or slightly higher than, the Aggregate score distributions. This is consistent with expectation because the Aggregate sample, which includes all AIMSweb users, has a higher proportion of low-achieving students than the National Norms sample which includes only schools doing universal screening. Table 4 compares raw-score means and standard deviations in the National Norm and Aggregate samples for each measure and grade.

Table 4. Means and standard deviations of raw scores in the Aggregate and National Norm samples, by measure and grade

image116.gif

image117.gif

 Because of these differences in score distributions, AIMSweb users will find that percentile scores tend to be slightly lower with the National Norms than with the previous Aggregate Norms.