Skip Navigation
National Profile on Alternate Assessments Based on Alternate Achievement Standards:

NCSER 2009-3014
August 2009

Table 1.  Reported substance abuse in past 30 days, by disability category


Methodology Summary description Test formats that work
with this methodology
1. Modified Angoff Panelists estimate the percentage of minimally proficient students at each performance level who are expected to answer correctly/be able to do each test item; these individual produce an overall percentage of items correct that correspond to estimates are summed to the cut score for that level. Assessments with multiple items that are scored right/wrong.
2. Extended Angoff Intended for open-ended items scored with a multiple point rubric. Panelists determine the mean score that 100 minimally proficient students at each performance level would receive on this item. Summing the estimate across items produces the cut score. Assessments with open-ended items.
3. Yes/No Method Rather than estimate a percentage, panelists simply determine whether or not a borderline student would be likely to answer correctly/be able to do each test item. Summing the number of "yesses" across items produces the cut score. Assessments that include items that are scored right/wrong or checklists.
4. Bookmark or Item mapping Panelists mark the spot in a specially constructed test booklet (arranged in order of item difficulty) where minimally proficient (or advanced) students would be able to answer correctly the items occurring before that spot with a certain probability Assessments with multiple items that are scored right/wrong or with short rubrics.
5. Performance Profile Method Panelists mark the spot in a specially constructed booklet of score profiles (arranged from lowest to highest total points) that designates sufficient performance to be classified as proficient. Each score profile uses a pictorial bar graph to display the student's performance on each task of the assessment and two to five profiles are shown for each raw score point. Assessments containing open-ended items, usually performance tasks, where it is difficult to provide samples of student work.
6. Reasoned Judgment Panelists divide a score scale (e.g., 32 points) into a desired number of categories (e.g., four) in some way (equally, larger in the middle, etc.) based on expert judgment Assessments that result in one overall score.
7. Judgmental Policy Capturing Panelists determine which of the various components of an overall assessment are more important to other , so that components or types of evidence are weighed. Assessments that contain multiple components.
8. Body of Work Panelists examine all of the data for a student and use this information to place the student in one of the overall performance levels. Standard setters are given a set of papers that demonstrate the complete range of possible scores from low to high. Assessments that consist primarily of performance tasks or one general body of evidence, such as a portfolio.
9. Contrasting Groups Teachers separate students into groups based on their observations of the students in the classroom; the scores of the students are then calculated to determine where scores will be categorized in the future. Because this method is not tied to the test, it works with almost any test that results in an overall score.
10. Item-Description Matching Panelists determine what a student must know and be able to do to answer an item correctly. The panelists match these item-response requirements to a performance level descriptor. As panelists match items to the descriptors, sequences of items emerge in which some items match more closely, and cut scores and determined depending on patterns. Assessments that include dichotomously scored and polytomously scored items.
11. Dominant Profile Method This method creates a set of decision rules to be used when tests are scored on several dimensions, such as performance, progress, generalization, and complexity, to determine rules for the cut score, describing whether there needs to be a minimum score on each dimension, on the total test, or some combination. It requires panelists to state exactly whether a high score on one dimension can compensate for a low score on another. The panelist's task is to become familiar with the meaning of each dimension and to specify rules for determining which combinations of scores on these dimensions represent acceptable performance and which do not. Tests that are scored on several dimensions, such as performance, progress, generalization, and complexity.