Scoring the Exams
There are two aspects to scoring the exams leading to the CHRP designation (the National Knowledge Exam® (NKE) and the National Professional Practice Assessment® (NPPA)): (1) setting the standard, and (2) removing non-performing items.
Setting the standard
The October 2009 administration of the exams saw the introduction of the use of Angoff panels to set the cut-score on the exam. An Angoff panel is a method which makes use of the combined judgment of panel members to establish the probability that a candidate at the threshold of competence would be able to answer the question correctly.
The table below gives an example of what Angoff panel data looks like and how the cut-score is arrived at. The data in the table are fictitious, and for this example represent the items on an NKE exam only. The process is the same for the NPPA, however there would be 60 items reviewed as opposed to the 150 in the example below. After all panel members have reviewed all questions, the probabilities are averaged across panel members to arrive at an average probability for each item. Summing the average probabilities for each question gives the proposed cut-score for the whole test.
| |
Angoff Panel Judge
|
Across Judges
|
1
|
2
|
3
|
4
|
5
|
Average
|
Standard Deviation
|
| Question 1 |
.75 |
.75 |
.80 |
.65 |
.70 |
0.73 |
.057 |
| Question 2 |
.65 |
.70 |
.75 |
.65 |
.80 |
0.71 |
.065 |
| Question 3 |
.70 |
.65 |
.60 |
.65 |
.65 |
0.65 |
.035 |
| Question 4 |
.65 |
.75 |
.65 |
.70 |
.60 |
0.67 |
.057 |
| Question 5 |
.55 |
.50 |
.45 |
.65 |
.55 |
0.54 |
.074 |
| --- |
--- |
--- |
--- |
--- |
--- |
--- |
--- |
| Question 146 |
.80 |
.80 |
.80 |
.70 |
.60 |
0.74 |
.089 |
| Question 147 |
.80 |
.75 |
.70 |
.55 |
.65 |
0.69 |
.096 |
| Question 148 |
.55 |
.60 |
.65 |
.65 |
.45 |
0.58 |
.084 |
| Question 149 |
.65 |
.65 |
.70 |
.75 |
.65 |
0.68 |
.045 |
| Question 150 |
.65 |
.70 |
.65 |
.65 |
.55 |
0.64 |
.055 |
| Passing score |
101.25 |
102.75 |
101.25 |
99 |
93 |
99.45 |
3.846 |
In the above example, the cut-score for the NKE would be set at 99. Expressed as a percent, the cut-score would be 66%.
Because the cut-score for the test is derived by adding the probabilities for each question, the cut-score will vary depending on the particular set of items that make up the test. Each version of each exam will have its own cut-score. An exam that is made up of somewhat more difficult questions will have a somewhat lower cut-score; an exam that is made up of somewhat easier questions will have a somewhat higher cut-score.These Angoff panels are convened before each exam.
Post-exam
review
The exams are scored using a two-pass
process. Although all test items are
carefully written and selected, it happens that some test items do not perform
as expected. In a first pass,
potentially flawed items are identified based on statistical criteria. These items are reviewed and some may be
discarded. In a second pass, the final
scores and cut-score are recalculated on the basis of the retained items.
The statistics calculated for each item
include difficulty and discrimination indices for each option. In addition, difficulty indices are calculated
for candidates at varying levels of overall exam performance, as well as for
each linguistic version of the exam.
All statistically flagged items are
re-reviewed by the Exam Board in view of making a final decision as to the
inclusion of the item. There are various
reasons why items may fail to perform as expected: items that inadvertently
have no correct answer or more than one correct answer, items that are not at
the appropriate level, or are ambiguous in some other way. Sometimes, items are found to be mis-keyed,
in such cases the item is re-keyed and the item statistics re-calculated.
Final
scoring
All items that the Exam Board identify as non-performing
are deleted from the final scoring. The Angoff
panel information for the deleted item is also discarded. In a second pass, the scores are recalculated
omitting the discarded items. This score
is compared to the Angoff panel cut-score recommendation also omitting the
discarded items. Scores at or above the
threshold are given a pass.
For example, say that for the data in the NKE
example table above, Questions 2 and 149 had been deemed flawed and
consequently discarded; that exam would now be scored out of 148 and the
cut-score would be 98. Expressed as a
percent, the cut-score would now be 66%.
Exam
Time Limit
The time limit allowed for the exams
assumes all items are performing. Therefore, writers are not disadvantaged by
having items removed from the exam after it has been written. For example, when
the CCHRA added five pre-test items to the NPPA, additional time was allotted
to account for the time it would take for exam writers to complete the
additional questions. As with non-performing items, pre-test questions are not
counted towards an exam writer’s final score.