A typical exam

I will use Excel and Lertap to speak about a typical university exam developed and used in Australia.


It had 88 multiple-choice items. Coefficient alpha reliability was found to be only 0.74. This is a low value -- if the test was used to pass/fail students it would not do a good job; the reliability was too low.


An item analysis indicates that a number of items had distractors that did not work as wanted.




The desired response pattern for a good item is shown above.


The top students (Grp1) got the item right but not most of the weaker students.


The top students (Grp1) did not choose the distractors.


In the bottom group (Grp3), about 40% chose distractor 1, and 20% chose distractor 2.


Only about 20% of the weakest students got this item correct.


In the Physical Therapy test I will show, many items had poor response plots.


It takes skill to write good multiple-choice items!