The longitudinal stability of rating characteristics in an EFL examination: methodological and substantive considerations
Peer reviewed, Journal article
Accepted version
View/ Open
Date
2020-08-17Metadata
Show full item recordCollections
Original version
https://doi.org/10.1177/0265532220940960Abstract
This longitudinal study (i.e. 2002-2014) investigates the stability of rating characteristics of a large group of raters over time in the context of the writing paper of a national high-stakes examination. The study uses one measure of rater severity and two measures of rater consistency. The results suggest that the rating characteristics of individual raters are not stable, thus predictions from one administration to the next are difficult, although not impossible. In fact, as the membership of the group of raters changes from year to year, past data on rating characteristics become less useful. When the membership of the group of raters is retained, the community of raters develops more stable characteristics. However, ‘cultural shocks’ (low retention of raters and large numbers of newcomers) destabilize the rating characteristics of the community and predictions become more difficult. We propose practical measures to increase the stability of rating across time and offer methodological suggestions for more efficient rater effect-related research designs and analyses.