https://hal-enac.archives-ouvertes.fr/hal-01022484Matton, NadineNadineMattonIHM Aero - ENAC - Programme transverse IHM Aéronautique - ENAC - Ecole Nationale de l'Aviation CivileVeldhuis, MichelMichelVeldhuisUT2 - PRES Université de ToulouseVautier, StéphaneStéphaneVautierUT2 - PRES Université de ToulouseAn Item Response Theory (IRT) approach to check correspondence between cut-off scores and maximal test information in French pilot selectionHAL CCSD2010[INFO.INFO-HC] Computer Science [cs]/Human-Computer Interaction [cs.HC]Porte, Laurence2014-07-18 15:32:572021-10-19 11:02:492014-07-18 16:46:15enConference papersapplication/pdf1Cognitive ability test scores are widely used in selection procedures in aviation for hiring pilot trainees or ATC trainees (e.g., Damos, 1996; Burke, Hobson & Linsky, 1997; Martinussen & Torjussen, 1998; Sommer, Olbrich & Arendasy, 2004; Matton, Vautier & Raufaste, 2009). In Europe, the underlying theory implicitly used in this context is the Classical Test Theory (CTT, Gulliksen, 1950; Lord & Novick, 1968). Following this theory, the observed score variable (Y), usually the sum of elementary scores for each item, is construed as the sum of a true score variable (T) and an error variable (E), Y = T + E. In CTT, measurement precision is generally assessed through reliability indexes. Considering scores of a group of participants, the reliability of a test is defined as the proportion of true variance (var(T)) on observed variance (var(Y)). Reliability cannot be computed directly (as T is a latent variable) and can only be estimated under some hypotheses (e.g., when two tests are supposed to be parallel1, reliability can be computed as the correlation between both score variables). Moreover, in CTT, reliability is assumed to be constant whatever the score level. In a more modern psychometric theory, item response theory (IRT, Rasch, 1960; Birnbaum, 1968), the focus is on the response on each item instead of on the test. Furthermore, the measurement precision is assessed by the level of information that is provided by each item, with the idea that the degree of information depends on the level of the respondent's ability, defined as the latent psychological dimension assessed by the test. The key idea in IRT is that the probability of success of an item depends on the level of ability of the respondent. Generally IRT models assume an S-shaped relationship (see Figure 1, left panel) depending on one, two or three parameters being characteristics of the item (e.g., difficulty or discrimination parameters). Classically, the difficulty corresponds to the location of the inflexion point of the curve (the more this point is on the left of the theta axis, the easier the item) and the discrimination corresponds to the steepness of the curve at the inflexion point (the steeper the curve, the more discriminant the item). The information given by an item is defined as the precision of measurement of the estimated ability and depends on the item parameters as well as on the level of ability (see Figure 1, right panel). It is also inversely related to the standard error of the ability level estimation.