Measuring cognitive variables: simple answers for thorny questions
To quantify learning, we measure their physiological arousal when they hear the sound – say, whether their hands become sweaty. We don’t want to mistakenly take random fluctuations of sweat gland activity for evidence of learning, and so we only look at a specific time window after sound onset. Now, what is the best time window - 3.0 s, 3.5 s, or 4.0 s? All of these values make sense – but which one is the best?
Similar thorny issues pervade many fields of experimental psychology. How do we analyse reaction times? How do we measure declarative memory, attention, confidence, physical attraction? For any of these constructs, multiple distinct ways of measurement are used simultaneously, by different labs, or even within the same lab.
The idea we came up with built on the psychometric concept of criterion validity. Going beyond psychometric tradition, we would induce that criterion experimentally. We did experiments in which we were fairly sure what the average person would learn, and we tested how well a particular measurement method could reproduce this. We termed this criterion “retrodictive validity”.
In our new paper in Nature Human Behaviour, we formally derive the conditions under which retrodictive validity is informative and can be generalised. If these conditions are met, then a measurement method can be calibrated in a simple experiment, and widely applied in completely different and more sophisticated experimental manipulations.
https://www.nature.com/articles/s41562-020-00976-8
Behavioural researchers often seek to experimentally manipulate, measure and analyse latent psychological attributes, such as memory, confidence or attention. The best measurement strategy is often difficult to intuit. Classical psychometric theory, mostly focused on individual differences in stable attributes, offers little guidance. Hence, measurement methods in experimental research are often based on tradition and differ between communities.
Here we propose a criterion, which we term ‘retrodictive validity’, that provides a relative numerical estimate of the accuracy of any given measurement approach. It is determined by performing calibration experiments to manipulate a latent attribute and assessing the correlation between intended and measured attribute values.
Our approach facilitates optimising measurement strategies and quantifying uncertainty in the measurement. Thus, it allows power analyses to define minimally required sample sizes. Taken together, our approach provides a metrological perspective on measurement practice in experimental research that complements classical psychometrics.