Psychometricsis a field of study withinpsychologyconcerned with the theory and technique ofmeasurement.Psychometrics generally covers specialized fields within psychology and education devoted to testing, measurement, assessment, and related activities.[1]Psychometrics is concerned with the objective measurement oflatent constructsthat cannot be directly observed. Examples of latent constructs includeintelligence,introversion,mental disorders,andeducational achievement.[2]The levels of individuals on nonobservable latent variables areinferredthroughmathematical modelingbased on what is observed from individuals' responses to items on tests and scales.[2]
Practitioners are described as psychometricians, although not all who engage in psychometric research go by this title. Psychometricians usually possess specific qualifications, such as degrees or certifications, and most arepsychologistswith advanced graduate training in psychometrics and measurement theory. In addition to traditional academic institutions, practitioners also work for organizations such as theEducational Testing ServiceandPsychological Corporation.Some psychometric researchers focus on the construction and validation of assessment instruments, includingsurveys,scales,andopen-orclose-endedquestionnaires.Others focus on research relating to measurement theory (e.g.,item response theory,intraclass correlation) or specialize aslearning and developmentprofessionals.
Historical foundation
editPsychological testing has come from two streams of thought: the first, fromDarwin,Galton,andCattell,on the measurement of individual differences and the second, fromHerbart,Weber,Fechner,andWundtand their psychophysical measurements of a similar construct. The second set of individuals and their research is what has led to the development ofexperimental psychologyand standardized testing.[3]
Victorian stream
editCharles Darwin was the inspiration behind Francis Galton, a scientist who advanced the development of psychometrics. In 1859, Darwin published his bookOn the Origin of Species.Darwin described the role of natural selection in the emergence, over time, of different populations of species of plants and animals. The book showed how individual members of aspeciesdiffer among themselves and how they possess characteristics that are more or less adaptive to their environment. Those with more adaptive characteristics are more likely to survive to procreate and give rise to another generation. Those with less adaptive characteristics are less likely. These ideas stimulated Galton's interest in the study of human beings and how they differ one from another and how to measure those differences.
Galton wrote a book entitledHereditary Geniuswhich was first published in 1869. The book described different characteristics that people possess and how those characteristics make some more "fit" than others. Today these differences, such as sensory and motor functioning (reaction time, visual acuity, and physical strength), are important domains of scientific psychology. Much of the early theoretical and applied work in psychometrics was undertaken in an attempt to measureintelligence.Galton often referred to as "the father of psychometrics," devised and included mental tests among hisanthropometricmeasures.James McKeen Cattell,a pioneer in the field of psychometrics, went on to extend Galton's work. Cattell coined the termmental test,and is responsible for research and knowledge that ultimately led to the development of modern tests.[4]
German stream
editThe origin of psychometrics also has connections to the related field ofpsychophysics.Around the same time that Darwin, Galton, and Cattell were making their discoveries, Herbart was also interested in "unlocking the mysteries of human consciousness" through the scientific method.[4]Herbart was responsible for creating mathematical models of the mind, which were influential in educational practices for years to come.
E.H. Weberbuilt upon Herbart's work and tried to prove the existence of a psychological threshold, saying that a minimum stimulus was necessary to activate asensory system.After Weber,G.T. Fechnerexpanded upon the knowledge he gleaned from Herbart and Weber, to devise the law that the strength of a sensation grows as the logarithm of the stimulus intensity. A follower of Weber and Fechner,Wilhelm Wundtis credited with founding the science of psychology. It is Wundt's influence that paved the way for others to develop psychological testing.[4]
20th century
editIn 1936, the psychometricianL. L. Thurstone,founder and first president of the Psychometric Society, developed and applied a theoretical approach to measurement referred to as thelaw of comparative judgment,an approach that has close connections to the psychophysical theory ofErnst Heinrich WeberandGustav Fechner.In addition, Spearman and Thurstone both made important contributions to the theory and application offactor analysis,a statistical method developed and used extensively in psychometrics.[5]In the late 1950s,Leopold Szondimade a historical and epistemological assessment of the impact of statistical thinking on psychology during previous few decades: "in the last decades, the specifically psychological thinking has been almost completely suppressed and removed, and replaced by a statistical thinking. Precisely here we see the cancer of testology and testomania of today."[6]
More recently, psychometric theory has been applied in the measurement ofpersonality,attitudes,andbeliefs,andacademic achievement.These latent constructs cannot truly be measured, and much of the research and science in this discipline has been developed in an attempt to measure these constructs as close to the true score as possible.
Figures who made significant contributions to psychometrics includeKarl Pearson,Henry F. Kaiser,Carl Brigham,L. L. Thurstone,E. L. Thorndike,Georg Rasch,Eugene Galanter,Johnson O'Connor,Frederic M. Lord,Ledyard R Tucker,Louis Guttman,andJane Loevinger.
Definition of measurement in the social sciences
editThe definition of measurement in the social sciences has a long history. A current widespread definition, proposed byStanley Smith Stevens,is that measurement is "the assignment of numerals to objects or events according to some rule." This definition was introduced in a 1946Sciencearticle in which Stevens proposed fourlevels of measurement.[7]Although widely adopted, this definition differs in important respects from the more classical definition of measurement adopted in the physical sciences, namely that scientific measurement entails "the estimation or discovery of the ratio of some magnitude of a quantitative attribute to a unit of the same attribute" (p. 358)[8]
Indeed, Stevens's definition of measurement was put forward in response to the British Ferguson Committee, whose chair, A. Ferguson, was a physicist. The committee was appointed in 1932 by the British Association for the Advancement of Science to investigate the possibility of quantitatively estimating sensory events. Although its chair and other members were physicists, the committee also included several psychologists. The committee's report highlighted the importance of the definition of measurement. While Stevens's response was to propose a new definition, which has had considerable influence in the field, this was by no means the only response to the report. Another, notably different, response was to accept the classical definition, as reflected in the following statement:
- Measurement in psychology and physics are in no sense different. Physicists can measure when they can find the operations by which they may meet the necessary criteria; psychologists have to do the same. They need not worry about the mysterious differences between the meaning of measurement in the two sciences (Reese, 1943, p. 49).[9]
These divergent responses are reflected in alternative approaches to measurement. For example, methods based oncovariance matricesare typically employed on the premise that numbers, such as raw scores derived from assessments, are measurements. Such approaches implicitly entail Stevens's definition of measurement, which requires only that numbers areassignedaccording to some rule. The main research task, then, is generally considered to be the discovery of associations between scores, and of factors posited to underlie such associations.[10]
On the other hand, when measurement models such as theRasch modelare employed, numbers are not assigned based on a rule. Instead, in keeping with Reese's statement above, specific criteria for measurement are stated, and the goal is to construct procedures or operations that provide data that meet the relevant criteria. Measurements are estimated based on the models, and tests are conducted to ascertain whether the relevant criteria have been met.[citation needed]
Instruments and procedures
editThe first psychometric instruments were designed to measureintelligence.[11]One early approach to measuring intelligence was the test developed in France byAlfred BinetandTheodore Simon.That test was known as theTest Binet-Simon .The French test was adapted for use in the U. S. byLewis Termanof Stanford University, and named theStanford-Binet IQ test.
Another major focus in psychometrics has been onpersonality testing.There has been a range of theoretical approaches to conceptualizing and measuring personality, though there is no widely agreed upon theory. Some of the better-known instruments include theMinnesota Multiphasic Personality Inventory,theFive-Factor Model(or "Big 5" ) and tools such asPersonality and Preference Inventoryand theMyers–Briggs Type Indicator.Attitudes have also been studied extensively using psychometric approaches.[citation needed][12]An alternative method involves the application of unfolding measurement models, the most general being the Hyperbolic Cosine Model (Andrich & Luo, 1993).[13]
Theoretical approaches
editPsychometricians have developed a number of different measurement theories. These includeclassical test theory(CTT) anditem response theory(IRT).[14][15]An approach that seems mathematically to be similar to IRT but also quite distinctive, in terms of its origins and features, is represented by theRasch modelfor measurement. The development of the Rasch model, and the broader class of models to which it belongs, was explicitly founded on requirements of measurement in the physical sciences.[16]
Psychometricians have also developed methods for working with large matrices of correlations and covariances. Techniques in this general tradition include:factor analysis,[17]a method of determining the underlying dimensions of data. One of the main challenges faced by users of factor analysis is a lack of consensus on appropriate procedures fordetermining the number of latent factors.[18]A usual procedure is to stop factoring wheneigenvaluesdrop below one because the original sphere shrinks. The lack of the cutting points concerns other multivariate methods, also.[19]
Multidimensional scaling[20]is a method for finding a simple representation for data with a large number of latent dimensions.Cluster analysisis an approach to finding objects that are like each other. Factor analysis, multidimensional scaling, and cluster analysis are all multivariate descriptive methods used to distill from large amounts of data simpler structures.
More recently,structural equation modeling[21]andpath analysisrepresent more sophisticated approaches to working with largecovariance matrices.These methods allow statistically sophisticated models to be fitted to data and tested to determine if they are adequate fits. Because at a granular level psychometric research is concerned with the extent and nature of multidimensionality in each of the items of interest, a relatively new procedure known as bi-factor analysis[22][23][24]can be helpful. Bi-factor analysis can decompose "an item's systematic variance in terms of, ideally, two sources, a general factor and one source of additional systematic variance."[25]
Key concepts
editKey concepts in classical test theory arereliabilityandvalidity.A reliable measure is one that measures a construct consistently across time, individuals, and situations. A valid measure is one that measures what it is intended to measure. Reliability is necessary, but not sufficient, for validity.
Both reliability and validity can be assessed statistically. Consistency over repeated measures of the same test can be assessed with the Pearson correlation coefficient, and is often calledtest-retest reliability.[26]Similarly, the equivalence of different versions of the same measure can be indexed by aPearson correlation,and is calledequivalent forms reliabilityor a similar term.[26]
Internal consistency, which addresses the homogeneity of a single test form, may be assessed by correlating performance on two halves of a test, which is termedsplit-half reliability;the value of thisPearson product-moment correlation coefficientfor two half-tests is adjusted with theSpearman–Brown prediction formulato correspond to the correlation between two full-length tests.[26]Perhaps the most commonly used index of reliability isCronbach's α,which is equivalent to themeanof all possible split-half coefficients. Other approaches include theintra-class correlation,which is the ratio of variance of measurements of a given target to the variance of all targets.
There are a number of different forms of validity.Criterion-related validityrefers to the extent to which a test or scale predicts a sample of behavior, i.e., the criterion, that is "external to the measuring instrument itself."[27]That external sample of behavior can be many things including another test; college grade point average as when the high school SAT is used to predict performance in college; and even behavior that occurred in the past, for example, when a test of current psychological symptoms is used to predict the occurrence of past victimization (which would accurately represent postdiction). When the criterion measure is collected at the same time as the measure being validated the goal is to establishconcurrent validity;when the criterion is collected later the goal is to establishpredictive validity.A measure hasconstruct validityif it is related to measures of other constructs as required by theory.Content validityis a demonstration that the items of a test do an adequate job of covering the domain being measured. In a personnel selection example, test content is based on a defined statement or set of statements of knowledge, skill, ability, or other characteristics obtained from ajob analysis.
Item response theorymodels the relationship betweenlatent traitsand responses to test items. Among other advantages, IRT provides a basis for obtaining an estimate of the location of a test-taker on a given latent trait as well as the standard error of measurement of that location. For example, a university student's knowledge of history can be deduced from his or her score on a university test and then be compared reliably with a high school student's knowledge deduced from a less difficult test. Scores derived by classical test theory do not have this characteristic, and assessment of actual ability (rather than ability relative to other test-takers) must be assessed by comparing scores to those of a "norm group" randomly selected from the population. In fact, all measures derived from classical test theory are dependent on the sample tested, while, in principle, those derived from item response theory are not.
Standards of quality
editThe considerations ofvalidityandreliabilitytypically are viewed as essential elements for determining thequalityof any test. However, professional and practitioner associations frequently have placed these concerns within broader contexts when developingstandardsand making overall judgments about the quality of any test as a whole within a given context. A consideration of concern in many applied research settings is whether or not the metric of a given psychological inventory is meaningful or arbitrary.[28]
Testing standards
editIn 2014, the American Educational Research Association (AERA), American Psychological Association (APA), and National Council on Measurement in Education (NCME) published a revision of theStandards for Educational and Psychological Testing,[29]which describes standards for test development, evaluation, and use. TheStandardscover essential topics in testing including validity, reliability/errors of measurement, and fairness in testing. The book also establishes standards related to testing operations including test design and development, scores, scales, norms, score linking, cut scores, test administration, scoring, reporting, score interpretation, test documentation, and rights and responsibilities of test takers and test users. Finally, theStandardscover topics related to testing applications, includingpsychological testing and assessment,workplace testing andcredentialing,educational testing and assessment,and testing inprogram evaluationand public policy.
Evaluation standards
editIn the field ofevaluation,and in particulareducational evaluation,theJoint Committee on Standards for Educational Evaluation[30]has published three sets of standards for evaluations.The Personnel Evaluation Standards[31]was published in 1988,The Program Evaluation Standards(2nd edition)[32]was published in 1994, andThe Student Evaluation Standards[33]was published in 2003.
Each publication presents and elaborates a set of standards for use in a variety of educational settings. The standards provide guidelines for designing, implementing, assessing, and improving the identified form of evaluation.[34]Each of the standards has been placed in one of four fundamental categories to promote educational evaluations that are proper, useful, feasible, and accurate. In these sets of standards, validity and reliability considerations are covered under the accuracy topic. For example, the student accuracy standards help ensure that student evaluations will provide sound, accurate, and credible information about student learning and performance.
Controversy and criticism
editBecause psychometrics is based onlatent psychological processesmeasured throughcorrelations,there has been controversy about some psychometric measures.[35][page needed]Critics, including practitioners in thephysical sciences,have argued that such definition and quantification is difficult, and that such measurements are often misused by laymen, such as with personality tests used in employment procedures. The Standards for Educational and Psychological Measurement gives the following statement ontest validity:"validity refers to the degree to which evidence and theory support the interpretations of test scores entailed by proposed uses of tests".[36]Simply put, a test is not valid unless it is used and interpreted in the way it is intended.[37]
Two types of tools used to measurepersonality traitsareobjective testsandprojective measures.Examples of such tests are the:Big Five Inventory(BFI),Minnesota Multiphasic Personality Inventory(MMPI-2),Rorschach Inkblot test,Neurotic Personality Questionnaire KON-2006,[38]orEysenck Personality Questionnaire.Some of these tests are helpful because they have adequatereliabilityandvalidity,two factors that make tests consistent and accurate reflections of the underlying construct.The Myers–Briggs Type Indicator(MBTI), however, has questionable validity and has been the subject of much criticism. Psychometric specialistRobert Hoganwrote of the measure: "Most personality psychologists regard the MBTI as little more than an elaborate Chinese fortune cookie."[39]
Lee Cronbachnoted inAmerican Psychologist(1957) that, "correlational psychology, though fully as old as experimentation, was slower to mature. It qualifies equally as a discipline, however, because it asks a distinctive type of question and has technical methods of examining whether the question has been properly put and the data properly interpreted." He would go on to say, "The correlation method, for its part, can study what man has not learned to control or can never hope to control... A true federation of the disciplines is required. Kept independent, they can give only wrong answers or no answers at all regarding certain important problems."[40]
Non-human: animals and machines
editPsychometrics addresseshumanabilities, attitudes, traits, and educational evolution. Notably, the study of behavior, mental processes, and abilities of non-humananimalsis usually addressed bycomparative psychology,or with a continuum between non-human animals and the rest of animals byevolutionary psychology.Nonetheless, there are some advocators for a more gradual transition between the approach taken for humans and the approach taken for (non-human) animals.[41][42][43][44]
The evaluation of abilities, traits and learning evolution ofmachineshas been mostly unrelated to the case of humans and non-human animals, with specific approaches in the area ofartificial intelligence.A more integrated approach, under the name ofuniversal psychometrics,has also been proposed.[45][46]
See also
edit- Cattell–Horn–Carroll theory
- Classical test theory
- Computational psychometrics
- Concept inventory
- Cronbach's alpha
- Data mining
- Educational assessment
- Educational psychology
- Factor analysis
- Item response theory
- List of international databases on individual student achievement tests
- List of psychometric software
- List of schools for psychometrics
- Operationalisation
- Quantitative psychology
- Psychometric Society
- Psychological testing
- Rasch model
- Scale (social sciences)
- School counselor
- School psychology
- Standardized test
References
edit- ^"Glossary1".22 July 2017. Archived fromthe originalon 2017-07-22.Retrieved28 June2022.
- ^abTabachnick, B.G.; Fidell, L.S. (2001).Using Multivariate Analysis.Boston: Allyn and Bacon.ISBN978-0-321-05677-1.[page needed]
- ^Kaplan, R.M., & Saccuzzo, D.P. (2010).Psychological Testing: Principles, Applications, and Issues.(8th ed.). Belmont, CA: Wadsworth, Cengage Learning.
- ^abcKaplan, R.M., & Saccuzzo, D.P. (2010).Psychological testing: Principles, applications, and issues(8th ed.). Belmont, CA: Wadsworth, Cengage Learning.
- ^Nunnally, J., & Berstein, I. H. (1994).Psychometric theory(3rd ed.). New York: McGraw-Hill.
- ^Leopold Szondi(1960)Das zweite Buch: Lehrbuch der Experimentellen Triebdiagnostik.Huber, Bern und Stuttgart, 2nd edition. Ch.27, From the Spanish translation, B)IILas condiciones estadisticas,p.396. Quotation:
el pensamiento psicologico especifico, en las ultima decadas, fue suprimido y eliminado casi totalmente, siendo sustituido por un pensamiento estadistico. Precisamente aqui vemos el cáncer de la testología y testomania de hoy.
- ^Stevens, S. S.(7 June 1946). "On the Theory of Scales of Measurement".Science.103(2684): 677–680.Bibcode:1946Sci...103..677S.doi:10.1126/science.103.2684.677.PMID17750512.S2CID4667599.
- ^Michell, Joel (August 1997). "Quantitative science and the definition of measurement in psychology".British Journal of Psychology.88(3): 355–383.doi:10.1111/j.2044-8295.1997.tb02641.x.
- ^Reese, T.W. (1943). The application of the theory of physical measurement to the measurement of psychological magnitudes, with three experimental examples.Psychological Monographs, 55,1–89.doi:10.1037/h0061367
- ^"Psychometrics".Assessmentpsychology.com.Retrieved28 June2022.
- ^Stern, Theodore A.; Fava, Maurizio; Wilens, Timothy E.; Rosenbaum, Jerrold F. (2016).Massachusetts General Hospital comprehensive clinical psychiatry(Second ed.). London. p. 73.ISBN978-0323295079.Retrieved31 October2021.
{{cite book}}
:CS1 maint: location missing publisher (link) - ^Longe, Jacqueline L., ed. (2022).The Gale Encyclopedia of Psychology.Vol. 2 (4th ed.). Farmington Hills, Michigan: Gale. p. 1000.ISBN9780028683867.
- ^Andrich, D. & Luo, G. (1993). A hyperbolic cosine latent trait model for unfoldingdichotomoussingle-stimulus responses. Applied Psychological Measurement, 17, 253–276.
- ^Embretson, S.E., & Reise, S.P. (2000).Item Response Theory for Psychologists.Mahwah, NJ: Erlbaum.
- ^Hambleton, R.K., & Swaminathan, H. (1985).Item Response Theory: Principles and Applications.Boston: Kluwer-Nijhoff.
- ^Rasch, G. (1960/1980).Probabilistic models for some intelligence and attainment tests.Copenhagen, Danish Institute for Educational Research, expanded edition (1980) with foreword and afterword by B.D. Wright. Chicago: The University of Chicago Press.
- ^Thompson, B.R. (2004).Exploratory and Confirmatory Factor Analysis: Understanding Concepts and Applications.American Psychological Association.
- ^Zwick, William R.; Velicer, Wayne F. (1986). "Comparison of five rules for determining the number of components to retain".Psychological Bulletin.99(3): 432–442.doi:10.1037/0033-2909.99.3.432.
- ^Singh, Manoj Kumar (2021-09-11).Introduction to Social Psychology.K.K. Publications.
- ^Davison, M.L.(1992).Multidimensional Scaling.Krieger.
- ^Kaplan, D. (2008).Structural Equation Modeling: Foundations and Extensions,2nd ed. Sage.
- ^DeMars, C. E. (2013). A tutorial on interpreting bi-factor model scores. International Journal of Testing, 13,354–378.http://dx.doi.org/10 .1080/15305058.2013.799067
- ^Reise, S. P. (2012). The rediscovery of bi-factor modeling.Multivariate Behavioral Research, 47,667–696.http://dx.doi.org/10.1080/00273171.2012.715555
- ^Rodriguez, A., Reise, S. P., & Haviland, M. G. (2016). Evaluating bifactor models: Calculating and interpreting statistical indices.Psychological Methods, 21,137–150.http://dx.doi.org/10.1037/met0000045
- ^Schonfeld, I.S., Verkuilen, J. & Bianchi, R. (2019). An exploratory structural equation modeling bi-factor analytic approach to uncovering what burnout, depression, and anxiety scales measure.Psychological Assessment, 31,1073–1079.http://dx.doi.org/10.1037/pas0000721p. 1075
- ^abc"Home – Educational Research Basics by Del Siegle".www.gifted.uconn.edu.17 February 2015.
- ^Nunnally, J.C. (1978).Psychometric theory(2nd ed.). New York: McGraw-Hill.
- ^Blanton, H., & Jaccard, J. (2006). Arbitrary metrics in psychology.Archived2006-05-10 at theWayback MachineAmerican Psychologist, 61(1), 27–41.
- ^"The Standards for Educational and Psychological Testing".apa.org.
- ^"Joint Committee on Standards for Educational Evaluation".Archived fromthe originalon 15 October 2009.Retrieved28 June2022.
- ^Joint Committee on Standards for Educational Evaluation. (1988).The Personnel Evaluation Standards: How to Assess Systems for Evaluating Educators.Archived2005-12-12 at theWayback MachineNewbury Park, CA: Sage Publications.
- ^Joint Committee on Standards for Educational Evaluation. (1994).The Program Evaluation Standards, 2nd Edition.Archived2006-02-22 at theWayback MachineNewbury Park, CA: Sage Publications.
- ^Committee on Standards for Educational Evaluation. (2003).The Student Evaluation Standards: How to Improve Evaluations of Students.Archived2006-05-24 at theWayback MachineNewbury Park, CA: Corwin Press.
- ^[E. Cabrera-Nguyen (2010)."Author guidelines for reporting scale development and validation results in the Journal of the Society for Social Work and Research]".Academia.edu.1(2): 99–103.
- ^Tabachnick, B.G.; Fidell, L.S. (2001).Using Multivariate Analysis.Boston: Allyn and Bacon.ISBN978-0-321-05677-1.
- ^American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (1999)Standards for educational and psychological testing.Washington, DC: American Educational Research Association.
- ^Bandalos, Deborah L. (2018).Measurement theory and applications for the social sciences.New York. p. 261.ISBN978-1-4625-3215-5.OCLC1015955756.
{{cite book}}
:CS1 maint: location missing publisher (link) - ^Aleksandrowicz JW, Klasa K, Sobański JA, Stolarska D (2009)."KON-2006 Neurotic Personality Questionnaire"(PDF).Archives of Psychiatry and Psychotherapy.1:21–22.
- ^Hogan, Robert(2007).Personality and the fate of organizations.Mahwah, NJ:Lawrence Erlbaum Associates.p. 28.ISBN978-0-8058-4142-8.OCLC65400436.
- ^Cronbach, L. J. (1957). "The two disciplines of scientific psychology".American Psychologist.12(11): 671–684.doi:10.1037/h0043943– via EBSCO.
- ^Humphreys, L.G. (1987). "Psychometrics considerations in the evaluation of intraspecies differences in intelligence".Behav Brain Sci.10(4): 668–669.doi:10.1017/s0140525x0005514x.
- ^Eysenck, H.J. (1987). "The several meanings of intelligence".Behav Brain Sci.10(4): 663.doi:10.1017/s0140525x00055060.
- ^Locurto, C. & Scanlon, C (1987). "Individual differences and spatial learning factor in two strains of mice".Behav Brain Sci.112:344–352.
- ^King, James E & Figueredo, Aurelio Jose (1997). "The five-factor model plus dominance in chimpanzee personality".Journal of Research in Personality.31(2): 257–271.doi:10.1006/jrpe.1997.2179.
- ^J. Hernández-Orallo; D.L. Dowe; M.V. Hernández-Lloreda (2013)."Universal Psychometrics: Measuring Cognitive Abilities in the Machine Kingdom"(PDF).Cognitive Systems Research.27:50–74.doi:10.1016/j.cogsys.2013.06.001.hdl:10251/50244.S2CID26440282.
- ^Hernández-Orallo, José (2017).The Measure of All Minds: Evaluating Natural and Artificial Intelligence.Cambridge: Cambridge University Press.ISBN978-1-107-15301-1.
Bibliography
edit- Andrich, D. & Luo, G. (1993)."A hyperbolic cosine model for unfolding dichotomous single-stimulus responses"(PDF).Applied Psychological Measurement.17(3): 253–276.CiteSeerX10.1.1.1003.8107.doi:10.1177/014662169301700307.S2CID120745971.
- Michell, J. (1999).Measurement in Psychology.Cambridge: Cambridge University Press.doi:10.1017/CBO9780511490040
- Rasch, G. (1960/1980).Probabilistic models for some intelligence and attainment tests.Copenhagen, Danish Institute for Educational Research), expanded edition (1980) with foreword and afterword by B.D. Wright. Chicago: The University of Chicago Press.
- Reese, T.W. (1943). The application of the theory of physical measurement to the measurement of psychological magnitudes, with three experimental examples.Psychological Monographs, 55,1–89.doi:10.1037/h0061367
- Stevens, S. S. (1946). "On the theory of scales of measurement".Science.103(2684): 677–80.Bibcode:1946Sci...103..677S.doi:10.1126/science.103.2684.677.PMID17750512.
- Thurstone, L.L. (1927). "A law of comparative judgement".Psychological Review.34(4): 278–286.doi:10.1037/h0070288.
- Thurstone, L.L. (1929). The Measurement of Psychological Value. In T.V. Smith and W.K. Wright (Eds.),Essays in Philosophy by Seventeen Doctors of Philosophy of the University of Chicago.Chicago: Open Court.
- Thurstone, L.L. (1959).The Measurement of Values.Chicago: The University of Chicago Press.
- S.F. Blinkhorn(1997). "Past imperfect, future conditional: fifty years of test theory".British Journal of Mathematical and Statistical Psychology.50(2): 175–185.doi:10.1111/j.2044-8317.1997.tb01139.x.
- Sanford, David (18 November 2017)."Cambridge just told me Big Data doesn't work yet".LinkedIn.
Further reading
edit- Robert F. DeVellis (2016).Scale Development: Theory and Applications.SAGE Publications.ISBN978-1-5063-4158-3.
- Borsboom, Denny (2005).Measuring the Mind: Conceptual Issues in Contemporary Psychometrics.Cambridge:Cambridge University Press.ISBN978-0-521-84463-5.
- Leslie A. Miller; Robert L. Lovler (2015).Foundations of Psychological Testing: A Practical Approach.SAGE Publications.ISBN978-1-4833-6927-3.
- Roderick P. McDonald (2013).Test Theory: A Unified Treatment.Psychology Press.ISBN978-1-135-67530-1.
- Paul Kline (2000).The Handbook of Psychological Testing.Psychology Press.ISBN978-0-415-21158-1.
- Rush AJ Jr; First MB; Blacker D (2008).Handbook of Psychiatric Measures.American Psychiatric Publishing.ISBN978-1-58562-218-4.OCLC85885343.
- Ann C Silverlake (2016).Comprehending Test Manuals: A Guide and Workbook.Taylor & Francis.ISBN978-1-351-97086-0.