Showing posts with label norms. Show all posts
Showing posts with label norms. Show all posts
Monday, June 15, 2015
WAIS-IV Canadian/US norm controversy--all articles for readers to review
I previously provided an FYI post on a hot topic in Canada...claims that the new WAIS-IV Canadian norms were flawed. There are now three articles outlining the different arguments. The three articles, published in JPA, can be found here, here, and here.
I continue to not comment on this controversy given my obvious conflict of interest as a coauthor of the competing WJ-IV.
Kevin McGrew
Labels:
Canada,
norms,
WAIS-IV,
Wechsler batteries
Wednesday, June 25, 2014
Wednesday, July 11, 2012
AP101 Brief #14: Inappropriate use of demographically-adjusted (Heaton) norms in MR/ID Dx
The following AP 101 brief was just posted at the ICDP blog.
Applied Psychometrics 101 Brief # 14: Demographically adjusted neuropsychological (Heaton) norm-based scores are inappropriate for the diagnosis of MR/ID
Kevin S. McGrew, PhD.
Director
Dale G. Watson, PhD.
Berkeley, CA
Sunday, March 27, 2011
IAP Applied Psychometrics 101 Report #10: "Just say no" to averaging IQ subtest scores
Should psychologists engage in the practice of calculating simple arithmetic averages of two or more scaled or standard scores from different subtests (pseudo-composites) within or across different IQ batteries? Dr. Joel Schneider and I, Dr. Kevin McGrew say "no."
Do psychologists who include simple pseudo-composite scores in their reports, or make interpretations and recommendations based on such scores, have a professional responsibility to alert recipients of psychological reports (e.g., lawyers, the courts, parents, special education staff, other mental health practitioners, etc.) of the potential amount of error in their statements when simple pseudo-composite scores are the foundation of some of their statements? We believe "yes."
Simple pseudo-composite scores, in contrast to norm-based scores (i.e., composite scores with norms provided by test publishers/authors--e.g., Wechsler Verbal Comprehension Index), contain significant sources of error. Although they have intuitive appeal, this appeal cloaks hidden sources of error in the scores---with the amount of error being a function of a combination of psychometric variables.
IAP Applied Psychometrics 101 Report #10 addresses the psychometric issues involved in pseudo-composite scores.
In the report we offer recommendations and resources that allow users to calculate psychometrically sound pseudo-composites when they are deemed important and relevant to the interpretation of a person's assessment results.
Finally, understanding the sources of error in simple pseudo-composite scores provides an opportunity for practitioners to understand the paradoxical phenomenon frequently observed in practice where norm-based or psychometrically sound pseudo-composite scores are often higher (or lower) than the subtest scores that comprise the composite. The "total does not equal the average of the parts" phenomenon is explained conceptually, statistically, and via an interesting visual explanation based on trigonometry.

Abstract
The publishers and authors of intelligence test batteries provide norm-based composite scores based on two or more individual subtests. In practice, clinicians frequently form hypotheses based on combinations of tests for which norm-based composite scores are not available. In addition, with the emergence of Cattell-Horn-Carroll (CHC) theory as the consensus psychometric theory of intelligence, clinicians are now more frequently “crossing batteries” to form composites intended to represent broad or narrow CHC abilities. Beyond simple “eye-balling” of groups of subtests, clinicians at times compute the arithmetic average of subtest scaled or standard scores (pseudo-composites). This practice suffers from serious psychometric flaws and can lead to incorrect diagnoses and decisions. The problems with pseudo-composite scores are explained and recommendations made for the proper calculation of special composite scores.
- iPost using BlogPress from my Kevin McGrew's iPad
intelligence IQ tests IQ testing IQ scores CHC intelligence theory CHC theory Cattell-Horn-Carroll human cognitive abilities psychology school psychology individual differences cognitive psychology neuropsychology psychology special education educational psychology psychometrics psychological assessment psychological measurement IQs Corner general intelligence standard scores IQ subtests Wechsler IQ subtests IQ part scores IQ composite scores cross-battery assessment applied Psychometrics psychological measurement
Do psychologists who include simple pseudo-composite scores in their reports, or make interpretations and recommendations based on such scores, have a professional responsibility to alert recipients of psychological reports (e.g., lawyers, the courts, parents, special education staff, other mental health practitioners, etc.) of the potential amount of error in their statements when simple pseudo-composite scores are the foundation of some of their statements? We believe "yes."
Simple pseudo-composite scores, in contrast to norm-based scores (i.e., composite scores with norms provided by test publishers/authors--e.g., Wechsler Verbal Comprehension Index), contain significant sources of error. Although they have intuitive appeal, this appeal cloaks hidden sources of error in the scores---with the amount of error being a function of a combination of psychometric variables.
IAP Applied Psychometrics 101 Report #10 addresses the psychometric issues involved in pseudo-composite scores.
In the report we offer recommendations and resources that allow users to calculate psychometrically sound pseudo-composites when they are deemed important and relevant to the interpretation of a person's assessment results.
Finally, understanding the sources of error in simple pseudo-composite scores provides an opportunity for practitioners to understand the paradoxical phenomenon frequently observed in practice where norm-based or psychometrically sound pseudo-composite scores are often higher (or lower) than the subtest scores that comprise the composite. The "total does not equal the average of the parts" phenomenon is explained conceptually, statistically, and via an interesting visual explanation based on trigonometry.

Abstract
The publishers and authors of intelligence test batteries provide norm-based composite scores based on two or more individual subtests. In practice, clinicians frequently form hypotheses based on combinations of tests for which norm-based composite scores are not available. In addition, with the emergence of Cattell-Horn-Carroll (CHC) theory as the consensus psychometric theory of intelligence, clinicians are now more frequently “crossing batteries” to form composites intended to represent broad or narrow CHC abilities. Beyond simple “eye-balling” of groups of subtests, clinicians at times compute the arithmetic average of subtest scaled or standard scores (pseudo-composites). This practice suffers from serious psychometric flaws and can lead to incorrect diagnoses and decisions. The problems with pseudo-composite scores are explained and recommendations made for the proper calculation of special composite scores.
- iPost using BlogPress from my Kevin McGrew's iPad
intelligence IQ tests IQ testing IQ scores CHC intelligence theory CHC theory Cattell-Horn-Carroll human cognitive abilities psychology school psychology individual differences cognitive psychology neuropsychology psychology special education educational psychology psychometrics psychological assessment psychological measurement IQs Corner general intelligence standard scores IQ subtests Wechsler IQ subtests IQ part scores IQ composite scores cross-battery assessment applied Psychometrics psychological measurement
Generated by: Tag Generator
Thursday, December 02, 2010
IQ test battery publication timeline: Atkins MR/ID Flynn Effect cheat sheet
As I've become involved in consulting on Atkins MR/ID death penalty cases, a frequent topic raised is that of norm obsolescence (aka, the Flynn Effect). When talking with others I often have trouble spitting out the exact date of publication of the various revisions of tests, as I keep track of more than just the Wechsler batteries (which are the primary IQ tests in Atkins reports). I often wonder if others question my expertise...but most don't realize that there are more IQ batteries out there than just the Wechsler adult battery....and, in particular, a large number of child normed batteries and other batteries spanning childhood and adulthood. Thus, I decided to put together a cheat sheet for myself..one that I could print and have in my files. I put it together in the form of a simple IQ battery publication timeline. Below is an image of the figure. Double click on it to enlarge.
An important point to understand is that when serious discussions start focusing on the Flynn effect in trial's, most often the test publication date is NOT used in the calculation of how obsolete a set of test norms are. Instead, the best estimate of the year the test was normed/standardized is used, which is not included in this figure (you will need to locate this information). For example, the WAIS-R was published in 1981...but the manual states that the norming occurred from May 1976 to May 1980. Thus, in most Flynn effect discussions in court cases, the date of 1978 (middle of the norming period) is typically used. This makes recall of this information difficult for experts who track all the major individually administered IQ batteries.
Hope this helpful...if nothing else...you must admit that it is pretty :) Click on image to view

- iPost using BlogPress from my Kevin McGrew's iPad
intelligence intelligence testing Atkins cases ICDP blog psychology school psychology neuropsychology Forensic psychology criminal psychology criminal justice death penalty capital punishment ABA IQ tests IQ scores adaptive behavior AAIDD mental retardation intellectual disability Flynn effect
An important point to understand is that when serious discussions start focusing on the Flynn effect in trial's, most often the test publication date is NOT used in the calculation of how obsolete a set of test norms are. Instead, the best estimate of the year the test was normed/standardized is used, which is not included in this figure (you will need to locate this information). For example, the WAIS-R was published in 1981...but the manual states that the norming occurred from May 1976 to May 1980. Thus, in most Flynn effect discussions in court cases, the date of 1978 (middle of the norming period) is typically used. This makes recall of this information difficult for experts who track all the major individually administered IQ batteries.
Hope this helpful...if nothing else...you must admit that it is pretty :) Click on image to view

- iPost using BlogPress from my Kevin McGrew's iPad
intelligence intelligence testing Atkins cases ICDP blog psychology school psychology neuropsychology Forensic psychology criminal psychology criminal justice death penalty capital punishment ABA IQ tests IQ scores adaptive behavior AAIDD mental retardation intellectual disability Flynn effect
Wednesday, October 20, 2010
Two new Atkins MR/ID death penalty Flynn Effect articles
Two new articles published regarding the issue of adjusting IQ scores for the Flynn Effect in Atkins MR/ID death penalty cases. These will be included in an update of the ICDP Flynn Effect archive project which I hope to complete by the end of the week.
Looking to science rather than convention in adjusting IQ scores when death is at issue. 2010 Volume 41, Issue 5 (Oct), p. 413-419. Professional Psychology: Research and Practice. Cunningham, Mark D.; Tassé, Marc J.
Abstract
The progressive obsolescence of IQ test norms and associated score inflation (i.e., the Flynn effect) may have literal life and death significance in capital mental retardation determinations (i.e., Atkins hearings). Hagan, Drogin, and Guilmette (2008) asserted that IQ score corrections for the Flynn effect were inconsistent with a “standard of practice” they deduced from custom, convention, and authority. More accurately, this reflected a proposed practice guideline or recommendation for practice, rather than a standard of practice. Whether a proposed guideline or recommendation for practice, these are better informed by an analysis of the available science than accepted convention. The authors reviewed research findings regarding the occurrence of the Flynn effect in the “zone of ambiguity” (IQ = 71–80), and proposed a best practice recommendation for discussing and reporting Flynn effect correction of IQ scores in capital mental retardation determinations.
Science rather than advocacy when reporting IQ scores, p. 420-423. Hagan, Leigh D.; Drogin, Eric Y.; Guilmette, Thomas J.
Abstract
The existence of shifts in mean IQ scores over time is well established. However, on a case-by-case basis, such shifts vary unreliably, rendering specific adjustments to a given individual's IQ score incalculable. Based upon data presented previously (Hagan, Drogin, & Guilmette, 2008) as well as a review of more recent studies that have further detailed the wide variability of mean score shifts, any proposal to “correct” IQ scores in forensic evaluations due to the “Flynn effect” (FE) is unjustifiable. To offer the court an unreliable new IQ score in place of an allegedly unreliable old one—and to do so specifically in capital murder cases as opposed to any other context—appears far more reflective of result-focused advocacy than objective scientific practice. Forensic psychologists are explicitly encouraged to address likely ranges of IQ score variability and to discuss in relevant detail the strengths and weaknesses of the specific studies—however much at odds these may be—that attempt to define and quantify mean score shifts.
Flynn effect IQ scores Atkins cases Death penalty capital punishment norm obsolescence ICDP blog Psychologists forensic psychology neuropsychology school psychology Intelligence IQ tests mental retardation intellectual disability
- iPost using BlogPress from my Kevin McGrew's iPad
Looking to science rather than convention in adjusting IQ scores when death is at issue. 2010 Volume 41, Issue 5 (Oct), p. 413-419. Professional Psychology: Research and Practice. Cunningham, Mark D.; Tassé, Marc J.
Abstract
The progressive obsolescence of IQ test norms and associated score inflation (i.e., the Flynn effect) may have literal life and death significance in capital mental retardation determinations (i.e., Atkins hearings). Hagan, Drogin, and Guilmette (2008) asserted that IQ score corrections for the Flynn effect were inconsistent with a “standard of practice” they deduced from custom, convention, and authority. More accurately, this reflected a proposed practice guideline or recommendation for practice, rather than a standard of practice. Whether a proposed guideline or recommendation for practice, these are better informed by an analysis of the available science than accepted convention. The authors reviewed research findings regarding the occurrence of the Flynn effect in the “zone of ambiguity” (IQ = 71–80), and proposed a best practice recommendation for discussing and reporting Flynn effect correction of IQ scores in capital mental retardation determinations.
Science rather than advocacy when reporting IQ scores, p. 420-423. Hagan, Leigh D.; Drogin, Eric Y.; Guilmette, Thomas J.
Abstract
The existence of shifts in mean IQ scores over time is well established. However, on a case-by-case basis, such shifts vary unreliably, rendering specific adjustments to a given individual's IQ score incalculable. Based upon data presented previously (Hagan, Drogin, & Guilmette, 2008) as well as a review of more recent studies that have further detailed the wide variability of mean score shifts, any proposal to “correct” IQ scores in forensic evaluations due to the “Flynn effect” (FE) is unjustifiable. To offer the court an unreliable new IQ score in place of an allegedly unreliable old one—and to do so specifically in capital murder cases as opposed to any other context—appears far more reflective of result-focused advocacy than objective scientific practice. Forensic psychologists are explicitly encouraged to address likely ranges of IQ score variability and to discuss in relevant detail the strengths and weaknesses of the specific studies—however much at odds these may be—that attempt to define and quantify mean score shifts.
Flynn effect IQ scores Atkins cases Death penalty capital punishment norm obsolescence ICDP blog Psychologists forensic psychology neuropsychology school psychology Intelligence IQ tests mental retardation intellectual disability
- iPost using BlogPress from my Kevin McGrew's iPad
Saturday, October 16, 2010
iPost: JPA special issue on the Flynn Effect

The special JPA issue on the Flynn Effect is now available. Below are the titles of the articles. Within a week I plan to add them to the Flynn Effect archive project. Stay tuned.
Ceci, S. J., & Kanaya, T. (2010). ''Apples and Oranges Are Both Round'': Furthering the Discussion on the Flynn Effect. Journal of Psychoeducational Assessment, 28(5), 441-447.
Fletcher, J. M., Stuebing, K. K., & Hughes, L. C. (2010). IQ Scores Should Be Corrected for the Flynn Effect in High-Stakes Decisions. Journal of Psychoeducational Assessment, 28(5), 469-473.
Flynn, J. R. (2010). Problems With IQ Gains: The Huge Vocabulary Gap. Journal of Psychoeducational Assessment, 28(5), 412-433.
Hagan, L. D., Drogin, E. Y., & Guilmette, T. J. (2010). IQ Scores Should Not Be Adjusted for the Flynn Effect in Capital Punishment Cases. Journal of Psychoeducational Assessment, 28(5), 474-476.
Kaufman, A. S. (2010). ''In What Way Are Apples and Oranges Alike?'' A Critique of Flynn's Interpretation of the Flynn Effect. Journal of Psychoeducational Assessment, 28(5), 382-398.
Kaufman, A. S. (2010). Looking Through Flynn's Rose-Colored Scientific Spectacles. Journal of Psychoeducational Assessment, 28(5), 494-505.
Kaufman, A. S., & Weiss, L. G. (2010). Guest Editors' Introduction to the Special Issue of JPA on the Flynn Effect. Journal of Psychoeducational Assessment, 28(5), 379-381.
McGrew, K. S. (2010). The Flynn Effect and Its Critics: Rusty Linchpins and ''Lookin' for g and Gf in Some of the Wrong Places''. Journal of Psychoeducational Assessment, 28(5), 448-468.
Reynolds, C. R., Niland, J., Wright, J. E., & Rosenn, M. (2010). Failure to Apply the Flynn Correction in Death Penalty Litigation: Standard Practice of Today Maybe, but Certainly Malpractice of Tomorrow. Journal of Psychoeducational Assessment, 28(5), 477-481.
Sternberg, R. J. (2010). The Flynn Effect: So What? Journal of Psychoeducational Assessment, 28(5), 434-440.
Weiss, L. G. (2010). Considerations on the Flynn Effect. Journal of Psychoeducational Assessment, 28(5), 482-493.
Zhou, X. B., Zhu, J. J., & Weiss, L. G. (2010). Peeking Inside the ''Black Box'' of the Flynn Effect: Evidence From Three Wechsler Instruments. Journal of Psychoeducational Assessment, 28(5), 399-411.
Flynn effect test norms IQ tests IQ scores Wechsler tests Norm obsolesce
- iPost using BlogPress from my Kevin McGrew's iPad
Tuesday, June 29, 2010
The Flynn Effect report series: What is the Flynn Effect: IAP AP101 Report #6
A new IAP Applied Psychometrics 101 report (#6) is now available. The report is the first in the Flynn Effect series, a series of brief reports that will define, explain and discuss the validity of the Flynn Effect (click here to access all prior FE related posts at the ICDP blog) and the issues surrounding the application of a FE "adjustment" for scores based on tests with date norms (norm obsolescence), particularly in the context of Atkins MR/ID capital punishment cases. The abstract for the brief report is presented below. The report can be accessed by clicking here.
Technorati Tags: psychology, forensic psychology, forensic psychiatry, neuropsychology, intelligence, school psychology, psychometrics, educational psychology, IQ, IQ tests, IQ scores, adaptive behavior, adaptive functioning, intellectual disability, mental retardation, MR, ID, criminal psychology, criminal defense, criminal justice, ABA, American Bar Association, Atkins cases, death penalty, capital punishment, AAIDD, Atkins MR/ID listserv, ICDP blog, Flynn Effect, norm obselescence, Flynn Effect Series, IAP Applied Psychometric reports
Norm obsolescence is recognized in the intelligence testing literature as a potential source of error in global IQ scores. Psychological standards and assessment books recommend that assessment professionals use tests with the most current norms to minimize the possibility of norm obsolescence spuriously raising an individual’s measured IQ. This phenomenon is typically referred to as the Flynn Effect. This report is the first in a series of brief reports the will define, explain, and summarize the scholarly consensus regarding the validity of the Flynn Effect. The series will conclude with an evaluation of the question whether a professional consensus has emerged regarding the practice of adjusting dated IQ test scores for the Flynn Effect, an issue of increasing debate in Atkins MR/ID capital punishment hearings.
Technorati Tags: psychology, forensic psychology, forensic psychiatry, neuropsychology, intelligence, school psychology, psychometrics, educational psychology, IQ, IQ tests, IQ scores, adaptive behavior, adaptive functioning, intellectual disability, mental retardation, MR, ID, criminal psychology, criminal defense, criminal justice, ABA, American Bar Association, Atkins cases, death penalty, capital punishment, AAIDD, Atkins MR/ID listserv, ICDP blog, Flynn Effect, norm obselescence, Flynn Effect Series, IAP Applied Psychometric reports
Saturday, April 03, 2010
Research Bytes: Russell (2010) on test validity across different versions/updates of tests
Russell, W. E. (2010). The 'Obsolescence' of Assessment Procedures. Journal Applied Neuropsychology, 17(1),60-67
Abstract
Keywords: assessment; Flynn Effect; obsolescence; validation; Wechsler tests
Technorati Tags: psychology, forensic psychology, forensic psychiatry, neuropsychology, intelligence, school psychology, psychometrics, educational psychology, IQ, IQ tests, IQ scores, intellectual disability, mental retardation, MR, ID, criminal psychology, criminal defense, criminal justice, ABA, American Bar Association, Atkins cases, death penalty, capital punishment, AAIDD, scientific evidence.Flynn effect, validity, norms
Abstract
The concept that obsolescence or being “out of date” makes a test or procedure invalid (“inaccurate,” “inappropriate,” “not useful,” “creating wrong interpretations,” etc.) has been widely accepted in psychology and neuropsychology. Such obsolescence, produced by publishing a new version of a test, has produced an extensive nullification of research effort (probably 10,000 Wechsler studies). The arguments, attempting to justify obsolescence, include the Flynn Effect, the creation of a new version of a test or simply time. However, the Flynn Effect appears to have plateaued. In psychometric theory, validated tests do not lose their validity due to the creation of newer versions. Time does not invalidate tests due to the improvement of neurological methodology, such as magnetic resonance imaging. This assumption is unscientific, unproven, and if true, would discredit all older neuropsychological and neurological knowledge. In science, no method, theory, or information, once validated, loses that validation merely due to time or the creation of another test or procedure. Once validated, a procedure is only disproved or replaced by means of new research.
Keywords: assessment; Flynn Effect; obsolescence; validation; Wechsler tests
Technorati Tags: psychology, forensic psychology, forensic psychiatry, neuropsychology, intelligence, school psychology, psychometrics, educational psychology, IQ, IQ tests, IQ scores, intellectual disability, mental retardation, MR, ID, criminal psychology, criminal defense, criminal justice, ABA, American Bar Association, Atkins cases, death penalty, capital punishment, AAIDD, scientific evidence.Flynn effect, validity, norms
Subscribe to:
Posts (Atom)