IQ's Corner: AP101 Reports

Showing posts with label AP101 Reports. Show all posts

Tuesday, February 07, 2012

IAP Applied Psychometrics 101 Brief Report # 11: What is the typical IQ and adaptive behavior correlation?

What is the typical relation (correlation) between standardized measures of adaptive behavior (AB) and measures of intelligence (IQ)? This is an important question given the role both play in the definition diagnosis of mental retardation (MR) / intellectual disability (ID).

During the late 1970's and 1980's this was an active area of research. Numerous studies were published that reported correlations between a wide variety of adaptive behavior scales and intelligence tests. Probably the best synthesis of this research was provided by Harrison (1987). Harrison's review included a table of over 40+ correlations. This is Table 2 in the above referenced and linked article. Harrison concluded, as have most others who have reviewed the literature, that "the majority of correlations fall in the moderate range" (p.39). When the correlations with maladaptive measures are excluded from Harrison's table, the correlations range from .03 to .91. This is a wide range. Harrison could not identify a specific explanation for the variability or range of correlations. Harrison speculated that variables might impact the magnitude of the correlations were the specific adaptive behavior or measure of intelligence used and differences in sample variability.

Subsequently the Committee on Disability Determination for Mental Retardation published a National Research Council report (Mental Retardation: Determining Eligibility for Social Security Benefits; Reschly, Meyers & Hartel, 2001) that also addressed the AB/IQ relation. The report concluded that AB/IQ studies report correlations "ranging from 0 (indicating no relationship) to almost +1 (indicating a perfect relationship). Data also suggest that the relationship between IQ and adaptive behavior varies significantly by age and levels of retardation, being strongest in the severe and moderate ranges and weakest in the mild range. There is a dearth of data on the relationship of IQ and adaptive behavior functioning at the mild level of retardation" (p. 8). Factors identified as moderating the AB/IQ correlation were scale content, measurement of competences versus perceptions, sample variability, ceiling and floor problems of the scales, and level of mental retardation.

Given the above, it is hard to render an objective statement on the approximate typical AB/IQ correlation. With this in mind, an informal research synthesis was completed and is reported here.

First, only the AB/IQ correlations (IQ/maladaptive correlations were excluded) from Harrison's 1987 table were extracted (n = 43 correlations). Then, the technical manuals for the current editions of the three most frequently used contemporary adaptive behavior scales were reviewed for additional correlations. This included the Vineland Adaptive Behavior Scale (Sparrow, Cicchetti & Balla, 2005; n = 2 correlations of .12, .20) and the Adaptive Behavior Scales--II (Harrison & Oakland, 2008; n = 10 correlations ranging from .39 to .67; median = .51).

Although six different correlations were reported in the Scales of Independent Behavior-Revised manual (SIB-R; Bruininks, Woodcock, Weatherman & Hill, 1996), the values were not used as they are inflated estimates when compared to the type of correlations typically reported. For example, very high correlations of .79, .82 and .91 are reported for certain groups. A close reading of the tables reveals that the SIB-R correlations with either the WJ or WJ-R intelligence test were calculated on the basis of the W-score growth metric. By definition, a growth metric includes age variance. If correlations are reported across wide age groups the correlations convey variance related to the correlation between the AB and IQ constructs but also contains shared variance due to the influence of general age-base development (age). Thus, the SIB and SIB-R correlations with IQ, although not wrong and providing different information, are not comparable to all other reported correlations where age variance has been removed (typically by correlating age-based standard scores). Clear evidence for this point comes from McGrew and Bruininks (1990) who used the same SIB/WJ subject data reported in the SIB and SIB-R manuals, but who removed the W-score confounded age variance prior to the calculation of latent factor correlations (via confirmatory factor analysis) between latent practical intelligence (SIB adaptive behavior) and conceptual intelligence (WJ IQ) factors. The resulting AB/IQ correlations for three different age groups were .38, .56 and .58--far below the values in the .70 to .92 range. Thus, the values from McGrew and Bruininks (1990) were included for estimates of the SIB/SIB-R IQ correlations in the current synthesis.

Finally, latent AB/IQ correlations (as estimated from confirmatory factor analysis models) of .27 and .39 were included from Ittenbach, Spiegel, McGrew and Bruininks (1992) and Keith, Fehrmann,Harrison and Pottebaum (1987), respectively. This process resulted in the addition of 17 AB/IQ correlations to the 43 from Harrison, for a total of 60 correlations.

Descriptive statistics for this collection of 60 AB/IQ correlations are as follows: range of correlations from .12 to .90, a mean of .51 and a median of .48, and a standard deviation of .20. Below is a figure that includes a frequency polygon (and smoothed normal curve overlay) and a box-whisker plot of the data set. A review of the box and whisker plot (at the bottom) shows the median correlation (.48) as a vertical line within the rectangle. The rectangle includes the 50% middle of the distributions of correlations and shows an approximate range of just below .40 to just above .65. Of particular note is the shape of the frequency polygon and smoothed normal curve. The shape of the frequency polygon is consistent with a normal curve. In quantitative research synthesis this type of normal distribution suggests that total data set included in the review is not biased--both studies that are likely under- or overestimates of the "true" population correlation (due to method or sampling factors) are included. More importantly, the "bunching" up of the majority of the correlations in the middle provide confidence that the median of this distribution is a reasonable unbiased estimate of the populaiton correaltion. This type of relatively normal distribution suggests that the current collection of 60 AB/IQ correlations is likely a reasonable approximation of the complete set of population AB/IQ correlations.

Based on this informal (and admittedly incomplete review of all possible AB/IQ correlation research) one can conclude that a reasonable estimate of the typical AB/IQ correlation is approximately .50 (mean = .51; median = .48), with most ranging from approximately .40 to .65. This finding is consistent with Harrison's 1987 conclusion of a "moderate" correlation. The current analysis continues to reinforce Harrison's (and others) conclusions that adaptive behavior and intelligence are statistically related constructs, but they are still independent. An average correlation of .50 indicates that AB and IQ share approximately 25 % common variance (approximately 15% to 40 % common variance if one looks at the range of the 50% middle of the distribution of values). In practical terms this means that for any individual, standard scores from AB and IQ tests will frequently diverge and not always be consistent.

Harrison (1987) provides a nice explanation for the primary reasons for the moderate correlation between AB and IQ. Her quote is reproduced below

Numerous caveats need to be applied to this analysis and report. The most important are:

A comprehensive review of all possible published and unpublished AB/IQ research studies was not completed. Clearly there are more studies "out there" that could be added to the synthesis.
The analysis makes no attempt to determine if there are moderator effects. That is, is the typical correlation likely to systematically vary as a function of AB measures, IQ measures, variability in the sample's level of functioning, manifest/measured versus latent variable correlations, level of ability, etc.?
This has not been peer reviewed.

It is hoped that this ad hoc update of Harrison's (1987) review, augmented by quantitive organizational methods, will serve to stimulate a formal meta-analysis by others (hint---a nice study or thesis for someone?)

Tuesday, September 14, 2010

CHC IQ test "Periodic Table of Cognitive Elements" is BACK!!!! WAIS-IV example

Back by popular demand....the McGrew Table of CHC Cognitive Elements....now revised and improved.

Below are a set of slides that include the new periodic table of cognitive elements and its use in a visual-graphic CHC summary of the WAIS-IV. Stay tuned...more of these are in the works.

The IAP AP 101 WAIS-IV report link included in one of the slides can also be accessed by clicking here.

Images can be enlarged by double clicking on them. Enjoy.

Tuesday, June 29, 2010

The Flynn Effect report series: What is the Flynn Effect: IAP AP101 Report #6

A new IAP Applied Psychometrics 101 report (#6) is now available. The report is the first in the Flynn Effect series, a series of brief reports that will define, explain and discuss the validity of the Flynn Effect (click here to access all prior FE related posts at the ICDP blog) and the issues surrounding the application of a FE "adjustment" for scores based on tests with date norms (norm obsolescence), particularly in the context of Atkins MR/ID capital punishment cases. The abstract for the brief report is presented below. The report can be accessed by clicking here.

Norm obsolescence is recognized in the intelligence testing literature as a potential source of error in global IQ scores. Psychological standards and assessment books recommend that assessment professionals use tests with the most current norms to minimize the possibility of norm obsolescence spuriously raising an individual’s measured IQ. This phenomenon is typically referred to as the Flynn Effect. This report is the first in a series of brief reports the will define, explain, and summarize the scholarly consensus regarding the validity of the Flynn Effect. The series will conclude with an evaluation of the question whether a professional consensus has emerged regarding the practice of adjusting dated IQ test scores for the Flynn Effect, an issue of increasing debate in Atkins MR/ID capital punishment hearings.

Technorati Tags: psychology, forensic psychology, forensic psychiatry, neuropsychology, intelligence, school psychology, psychometrics, educational psychology, IQ, IQ tests, IQ scores, adaptive behavior, adaptive functioning, intellectual disability, mental retardation, MR, ID, criminal psychology, criminal defense, criminal justice, ABA, American Bar Association, Atkins cases, death penalty, capital punishment, AAIDD, Atkins MR/ID listserv, ICDP blog, Flynn Effect, norm obselescence, Flynn Effect Series, IAP Applied Psychometric reports

Friday, February 05, 2010

AP101 Brief #6: Understanding Wechsler IQ score differences--the CHC evolution of the Wechsler FS IQ score

[Note. A typo in the original tables used to construct the WAIS figure below has been fixed. Visual Puzzles on the WAIS-IV had been incorrectly designated as a measure of Gf----it should have been classified Gv. This has now been changed and the corresponding text also modified. Sorry for this error. Changes in the text are so designated below via the ~~strikeover~~]

Why do the IQ scores for the same individual often differ?

This question often perplexes both users and recipients of psychological reports. In a previous IAP Applied Psychometrics 101 report (AP101 #1: Understanding IQ score differences) I discussed general statistical information related to the magnitude and frequency of expected IQ score differences for different tests (as a function of the correlation between tests). In that report I mentioned the following general categories of possible reasons for IQ score differences/discrepancies.

Factors contributing to significant IQ differences are many, and include: (a) procedural or test administration issues (e.g., scoring errors; improper test administration; malingering; age vs grade norms), (b) test norm or standardization differences (e.g., possible errors in the norms; sampling plan for selecting subjects for developing the test norms; publication date of test), (c) content differences, and/or, (d) in the case of group research, research methodology issues (e.g., sample pre-selection effects on reported mean IQs) (McGrew, 1994).

At this time I return to one of these factors--content differences. This brief report does not focus on content differences between different IQ tests but, instead, focuses on the changing content across the various editions of the two primary Wechsler intelligence batteries (WISC/WAIS). This information should be useful when individuals are comparing IQ scores (for the same person) based on different versions of the Wechsler's .

Of course, content differences will not be the only reason for possible IQ score differences across editions of the Wechsler's for an individual. Other possible reasons may include real changes in intelligence, serious scoring errors present in either one of the two test administration's, the Flynn effect, and other possible factors. This post focuses only on the changing CHC content of the WISC and WAIS series of intelligence batteries.

As discussed previously in numerous posts, contemporary CHC theory is currently considered the consensus psychometric taxonomy of human cognitive abilities (click here for prior posts and information regarding the theory). For this current brief report, I reviewed the extant CHC-organized factor analysis literature of the variousWechsler intelligence batteries. I then used this information as per the following steps:

1. I identified the individual subtests in all editions of the WISC and WAIS batteries that contributed to the respective Full Scale (FS) IQ score for each battery.

2. Using the accepted authoritative sources re: the CHC analysis of the Wechsler intelligence batteries (Flanagan, McGrew and Ortiz, 2000; Flanagan, Ortiz, and Alfonso, 2007; McGrew and Flanagan, 1998; Woodcock, 1990), I classified each of the above identified subtests as per the broad CHC ability (or abilities) measured by each subtest. For readers who want a very brief CHC overview (and ability definition cheat-sheet), click here.

3. I calculated the percentage of each broad CHC ability represented in each batteries respective FS IQ. For example, for the 1974 WISC-R, the FS IQ is calculated by summing the WISC-R scaled scores from 10 of the individual subtests. Four of these 10 subtests (Information, Comprehension, Similarities, and Vocabulary) have all been consistently classified as indicators of broad Gc. Since each of the individual subtests contribute equally to the FS IQ score, Gc represents at least 40% (4 of 10) of the WISC-R FS IQ.

However, the extant CHC Wechsler research has consistently identified a few tests with dual CHC factor loadings. In particular, both Picture Completion and Picture Arrangement have been consistently reported to load on both the Gv (performance scale) and Gc (verbal scale) on the WISC-R. For tests that demonstrated consistent dual CHC factor loadings, I assigned each broad CHC ability measured as representing 1/2 (0.5) of the test. More precise proportional calculation might have been possible (via the calculation of the average factor loadings across all studies), but for the current purpose I used this simple and (IMHO) reasonably approximate method.

As a result, both the Picture Completion and Picture Arrangement subtests were each assigned a 1/2 (0.5) Gc and 1/2 (0.5) ability classifications. When added together these two 0.5 Gc test classifications sum to 1.0. When combined with the other four clear Gc tests mentioned above, the final Gc test indicator total is 5. As a result, the total Gc proportional percentage of the WISC-R FS IQ was calculated as 50%.

4. Although the Wechsler CHC classifications were based on the primary source sources noted above, I did revise some commonly accepted classifications based upon my professional opinion (when supported by empirical research). For example, the Arithmetic subtest has frequently been classified as a measure of Gf, Gsm, and sometimes Gs. However, when valid factor indicators of Quantitative Knowledge (Gq) have been included in analyses, the Arithmetic subtest consistently displays a robust loading on the Gq factor and only minor loadings on other CHC abilities. I placed greater stock in these studies (e.g., Phelps at al, 2005: Woodcock, 1990) as I deem these to be better designed CHC studies (they included a broader array of CHC ability indicators). My final determination for Arithmetic was that it is a test that measures both Gq and Gsm.

In addition, where appropriate and consistent with published research, I modified a few other commonly accepted CHC Wechsler test classifications to reflect recent research (e.g.., Kaufman et al., 2001; Keith et al., 2006; Keith & Reynolds (in press--CHC abilities and cognitive tests: What we've learned from 20 years of research; Psychology in the Schools); Lichtenberger & Kaufman, 2001; McGrew, 2009; Tulsky & Price, 2003; plus the factor studies reported in the respective technical manuals of each battery). Referring to the mixed measures of Picture Completion and Picture Arrangement mentioned above, research with the WISC-IV has suggested that Picture Completion is primarily a measure Gv (Gc factor loading minimal or nonexistent) while Picture Arrangement continues to show significant loadings on both Gv and Gc. Thus, Picture Arrangement was classified as a mixed measure of Gc and Gv for all editions of the WISC. In contrast, in the case of the WISC-IV Picture Completion was classified as a measure Gv.

It is not possible to describe in detail all of the minor "fine tunings" I did for select Wechsler CHC test classifications. The basis for all are included in the various reference sources cited above. In the final analysis the Wechsler CHC test classifications used in this brief report are those made by myself (Kevin McGrew) based on my integration and understanding of the extant empirical research regarding the CHC abilities measured by individual tests in both the WISC and WAIS series of intelligence batteries.

5. Finally, I calculated the proportion of CHC abilities represented in the FS IQ scores for all editions of the WISC and WAIS. These value were tabled and plotted on graphs. The summary graphs are presented below. [Double click on images to enlarge]

Conclusions/observations: A review of all information presented (in and across both graphs) produces a number of interesting conclusions and hypotheses. I only present a few at this time. I encourage others to review the documents and provide additional insights or commentary via the comment feature of the blog or on various listserv's where I have posted and FYI message regarding this set of analysis.

1. Historically, the FS IQ score from the Wechsler batteries, which is typically interpreted as a measure of general intelligence (g), has been heavily weighted towards the measurement of Gc and Gv abilities. This should not be surprising given the original design blueprint specified by David Wechsler (the measurement of intelligence vis-a-vis two different modes of expression).

2. The WISC series remained constant in the CHC FS IQ composition from 1949 to 1991. Although tests may have been revised or replaced, the differential CHC proportional contribution to the FS IQ was relatively equal across all three editions. Following the 80% combined contribution of Gc and Gv, much smaller contributions to the FS IQ came from measures of Gs (10%) and Gq and Gsm (5% respectively).

3. The WISC-IV represents a significant change in the general intelligence FS IQ score provided. Gc representation has decreased approximately 20%, Gv representation was cut in half (30 % to 15 %) , Gs abilities increased slightly (5 %), and Gq was eliminated. More importantly, there was a fourfold increase in the contribution of the Gsm (from 5% to 20%) and a 20% increase in Gf representation (from 0 to 20%)! Clearly different FS IQ scores may be obtained by the same individual when comparing WISC-IV FS IQ to either WISC-R/WISC-III scores. More importantly,the difference may be a function of the different mixture of CHC abilities represented in the different editions of the WISC series.

4. The first two editions of the WAIS (WAIS and WAIS-R) were identical in differential CHC ability contribution to the FS IQ score. However, starting with the WAIS-III significant changes in the adult Wechsler battery commenced and were later amplified in the WAIS-IV. Both the WAIS-III and WAIS-IV FS IQs reduced the amount of Gc representation by approximately 14% to 15%. The contribution of Gv decreased only slightly (27.3% to 22.7%) from the WAIS-R to WAIS-III, ~~but there was a dramatic reduction (by one half)~~ and then another 2% from the WAIS-III to the WAIS-IV (22.7% to ~~10%~~ 20%). Offsetting reductions in Gc and Gv over these two editions was a trend towards greater measurement of Gs (has doubled from around 9% from the early two editions to approximately 18% to 20% in the last two editions). Gq FS IQ contribution has remained relatively similar throughout all editions. The most dramatic change, which is also consistent with the WISC series, is an approximate tenfold increase (0 % to 9.1%) in Gf from the WAIS-R to the WAIS-III~~, which was again doubled in magnitude with the publication of the~~ and WAIS-IV (10% ~~20%~~). In general, similar to the WISC series, the adult WAIS series FS IQ has slowly evolved in the CHC abilities represented by the FS IQ. Both Gc and Gv abilities have been systematically reduced concurrently with a significant increases in the contribution of Gs and Gf.

Implications of the CHC evolution of the WISC and WAIS FS IQ scores are many if one attempts to compare a current IQ score from one battery to an older score from a earlier edition of the same battery (or compare an older score from the childrens version to the latest edition of the adult version). Before one can assume that significant changes from a childhood WISC-based IQ to a WAIS-III or WAIS-IV are due to certain factors (neurological insult; malingering, the Flynn effect, etc.), one should review the above graphs and consider the possibility that the different FS IQ scores may both be valid indicators of functioning but may represent differ CHC mixes (flavors) of general intelligence.

The potential implications and hypotheses that can be generated with the aid of the above graphs are numerous. For example, Flynn (2006) has suggested that there are problems with the WAIS-III standardization norms given that studies comparing the WAIS-R/WAIS-III scores are not consistent with Flynn effect expectations. According to Weiss (2007), Flynn is ignoring data that does not fit his theory and instead is using theory to question data (and the integrity of a tests norms). According to Weiss (2007), "the only evidence Flynn provides for this statement is that WAIS-III scores do not fit expectations made based on the Flynn effect. However, the progress of science demands that theories be modified based on new data. Adjusting data to fit theory is an inappropriate scientific method, regardless of how well supported the theory may have been in previous studies." (p.1 from abstract).

I tend to concur with Weiss's arguments that the mere finding that the WAIS-III results were inconsistent with Flynn effect expectations is insufficient evidence to claim that the a test norms are wrong. If the data don't fit--one may need to retrofit (your theory or hypothesis). By inspecting the second graph above, one can see that a viable explanation for the apparent lack of the WAIS-R-to-WAIS-III Flynn effect is that the WAIS-III FS IQ score represents a different proportional composite of CHC abilities. More specifically, the WAIS-III reduced the proportional representation of Gc from 45.5% to 31.8%, decreased the Gv representation by approximately 5%, doubled the impact of Gs, and for the first time ever introduced close to 10% Gf representation. CHC content changes of the FS IQ scores between batteries may be at play. Can anyone say "comparing apples to apples+oranges?"

And so on.................more comments may be forthcoming.

PS - additional information not included in this original post has now been posted. Click here.

Technorati Tags: psychology, school psychology, educational psychology, forensic psychology, neuropsychology, clinical psychology, intelligence testing, intelligence, IQ, Wechsler batteries, WISC-R, WISC, WISC-III, WISC-IV, WAIS, WAIS-R, WAIS-III, WAIS-IV, IQ score differences, CHC theory, Cattell-Horn-Carroll, Flynn effect

Friday, January 15, 2010

Weiss & Daniel respond to "Wechsler-like IQ scaled score metric..." post

Below is a response to my prior post regarding Wechsler-like scaled score issues. The response was on the NASP listserv and the authors gave me permission to reproduce it "as is" below. I'm pleased that they concur with the recommendations at the end of the paper post.

Kevin McGrew's argument can be turned around to show that using subtest score metrics with larger SDs also may lead to misinterpretation if a change of 1 raw score point leads to a change of many standard score points. So, the issue is not as simple as which subtest metric is better (e.g, the Wechsler / Kaufman metric or the WJ metric). The issue is better framed in terms of making the right choice of metric based on how it fits with the underlying RS distribution. Appropriate fit between the RS and SS distributions is necessary to avoid
misinterpretation due to SS metrics that are either too large or small.

We agree with his suggested guidelines at the end of the full paper.

Larry Weiss
Mark Daniel
Pearson

Technorati Tags: psychology, school psychology, forensic psychology, clinical psychology, neuropsychology, Wechsler batteries, WJ III, scaled scores, standard scores, psychometrics

Tuesday, January 12, 2010

IAP Applied Psychometric 101 Brief reports section added to blog

A new section has been added to IQs Corner blog. This section is IAP Applied Psychometrics 101 Briefs. It can be found on the blog sidebar. These are brief reports that are posted at IQs Corner sister blog ICDP. Clicking on the link will take you to the ICDP blog page that contains the link to the brief report.

Technorati Tags: Psychology, school psychology, educational psychology, neuropsychology, forensic psychology, clinical psychology, special education, psychological testing, intelligence, IQ tests, IQ scores, psychometrics, psychological measurement, IAP, Institute for Applied Psychometrics, IQs Corner, cognition, cognitive, ICDP

Tuesday, January 05, 2010

The Wechsler-like IQ subtest scaled score metric: The potential for misuse, misinterpretation and impact on critical life decisions---draft report in search of feedback

The following are the first three paragraphs (and a critical figure) of a draft of an IAP Applied Psychometrics 101 Brief Report (#5). The complete report can be download in PDF format by clicking here. A web-page version of the complete report can be found by clicking here (note - the web page verision may NOT display two embedded figures....viewing the PDF copy may be necessary)

I'm providing this initial draft report with the expressed intent of soliciting feedback and comments regarding the accuracy and soundness of my analyses and logic. I'm looking for critical feedback to improve the report. This is a draft report that will be revised if comments suggest important changes. Please read it in the spirit of "tossing out some critical ideas" for reflective analysis and feedback. Feedback can be sent directly to me (iap@earthlink.net) or could be provided in the form of listserv thread discussions at the NASP and/or CHC listservs.

I've recently been skimming James Flynn's new book (What is Intelligence: Beyond the Flynn Effect) to better understand the methodology and interpretation of the Flynn effect. Of particular interest to me (as an applied measurement person) is his analysis of the individual subtest scores from the various Wechsler scales across time. As most psychologists know, Wechsler subtest scaled scores (ss) are on a scale with a mean (M) = 10 and a standard deviation (SD) = 3. The subtest ss range from 1 to 19. In Appendix 1 of his book, Flynn states "it is customary to score subtests on a scale in which the SD is 3, as opposed to IQ scores which are scaled with SD set at 15. To convert to IQ, just multiply subtest gains by five, as was done to get the IQ gains in the last column." At first glance, this statement makes it sound as if the transformation of subtest ss to IQ SS is an easy (“just multiply….”; emphasis added by me) and mathematically acceptable procedure without problems. However, on close inspection this transformation has the potential to introduce unknown sources of error into the precision of the transformed SS scores. It is the goal of this brief technical post to explain the issues involved when making this ss-to- IQ SS conversion.

The ss 1-19 scale has a long history in the Wechsler batteries. For sample, in Appendix 1 of Measurement of Adult Intelligence (Wechsler, 1944), Wechsler described the steps used to translate subtest raw scores to the new ss metric. The Wechsler batteries have continued this tradition in each new revision, although the methodology and procedures to calculate the ss 1-19 values have become more sophisticated over time. Although the methods used to develop the Wechsler ss 1-19 scale may have become more sophisticated, the resultant underlying scale for each subtest has not…scores still range from 1-19 (M=10; SD=3). Also, the most recent Stanford-Binet—5th Edition (SB5; Roid, 2003) and Kaufman Assessment Battery for Children-2nd Edition (KABC-II) have both adopted the same ss 1-19 scale for their respective individual subtests.

Why is this relatively crude (to be defined below) scale metric still used in some intelligence batteries when other contemporary intelligence batteries provide subtest scale metrics with finer measurement resolution? For example, the DAS-II (Elliott, 2007) places individual test scores on the T-scale (M=50; SD=10), with scores that range from 10-90. The WJ III (McGrew & Woodcock, 2001) places all test and composite scores on the standard score (SS) metric associated with full scale and composite scores (M=100; SD=15). The critical question to be asked is “are there advantages or disadvantages to retaining the historical ss 1-19 scale or, are their real advantages to having individual test scales with finer measurement resolution (DAS-II; WJ III)?”

......continued............
(complete report available at links in first paragraph of this post)

[Double click on image to enlarge]

Technorati Tags: psychology, Flynn Effect,school psychology, educational psychology, neuropsychology, forensice psychology, criminal psychology, criminal justice, psychometrics, Wechslers, WISC-III, WISC-IV, WAIS-III, WAIS-IV, WJ III, DAS-II, KABC-II, SB5, intelligence, IQ tests, IQ scores, scaled scores, MR, ID, mental retardation, intellectual disability, capital punishment, death penalty, IAP AP101 report

Monday, December 14, 2009

New IAP Applied Psychometrics 101 Report: IQ scores and SEM

A new IAP Applied Psychometrics 101 report (#5) is now available. The title of the report and abstract is below. The report can be downloaded by clicking here.

Applied Psychometrics 101 #4: The Standard Error of Measurement (SEM): An Explanation and Facts for "Fact Finders" in Atkins MR/ID death penalty proceedings.

Abstract

The standard error of measurement (SEM) is a professionally accepted and scientifically based measurement concept that allows users of psychological test scores to account for the known degree of imprecision in the scores. Atkins MR/ID cases almost always involve standardized psychological testing in the domains of intelligence (IQ tests) and adaptive behavior (AB). Scores from IQ and AB measures are fallible—not perfectly reliable. This report provides an easy to understand explanation of the psychometric concept of SEM augmented by an example based on real-world data. The report concludes with 8 SEM facts that “fact finders” should understand and internalize when evaluating psychological test data during legal proceedings--Atkins MR/ID death penalty proceedings in particular.

Here is a visual treat/tease from the report:

All prior IAP AP101 reports can be accessed via the Applied Psychometrics 101 (AP101) Reports section of the blog--on the blog sidebar.

Technorati Tags: psychology, forensice psychology, criminal psychology, school psychology, educational psychology, neuropsychology, developmental disabilities, MR, ID, mental retardation, intlellectual disability, AAIDD, Joint Test Standards, psychometrics, psychological measures, applied psychometrics, SEM, standard error or measurement, reliability, measurement error, Atins cases, death penalty, capital punishment, adpative behavior, intelligence, IQ, IQ tests, IQ scores, ICDP

Tuesday, November 17, 2009

Cluster analysis of the WJ III: Implications for test interpretation and CHC model extensions

IAP AP101 # 4 report is now available (click here for all AP101 reports and briefs). "IAP AP101 Report #4: Cluster analysis of the WJ III Battery: Implications for CHC test interpretation and possible CHC model extensions" can be downloaded or viewed by clicking here.

PPT files are also viewable and downloadable via SlideShare .

Abstract

The WJ III Battery is comprised of both cognitive (intelligence) and achievement components. As reported in the technical manual, the Cattell-Horn-Carroll (CHC) theory of cognitive abilities organizational structure of the WJ III has been validated. The current investigation analyzed the cognitive and achievement tests for all WJ III norm subjects from ages 6-18 years of age. Cluster analysis of the 50 WJ III tests provides additional validity for the CHC structure of the WJ III. More importantly, the analyses provide support for a significant number of narrow ability classifications for many WJ III tests, classifications that (to date) have largely been based on expert consensus task analysis. The results also suggest possible new interpretative clusters and intermediate CHC dimensions warranting future research regarding the CHC taxonomy of human cognitive abilities.

Wednesday, November 11, 2009

MDS analysis of the WJ III: Implications for CHC theory refinement and extension

IAP AP101 # 3 report is now available (click here for all AP101 reports and briefs). "IAP AP101 Report #3: MDS Analysis of the CHC-based WJ III Battery: Implications for possible refinements and extensions of the CHC model of human intelligence" can be viewed or downloaded by clicking here .

The PPT files are also viewable and downloadable via SlideShare .

Abstract

The WJ III Battery is comprised of both cognitive (intelligence) and achievement components. As reported in the technical manual, the Cattell-Horn-Carroll (CHC) theory of cognitive abilities organizational structure of the WJ III has been validated. The current investigation analyzed the cognitive and achievement tests for all WJ III norm subjects from ages 6-18 years of age. Multidimensional scaling (MDS—Guttman Radex model) of the 50 WJ III tests suggested new facets from which to interpret the WJ III. The results suggested three to four higher-order intermediate CHC model stratum abilities that varied along the dimensions of (a) controlled vs automatic cognitive processing and (b) product- vs process-dominant abilities. The results, together with recent similar analysis of the WAIS-IV, support Woodcock’s Cognitive Performance Model (CPM). Implications for possible minor changes in the CPM model are suggested. More importantly, the WJ III and WAIS-IV results collectively suggest hypothesized refinements and extensions of the CHC intelligence framework. Research focused on exploring the compatibility of a combined CHC and Berlin Model of Intelligence Structure (BIS) theory is recommended.

Technorati Tags: psychology, school psychology, educational psychology, forensic psychology, psychological testing, intelligence, IQ, IQ tests, IQ scores, WJ III, WJ III NU, Woodocock Johnson, MDS, CHC theory, Cattell-Horn-Carroll theory, controlled cognitive processing, automatic cognitive processing, process-dominant abilities, product-dominat abilities, factor analysis

Sunday, November 08, 2009

What does the WAIS-IV measure ? Applied Psychometrics 101 Report 2

What does the WAIS-IV measure?: CHC analysis and beyond.

IAP AP101 # 2 report is now available (click here for other AP101 reports and briefs). "IAP AP101 IQ TEST SCORE DIFFERENCE SERIES #2: What does the WAIS-IV measure? CHC analysis and beyond" can be viewed or downloaded by clicking here.

The PPT files are also viewable and downloadable via SlideShare.

Abstract

The WAIS-IV (2008) is the latest revision of the adult Wechsler battery. The addition of new, and deletion of old tests, plus a more-factor based foundation for the composite indexes, requires psychologists to be familiar with the best possible interpretative structure of the venerable battery. In this PowerPoint based report, the available published and unpublished confirmatory factor studies of the WAIS-IV subtests are summarized. They are then augmented via a series of new exploratory data analysis of the WAIS-IV. It is concluded that the currently available structural research argues for a CHC-based organization of WAIS subtest scores that differs from the suggested structure provided by the test publisher. In addition, exploratory methods, when combined with similar analysis of the WJ III battery, provide support for possible intermediate level CHC dimensions (between g and the Gf-Gc broad abilities) in the contemporary CHC theory of intelligence.

Technorati Tags: psychology, educational psychology, school psychology, neuropsychology, forensic psychology, clinical psychology, psychometrics, IAP, AP101, IQs Corner, intelligence, IQ, IQ tests, Wechsler, WAIS-IV, CHC, Cattell-Horn-Carroll theory, MDS, cognitive abilities, intellectual disabilities, ID, MR, mental retardation, special educaiton, LD

Saturday, September 12, 2009

Applied Psychometrics 101: Why IQ scores can differ #1 9-12-09 revision

If you downloaded the report AP101 #1 yesterday, you should return to the post and download a revised version. Some confusion in the discussion and estimation of the range of expected IQ difference scores (between different IQ tests that correlate at different levels) has been clarified.

I want to thank Dr. Joel Schneider for pointing out the confusion in the first draft. I plan to post future reports in a similar "draft" form--with the goal to receive comments and feedback that will result in better revised reports.