Breaking Down the Independent Review of Mary Willingham's Data

Ronald Martinez

The University of North Carolina has released the findings from a review of Mary Willingham's data and claim regarding student-athlete reading levels at the school. The review, which was conducted by three experts in the field independent of each other concluded the data did not support the claims Willingham has made and the media has parroted since January.

The executive summary can be found here and with links to the individual reports at Carolina Commitment website..The review was done by three individuals who worked independently of each other. Those experts were Nathan Kuncel from the University of Minnesota, Lee Alan Branum-Martin from Georgia State and Dennis Kramer from the University of Virginia. It should be pointed out that the review does not mention Willingham by name mainly because the three people doing the review are dealing with the data provided to them. UNC is also in a precarious position regarding Willingham and cannot be seen trying to bully her since she has obtained "whistleblower" status and has pending grievances lodged against UNC which makes her close to untouchable. Willingham's data, on the other hand, is fair game.

Here are some highlights from the executive summary(emphasis mine)

The screening tests were administered when the student-athletes first arrived at UNC, either during summer session II or fall semester of their freshman year. The student-athletes who took these tests, like other students tested for learning differences and/or disabilities, did so voluntarily with the expectation that the results would be treated confidentially and would be used solely to strengthen their educational experience.

This is a big deal. The student-athletes in question took these tests on the assumption the data would be confidential and used for the express purpose of evaluating their educational needs at UNC. While UNC and other schools might make use of such data to analyze certain issues, it would be done with the data being "de-indentified" to ensure privacy. That wasn't the case with this data. The internal review shows Willingham had access to all sorts of identifiable information.

There were no SATA subtest scores for six of the students listed. The data set contained identifiable information about students, including names, the year they entered the University and their sport, as well as SAT scores and GPA information that was added later, after the original screening.

The confidentiality issue is not related to the overall purpose, which makes this is a dig of sorts at Willingham. It also highlights a huge ethical issue for her. According to the IRB at UNC, Willingham was only supposed to be working with "de-identified" data. She was never supposed to have identifiable information and the fact she did is why the IRB moved to yank her approval three days after this data was turned over to the Provost's Office.

And if that wasn't enough, the investigation of the submitted data revealed even more intriguing aspects.

An internal team from the Provost’s Office and ASPSA compared the January 13 data set to the outside neuropsychologist’s records that had been retained in ASPSA since 2004. They found that the January 13 data set did not include a complete list of student-athletes from revenue sports who were screened between 2004 and 2012. Rather, the January 13 data set included some (but not all) members of the football, men’s and women’s basketball, baseball and volleyball teams. The data set did not include information about student-athletes from non-revenue sports who were tested during the same period. Also, the data set listed some members of the baseball team as members of the men’s basketball team.

To sum up, the data Mary Willingham provided didn't even match the neuropsychologist who actually administered the tests.

(Note: According to Dan Kane UNC terminated the contract with the neuropsychologist who administered these tests. And yes, he dropped that a few minutes after UNC released these findings because that's what Dan Kane does.)

Based on the data set the Provost's Office received, not all the student-athletes were included even though they had been screened. It also notes that the actual number of tests included was 176 not 182 or 183 which were both reported by the media. Does that impact the findings? Why is the data different? Did Willingham take certain student-athletes out of the data set? The fact the data UNC received on which Willingham's findings were based and what the test administrator had were different is problematic to say the least.

As for the findings, here is what the reviewers concluded.

1. The SATA RV subtest, a 25-question multiple choice vocabulary test, is not a true reading test and should not be used to draw conclusions about student reading ability.

All three experts said that the test in question is used to ascertain vocabulary knowledge and cannot be used to determine reading level. The SATA has a reading comprehension test but, as Branum-Martin notes, "that was not reported on."

2. The data do not support the public claims about the students’ reading ability

Kramer provides the key finding saying, "This report could not find any analytical approach that produced the 60% reported from the data provided."

In addition to that, Branum-Martin points out that even using the SATA RV for grade equivalency, 109 of the 176 students "with valid scores in the sample had Reading Vocabulary grade equivalents above 12th grade." Branum-Martin notes that this is simply a vocabulary test and not a reading test with "connected text." In short, the conclusion here is even using Willingham's standard, 109 of 176 athletes in this group were college level readers.

3. Reading ability should not be reported as grade equivalents

Kramer basically says scores on this test are not grade equivalent any more than the SAT range of 200-800 is.

4. The difference in demographics between the SATA test norm and the demographics of the UNC student-athletes is important to understanding conclusions that can be drawn from the data.

This might be the more interesting conclusion. It is noted in the report that the norm used for the SATA RV has a drastically different demographic makeup than that of the athletes tested. The demographics the SATA test was "normed" against were 54% female and 86% white, 8% black and 6% other however the demographics for the athletes in question were 86% male and 59% black.

UNC asked if this impacted how the conclusions could be drawn given it is so vastly different from the norm which was based on the U.S. population in general.

"Results indicate that performance on the SATA assessment was significantly predicted by race and gender, but not sport participation or age of entry. In particular, it appears that males performed two points lower than their female counterparts. Additionally, African-Americans performed 2.3 points lower than their White counterparts regardless of their age or sport participation. The common discourse around the [public claims] is football and men’s basketball players were admitted with significantly lower reading levels as compared to non-revenue generating sports. However, it appears that the SATA assessment is biased downward for males and African-Americans rather than football and men’s basketball participants. Given that African-American males are highly represented within these two sports, it stands to reason that the potential gender and racial biases of the SATA assessment are leading to lower scores for that particular population. While further data is needed to validate these claims, Table 3.3 provides a basis for future inquiry into the potential bias." (Kramer)

Kramer expresses concern that the assessment is "biased downward" for African-American males which happen to make up the majority of football and men's basketball teams. Kramer does say more data is needed to validate the claim that bias leads to lower scores for this particular demo.

5. The SATA subtests were administered in low-stakes settings, meaning that the result of the test had relatively unimportant consequences to the test taker. Low stakes settings are thought to influence test results.

Succinctly put, the athletes probably didn't really give the test much thought, may have lacked motivation and there is evidence to suggest it would produce lower scores.

6. While SATA RV (the 25-question multiple choice vocabulary subtest) results can be informative as part of screening for learning differences and/or disabilities, they are not accepted by the psychological community as an appropriate measure of reading grade level and literacy.

In short, almost no one in the field of psychology uses the test the data is based on to determine literacy. According to Kramer, in 1999 less than 1% of 728 leading neuropsychologists surveyed used the SATA at all and fewer than that for "assessment of reading abilities." Kramer notes, "The lack of acceptance of the SATA assessment by the psychological community produces yet another concern with the use of the SATA assessment as the only measure of student ability."

Within an hour of UNC posting the findings, Mary Willingham issued a statement via her mouthpiece at CNN, Sara Ganim.

In case you are wondering, yes, that looks(in form) an awful lot like the infamous "AFAM 41" paper from two weeks ago(hat tip @ChrisFord_UNC)

Willingham asserts that UNC didn't use all of the data which is interesting given Willingham told the News and Observer on January 16th that she had turned over the data to Provost James Dean. Now that that data has been exposed as being hot garbage, Willingham is claiming there is other data out there which UNC neglected to use. The implication here is UNC only used data that proved their case and ignored data that validated her findings. Willingham is also upset no one talked to her about the data and her findings while doing the review. I am just going to leave that one alone.

It is also comical to see Willingham screaming about integrity and what not when she herself violated IRB conditions regarding what data she was supposed to have and how it should be used. Not to mention, the three professors who did these reviews are probably not going to put their respective reputations on the line to cover UNC's rear end. This is a clear case of Willingham running afoul of academics who take things like research very, very seriously. It's never a good idea to release findings based on highly questionable methodologies and do so while extending a middle finger in the direction of the school's IRB. For the reviewers this isn't about covering for UNC, it's about slapping down an individual who is acting without credibility in their field.

And finally, most of this doesn't really matter in the overall narrative. The stories have already been written and people like Sara Ganim and Dan Kane are going to stand by Willingham even if it means setting fire to their own credibility in the process. As Doc likes to say, headlines are for the front page and corrections for B11. No media outlet is going to give this any attention other than to print whatever garbage Willingham decides to spew in response.

It also should be noted that Willingham has done great damage to a cause she claims to care so much about. UNC's response is about smacking down bogus research because that's what it is. None of what UNC has posted today changes the fact there are still underprepared athletes who need academic help.The problem with Willingham's claims is exactly what Bradley Bethel has said. An athlete who might not be prepared for UNC is, at the same time, not a drooling vegetable who can't read "Go Dog Go." The fact they produced an NCAA qualifying score on the SAT or ACT would indicate their literacy must be higher than Willingham claims. There are legitimate issues in intercollegiate athletes regarding players being propped up eligibility wise, so much so, it wasn't necessary to conjure up false research findings to make the point. It certainly wasn't necessary to impugn a majority of student-athletes at UNC who are performing as they should in the classroom.

Willingham, Jay Smith and others have claimed this is about reforming the NCAA and UNC. That is a laudable goal but the proof is in the pudding. What Willingham and Smith have chosen to do is peddle questionable data, cast aspersions on UNC athletes and offer only token reforms. As Bethel noted last night, UNC is engaging in actual reform which almost never gets discussed by the media nor has it been acknowledge by the Paper Class Inc crowd. Both those entities are too busy chasing banners which isn't about reform, it's about someone's bottom line.

X
Log In Sign Up

forgot?
Log In Sign Up

Forgot password?

We'll email you a reset link.

If you signed up using a 3rd party account like Facebook or Twitter, please login with it instead.

Forgot password?

Try another email?

Almost done,

By becoming a registered user, you are also agreeing to our Terms and confirming that you have read our Privacy Policy.

Join Tar Heel Blog

You must be a member of Tar Heel Blog to participate.

We have our own Community Guidelines at Tar Heel Blog. You should read them.

Join Tar Heel Blog

You must be a member of Tar Heel Blog to participate.

We have our own Community Guidelines at Tar Heel Blog. You should read them.

Spinner.vc97ec6e

Authenticating

Great!

Choose an available username to complete sign up.

In order to provide our users with a better overall experience, we ask for more information from Facebook when using it to login so that we can learn more about our audience and provide you with the best possible experience. We do not store specific user data and the sharing of it is not required to login with Facebook.

tracking_pixel_9347_tracker