There is a new AERA publication on the edTPA that questions the validity of using the portfolio based assessment as a teacher certification exam.  Since we operate in Texas, these concerns that seem to keep popping up really seem to indicate that Texas needs to stop and take a hard look at the direction we are heading.

The new study by Drew Gitomer from Rutgers (a very solid researcher) and others entitles “Assessing the Assessment: Evidence of the Reliability and Validity in the edTPA” is pretty clear.  Their conclusion:

We argue that, in light of the evidence available, the current proposed and actual uses of edTPA in evaluating PSTs and programs are not sufficiently supported on technical and empircal groups. We recommend that serious consideration be given to a moratorium on using edTPA scores for consequential decisions at the individual level, pending provision of appropriate evidence of the reliability, precision and validity of the scores produced by the assessments and given the stakes involved, an independent technical review of this evidence by an expert panel.”

That’s harsh. The basic gist of the study is that because edTPA is a portfolio based assessment which relies totally on those rating the portfolio for scores – you have to have solid interrater reliability. This means that if I grade a submission and you grade it using the rubric provided we should give the portfolio same score.

Historically portfolios have not done well as a certification exam because it is extremely difficult to get interrater reliability set on a large scale. For the Texas pilot for edTPA, Pearson is using local teachers who do an online training and are paid $50 per portfolio. So essentially they are supposed to grade it “holistically” and not spend hours on each portfolio.

But the grade will ultimately decide if someone gets a teaching license so it has to be 100% accurate.

So according to the study, edTPA made up their own interrater reliability stat to get around the issues and make themselves look better.  And people are failing because one rater gives a good score, the other rater doesn’t on the same submission – but edTPA says the raters are both the same according to their stat.  While it’s very technical – this is not good.

I look forward to hearing what the response will be from those that support edTPA