In the previous post, I discussed some broad themes with regard to summative and formative
assessment. Today, let’s dig down into the weeds of why we have grades in the
context of summative assessment, at least according to Daisy Christodoulou in Making Good Progress – the book I’ve been reading. I will begin by using the
question and quote approach similar to the previous post.
(I recommend reading the previous post for definitions I’ll be using.)
(I recommend reading the previous post for definitions I’ll be using.)
Why are summative
assessments needed?
To support large
and broad inferences about how pupils will perform beyond the school and in
comparison to their peers nationally.
Why are there
‘grades’?*
The purpose of
grades and other similar labelling systems is to provide an easy way of
communicating this shared meaning… An employer or a teacher at a further
education [university or college] is able to infer from it how well the pupil
as done in that subject.
Are there other
assessment alternatives to letter grades?
There are other
methods of communicating a shared meaning which don’t look like a grade but in
practice fulfil the same function… [for example, reporting] performance in
terms of labels such as ‘emerging, expected and exceeding’ [a particular
standard].
If being able to
communicate the shared meaning is key, what restrictions does this impose on
summative assessments?
1. They need to
be taken in standard conditions with strict restrictions on the type of
external help that is, and that is not, available.
2. They have to
include questions that allow us to distinguish between different pupils.
3. They have to
sample from [a larger] body of content [because] it will not be possible to
cover all of it in a test that is a couple of hours long.
While I’ve
provided soundbites from Christodoulou, she does explain each of the points
more fully and provides clear examples. I recommend reading her book if you
find any of this interesting. With the main purpose of summative assessment is
laid out, we can see why formative assessment might look different. In
particular, the teacher does not need to worry about the three restrictions
above in designing such activities and assessments, particularly if the purpose
of formative assessment is figuring out what to do next given where the
students are at a certain point. Christodoulou argues that the ‘responsive’
aspect of formative assessment is important. Feedback runs both ways from
students to teacher and vice-versa.
Let’s take a
closer look at exams, since this is one of the most common methods of
assessment in science courses. (I certainly use it regularly.) Exams can
certainly provide summative information, but can they also provide useful
formative assessment? One way they might do so is via detailed question-level
analysis – if the questions are sufficiently detailed, and can be broken down
into discrete parts. If a student answered a question incorrectly, why did the
student err? Christodoulou provides a nice example of an exam question on
electrolysis with commentary from a science teacher analyzing the possible
errors. While this isn’t a multiple-choice question (MCQ), I’ve seen a number
of cleverly paired MCQs where the right and wrong answers in the first question
are paired with suitable “I chose my answer because…” MCQ explanations. Such paired
MCQs are difficult to design if you’re a novice teacher, but an experienced
instructor who knows the common pitfalls can zero in on this. I sincerely hope
there is a good test bank of such questions – or I should really start
collecting all the examples I find now so I can use them!
One of the
challenges in summative assessments where the grades function as a
discriminator of student activity is that (at the college level), the exam
questions are more complex and require synthesizing different conceptual
knowledge and skills. These sorts of questions work well to distinguish the ‘A’
student performance from the ‘C’ student performance. However as the complexity
of the question increases, the reason a student may err becomes increasingly
difficult to pinpoint. We certainly want to get our students up to the point of
being able to tackle these more ‘authentic’ problems, but throwing such
problems at them early on is likely to be more confusing than enlightening.
It makes you
wonder if grading the earlier efforts is helpful from a learning point-of-view.
I use low stakes quizzes throughout the semester, but it’s mainly as a
motivating factor for students to keep up with the material and do the reading
and homework. It has very little impact on the final grade. In my general
chemistry class this semester, I’m experimenting (again) with low-stakes
take-home exams but a higher stakes final exam. I worry that this
approach favors the stronger students and disadvantages the weaker students,
but I don’t have enough data to conclude if this is truly the case.
Christodoulou provides a suggestion I haven’t yet tried out – that formative
‘grades’ be assigned based on improvement measures. But there are caveats to
this approach (which she also tackles). I’ll have to think a little more how to
avoid the pitfalls should I try this. There are also other approaches to
qualitative formative assessment.
Chapter 5 (“Exam
Based Assessment”) of Christodoulou’s book closes with whether exams and grades
provide valid summative information. We’ve discussed the challenges of
reliability in such assessments in the previous post. I’ve also previously
considered the aversive control of grades. There’s also the
elephant-in-the-room of ‘teaching to the test’ or students employing short-term
strategies at the expense of long-term learning. Gosh, teaching is complicated!
*Did you know that
college grades influenced the meat-packing industry?
No comments:
Post a Comment