Friday, December 3

Assessment

We are just beginning an assessment period at my school, so they've been rather on my mind, and I would like to share some of my observations about assessment tasks.

They can be classified in various ways, including the following:
In-class or prepared
Constructive or diagnostic
Knowledge-based or skills-based
Short answer or extended response
Topic-focused or overall
Creative, analytical or research (for knowledge-based tasks only)

Already a bewildering number of options present themselves, but at least a few may be eliminated. It is unlikely, for example, that you would be required to do a short-answer, knowledge-based, in-class creative task, because it is almost impossible to write the questions, let alone answer them. Some tasks are irrelevant to some subjects: creative tasks are not possible in maths/science subjects. Even so, there is no possibility of a task which suits all subjects, or even all topics within a subject.

It occurs to me that, with this in mind, a standardised national test, in similar formats for every subject, is unlikely to truthfully represent the skill-set of even the average student. Why then are they so prevalent?

I believe that such final exams began to be used before much study had gone into teaching and learning styles, and it was possible to have rote answers for a great many subject areas, and have since been reinforced by their decades of existence and the difficulty of major changes to education systems. They are now regarded as a more or less fair way to distinguish between those students capable of extensive higher-level education, and those who are not. Are they?

The two important factors in judging a test are its reliability and validity. Is it reliable? If the same person undertook the same test three times with the same knowledge, would they get the same marks? If the same paper was marked three times, would the same marks be given? The design of these tests is highly concerned with this question of reliability, which leads to a greater proportion of short answer or even multiple choice, as well as cross-marking in most subjects. I think I can say with a reasonable degree of certainty that most final exams are reliable. Are they also valid? Do they test what we think they are testing? To answer that, we must first define what we wish final examinations to do.

In my mind, final exams are designed to discriminate between levels of ability and predicted success at university, to tell us which students will do well there and which will do worse. With that in mind, we should look at what makes a successful university student. Good university students are motivated, independent, dedicated, disciplined and enjoy learning. In essence, good university students are those who can learn well, quickly and on their own. So, final examinations should test the ability of a student to learn, and to learn independently.

The HSC is a year and a half with a small break in the middle of continuous learning, with continuous assessments, leading to a final exam which focuses on factual information and skill-sets. Such a process tests endurance. It tests ability to withstand stress. It tests memory. It tests support systems and ability to find resources. All those things are useful in university, but not, I believe, a predictor of success. So we reach the conclusion that the HSC is an invalid test, and by extension, similar exams in other states or countries are also invalid.

Having reached the conclusion that the current exam format, though reliable, is invalid as a discriminator for university purposes,the immediate question is: what else could we do?

We wish to design an assessment of learning ability, particularly independent learning. It is important to note that this is not a test of intelligence - though it may contribute to learning ability, a high IQ will not guarantee high marks, in the same way that it does not in current systems. Nor would it aim to test the information in the syllabus. At first it seems an impossible task. To throw an idea in the air:

Topics could be assigned at random, with students asked to use set materials to create either a presentation (with script and visuals submitted) or a report (similar to an academic essay) and given a set time (a fortnight, a month) to do so. So that students would be able to work in their rough comfort zone, they would be able to choose a broad subject area (maths/sciences, social sciences, philosophy/literature). If they used additional sources, then they would gain marks, because ability to research is useful. However, marks would be given for successful integration of the set material, and the set material would be sufficient for an acceptable presentation/report. Students would be marked on originality/individuality to emphasise the importance of their own work.  Answers would be cross-checked (using the now prevalent anti-plagiarism software) against responses from other years to ensure students learned themselves. After students had submitted their works, they would each be asked to do a Viva Voce - to talk to an examiner about their topic area, without notes - as a demonstration of their absorbtion of the information.

At first sight, this model seems to require far too many markers - however, it would actually need probably half the markers that current systems use. Presently, every student does at least five or six subjects and in each there is a final exam. This replaces those five or six exam papers (each of which are marked twice, taking over an hour), with one paper and one presentation. It seems that it would take an enormous amount of time to interview each student, with the 70 000 students in my state alone, but once again, the HSC already makes provisions for the large number of students. If each school had a set of markers, the students could be heard in ten days, by my approximate calculations, and that is assuming each school hears only one student at a time. There are logistical problems, but they are not insurmountable, and they are certainly no larger than those presently dealt with by the Board of Studies.

I think that raising these questions about the nature of assessments and trying to come up with new ideas is important. A large contributor to the unshakability of systems currently in place is that other options are rarely considered. Such drastic changes would be very difficult to implement, but evaluating them paves the way for smaller movements towards valid assessment.