Andy Daventry

Testing Objectives and the Curriculum

Andrew Daventry

METU College

Testing objectives for English language teaching are often problematic, particularly when there is no explicit curriculum. Without a clear description of the aims you are trying to achieve, it is not possible to know whether or not your have got there.
In the absence of clear curriculum objectives, testing becomes a hit and miss affair. Most often, tests are constructed from an unsystematic selection of some of the structures, words and topics which have been used in teaching throughout the previous weeks, months, or over the whole academic year. These items are then presented in the test as objects in themselves, with, possibly, little thought as to how and why they were dealt with in the classroom in the first place. Then, there will be a reading passage and perhaps a listening test…quickly search through course books that you didn’t use for suitable material…and there is your exam.
Exam content is selected according to a feeling that, “Yes, they should be able to do this. It’s about their level” or “They did this in class. Let’s see if they can still do it” rather than any commonly accepted information gathering principles.
Effective testing objectives have to be based on the curriculum for a number of reasons. The main one, assuming that the main aim of testing is to gain a variety of information about how well the language training or education in your institution is going, is that tests need to be based on what you are trying to achieve. To do that, of course, you need a clear, public, agreed statement of what, you are trying to achieve, and why, and how you propose going about it.
In writing the Lise preparatory year exemption examination last year, we were in the position of not having a curriculum suitable for generating examination aims. As an interim measure, we decided to create an exam specification which could later be used as the basis for part of our English language curriculum. The specification would be based on what we had been doing in years 6,7 and 8, but expressed in terms of objectives. We felt that this would not only give us a useable basis for an exam, but would also allow us to assess whether or not our implicit objectives as realised in the test were really what we wanted to aim at. In other words, we were working backwards. Not going from aim to method but instead writing down our methods and seeing what aim was implicit in them.
We divided our test into five sections, being the now traditional Use of English, Reading, Writing, Listening and Speaking. Most weighting was given to Use of English, as we felt this represented the reality of the teaching. We first created a trial exam, which was held in April and then a real exam, which was held towards the end of June. I will take each part in turn.

Use of English

Aim Task type and format Marks

Paper 1

1 Students demonstrate ability to make deductions and decisions on grammatical and lexical points from context and cotext. Gap-fill text. 10 blanks. Students insert correct form of word from given root. 15

2 Students demonstrate ability to form grammatically correct phrases and sentences using given lexical items. 10 questions, 2 marks each. One for total one for partial accuracy. Keyword transformations. Discrete items with a lead-in sentence and a gapped response to complete, using a given word. 20

3 Students demonstrate an ability to recognise grammatical and lexical inaccuracies and distinguish incorrect usage from correct Error correction. A text containing an extra word. Some lines are correct, some incorrect. Students indicate which are which and identify errors. 10
1. Use of English specifications

These are all classic Cambridge First Certificate style question types. The skills and knowledge they test are formal and abstract. They involve recognition of formal categories in the language.
Question 1
2. Example question….
In the trial exam, this was found to be a difficult question, and one which did not discriminate well. We felt that this might be due to the fact that the texts selected used lexical items that were too specialised (economics) and so in the real exam we used more general texts. The facility index did increase to a satisfactory level, but discrimination power was still somewhat unsatisfactory despite a small increase.
Question 2

3. Example question…
You are too young to drive.
You ________________________ drive. (enough)

This proved very difficult. In both the trial and the real exam, the analysis showed that the questions were on the whole sound, but that the skill had just not been acquired. After our analysis of the trial exam, we had decided to keep the test, but make the questions more achievable. Increases in item facility were, however, minimal, and were accompanied by a small decrease in discrimination power.

It is apparent in scanning the data that the questions in which the students did best were those where the keyword was a recognised ‘trigger’ for a particular structure type (such as “enough”, “unless”, “been”). In these cases, these words are habitually associated with the structures required. The most difficult questions used more general items as keywords (e.g. “can’t” instead of the trigger phrase “can’t have”). Students recognise the ‘trigger’ phrase, which calls to mind the structure to be used, rather than understanding the full meaning of the original sentence and then trying to re-express it in a different way. This suggests to me that the successfully completed keyword transformations in the exam were more the result of memorised knowledge about the structures rather than an ability to actually manipulate the structures of English.

I believe that this question type, and question 1 as well, requires a degree of abstracting ability that is not fully there in 12 and 13 year olds, which does not reflect their level of cognitive development.
Question 3

4. Example question…
We have learned lots of things in English last year.
1. …………
This was an error correction question. Here, the skill tested was primarily one of recognition. The students had to look at a text in which each sentence possibly contained an extra word. They had to decide firstly whether or not a sentence was wrong, then isolate the extra word and report it.
In the trial exam, this question proved easy enough, but interestingly, did not discriminate very well. In the real exam, we replaced the text with individual sentences and also made sure that each sentence had an extra word instead of giving the students the extra burden of first deciding whether a sentence was right or wrong before correcting it. There was a small increase in facility, but most interestingly a very large increase in the discrimination power of the test. The better students seemed more certain about the task.
These observations in general suggest to me that in grade 8 students are still thrown by uncertainty, and while they are aware of correct and incorrect usage in English, they are not confident producers of usage. It is also probably the case that these kinds of highly abstract question are not an ideal vehicle for 12 and 13 year olds. It should be born in mind that FCE and its question types are intended for older students. We are currently discussing whether we change our Use of English aims in grade 8 or decide to keep them but alter the way we try to achieve them. It is almost sure that the question types themselves will go through some serious changes.

Reading

Aim Task type and format Marks

Paper 1

4 Students demonstrate an ability to identify the gist (overall message) of a given written text and provide a non-linguistic response. Matching titles to texts, ordering paragraphs of a text, ordering illustrations 5

5 Students demonstrate an ability to retrieve specific information from a given written text to complete a task. Completing a table or chart, matching texts on the basis of compatible information. 10

6 Students demonstrate an ability to give a written response to a text based on the information within the text Writing short answers to specific questions. Classic comprehension. 5
5. Reading question specifications

Three definite and recognised reading subskills were tested here. Exam specifications were in terms of the kind of reading subskill that we felt should be in the syllabus. The test analyses show that we will have to somehow integrate a text level element into the specifications. The test in the trial exam was carefully selected, and the results showed a reasonable discrimination power, despite the fact the tasks were quite easy. The test in the real exam was much too easy, to the extent that we gained little worthwhile information regarding students reading ability.
Instead of just referring to subskills, I think it is necessary to refer also to the type of text to be read, and the type of situation the information is to be applied to. In other words, the range of text types and levels of difficulty together with a range of ‘real world’ task types needs to be specified.
In curriculum terms, we also need to think about the place of more complex reading skills amongst our aims. Particularly, inferential skills.
Writing

Aim Task type and format Marks

Paper 1

7. Students demonstrate an ability to express themselves through a specified written text type to complete a specified task to specified levels of grammatical and lexical accuracy, conformity to style and organisation standards and task achievement. Writing a short letter in response to a given situation. 15
6. Writing question specifications

We are very happy with our writing test. The analyses of both exams showed that both the trial and the real exams had an almost equal level of difficulty (which in the context of our marking scheme, is more indicative of consistency in evaluation) and the discrimination power of this part of the test was very high.
We used the following band and scale descriptors, and standardisation procedures to increase the reliability of what is inevitably a subjective process; scores rely on marker judgements rather than objectively measurable features of the text. Standardisation procedures involved all markers marking a representative sample of the answers according to their interpretation of the descriptors, and then discussing and agreeing on variations in marking for each one. In this way, a common interpretation was strengthened.
Our curriculum will have to specify a wider range of text types. There is more to write in this world than just letters!

Writing test descriptors Accuracy Vocabulary Organisation Achievement of task

5 Minimal basic errors of structure, spelling punctuation. Accuracy errors in no way impede communication Range of vocabulary is entirely suitable to the task in hand. Organisation completely in accordance with standards and facilitates the readers understanding Task fully and completely achieved

4 Some basic errors, but these do not significantly obscure the message. Some vocabulary is inadequate, or missing, but the candidate has got round this problem through circumlocution Organisation is clumsy in places…but the message is still clear Task largely achieved. All major issues will have been addressed

3 Errors do obscure some parts of the message, but still the message overall can be easily understood. Vocabulary is too limited for the task in hand, but the message is still largely comprehensible, despite the absence of repair strategies Organisational problems make the text require more effort to understand, but overall the meaning is still clear. Task partially achieved. One or two major issues may not have been fully dealt with, but the general aim is still clear.

2 Errors seriously affect the message, making some parts of it incomprehensible Large parts of the message are incomprehensible due to vocabulary limitations Organisational difficulties make parts of the text difficult to follow, to the extent that the overall meaning is unclear. Significant amounts of the task remain unaddressed. The reader has some difficulty in understanding the purpose of the text.

1 Errors render the text largely incomprehensible. The vocabulary used largely fails to communicate the ideas. Organisation difficulties render most of the text very unclear, leading to unresolvible comprehension problems. The reader has great difficulty in understanding the full purpose of much of the text.

0 Errors render the text completely incomprehensible. The vocabulary used completely fails to communicate any useful meaning. There is no evidence of any organisation in the text. The purpose of text cannot be perceived at all
7. Writing evaluation descriptors

Listening

Paper 2

8 Students demonstrate an ability to identify the gist (overall message) of a given oral text and provide a non-linguistic response. Deciding what situation the listening is relevant to, ordering ideas presented, identifying the speakers from a given selection. 6

9 Students demonstrate an ability to retrieve specific information from a given oral text to complete a task. Completing a table or chart, matching texts on the basis of compatible information. 6

10 Students demonstrate an ability to give a written response to an oral text based on the information within the text Noting down the main ideas referred to in the text. Gap completion. 8
8. Listening specifications

As with the reading test the real exam was much to easy. The analyses of the exam showed a reasonable item facility and acceptable discrimination on the trial exam, but a greatly increased facility and no discrimination on the real exam. This is almost certainly due to hasty selection of material; there is no doubt that we were more careful in the selection and discussion of the listening material in the trial exam. In the trial exam, material was chosen after lengthy discussion in the preparation commission. With pressures of other work, and of time, this intense discussion was absent when it came to the real exam.

It also suggests that the specifications for listening material above are not sufficient to give a clear picture of the kind of task required. Exam writers were not entirely clear as to how these specifications were supposed to translate into reality. Again, as with reading, I feel that we need to mention situation type and text level, as well. There was a certain amount of ‘Oh, I think they will be able to do this’ in the selection of the task.

Speaking

Paper 3

11 Students demonstrate an ability to communicate with others to achieve a specified task in English to specified levels of grammatical and lexical accuracy, pronunciation and task achievement. Talking about a picture. Pair discussion, role-play, and other co-operative activity. 20 (reduced from an overall 100 marks).
9. Speaking test specifications

Our speaking tasks and methods of evaluation are heavily based on the Cambridge PET speaking exams. The exam analyses showed that item facility and discrimination power were consistent over the two exams. The discrimination power, indeed, was very high. All in all, we feel that our oral tests gave an accurate and trustworthy result.
One reason for this is, I feel, the care that was taken in setting up the aims of the test, and standardising evaluation. The Lise administration had to be convinced that our testing was going to work, as the approach we took was different from the traditional one by one interview. We therefore had to specify very clearly the categories of evaluation:

Grammar and Vocabulary
Students are expected to have a command of the basic structures of the year 7 and 8 syllabus. They are not expected necessarily to use all these structures in the exam. They are not expected to use these structures with no performance errors. However, any errors should not impede understanding. Similarly vocabulary should be sufficient to the task, but if not immediately available, students should demonstrate an ability to paraphrase rather than repeat.
Pronunciation
The criterion for pronunciation is comprehensibility.
Organisation
This category refers to the student’s coherence. Students are expected to be able to make extended utterances relevant to the task in hand.
Task Achievement
This category is to judge the student’s effectiveness is communicating to achieve the task in hand.

10. Categories of evaluation for the speaking test

These categories were expanded into precise speaking descriptors, which as you will see are firmly based on the PET criteria, an examination the students are familiar with.
Part of the consistency also arises from the careful preparation for evaluation. As with the writing section of the exam, markers went through a standardisation process, which involved watching a video of students performing tasks similar to the exam tasks, and then evaluating them according to the criteria. Standardisation revealed differences of opinion (as it should!) and also allowed these differences to be reconciled.
The objectives were very clear in the markers’ minds, and make an admirable basis for curriculum objectives, too.
Another contributor to this consistency was standardisation of task. Students went into the speaking test in pairs, and discussed with an interlocutor. The first task, after some general introductory “nerve settling” exchanges, was a fairly controlled pair work activity, where students might be ask to discuss which of a limited set of options would be the most suitable for a particular purpose, or to order lists according to importance. The issue here was to keep the task achievable, and not to have to rely too much on student inspiration and creativity…we are after all testing oral ability, not creativity! The second task was to discuss a picture with the interlocutor on a one to one basis.
The interlocutor plays no part in the evaluation. The two judges for each pair sit behind the students taking the test, but are able to make eye contact with the interlocutor when necessary (as when, for example, they feel they need more from one of the students in order to make a judgement).

Speaking descriptors Grammar and vocabulary Pronunciation Organisation Achievement of task

5 Minimal basic errors of structure. Accuracy errors in no way impede communication Pronunciation is clear, precise, though not anywhere near native speaker level. Errors are not obtrusive. The speaker is easily and naturally able to organise utterances to facilitate the listener’s comprehension. Task fully and completely achieved

4 Some basic errors, but these do not significantly obscure the message. Pronunciation inaccuracies occasionally cause the listener difficulties, but do not obscure the message Hesitation and rephrasing may irritate the listener slightly, but the message is still clear and coherent. Task largely achieved. All major issues will have been addressed

3 Errors do obscure some parts of the message, but still the message overall can be easily understood. Pronunciation errors occasionally cause misunderstanding or requests for repetition/ clarification. The message is largely comprehensible. Hesitation and rephrasing may cause the listener to misunderstand parts of the message. Task partially achieved. One or two major issues may not have been fully dealt with, but the general aim is still clear.

2 Errors seriously affect the message, making some parts of it incomprehensible Significant parts of the message are incomprehensible due to pronunciation difficulties Hesitation and rephrasing may cause the listener to misunderstand much of the message. Significant amounts of the task remain unaddressed.

1 Errors render the speaker largely incomprehensible. Pronunciation difficulties make most of what the speaker says incomprehensible Hesitation and rephrasing cause the listener to misunderstand the main point of the message. Most of the task remains unaddressed.

0 Errors render the speaker completely incomprehensible. Pronunciation errors completely obscure what the speaker is trying to say. Hesitation and rephrasing cause the listener to understand nothing. None of the task is achieved.
11. Speaking evaluation descriptors

Before discussing the relationship with the curriculum in more depth, two immediate points spring to mind. First, the only area of the exam, which actually improved between the trial and the real exam, was the grammar section. This section was subject to careful and systematic development as a result of the analysis of the trial exam. This shows to me how useful a careful process of studying and interpreting the simple statistical data derived from an item analysis can be.
Secondly, testing objectives seem to be most effective when they create a clear image of what is to be achieved in the teacher’s mind. This is, in my opinion, what lies behind the success of the productive skills tests. The testing aim and evaluation methods were very clear in the examiners minds, and they were therefore confident in the execution of these tests.
I believe that the relatively disappointing results of the receptive skills tests stem from a lack of co-ordination and mutual agreement on the nature of receptive skills, and the absence of a clear group perception of the aims of their receptive skills work on the part of the English language departments in the school. I do not mean by this that our teachers are ignorant of what receptive skills teaching is all about (they know it very well!), but simply that we do not have a clear, practical and agreed set of aims for the school as a whole.
Curriculum implications
Currently, at METU College, we are in the middle of an extremely large curriculum renewal and development programme. This programme covers every level from grade 6 and every department, not just English. There is much being discussed and debated, and this is all very much ‘work in progress’. The basic model we are following is hierarchical…

…with the inspiration for each level being the level above. The school mission statement outlines the reason for the school’s existence, what it believes about education and how it proposes educating its students. Similar, subject philosophies are a statement of what the various departments believe, propose to do and why. General aims are wide ranging, general and inspirational, with the specific aims being more worldly and measurable. The ‘Agreed essential learning activities’ are those activities which are considered essential to the achievement of the curriculum. And unit plans, are, of course, descriptions of what it to be done in each unit of work.
The higher levels of this hierarchy are inspirational rather than concrete and measurable. They are a statement on out part of what we want our students to be like, to know and to be able to do by the end of their education with us. They guide us in setting up and describing our more concrete, mundane and measurable objectives. Our testing objectives are firmly related to these aims and objectives at all levels: specified in unit plans; present in the specifications of benchmark examinations such as the Lise prep exemption exam; in procedures established to monitor and evaluate the curriculum as a whole.
During this process, many questions have been raised as to why we took this approach. In particular, the following have been asked:

Why isn’t the curriculum being produced by trained curriculum experts, who know the theory?

Why don’t we just apply a ready-made curriculum with our aims based on proven ELT principles and applied linguistics, or the Ministry’s published programme?

Here are some answers:
A curriculum is primarily a statement of purpose. A statement of aims and how you propose achieving them. Everybody, at every level in any activity needs their own aims. This is why programmes such as that produced by the Ministry of Education are not enough, and why our teachers need to be deeply involved in the process of development. A teacher implementing a programme, which is not HER programme, is not a teacher. She is a robot. A successful programme requires all involved to make the aims their own by contributing to their formulation. Only in this way will all teachers involved in a programme, fully understand it, fully agree with it and be fully able to implement it. This is something the Ministry accepts, by the way. In conversations with members of the Board of Education and other senior members of the Ministry, I have realised that the Ministry is now looking on its programmes as guiding frameworks rather than prescriptive toolkits.
The most obvious example of a ‘ready made’ curriculum is a coursebook. An interestingly, the difference between a successful and an unsuccessful coursebook in a school depends not so much on the quality of the book as on the degree to which the teachers and the students like the book. In other words, the degree to which they feel it is helping them achieve their aims. Course books which are really quite dreadful (in my opinion, of course!) such as the old ‘Access to English’ series, or coursebooks which are methodologically suspect, such as the original ‘Streamline’ series, have all been used with success simply because they were liked. Where coursebooks fail, it is normally because they don’t fit in with the teacher’s own aims. And, in the same way that coursebooks cannot provide a teacher’s own aims, neither can any other imported ready made curriculum.
In a similar way, I feel it is a mistake to look for aims within ELT or applied linguistics. Aims, goals and objectives are nothing more or less than a characterisation and specification of what people involved in an activity want to achieve. So, people and schools can have aims. School subjects do not have aims. ‘ELT’, as a subject, does not have aims. It just has content, categories and parameters. In a very real sense, to say ‘The aim of ELT is x, y and z’ is a meaningless statement. Better to say ‘We are teaching English to our students in order to help them ……”.
Applied linguistics is a field which applies discoveries and theories from linguistics to a number of practical fields, including language teaching. It has much to say and should be taken very seriously, because we have learned an immense amount about what we know and don’t know about language teaching over the past century. But Applied Linguistics can only give us useful advice about the journey. It cannot be a source of aims and objectives, though it can help us characterise and specify them. Applied Linguistics can inform the process of curriculum development, but it cannot guide it. It can NOT tell us where to go and why we are doing it. We, as teachers, have to make that decision for ourselves.
The testing objectives we drew up during the process of developing the exemption examination have done a great deal to help us understand where we are going with our teaching. They have helped us to realise that good teaching objectives as well as good testing objectives have to be concrete and real enough to produce a rich, real, concrete picture of what we are trying to achieve in our minds. Not too abstract. We are currently producing a profile of the ‘ideal’ 8th grade student from which we will establish some of our curriculum objectives. We feel that such a profile will help us establish curriculum and testing objectives that are real, applicable, laden with meaning and practical.

Return to index

Language and web services throughout Turkey

E-mail: andy@andydaventry.net
Phone: 0532 662 3936