This chapter begins by briefly laying down a set of general procedures for test construction. These are then  illustrated by two examples: an achievement test and a placement test. Finally there is a short section on validation.
5.1   STATEMENT OF THE PROBLEM
It cannot be said too many times that the essential first step in testing is to make oneself perfectly clear about what it is one wants to know and for what purpose. The following questions have to be answered.
– What kind of test is it to be? Achievement (final or progress), proficiency, diagnostic, or placement?
  What is its precise purpose?
  What abilities are to be tested?
  How detailed must the results be?
  How accurate must the results be?
  How important is backwash?
– What constraints are set by unavailability of expertise, facilities, time (for construction, administration and scoring)?
5.2   WRITING SPECIFICATION FOR THE TEST
The first form that the solution takes is a set of specifications for the test. This will include information on: contain, format and timing, criterial levels of performance, and scoring procedures.
CONTENT
This refers not to the content of a single, particular version of a test, but to the entire content of any number of versions. Samples of this content will appear in individual versions of the test.
The way in which content is described will vary with its nature. The content of a grammar test, for example, may simply list all the relevant structures. The content of a test of a language skill, on the other hand , may be specified along a number of dimensions. The following provides a possible framework for doing this. It is not meant to be prescriptive; readers may wish to describe test content differently. The important thing is that content should be as fully specified as possible.
Operations (the task that candidates have to be able to carry out).  For a reading test these might include, for example: scan text to locate specific information; guess meaning of unknown words from context.
Types of text    For example a writing test these might include: letters, forms, academic essays up to three pages in length.
Addresses   This refers to the kinds of people that the candidate is expected to be able to write or speak to (for example native speakers of the same ages and status); or the people for whom reading and listening materials are primarily intended (for example native-speaker university students).
Topics  
Topics are selected according to suitability for the candidate and the type test.
FORMAT AND TIMING
This should specify test structure (including time allocate to components) and item types/elicitation procedures, with examples. It should state what weighting is to be assigned to each component. It should also say how many passages will normally be presented (in the case of reading or listening) or required (in the case of writing), how many items there will be in each component.
5.3   WRITING THE TEST
SAMPLING
It is most likely that everything found under the heading of ‘content’ in the specifications can be included in any version of the test .choices have to be made. For ‘content validity and for beneficial backwash, the important thing is to choose widely from the whole area of content. One should not concentrate on those elements known to be easy to test. Succeeding versions of the test should also sample widely and unpredictably.
Example    an achievement test
Statement of the problem
There is a need for an achievement test to be administered at the end of a presessional course of training in the reading of academic texts in the social sciences and business studies (the students are graduates who are about to follow postgraduate courses in English-medium universities). The teaching institution concerned (as well as the sponsors of the students) wants to know just what progress is being made during the three-month course. The test must therefore obviously be sufficiently sensitive to measure gain over that relatively short period. While there is no call for diagnostic information on individuals, it would be useful to know, for groups, where the greatest difficulties remain at the end of the course, so that future courses may give more attention in these areas. Backwash is considered important; the test should encourage the practice of the reading skills that students will need in their university studies.  This is in fact intended to be only one of a battery of tests, and a maximum of two hours can be allowed for it. It will not be possible at the outset to write separate tests for different subject areas.
Specifications
CONTENT
Types of text   The text should be academic (taken from textbooks and journal papers).
Addresses   Academics at post graduate level and beyond.
Topics   The subject areas will have to be as ‘natural’ as possible, since the students are from a variety of social science and business disciplines. (economics, sociology, management, etc.).
Operations   These are based on the stated objectives of the course, and include broad and underlying skills:
Broad skills:
1.  Scan extensive written texts for pieces of information.
2.  Construe the meaning of complex, closely argued passages.
Underlying skills:
(Those which are regarded as of particular importance for the development of the broad skills, and which are given particular attention on the course.)
3.  Guessing the meaning of unfamiliar words from context
4.  Identifying referents of pronouns etc., often some distance removed in the text.
FORMAT AND TIMING
Scanning     2 passages each c. 3,000 words in length.
15 short-answer items on each, the items in the order in which relevant information appears in the texts. ‘Controlled’ responses where appropriate.
Example:
How does the middleman make a large profit in traditional rural industry?
He …………………………………………………………………………….
and then ……………………………………………………………………..
Time: 1 hour
Detailed reading   2 passages each c.1,500 words in length.
7 short-answer questions on each, with guidance as to relevant part of text. ’Controlled’ responses where appropriate.
Example:
Cross-sectional studies have indicated that intelligence declines after the age of thirty. Part of the explanation for this may be that certain intelligence test tasks require ……………, something of which we are less capable as we grow older.
5  meaning-from-context items from detailed reading passages
Example:
For each of the following, find a single word in the text with an equivalent meaning. Note: the words in the text may have an ending such as –ing, -s, etc.
highest point (lines 20 – 35)
5 referent-identification items from one of the detailed reading passages
Example:
What does each of the following refer to in the text? Be very precise/
the former (line 43)
SCORING PROCEDURES
There will be detailed key, making scoring, almost entirely objective. There will nevertheless be independent double scoring. Scorers will be trained to ignore irrelevant (for example grammatical) inaccuracy in responses.
SAMPLING
Texts will be chosen from  as wide a range of topics and types of writing as is compatible with specifications. Draft items will only be written after the suitability of the texts has been agreed.