Kinds of Test
The teacher is the best position to know which tests are appropriate for his class. The teachers may give classroom tests in the intention of motivating student efforts to learn or to assess the outcomes of the efforts, and the appropriateness of a test is largely determined by purpose.[1] Test can be categorized according to the types of information they provide. This categorization will prove useful both in deciding whether an existing test is suitable for a particular purpose and in writing appropriate new tests where these are necessary.[2]
a. Based on Its Purpose
1) Aptitude test
The aptitude test is conceived as a prognostic measure that indicates whether a student is likely to learn a second language readily. It is generally given before the student begins language study, and may be used to select students for a language course or to place students in sections appropriate to their ability.[3]
Aptitude tests are designed to predict future performance in some activity. It can provide information that is useful in determining learning readiness, individualizing instruction, organizing classroom groups, identifying underachievers, diagnosing learning problems, and helping students with their educational and vocational plans. It makes a special contribution.[4]
Aptitude tests are designed to predict potential. It attempts to indicate what a person could learn if opportunity and motivation are present. Aptitude tests are attempting to assess what the student “could do” more than what the student “will do”[5]
Aptitude tests do not measure a fixed capacity. Rather, they provide an indication of present level of learned abilities and can be useful in predicting future performance. Performance on aptitude tests is influenced by previous learning experiences, but it is less directly dependent on specific courses of instruction than is performance on achievement tests.[6]
There are some useful functions provided by aptitude tests.[7] The first is, from aptitude tests’ result, a tester can determine test taker’s readiness to have instructional programs. The second is the tester can classify or place individuals in appropriate class. The third is the tester can diagnose the individual’s specific strength and weakness. The last is the tester can measure aptitude for learning.
2) Achievement test
An achievement test (also called an attainment or summative test) looks back over a longer period of learning than the diagnostic test, for example a year’s work, or a whole course, or even a variety of different courses. It is intended to show the standard which the students have now reached in relation to other students at the same stage.[8]
Achievement tests are directly related to language courses, their purpose being to establish how successful individual students, groups of students, or the courses themselves have been in achieving objectives.[9] Achievement test is similar to the progress test in that it measures how much the student has learned in the course of second-language instruction.[10]
Achievement test is designed to indicate degree of success in some past learning activity. This purpose of achievement test is obviously different with the purpose of aptitude test, where the aptitude test is designed to predict success in some future learning activity.[11]
There are two kinds of achievement tests: final achievement tests and progress achievement tests. Final achievement tests are those administered at the end of a course of study. They may be written and administered by ministries of education, official examining boards, or by members of teaching institutions. Clearly the content of these tests must be related to the courses with which they are concerned, but the nature of this relationship is a matter of disagreement amongst language testers. The content of a final achievement test should be based directly on a detailed course syllabus or on the books and other materials used. Progress achievement tests, as their name suggest, are intended to measure the progress that students are making. They contribute to formative assessment.[12]
Achievement test is conducted at the end of an instructional segment to determine if learning is sufficiently complete to warrant moving the learner to the next segment of instruction. It is to determine the status of achievement at the end of an instructional segment, to determine how well things went.[13]
John E. Horrocks wrote some functions of achievement tests. In his book he stated that achievement tests may be used as a means :
a) To gain a picture of the range and nature of individual differences in a group where some specified aspect of achievement is concerned
b) To equate groups for research and sectioning purposes.
c) To determine an examinee’s level of achievement in relation to his age and ability.
d) To provide a basis for selection, promotion, and termination.
e) To group students into relatively homogenous groups for instructional purpose.
f) To determine rate of progress by comparing present and past achievement.
g) To diagnose learning difficulties.
h) To evaluate the results of a method of instruction.
i) To evaluate teachers’ success in teaching students.
j) To provide a basis for counseling with parents as well as with students.
k) To provide a basis for grading.
l) To compare the status of instructional units (schools, classroom, cities, countries, states, etc.).
m) To diagnose a given schools’ strengths and weakness.
n) To diagnose a given school entrants.
o) To determine, in part, the efficiency of certain administrative policies.
p) To predict future success as well as present readiness.
q) To act as an adjunct to instruction and as a teaching tool.
r) To act as a motivating device.[14]
Fundamentally, achievement tests have different features in nature from aptitude tests. An aptitude test is primarily designed to predict success in some future learning activity, whereas achievement test is designed to indicate degree of success in some past learning activity.[15] From a comparison above, it can be comprehended that a distinction founded between these two tests is made in terms of the use of the results rather than of the qualities of the test themselves
In order to have a good achievement test form, it should be considered that achievement test must be constructed well by paying attention to some following basic principles providing a firm base for constructing and using classroom tests as a positive force in the teaching learning process:
§ Achievement tests should measure clearly defined learning outcomes that are in harmony with the instructional objectives
§ Achievement tests should measure a representative sample of the learning tasks included in the instruction.
§ Achievement test should include the types of tests items that are most appropriate for measuring the desired learning outcomes.
§ Achievement tests should fit the particular uses that will be made of the results.
§ Achievement test should be made as reliable as possible and should then be interpreted with caution.
§ Achievement tests should improve student learning. [16]
3) General Proficiency Test
Language proficiency tests are designed to measure control of language or cultural items and communication skills already present at the time of testing, irrespective of formal training.[17]
The proficiency test also measures what students have learned, but the aim of the proficiency test is to determine whether this language ability corresponds to specific language requirements. For example, is the student able to read professional literature in another language with a specific level (such as 90 percent) of accuracy?[18]
Proficiency tests are designed to measure people’s ability in a language, regardless of any training they may have had in that language. The content of a proficiency test, therefore, is not based on the content or objectives of language courses that people taking the test may have followed. Rather, it is based on a specification of what candidates have to be able to do in the language in order to be considered proficient. In the case of some proficiency tests, ‘proficient’ means having sufficient command of the language for a particular purpose.[19]
The aim of a proficiency test is to assess the student’s ability to apply in actual situations what he has learnt. This type of test is not usually related to any particular course because it is concerned with the student’s current standing in relation to his future needs. A proficiency test is the most suitable vehicle for assessing English for specific purposes (ESP).[20]
Proficiency tests normally measure a broad range of language skills and competence, including structure, phonology, vocabulary, integrated communication skills, and cultural insight. There is also proficiency test, which includes the appropriateness of language of language usage in its specified social context, in other words, communicative competence.[21]
b. Based on Test Maker
1) Standardized Test
Standardized tests are a test which presupposes certain standard objectives, or criteria, that are held constant across one form of the test to another. The criteria in large-scale standardized tests are designed to apply to a broad band of competencies that are usually not exclusive to one particular curriculum.[22] In order to have good standardized tests, those should be produced with a thorough process of empirical research and development.
Standardized tests focus on general skills and content those are included among the educational objectives of virtually all school districts. Standardized test must span a much wider range of content than most teacher-constructed tests.[23]
A standardized test has certain distinctive features. These include a fixed set of test items designed to measure a clearly defined sample of behavior, specific directions for administrating and scoring the test, and norms based on representative groups of individuals like those for whom the test was designed.[24]
Standardized tests usually consist of a set of materials including, (1) a test booklet with the test items and instruction to the test taker, (2) an answer sheet, (3) an administration manual containing instructions on how to administer the test, (4) a technical manual with information on uses o the test, how it was developed, how it is to be scored, and how the scores can be interpreted.[25]
The best standardized tests are carefully developed and refined by means of editorial writing and item analysis from a field testing so that every item functions well. Intrinsic ambiguity should be removed, and implausible distracters are modified or replaced.[26]
A standardized test should be the product of a carefully conducted program of research and development. Such a program involves the work of many persons and includes the following steps.
a) Considering preliminary planning and marketing.
b) Developing test blueprint and item drafts.
c) Designing and professionally producing test items, materials, answer documents, and directions.
d) Pre testing items; collecting and analyzing data on them.
e) Selecting items for the final forms and professionally producing standardization edition.
f) Locating schools willing to participate in standardization and conducting standardization testing.
g) Collecting and analyzing standardization data and preparing norms tables, collecting and analyzing data for reliability and validity studies.
h) Professionally producing the final forms of the test and writing test manuals.
i) Marketing and selling the final edition.
j) Conducting post-publication special studies and developing special technical publications.[27]
2) Teacher Made Test
Teacher made tests focus on a much more restricted range of content than standardized tests; they usually reflect a particular unit of study or a semester of study.[28] Teachers usually feel that standardized test do not adequately measure their own or the local objectives of instruction.[29] Consequently, teacher must construct their own tests based on instructional objectives or subject matters having been learnt by students.
Teacher made tests can be constructed to measure how well a specific set of objective has been met, something that standardized tests are not expected to do.[30]
Teacher-made tests would provide some functions or information benefited by teacher. From teacher-made tests, teachers can see how well students have mastered a limited unit of instruction, they can determine the extent to which distinctive local objectives have been achieved, and teacher-made tests can provide a basis for assigning course marks.
It should be clear that teacher-made and standardized tests complement each other. They serve related but somewhat different purposes. Both kinds of test are needed for an adequate evaluation of educational achievement by individual students, school, and school districts. [31]
c. Based on the Way of Scoring
1) Objective Test
The objective test is a test, which is highly structured requires the pupils to supply a word or two, or to select the correct answer from among a limited number of alternatives.[32]
An objective item is one for which there a specific correct response is; therefore, whether the item is scored by one teacher or another, whether it is scored today or last week, it is always scored the same way.[33]
Objective tests usually have only one correct answer, they can be scored mechanically. But, objective tests require far more careful preparation than subjective tests. Objective tests are frequently criticized on the ground that they are simpler to answer than subjective tests. Items in objective test however, can be made just as easy or as difficult as the test constructor wishes.
The objective test includes a variety of item types. Objective test items can be classified into supply type and selection type.
a) Supply type.
The supply type test requires pupil to supply the answer. This type is also known as ‘short answer’ or ‘completion’. The short answer and completion are essentially the same. They differ only in the method of presenting the problem. In the case of the short answer item consists of an incomplete statement.[34]
The short answer item and the completion item both are supply-type test items that can be answered by a word, phrase, number, or symbol. They are essentially the same, differing only in the method of presenting the problem. The short answer item uses a direct question, whereas the completion item consists of an incomplete statement.[35]
Example of the short answer item:[36]
§ What is the name of the first President of Republic of Indonesia?
(Ir. Soekarno).
Example of the completion item:
§ The name of the first President of the republic of Indonesia is………
(Ir. Soekarno).
The short answer item is subject to a variety of defects, even though it is considered one of the easiest to construct. The following suggestion will avoid possible pitfalls and provide greater assurance that the items will function as intended:
§ Word the item so that the required answer is both brief and specific.
§ Do not take statements directly from textbooks to use as a basis for short answer item.
§ A direct question is generally more desirable than an incomplete statement.
§ If the answer is to be expressed in numerical units, indicate the type of answer wanted.
§ Blanks for answers should be equal in length and in a column to the right of the question.
§ When completion items are used, do not include too many blanks.[37]
b) Selection type.
The selection type test requires pupil to select the answer from a given number of alternatives. This type can be further subdivided into: True false, Matching, Multiple choice.[38]
§ True false
A true false item consists of a declarative statement and the student responds ‘true’ if it conforms to accepted truth, or ‘false’ if it is essentially incorrect. True false items are also referred to as alternative-response items [39]
The alternative-response test item consists of a declarative statement that the pupil is asked to mark true or false, right or wrong, correct or incorrect, yes or no, fact or opinion, agree or disagree and the like. In each case there are only two possible answers.[40]
True false item doesn’t directly test writing or speaking abilities: only listening or reading. It may be used to test aspects of language such as vocabulary, grammar, content of a reading or listening passage. It is fairy easy to design; it is also easy to administer, whether orally or in writing, and to mark.[41]
The most common uses of the true false item are:
§ To measure the ability to identify the correctness of statements of fact, definition of terms, statement of principles, and the like. [42]
Example:
Directions:
Read the following statement. If the statement is true, circle the T, if the statement is false, circle the F.
(T) (F) 1. The green coloring material in a plant leaf is called chlorophyll.
§ To measure the pupil’s ability to distinguish fact from opinion[43]
Example:
Direction:
Read the following statement. If the statement is a fact, circle the F. If the statement is an opinion, circle the O.
(F) (O) 1. Other countries should adopt a constitution like that of the United States
§ To measure aspect of understanding, that is, the ability to recognize cause-and-effect relationships. This type of item usually contains two true propositions in one statement, and the pupil is to judge whether the relationship between them is true or false.[44]
Example:
Direction:
In the following statement, both parts of the statement are true. You are to decide whether the second part explains why the first part is true. If it does, circle Y, If it doest not, circle N.
(Y) (N) 1. Some plants do not need sunlight because they get their food from other plants.
§ To measure the simple aspect of logic.[45]
Example:
Direction:
Read the following statement. If the statement is true, circle the T; if it is false circle the F. Also, if the converse of the statement is true, circle the CT; if the converse is false, circle the CF; be sure to give two answers for each statement.
(T) (F) (CT) (CF) 1. All tress are plants.
§ Matching
The matching exercise consists of two parallel columns with each word, number, or symbol in one column being matched to a word, sentence, or phrase in the other column. The items in the column for which a match is sought are called premises and the items in the column from which the selection is made are called responses.[46] Matching items are useful in measuring students’ ability to make associations, discern relationships, and make interpretations or measure knowledge of series of facts.
For example:
Direction:
On the line to the left of each province listed in column I, write the letter of the capital city in column II. Each capital city is using one or not at all.
Column I Provinces ( ) 1. Central Java ( ) 2. Central Kalimantan ( ) 3. East Java ( ) 4. Irian Jaya ( ) 5. North Sumatra ( ) 6. South Kalimantan ( ) 7. South Sulawesi | Column II Capital cities A. Bandung B. Banjarmasin C. Jayapura. D. Medan. E. Palangkaraya. F. Samarinda. G. Semarang. H. Surabaya. I. Ujung Pandang. |
The advantages of using matching items are that the matching items can be used for a large quantity of associated factual material to be measured in a small amount of space while students’ time needed to respond is relatively short.[47]
The following suggestions are designed to construct matching exercises:
§ Use only homogeneous material in a single matching exercise.
§ Include an unequal number of responses and premises, and instruct the student that responses may be used once, more than once, or not at all.
§ Keep the list of items to be matched brief, and place the shorter responses on the right.
§ Arrange the list of responses in logical order: place words in alphabetical order and numbers in sequence.
§ Indicate in the directions the basis for matching the responses and premises.
§ Place all of the items for one matching exercise on the same page.[48]
§ Multiple choice
The multiple choices item consists of a premise and a set of alternatives. The premise, known as the “stem”, is presented as a question or incomplete statement which the student answers or completes by selecting one of several alternatives. Usually either four or five alternatives (also called options or choices) are available.[49]
The pupil is typically requested to read the stem and the list of alternatives and to select the one correct, or best alternative. The correct alternative in each item is called merely the answer, while the remaining alternatives are called distracters. [50] These incorrect alternatives receive their name from their intended function-to distract those students who are in doubt about the correct answer.[51]
Some of the more typical uses of the multiple choice form in measuring knowledge outcomes common to most school subjects:
§ Knowledge of terminology
For this purpose, the pupil can be requested to show his knowledge of a particular term by selecting a word which has the same meaning as the given term or by selecting a definition of the term.[52]
For example:
1. Which one of the following words has the same meaning as the word “plush”?
a. Smart and confident.
b. Wet and dirty.
c. Expensive and comfortable
d. Poor and sad.
§ Knowledge of specific facts
This type provides a necessary basis for developing understanding, thinking skills, and other complex learning outcomes. Multiple choice items designed to measure specific facts can take many different forms, but questions of who, what, when, and where variety are most common.[53]
For example:
1. Who is the latest prophet of Islam?
a. Isa AS.
b. Muhammad SAW.
c. Yahya AS.
d. Ilyasa AS.
§ Knowledge of principles
Multiple choice items can be constructed to measure knowledge of principles as easily as those designed to measure knowledge of specific facts. The items appear a bit more difficult but this is because principles are more complex than isolated facts.[54]
For example:
1. Which one of the following principles of taxation is characteristic of the federal income tax?
a. The benefits received by an individual should determine the amount of the tax.
b. A tax should be based on an individual’s ability to pay.
c. All citizens should be required to pay the same amount of tax.
d. The amount of tax an individual pays should be determined by the size of the federal budget.
§ Knowledge of methods and procedures
This multiple choice form is also be able to measure the knowledge of laboratory procedures; knowledge of methods underlying communication, computational, and performance skills; knowledge of methods used in problem solving; knowledge of governmental procedures; and knowledge of common social practices.[55]
For example:
1. If you were making a scientific study of a problem, your first step should be to?
a. Collect information about the problem.
b. Develop hypotheses to be tested.
c. Design the experiment to be conducted.
d. Select scientific equipment.
The multiple choice item is generally recognized as the most widely applicable and useful type of objective test item. It can more effectively measure many of the simple learning outcomes measured by the short item or completion, the true false item and matching item.[56]
The following list shows some reasons, why do teachers, schools, and assessment organizations use multiple choice items so often?
§ Multiple choice tests are fast, easy, and economical to score. In fact, they are machine scorable.
§ They can be score objectively and thus may give the test appearance of being fairer and/or more reliable than subjectively scored tests.
§ They “look like” tests and may thus seem to be acceptable by convention.
§ They reduce the chances of learners guessing the correct answer, in comparison to true false.[57]
Even though the types of objective test items are various, they have one feature in common which distinguished them from the essay test. That is, they present the pupil with a highly structured task which limits the type of response the pupil can make. The pupil is not free to redefine the problem or to organize and present the answer in his own words[58].
2) Subjective Test
Subjective test is one that does not have a single right answer. A short composition or an impromptu interview may be scored in different ways by different teachers, and even by the same teacher scoring the answer twice under different circumstances. Test questions where students may give a variety of responses, each somewhat different from the other.[59]
The most well know item type for subjective test is the essay test, it requires examinees to read the question, formulate his response and express the response on his own words.[60] Essay items permit the testing of a student’s ability to organize ideas and thoughts and allow for creative verbal expression.[61]
Typical key words in the questions set in examinations of this kind are: ‘discuss’, ‘compare’, ‘contrast’, ‘describe’, the answers they elicit may range from a single sentence to a dozen or more paragraph. These answers are commonly called ‘essays’, the question ’essay questions’, and the whole examination is of the ‘essay type’.[62]
The essay test, based on the amount of freedom of response, is subdivided into two types:
a) Extended response type.
In the extended response type, it permits the pupil to decide which facts he thinks are most pertinent, to select his own method of organization, and to write as much as he deems necessary to provide a comprehensive answer.[63] Students are given almost complete freedom in making their response.
For example:
1. Compare the strengths and the weakness of the multiple choice test and essay question?
b) Restricted response type.
The restricted response question usually limits both the content and the response. The content is usually restricted by the scope of the topic to be discussed. Limitations on the form of response are generally indicated in the question.[64]
In the restricted response type, the pupil is not given complete freedom in making his response.[65]
For example:
1. State three advantages of saving money in the bank?
[2] Arthur Hughes, Testing for Language Teachers, (New York: Cambridge University Press, 2003), p.11.
[3] Rebecca M. Vallette, Modern Language Testing, (New York: Harcourt Brace Jovanovich, Inc, 1977), p. 5.
[4] Robert L. Linn and Norman E. Grondlund, Measurement and Assessment in Teaching, (New Jersey: Prentice Hall, Inc, 1995), p. 391.
[5] Kenneth D. Hopkins, Educational and Psychological Measurement and Evaluation, (Boston: Walsh & Associates, Inc, 1998), p. 369.
[9] Arthur Hughes, Testing for Language Teachers, (New York: Cambridge University Press, 2003), p. 13.
[10] Rebecca M. Vallette, Modern Language Testing, (New York: Harcourt Brace Jovanovich, Inc, 1977), p. 5.
[13] Robert L. Ebel and David A. Frisbie, Essential of Educational Measurement, (New Jersey: Prentice Hall, 1991), p. 24.
[14] John E. Horrocks, Assessment of Behavior, (Ohio: Charles E. Merrill Publishing Company, 1964), pp. 484-485.
[16] Norman E. Gronlund, Constructing Achievement Tests, New Jersey: Prentice Hall, Inc., 1968, p. 8.
[18] Rebecca M. Vallette, Modern Language Testing, (New York: Harcourt Brace Jovanovich, Inc, 1977), p. 6.
[19] Arthur Hughes, Testing for Language Teachers, (New York: Cambridge University Press, 2003), p. 11.
[22] H. Douglas Brown, Language Assessment, Principle and Classroom (principle & Classroom Practice), ( New York: Pearson Education Inc., 2004), p. 67
[23] Kenneth D. Hopkins, Educational and Psychological Measurement and Evaluation, (Boston: Walsh & Associates, Inc, 1998), p. 368.
[24] Norman E. Gronlund, Measurement and Evaluation in Teaching, (New York: Macmillan Publishing Co., Inc., 1976), p. 287.
[25] Freed Genesee and John A. Upshur, Classroom-Based Evaluation in Second Language Education, (New York: Cambridge University Press, 1996), p. 233.
[26] Kenneth D. Hopkins, Educational and Psychological Measurement and Evaluation, (Boston: Walsh & Associates, Inc, 1998), p. 368.
[27] Anthony J. Nitko, Educational Tests And Measurement an Introduction, (New York: Harcourt Brace Jovanovich, Inc, 1983), p. 468.
[29] Victor H. Noll, Introduction to Educational Measurement, Boston: Hougthon Mifflin Company, 1965, p. 125
[30] Charles D. Hopkins and Richard L. Antes, Classroom Testing: Construction, (Illinois, F.E Peacock Publishers, Inc, 1979), p. 9.
[33] Rebecca M. Vallette, Modern Language Testing, (New York: Harcourt Brace Jovanovich, Inc, 1977), p. 10.
[35] Robert L. Linn and Norman E. Grondlund, Measurement and Assessment in Teaching, (New Jersey: Prentice Hall, Inc, 1995), p. 148.
[40] Norman E. Gronlund, Measurement and Evaluation in Teaching, (New York: Macmillan Publishing Co., Inc., 1981), p. 162.
[48] Robert L. Linn and Norman E. Grondlund, Measurement and Assessment in Teaching, (New Jersey: Prentice Hall, Inc, 1995), pp. 168-170.
[56] Wilmar Tinambunan, Evaluation of Student Achievement, (Jakarta: Dept. P&K Dirjen. Pendidikan Tinggi Proyek Pengembangan Lembaga Pendidikan Tenaga Kependidikan, 1998), p.75.
[57] Kathleen M. Bailey, Learning about Language Assessment: Dilemmas, Decisions, And Directions, (New York: Heinle&Heinle Publishers, 1998), pp.130-131.
[61] Anthony J. Nitko, Educational Tests And Measurement an Introduction, (New York: Harcourt Brace Jovanovich, Inc, 1983), p. 21.
[62] A.E.G. Pilliner, Language Testing Symposium, Great Britain: Headley Brothers LTD., 1976, p. 19.
Tidak ada komentar:
Posting Komentar