SAT 9 doesn’t reflfect PSS performance
SAT 9 and other tests like it cannot tell us about the performance of PSS or whether there has been an improvement in learning. It is not designed for this purpose. It can tell us how we compare to others who took the test and little else. But let us remember, just because you are better than me, it doesn’t mean you are good enough.
Too often lately there have been reports about test scores and various incorrect interpretations of what they mean. The people who discuss test results are well intentioned, but misinformed. As someone who worked for PSS and had access to test scores for over 15 years, I would like to try to explain some facts about these tests and some of the problems with them.
Here is an example: I got on a scale to weigh myself. Then you get on the same scale to weigh yourself. I weigh more than you do. You weigh less than me. Can we figure out who weighs the right amount? No we don’t. We may both be thin. We may both be fat. It is a simple scale that only gives a number. But that number is meaningless without other information about us both.
SAT 9 (and before CAT) are tests that are given to students to try to weigh them academically. These tests are mostly multiple-choice, which means that the answer is selected from several possible choices that are given.
It is very important to understand how these tests are constructed and how the results should be used. They belong to a type called NORM-REFERENCED tests, (NRTs for short) and were first used by the army to measure IQ for soldiers (intelligence tests).
Those tests are based on the bell curve concept and designed so that 50% of students taking the test score at a certain level and the other 50% don’t. Not unlike a baseball game, there are always winners and losers. And just like a baseball game, it doesn’t mean the winning team is the best team or even a good team. It only means they are better than the other team on the day that they played that game.
Yes, the PSS scores are low just like Guam, Hawaii, Puerto Rico and many other districts and states. And yes, there are districts and states that have very high scores. It has to be that way with a test like SAT 9. When reports mention that the US average is at the 50th percentile that doesn’t mean that everybody in the US scores at this level, only that half do. When the PSS scores at the 20th percentile in math, it means that the 495 10th graders who took the test had higher than 20% of the total number of students who took the test. That means that many mainland students scored lower than ours did!
In fact, from California alone, a total of 29 schools had the same or lower SAT 9 score than did the CNMI PSS 10th graders (L.A. Times, July 23, 1998, pg. 86). The California statewide average for 10th grade math was 36 percentile and 16 districts had average percentile scores of 25 and below for the same grade level and subject area.
Another example from the newspaper is that California had 227 elementary schools with average reading scores for 4th graders below the 25th percentile. The CNMI did not test 4th graders, but the 3rd graders average at a similar low level.
Furthermore, most of these schools exclude students who speak English as a second language from testing, whereas the PSS tests all students.
It is untrue that scores are getting lower as has been reported. In fact, in general, the scores have not changed since the PSS began using these kinds of tests. However, the number (percentage) of students with high scores has increased somewhat.
True, the scores are low, but is this proof that PSS is not performing well? Actually, we cannot answer this question with test scores from the SAT 9. The answer could be yes and no or yes or no. We don’t know! Most educators today, regard reading comprehension, writing and thinking skills as priority. The PSS curriculum, performance standards, and accreditation plans all value these skills highly, yet they are not well measured on the SAT 9.
For many years, experts have agreed that these tests are biased for language, experience and culture. This means that if you are a white middle class mainland student, in a middle income family, you have a better chance of having a higher score. To use the same analogy, your scale has a pre-set weight that equals the correct weight.
The ignorance about these tests and how to use them will no doubt continue for many years to come, but when good teachers are blamed for poor scores, and students are constantly reminded how poorly they do on these tests, these lowers their self-esteem and does nothing to improve their learning. In fact, many suggest it only exacerbates the situation.
Perhaps even worse is the suggestion that we teach to the test. In fact some states do just that. We could try it for one year. Give each teacher a copy of the test. Have teachers ignore writing skills and complex thinking skills and application of learning, and instead teach only what will be tested. I can guarantee that the scores would not improve very much, if at all. For PSS to reach an average score of 50 percentile, another group of students would have to lower their scores! In fact, one school in Hawaii tried something similar a few years ago and their scores did not improve.
And then we would be ignoring CNMI history, complex problem solving, science experiments, extended writing and other important skills that are Board of Education approved.
So why do people use these tests at all? People (mostly politicians) want quick ways to weigh students, teachers and school systems.
The tests take three days to complete and then are sent to a scoring machine. Many people also don’t realize there are alternatives.
When you get a driver’s license, you actually have to drive a car. To find out if someone can grow vegetables or mow the lawn, catch a fish, or write a good story, memorize a prayer or a poem, they actually have to PERFORM the task. There are tests that do this. They are called Criterion Referenced Tests (CRTs) and the results tell us exactly how much people weigh and what they can do, rather than comparing them to others. They take into account that learning is a complex process and evaluating learning is not simple and quick.
PSS has used the Degrees of Reading Power tests (CRT) since 1992 and the scores show that most students can read various content area materials fairly well and continue to improve until they graduate.
PSS has other tests called Performance Based Tests for math, science and reading/writing, but they have never been used. Even though these test take the same amount of time to administer to students, they can’t be sent away to be scored. And since the budget has been reduced, there are no qualified personnel to administer these tests that would tell us if our students are learning what we are teaching them and how we can improve learning.
If PSS did begin to use Performance-Based tests, we might be as surprised at the results, as some high scoring districts in California were when they began using these same kind of tests.
The scores were much lower than on traditional tests that they had used in the past. The conclusion was “we have been fooling ourselves for a long time”. Students were able to score well on bubble tests, but when they were asked to provide answers to questions, explain their answers an writing, they performed poorly.
The reality is that tests like the SAT 9 provide very little, if any information for teachers, parents or educators, especially considering their high price. And the more we use them, the less we will really know what our students know and can do.
Until PSS has the confidence and funding it needs to begin system wide, Performance-Based testing that will tell us exactly how much our students weigh (like many states and other countries do), I suggest that we all stop talking about SAT 9. Instead, why not focus our money, energy and time on improving learning.
Dominique Buckley