Students despised the SAT not just because of the intense anxiety it caused — it was one of the biggest barriers to entry to the colleges they dreamed of attending — but also because they didn’t know what to expect from the exam and felt that it played clever tricks, asking the kinds of questions they rarely encountered in their high-school courses. Students were docked one-quarter point for every multiple-choice question they got wrong, requiring a time-consuming risk analysis to determine which questions to answer and which to leave blank. Teachers, too, felt the test wasn’t based on what they were doing in class, and yet the mean SAT scores of many high schools were published by state education departments, which meant that blame for poor performances was often directed at them.
An even more serious charge leveled at the test was that it put students whose families had money at a distinct advantage, because their parents could afford expensive test-prep classes and tutors. Several years ago, an exasperated Mitch Kapor, a founder of Lotus Software, co-wrote an op-ed in The San Francisco Chronicle suggesting colleges should require mandatory disclosure by students and parents of “each and every form of purchased help,” as a way to level the playing field.
When the Scholastic Aptitude Test was created in 1926, it was promoted as a tool to create a classless, Jeffersonian-style meritocracy. The exam, which purported to measure innate intelligence, was originally adapted from the World War I Army I.Q. test and served as a scholarship screening device for about a dozen selective colleges throughout the 1930s. It was assumed that there was no way to effectively prep for a test geared to inborn intelligence, but as early as 1938, Stanley Kaplan began offering classes that promised higher scores. Today the company Kaplan founded and its main competitor, the Princeton Review, are joined by innumerable boutique firms (not to mention high-priced private tutors), all part of a $4.5-billion-a-year industry that caters largely to the worried wealthy in America who feel that the test can be gamed and that their children need to pay to learn the strategies.
Coleman conducted a “listening thing” with his organization’s various frustrated constituencies. For the College Board to be a great institution, he thought at the time, it had to own up to its vulnerabilities. “Unequal test-prep access is a problem,” he said. “It is a problem that it’s opaque to students what’s on the exam. It is a problem that the scoring is too complex. I knew some of the science behind the SAT and actually admired a lot of it. On the other hand, I felt that something really had to happen, because what had grown up around it” — the way in which the test evolved from a vehicle to encourage meritocracy to a reinforcement of privilege in American education — “threatened everything.”
It was clear, Coleman said, that no parents, whatever their socioeconomic status, were satisfied. The achievements of children from affluent families were tainted because they “bought” a score; those in the middle class cried foul because they couldn’t get the “good stuff” or were overextended trying to; and the poor, often minority students, were shut out completely. A paper prepared in 2009 by Derek Briggs, the chairman of the Research and Evaluation Methodology program at the University of Colorado, Boulder, emphasized another cost to test prep beyond the $1,000-plus classes and the personal tutors: He called it an opportunity cost, meaning that time spent in the narrow pursuit of beating the test meant time away from schoolwork and extracurricular activities that are actually designed to prepare students to succeed in college.
In addition to these educational (and moral) quandaries, Coleman had to grapple with what it meant for the College Board as a business to have the credibility of the SAT called into question. A growing number of colleges and universities, frustrated by the minimal change to the SAT when it was revised in 2005 and motivated by a report issued in 2008 by the National Association for College Admission Counseling (Nacac), began to eliminate the SAT and its competitor, the A.C.T., as admission requirements, following the lead of several small, liberal-arts colleges that did so years before. The authors of the Nacac report cited a University of California study, which characterized the SAT as a “relatively poor predictor of student performance” and questioned the tendency of colleges to rely on the SAT as “one of the most important admission tools.” (Many of the schools that dropped test requirements saw spikes in their applications, at least in the first year.)
Around the time the report came out — and following the publication of “The Power of Privilege,” by the Wake Forest University sociology professor Joseph A. Soares, an account of the way standardized tests contributed to discriminatory admissions policies at Yale — Wake Forest became the first Top 30 national university in the U.S. News & World Report college rankings to announce a test-optional admissions policy. Follow-up studies at Wake Forest showed that the average high-school G.P.A. of incoming freshmen increased after the school stopped using standardized-test scores as a factor. Seventy-nine percent of its 2012 incoming class was in the top 10 percent of their high-school classes. Before going test-optional, that figure was in the low 60s. In addition, the school became less homogeneous. “The test highly correlates with family income,” says Soares, who also edited a book that, in part, examines the weak predictive validity of the SAT at the University of Georgia, Johns Hopkins University and Wake Forest. “High-school grades do not.” He continued, “We have a lot more social, racial and lifestyle diversity. You see it on campus. Wake Forest was a little too much like a J. Crew catalog before we went test-optional.”
A report released last month by William C. Hiss, a former dean of admissions at Bates College, and Valerie W. Franks, a former Bates assistant dean of admissions, supports Wake Forest’s experience. They reviewed 33 colleges and universities that did not require SAT or A.C.T. scores and found no significant difference in college G.P.A. or graduation rates between those who had submitted tests and those who had not. Specifically, they saw that students with good high-school grades did well in college, even if they had weak SAT scores. But students with weaker high-school grades — even with strong SATs — did less well in college. Those who didn’t submit SATs were more likely to be minority students, women, Pell grant recipients or the first in their families to go to college.
While more colleges are choosing to opt out of standardized testing, an estimated 80 percent of four-year colleges still require either SAT or A.C.T. scores, according to David Hawkins at Nacac, and admissions officers report feeling bound to the tests as a way to filter the overwhelming numbers of applicants. Robert Sternberg, a celebrated author and Cornell professor, told “Frontline” that when he was at Yale and reviewed admissions applications, the scores were hard to ignore. “I know that when I’m reading applications and as the night goes on and I’m reading more and more, it gets more and more tempting to count the SATs,” he said. “It’s easier than reading these long essays and teacher recommendations. It’s human nature.” On top of the pressures to winnow the applicants, the Nacac report cited the problems resulting from the use of SAT and A.C.T. scores by U.S. News & World Report to create its rankings, stating that the scores “were not a valid measure of institutional quality.” In addition, it criticized the use of the SAT and A.C.T. by bond-rating companies to help assess the financial health of a school as creating “undue pressure on admission offices to pursue increasingly high test scores.”
Coleman said that many of the admissions officers he spoke with made it clear that they were uncomfortable being beholden to the test, at least to this test, but there was no consensus about what an exam that was fair and acceptable to all would look like.
Hard questions have always interested Coleman. In 1994, after earning a bachelor’s degree in philosophy at Yale, a bachelor’s in English literature at Oxford (where he was a Rhodes scholar) and a master’s in ancient philosophy at Cambridge (“three degrees that entitled you to zero jobs”), Coleman intended to come home to New York City and work as a public-school teacher. But when he realized he wouldn’t find a job teaching high-school English, he ended up instead as a consultant at McKinsey & Company, where for five years he became increasingly obsessed with evidence-based solutions. During that time, he did pro bono work for school districts trying to improve performance, and in 1999, he left McKinsey and helped create a company called the Grow Network, which focused on assisting students and parents, including non-English-speaking families, in navigating an educational system that was increasingly dictated by standardized tests. His immersion in the world of standardized testing — talking to educators as well as students — convinced him that the standards those tests were supposedly measuring had to change. They were too vast and vague, and they produced textbooks that suffered from the same lack of purpose.
“When you cover too many topics,” Coleman said, “the assessments designed to measure those standards are inevitably superficial.” He pointed to research showing that more students entering college weren’t prepared and were forced into “remediation programs from which they never escape.” In math, for example, if you examined data from top-performing countries, you found an approach that emphasized “far fewer topics, far deeper,” the opposite of the curriculums he found in the United States, which he described as “a mile wide and an inch deep.”
In 2008, Coleman helped start a nonprofit organization called Student Achievement Partners, which was dedicated to “acting on evidence” when making decisions about education policy. While at Partners, Coleman was integral in helping shape the Common Core, a set of academic standards that has subsequently been implemented in more than 40 states. While not without its critics — many parents and educators believe it deeply roots schools and teachers in a problematic “teach to the test” mind-set — Coleman talks about it not just as a bulwark against the declining standards of American public education but as a rare bipartisan success. At the Strategic Data Project conference last May in Boston, he challenged those in the audience to cite a “significant domestic policy area where Republicans and Democrats have gotten together and gotten something done.” The Common Core, he said, was a galvanizing idea that “swept the country during a period when all ideas seemed to stop.”
When Coleman attended Stuyvesant High in Manhattan, he was a member of the championship debate team, and the urge to overpower with evidence — and his unwillingness to suffer fools — is right there on the surface when you talk with him. (Debate, he said, is one of the few activities in which you can be “needlessly argumentative and it advances you.”) He offended an audience of teachers and administrators while promoting the Common Core at a conference organized by the New York State Education Department in April 2011: Bemoaning the emphasis on personal-narrative writing in high school, he said about the reality of adulthood, “People really don’t give a [expletive] about what you feel or what you think.” After the video of that moment went viral, he apologized and explained that he was trying to advocate on behalf of analytical, evidence-based writing, an indisputably useful skill in college and career. His words, though, cemented his reputation among some as both insensitive and radical, the sort of self-righteous know-it-all who claimed to see something no one else did.
Coleman obliquely referenced the episode — and his habit for candor and colorful language — at the annual meeting of the College Board in October 2012 in Miami, joking that there were people in the crowd from the board who “are terrified.”
The lessons he brought with him from thinking about the Common Core were evident — that American education needed to be more focused and less superficial, and that it should be possible to test the success of the newly defined standards through an exam that reflected the material being taught in the classroom. This was exactly how the College Board’s Advanced Placement program worked (80 percent of teachers surveyed in a study by the Fordham Institute said that the A.P. exam was a good indication of their and their students’ work). It was also one of the main suggestions in Nacac’s 2008 report, that a college admission exam should be redesigned as an achievement-style test — like the A.P. exams — that would send a “message to students that studying their course material in high school, not taking extracurricular test-prep courses that tend to focus on test-taking skills, is the way to do well on admission tests and succeed in a rigorous college curriculum.”
The question for Coleman was how to create an exam that served as an accurate measure of student achievement and college preparedness and that moved in the direction of the meritocratic goals it was originally intended to accomplish, rather than thwarting them.
More than a year ago, Coleman and a team of College Board staff members and consultants began to try to do just that. Cyndie Schmeiser, the board’s chief of assessments, told me that their first order of business was to determine what the test should measure. Starting in late 2012 and continuing through the spring of 2013, she and her team had extensive conversations with students, teachers, parents, counselors, admissions officers and college instructors, asking each group to tell them in detail what they wanted from the test. What they arrived at above all was that a test should reflect the most important skills that were imparted by the best teachers. Schmeiser explained that, for example, a good instructor would teach Martin Luther King Jr.’s “I Have a Dream” speech by encouraging a conversation that involved analyzing the text and identifying the evidence, both factual and rhetorical, that makes it persuasive. “The opposite of what we’d want is a classroom where a teacher might ask only: ‘What was the year the speech was given? Where was it given?’ ”
The team then set about trying to create test questions that lent themselves to this more meaningful engagement. Schmeiser said that in the past, assembling the SAT focused on making sure the questions performed on technical grounds, meaning: Were they appropriately easy or difficult among a wide range of students, and were they free of bias when tested across ethnic, racial and religious subgroups? The goal was “maximizing differentiation” among kids, which meant finding items that were answered correctly by those students who were expected to get them right and incorrectly by the weaker students. A simple way of achieving this, Coleman said, was to test the kind of obscure vocabulary words for which the SAT was famous (or infamous). The answer pattern was statistically strong, he said — a small percentage of the kids knew them, most did not — but it didn’t adequately reflect the educational values Coleman believed in. In redesigning the test, the College Board shifted its emphasis. It prioritized content, measuring each question against a set of specifications that reflect the kind of reading and math that students would encounter in college and their work lives. Schmeiser and others then spent much of early last year watching students as they answered a set of 20 or so problems, discussing the questions with the students afterward. “The predictive validity is going to come out the same,” she said of the redesigned test. “But in the new test, we have much more control over the content and skills that are being measured.”
When I met with Coleman in his office last month to talk about the remaking of the SAT, he periodically leapt from his chair when he became excited about an idea. At one point he jumped up and drew a dividing line down the middle of his whiteboard (he’s a very enthusiastic user of the whiteboard), then scrawled, “Evidence-based reading and writing” on one side and “Math” on the other. He was unveiling, at least in broad strokes, the results of those many months of rethinking and testing.
Starting in spring 2016, students will take a new SAT — a three-hour exam scored on the old 1,600-point system, with an optional essay scored separately. Evidence-based reading and writing, he said, will replace the current sections on reading and writing. It will use as its source materials pieces of writing — from science articles to historical documents to literature excerpts — which research suggests are important for educated Americans to know and understand deeply. “The Declaration of Independence, the Constitution, the Bill of Rights and the Federalist Papers,” Coleman said, “have managed to inspire an enduring great conversation about freedom, justice, human dignity in this country and the world” — therefore every SAT will contain a passage from either a founding document or from a text (like Lincoln’s Gettysburg Address) that is part of the “great global conversation” the founding documents inspired.
Coleman gave me what he said was a simplistic example of the kind of question that might be on this part of the exam. Students would read an excerpt from a 1974 speech by Representative Barbara Jordan of Texas, in which she said the impeachment of Nixon would divide people into two parties. Students would then answer a question like: “What does Jordan mean by the word ‘party’?” and would select from several possible choices. This sort of vocabulary question would replace the more esoteric version on the current SAT. The idea is that the test will emphasize words students should be encountering, like “synthesis,” which can have several meanings depending on their context. Instead of encouraging students to memorize flashcards, the test should promote the idea that they must read widely throughout their high-school years.
The Barbara Jordan vocabulary question would have a follow-up — “How do you know your answer is correct?” — to which students would respond by identifying lines in the passage that supported their answer. (By 2016, there will be a computerized version of the SAT, and students may someday search the text and highlight the lines on the screen.) Students will also be asked to examine both text and data, including identifying and correcting inconsistencies between the two.
“Whenever a question really matters in college or career, it is not enough just to give an answer,” Coleman said. “The crucial next step is to support your answer with evidence,” which allows insight into what the student actually knows. “And this change means a lot for the work students do to prepare for the exam. No longer will it be good enough to focus on tricks and trying to eliminate answer choices. We are not interested in students just picking an answer, but justifying their answers.”
To that end, the question for the essay portion of the test will also be reformulated so that it will always be the same, some version of: “As you read the passage in front of you, consider how the author uses evidence such as facts or examples; reasoning to develop ideas and to connect claims and evidence; and stylistic or persuasive elements to add power to the ideas expressed. Write an essay in which you explain how the author builds an argument to persuade an audience.” The passage will change from test to test, but the analytical and evidentiary skills tested will always be the same. “Students will be asked to do something we do in work and in college every day,” Coleman said, “analyze source materials and understand the claims and supporting evidence.”
The math section, too, will be predicated on research that shows that there are “a few areas of math that are a prerequisite for a wide range of college courses” and careers. Coleman conceded that some might treat the news that they were shifting away from more obscure math problems to these fewer fundamental skills as a dumbing-down the test, but he was adamant that this was not the case. He explained that there will be three areas of focus: problem solving and data analysis, which will include ratios and percentages and other mathematical reasoning used to solve problems in the real world; the “heart of algebra,” which will test how well students can work with linear equations (“a powerful set of tools that echo throughout many fields of study”); and what will be called the “passport to advanced math,” which will focus on the student’s familiarity with complex equations and their applications in science and social science.
Last June, Coleman spoke at the Harvard Summer Institute’s multiday seminar for college-admissions and counseling professionals. Before the talk, he met with William Fitzsimmons, the longtime dean of admissions and financial aid at Harvard and the primary author of the 2008 Nacac commission report. Coleman brought along an outline of the SAT redesign to get Fitzsimmons’s impressions.
Fitzsimmons told me he was stunned by what he saw, the ways in which the exam read like a direct response to his commission’s most serious recommendations. “Like any other truly significant change, there will be debate,” he added. But then he went on: “Sometimes in the past, there’s been a feeling that tests were measuring some sort of ineffable entity such as intelligence, whatever that might mean. Or ability, whatever that might mean. What this is is a clear message that good hard work is going to pay off and achievement is going to pay off. This is one of the most significant developments that I have seen in the 40-plus years that I’ve been working in admissions in higher education.”
But changing the test didn’t solve all the problems that preoccupied Coleman. He was still troubled by the inequalities in education opportunity and believed that the College Board should play a role in ameliorating them. For some time, the College Board had been aware of the work of Caroline Hoxby, a professor of economics at Stanford, and Christopher Avery, a professor of public policy and management at Harvard’s John F. Kennedy School of Government, who had been studying what is sometimes called undermatching — the tendency of poor students to pick a school that is closer to home and less rigorous, in spite of evidence that they could succeed elsewhere. Hoxby first became aware of the problem in 2004, when she was on the faculty at Harvard and the university announced to great fanfare that it would recruit top-performing, financially challenged kids by offering free tuition if their parents made less than $40,000. Yet despite the offer, enrollment numbers for those students remained stubbornly low. When Hoxby began to research the issue, she hypothesized that there was a large population of high-achieving, low-income students yet to be identified. She and Avery began working with the College Board and the A.C.T. to develop new techniques to find out how many students were in this low-income, top-performing pool and where they lived. By piecing together data from census reports, I.R.S. income data broken down by ZIP code, real estate valuations and other sources, they pinpointed some 35,000 students whose grades were in the top-10 percent nationally and whose family income was in the bottom quarter of families with a 12th-grader.
When they tracked where those kids applied to school, they found a number that would later shock Coleman. Fifty-six percent didn’t apply to a single selective college or university.
The researchers surmised this was a problem of communication, more than anything else; the information wasn’t reaching the students and families it needed to reach, and in the cases when it did, it wasn’t as clear and useful as it could be. Hoxby and Sarah Turner, an economics professor at the University of Virginia, tested whether they could change enrollment patterns. From 2010 to 2012, Hoxby’s team sent out personalized, detailed packets encouraging the high-achieving, low-income students to apply to several schools and providing application-fee waivers and financial-aid information about scholarships. In many cases these students would be able to see that they could get a better deal financially at more highly selective schools that wanted to attract them. The intervention resulted in those students’ applying to more colleges and closing “part of the college behavior ‘gap’ between low-income and high-income students with the same level of achievement,” Hoxby and Turner wrote.
When Coleman became College Board president, he was briefed on the supporting role the board had played to date. He agreed with those who saw an opportunity and decided, he says, to “take it from a small experiment to implement it nationwide.” He called for additional research to find low-income students whom the board refers to as “college-ready,” meaning they scored 1,550 or above on the SAT (the top 43 percent of U.S. test-takers). Ultimately the board mailed out nearly 100,000 packets to top-performing and college-ready students. The packets included four or more application waivers to allow students to apply immediately to any of the more than 2,000 schools that agreed to participate in the program. One requirement of those colleges, Coleman said, was that they agreed to rely on the financial determination made by the College Board and didn’t make the students requalify for aid or special tuition dispensation. Instead, the waiver was designed to look like a ticket — “Here’s your ticket, go!” Coleman said — simplifying the process and encouraging the students to jump at the opportunity. Research on the initial effects of the program won’t be released until next month, but the speed with which the deployment happened fulfilled Coleman’s promise to accelerate the agenda.
In January 2013, at a College Board-sponsored conference in Florida, Coleman met Daniel Porterfield, the president of Franklin & Marshall College, whose surprisingly effective work in bringing high-achieving, low-income students to his small liberal-arts school in Lancaster, Pa., gained national attention. Porterfield agreed with Hoxby’s team’s conclusions that the problem wasn’t that the students weren’t out there; the problem was that colleges weren’t looking hard enough to find them, and that this commitment was a big part of Franklin & Marshall’s success. Porterfield, who is now a board member of the College Board, told me he saw Coleman as uniquely “using the College Board to serve society as opposed to the College Board serving its own position.” He also said that when the two of them first talked, Coleman promised that the College Board could help F&M find talented, high-striving, high-achieving students and that he has “exactly delivered on that promise.”
What Coleman found exciting about the intervention was its use of the standardized tests as a way to reach students who would otherwise not apply to the kinds of colleges that they might assume were out of reach. It transformed an exam that most thought of as a burden — and many low-income students opted not to take at all — into an opportunity. Coleman explained that the moment when students get their test results is a rare instance in which you have their full attention — that’s the moment you have to seize on, connecting the score they’re holding in their hand to the future that they could possibly attain. “When have you ever gotten anything for taking the SAT?” he said, imagining the reaction of students opening up their test results and holding the application waivers in their hands.
For all the good intentions and all the evidence-based ideas brought to bear by Coleman and his colleagues over the past year and a half, there is still a chasm between the educational experiences of children at good schools in wealthier districts and those in lower-income areas. The fact that you could never fully level the playing field — that good, focused instruction and meaningful preparation would still be unavailable to the students they were most focused on lifting up — nagged at Coleman and his staff as they continued redesigning the test.
They began to consider how they might provide teachers in sixth through 12th grade, particularly in low-performing schools, with broader access to content and resources to help prepare students for the test. Then last July, on a bus to dinner at a staff retreat in upstate New York, two of Coleman’s senior team members, Cyndie Schmeiser and Jeff Olson, threw out an idea: What if the College Board worked with Khan Academy, the free online tutoring service, visited by 10 million students each month, to offer SAT prep classes to anyone who wanted them? At Khan Academy, students logged on to the site and then worked over weeks or months at their own pace answering questions in different subject areas and following their progress. If they needed help, they could watch one of the thousands of casual but engaging videos created by the founder, Sal Khan. Khan holds multiple degrees from Harvard and M.I.T. and serves as the site’s ubiquitous guide, his voice explaining how to do various problems while text and numbers appeared on a digital chalkboard.
Khan started the site in modest fashion, tutoring his young niece over the Internet. When her relatives and friends wanted to be taught by him, too, he began posting videos to YouTube. As the site grew, he worked through all sorts of problems, including some that he took from past SAT exams, which he says now he probably didn’t have permission to use.
Coleman and his team were aware of what was happening at Khan Academy and were intrigued by the idea of a partnership, but they were also wary. “You kind of say, ‘O.K.,’ ” Coleman said. “ ‘But is it good enough?’ ” This kind of partnership had never been done by the College Board and he worried about what it might mean for the brand.
Throughout last fall his staff spent many hours on the Khan Academy website. The idea of creating a transparent test and then providing a free website that any student could use — not to learn gimmicks but to get a better grounding and additional practice in the core knowledge that would be tested — was appealing to Coleman.
He thought about athletics as the corollary for what they were trying to do. In sports, you practiced all the time to prepare for games. But while the stakes were high on game days, they didn’t result in the “suffering” and counterproductive anxiety that was a common reaction to the SAT. The difference, he said, was that in sports there was no mystery as to what would be required of you in a game. The rules were clear and didn’t change. On the SAT, the rules were unclear. “Half the anxiety is about what’s going to happen on ‘game day,’ ” Coleman said. “It’s not really fair. High stakes should not be placed on something that didn’t matter before that suddenly matters now. The stakes should emerge because the work is important and your demonstration of that is significant.”
The long, deliberate practice required for that type of performance was consistent with the Khan Academy method. In theory, anyway, the partnership made perfect sense.
In January, Coleman met with Wade Henderson, the president and C.E.O. of the Leadership Conference on Civil and Human Rights, who spoke with him about the ill will that had been built up in the minority community over the SAT, how the test has long been viewed not as a launching pad to something better but as an obstacle to hard-working, conscientious students who couldn’t prepare for it in the way more affluent students could. Coleman acknowledged “the extent to which the exam recapitulates income inequality.” Henderson also expressed concern, Coleman said, that poor SAT scores could block access to jobs. After the hourlong conversation, which Coleman characterized as deeply moving, he decided to add one more element to the redesign. Test information sent to an institution would include a “safe use” warning in red ink: “This data should only be used in combination with other relevant information to make responsible decisions about students.”
A couple of weeks after his talk with Henderson, Coleman flew to Silicon Valley to discuss a partnership with Sal Khan. There was no discussion of financial terms, just an agreement in principle that they would join forces. (The College Board won’t pay Khan Academy.) They talked about a hypothetical test-prep experience in which students would log on to a personal dashboard, indicate that they wanted to prepare for the SAT and then work through a series of preliminary questions to demonstrate their initial skill level and identify the gaps in their knowledge. Khan said he could foresee a way to estimate the amount of time it would take to achieve certain benchmarks. “It might go something like, ‘O.K., we think you’ll be able to get to this level within the next month and this level within the next two months if you put in 30 minutes a day,’ ” he said. And he saw no reason the site couldn’t predict for anyone, anywhere the score he or she might hope to achieve with a commitment to a prescribed amount of work.
Coleman told Khan that the College Board would invest in an outreach campaign through organizations like Boys and Girls Clubs and Big Brothers Big Sisters groups to reach as many students as possible, especially low-income students who aren’t the website’s primary users now. He also gave Khan access to actual test questions, and Khan is in the process of creating material for students who will be taking the old exam. (He says that it will be available early this month.) Coleman told me that his confidence in the partnership crystallized when Khan told him they were constantly revising their material based on what was most effectively helping students on the site. Coleman was particularly inspired by Khan’s belief that it was possible for any student to achieve better skills with the proper instruction. Khan asked if Coleman was aware that five centuries ago, there was an analogous misperception. Coleman explained, “He said, ‘David, do you realize it used to be believed that most human beings couldn’t read?’ ”
At various times in our discussions, Coleman referred to some test-prep providers as predators who prey on the anxieties of parents and children and provide no real educational benefit. (Though there’s a debate about how helpful test prep is, much research shows increases of an average of only 30 points.) “This is a bad day for them,” he said about the new test and his Khan Academy partnership.
Still, Coleman concedes that the redesigned SAT won’t quiet everyone’s complaints, and he doesn’t expect there to be a universal celebration of what they’ve done. You can imagine there will be substantial questions, for instance, about whether any standardized test can be fair across all groups, and whether the College Board is not ultimately creating a new test that somehow, some way, will be gamed as much as the old one.
Coleman’s response to those concerns is to say that the new, more transparent test will be tied to what’s being taught in high school and will be evidence-based. But his previous work on the Common Core has raised some educators’ concerns. “Dave Coleman is not an educator by training,” says Lucy Calkins, the founding director of the Reading and Writing Project at Columbia University’s Teachers College and an author of “Pathways to the Common Core.” Calkins has been a strong defender of the Common Core but thinks Coleman has been too insistent on his own particular method for implementing its standards. She cites a video that Coleman helped create of a “model lesson” for teaching the Gettysburg Address, where he would have students spending several classes “parsing the meaning of each word in each paragraph,” she said. She doesn’t feel there’s evidence that this method works.
With a redesigned SAT, Calkins thinks that too much of the nation’s education curriculum and assessment may rest in one person’s hands. “The issue is: Are we in a place to let Dave Coleman control the entire K-to-12 curriculum?”
William Fitzsimmons, the Nacac chairman and head of admissions at Harvard, for his part, was impressed with the quickness with which Coleman has been able to make these changes. “In the world of education,” Fitzsimmons told me, “this is lightninglike speed.” And Coleman rejects the worries that he might be making changes that are too radical without waiting to see what works. He says that he believes that if you’ve been diligent in gathering the supporting facts, which he has been, then that is your defense against hubris and wrong thinking. In reality, he said, the decisions he has made aren’t all that bold, because they’re all completely supported by research. This is where he and critics like Lucy Calkins disagree, of course, but like any good debater, Coleman seems to know when to marshal hard evidence and when to wrap it in persuasive rhetoric. And what’s at stake, he often makes clear, is not just the fairness and usefulness of an exam but our nation’s ability to deliver opportunity for all, which, really, is about the soul of the country. The rest of us will have to wait for the proof that he has found the answer.Continue reading the main story
An article on March 9 about changes in the SAT referred incorrectly to two universities’ policies on the SAT. The test is not optional at the University of Georgia or at Johns Hopkins. The same article erroneously attributed a distinction to Wake Forest University. It was the first Top 30 national university in the U.S. News & World Report college rankings to announce a test-optional admissions policy; it was not the first educational institution to do so. (Several institutions adopted a test-optional policy before Wake Forest.)
This article is about the college admission test in the United States. For the exams in England colloquially known as SATs, see National Curriculum assessment.
For other uses, see SAT (disambiguation).
|Type||Paper-based standardized test|
|Developer / administrator||College Board, Educational Testing Service.|
|Knowledge / skills tested||Writing, critical reading, mathematics.|
|Purpose||Admission to undergraduate programs of universities or colleges.|
|Year started||1926 (1926)|
|Duration||3 to 4 hours|
|Score / grade range||200–800 (in 10-point increments) on each of two sections (total 400–1600).|
Essay scored on scale of 2–8, in 1-point increments.
|Offered||Seven times annually|
|Countries / regions||Worldwide|
|Annual number of test takers||Over 1.71 million high school graduates in the class of 2017|
|Prerequisites / eligibility criteria||No official prerequisite. Intended for high school students. Fluency in English assumed.|
|Fee||US$52.50 to US$101.50, depending on country.|
|Scores / grades used by||Most universities and colleges offering undergraduate programs in the U.S.|
The SAT (es-ay-TEE) is a standardized test widely used for college admissions in the United States. Introduced in 1926, its name and scoring have changed several times; originally called the Scholastic Aptitude Test, it was later called the Scholastic Assessment Test, then the SAT I: Reasoning Test, then the SAT Reasoning Test, and now, simply the SAT.
The SAT is owned, developed, and published by the College Board, a private, non-profit organization in the United States. It is administered on behalf of the College Board by the Educational Testing Service, which until recently developed the SAT as well. The test is intended to assess students' readiness for college. The SAT was originally designed to not be aligned with high school curricula, but several adjustments were made for the version of the SAT introduced in 2016, and College Board president, David Coleman, has said that he also wanted to make the test reflect more closely what students learned in high school.
On March 5, 2014, the College Board announced that a redesigned version of the SAT would be administered for the first time in 2016. The current SAT, introduced in 2016, takes three hours to finish, plus 50 minutes for the SAT with essay, and as of 2017[update] costs US$45 (US$57 with the optional essay), excluding late fees, with additional processing fees if the SAT is taken outside the United States. Scores on the SAT range from 400 to 1600, combining test results from two 800-point sections: mathematics, and critical reading and writing. Taking the SAT, or its competitor, the ACT, is required for freshman entry to many, but not all, colleges and universities in the United States. Starting with the 2015–16 school year, the College Board also announced it would team up with Khan Academy, a free, online education site to provide SAT prep, free of charge.
The SAT is typically taken by high schooljuniors and seniors. The College Board states that the SAT measures literacy, numeracy and writing skills that are needed for academic success in college. They state that the SAT assesses how well the test takers analyze and solve problems—skills they learned in school that they will need in college. However, the test is administered under a tight time limit (speeded) to help produce a range of scores.
The College Board also states that use of the SAT in combination with high school grade point average (GPA) provides a better indicator of success in college than high school grades alone, as measured by college freshman GPA. Various studies conducted over the lifetime of the SAT show a statistically significant increase in correlation of high school grades and college freshman grades when the SAT is factored in. A large independent validity study on the SAT's ability to predict college freshman GPA was performed by the University of California. The results of this study found how well various predictor variables could explain the variance in college freshman GPA. It found that independently high school GPA could explain 15.4% of the variance in college freshman GPA, SAT I (the SAT Math and Verbal sections) could explain 13.3% of the variance in college freshman GPA, and SAT II (also known as the SAT subject tests; in the UC's case specifically Writing, Mathematics IC or IIC, plus a third subject test of the student's choice) could explain 16% of the variance in college freshman GPA. When high school GPA and the SAT I were combined, they explained 20.8% of the variance in college freshman GPA. When high school GPA and the SAT II were combined, they explained 22.2% of the variance in college freshman GPA. When SAT I was added to the combination of high school GPA and SAT II, it added a .1 percentage point increase in explaining the variance in college freshman GPA for a total of 22.3%.
There are substantial differences in funding, curricula, grading, and difficulty among U.S. secondary schools due to U.S. federalism, local control, and the prevalence of private, distance, and home schooled students. SAT (and ACT) scores are intended to supplement the secondary school record and help admission officers put local data—such as course work, grades, and class rank—in a national perspective. However, independent research has shown that high school GPA is better than the SAT at predicting college grades regardless of high school type or quality.
Historically, the SAT was more widely used by students living in coastal states and the ACT was more widely used by students in the Midwest and South; in recent years, however, an increasing number of students on the East and West coasts have been taking the ACT. Since 2007, all four-year colleges and universities in the United States that require a test as part of an application for admission will accept either the SAT or ACT, and over 950 four-year colleges and universities do not require any standardized test scores at all for admission.
The SAT has four sections: Reading, Writing and Language, Math (no calculator), and Math (calculator allowed). The test taker may optionally write an essay which, in that case, is the fifth test section. The total time for the scored portion of the SAT is three hours (or three hours and fifty minutes if the optional essay section is taken). Some test takers who are not taking the essay may also have a fifth section which is used, at least in part, for the pretesting of questions that may appear on future administrations of the SAT. (These questions are not included in the computation of the SAT score.) Two section scores result from taking the SAT: Evidence-Based Reading and Writing, and Math. Section scores are reported on a scale of 200 to 800, and each section score is a multiple of ten. A total score for the SAT is calculated by adding the two section scores, resulting in total scores that range from 400 to 1600. There is no penalty for guessing on the SAT: scores are based on the number of questions answered correctly. In addition to the two section scores, three "test" scores on a scale of 10 to 40 are reported, one for each of Reading, Writing and Language, and Math. The essay, if taken, is scored separately from the two section scores.
The Reading Test of the SAT is made up of one section with 52 questions and a time limit of 65 minutes. All questions are multiple-choice and based on reading passages. Tables, graphs, and charts may accompany some passages, but no math is required to correctly answer the corresponding questions. There are five passages (up to two of which may be a pair of smaller passages) on the Reading Test and 10-11 questions per passage or passage pair. SAT Reading passages draw from three main fields: history, social studies, and science. Each SAT Reading Test always includes: one passage from U.S. or world literature; one passage from either a U.S. founding document or a related text; one passage about economics, psychology, sociology, or another social science; and, two science passages. Answers to all of the questions are based only on the content stated in or implied by the passage or passage pair.
Writing and Language Test
The Writing and Language Test of the SAT is made up of one section with 44 multiple-choice questions and a time limit of 35 minutes. As with the Reading Test, all questions are based on reading passages which may be accompanied by tables, graphs, and charts. The test taker will be asked to read the passages, find mistakes or weaknesses in writing, and to provide corrections or improvements. Reading passages on this test range in content from topic arguments to nonfiction narratives in a variety of subjects. The skills being evaluated include: increasing the clarity of argument; improving word choice; improving analysis of topics in social studies and science; changing sentence or word structure to increase organizational quality and impact of writing; and, fixing or improving sentence structure, word usage, and punctuation.
The mathematics portion of the SAT is divided into two sections: Math Test – Calculator and Math Test – No Calculator. In total, the SAT math test is 80 minutes long and includes 58 questions: 45 multiple choice questions and 13 grid-in questions. The multiple choice questions have four possible answers; the grid-in questions are free response and require the test taker to provide an answer.
- The Math Test – No Calculator section has 20 questions (15 multiple choice and 5 grid-in) and lasts 25 minutes.
- The Math Test – Calculator section has 38 questions (30 multiple choice and 8 grid-in) and lasts 55 minutes.
Several scores are provided to the test taker for the math test. A subscore (on a scale of 1 to 15) is reported for each of three categories of math content: "Heart of Algebra" (linear equations, systems of linear equations, and linear functions), "Problem Solving and Data Analysis" (statistics, modeling, and problem-solving skills), and "Passport to Advanced Math" (non-linear expressions, radicals, exponentials and other topics that form the basis of more advanced math). A test score for the math test is reported on a scale of 10 to 40, and a section score (equal to the test score multiplied by 20) is reported on a scale of 200 to 800. 
All scientific and most graphing calculators, including Computer Algebra System (CAS) calculators, are permitted on the SAT Math – Calculator section only. All four-function calculators are allowed as well; however, these devices are not recommended. All mobile phone and smartphone calculators, calculators with typewriter-like (QWERTY) keyboards, laptops and other portable computers, and calculators capable of accessing the Internet are not permitted.
Research was conducted by the College Board to study the effect of calculator use on SAT I: Reasoning Test math scores. The study found that performance on the math section was associated with the extent of calculator use: those using calculators on about one third to one half of the items averaged higher scores than those using calculators more or less frequently. However, the effect was "more likely to have been the result of able students using calculators differently than less able students rather than calculator use per se." There is some evidence that the frequent use of a calculator in school outside of the testing situation has a positive effect on test performance compared to those who do not use calculators in school.
Style of questions
Most of the questions on the SAT, except for the optional essay and the grid-in math responses, are multiple choice; all multiple-choice questions have four answer choices, one of which is correct. Thirteen of the questions on the math portion of the SAT (about 22% of all the math questions) are not multiple choice. They instead require the test taker to bubble in a number in a four-column grid.
All questions on each section of the SAT are weighted equally. For each correct answer, one raw point is added. No points are deducted for incorrect answers. The final score is derived from the raw score; the precise conversion chart varies between test administrations.
The SAT is offered seven times a year in the United States: in August, October, November, December, March, May, and June. The test is typically offered on the first Saturday of the month for the October, November, December, May, and June administrations. In other countries, the SAT is offered four times a year: in October, December, March, and May. The test was taken by 1,715,481 high school graduates in the class of 2017.
Candidates wishing to take the test may register online at the College Board's website, by mail, or by telephone, at least three weeks before the test date.
The SAT costs $45 ($57 with the optional essay), plus additional fees if testing outside the United States) as of 2017[update]. The College Board makes fee waivers available for low income students. Additional fees apply for late registration, standby testing, registration changes, scores by telephone, and extra score reports (beyond the four provided for free).
Candidates whose religious beliefs prevent them from taking the test on a Saturday may request to take the test on the following day, except for the October test date in which the Sunday test date is eight days after the main test offering. Such requests must be made at the time of registration and are subject to denial.
Students with verifiable disabilities, including physical and learning disabilities, are eligible to take the SAT with accommodations. The standard time increase for students requiring additional time due to learning disabilities or physical handicaps is time + 50%; time + 100% is also offered.
Raw scores, scaled scores, and percentiles
Students receive their online score reports approximately three weeks after test administration (six weeks for mailed, paper scores), with each section graded on a scale of 200–800 and two sub scores for the writing section: the essay score and the multiple choice sub score. In addition to their score, students receive their percentile (the percentage of other test takers with lower scores). The raw score, or the number of points gained from correct answers and lost from incorrect answers is also included. Students may also receive, for an additional fee, the Question and Answer Service, which provides the student's answer, the correct answer to each question, and online resources explaining each question.
The corresponding percentile of each scaled score varies from test to test—for example, in 2003, a scaled score of 800 in both sections of the SAT Reasoning Test corresponded to a percentile of 99.9, while a scaled score of 800 in the SAT Physics Test corresponded to the 94th percentile. The differences in what scores mean with regard to percentiles are due to the content of the exam and the caliber of students choosing to take each exam. Subject Tests are subject to intensive study (often in the form of an AP, which is relatively more difficult), and only those who know they will perform well tend to take these tests, creating a skewed distribution of scores.
The percentiles that various SAT scores for college-bound seniors correspond to are summarized in the following chart:
|Percentile||Score, 1600 Scale|
|Score, 2400 Scale|
|* The percentile of the perfect score was 99.98 on the 2400 scale and 99.93 on the 1600 scale.|
|** 99+ means better than 99.5 percent of test takers.|
The older SAT (before 1995) had a very high ceiling. In any given year, only seven of the million test-takers scored above 1580. A score above 1580 was equivalent to the 99.9995 percentile.
In 2015 the average score for the Class of 2015 was 1490 out of a maximum 2400. That was down 7 points from the previous class’s mark and was the lowest composite score of the past decade.
SAT-ACT score comparisons
The College Board and ACT, Inc. conducted a joint study of students who took both the SAT and the ACT between September 2004 (for the ACT) or March 2005 (for the SAT) and June 2006. Tables were provided to concord scores for students taking the SAT after January 2005 and before March 2016. 
In May, 2016, the College Board released concordance tables to concord scores on the SAT used from March 2005 through January 2016 to the SAT used since March 2016, as well as tables to concord scores on the SAT used since March 2016 to the ACT.
Many college entrance exams in the early 1900s were specific to each school and required candidates to travel to the school to take the tests. The College Board, a consortium of colleges in the northeastern United States, was formed in 1900 to establish a nationally administered, uniform set of essay tests based on the curricula of the boarding schools that typically provided graduates to the colleges of the Ivy League and Seven Sisters, among others.
In the same time period, Lewis Terman and others began to promote the use of tests such as Alfred Binet's in American schools. Terman in particular thought that such tests could identify an innate "intelligence quotient" (IQ) in a person. The results of an IQ test could then be used to find an elite group of students who would be given the chance to finish high school and go on to college. By the mid-1920s, the increasing use of IQ tests, such as the Army Alpha test administered to recruits in World War I, led the College Board to commission the development of the SAT. The commission, headed by Carl Brigham, argued that the test predicted success in higher education by identifying candidates primarily on the basis of intellectual promise rather than on specific accomplishment in high school subjects. In 1934, James Conant and Henry Chauncey used the SAT as a means to identify recipients for scholarships to Harvard University. Specifically, Conant wanted to find students, other than those from the traditional northeastern private schools, that could do well at Harvard. The success of the scholarship program and the advent of World War II led to the end of the College Board essay exams and to the SAT being used as the only admissions test for College Board member colleges.
The SAT rose in prominence after World War II due to several factors. Machine-based scoring of multiple-choice tests taken by pencil had made it possible to rapidly process the exams. The G.I. Bill produced an influx of millions of veterans into higher education. The formation of the Educational Testing Service (ETS) also played a significant role in the expansion of the SAT beyond the roughly fifty colleges that made up the College Board at the time. The ETS was formed in 1947 by the College Board, Carnegie Foundation for the Advancement of Teaching, and the American Council on Education, to consolidate respectively the operations of the SAT, the GRE, and the achievement tests developed by Ben Wood for use with Conant's scholarship exams. The new organization was to be philosophically grounded in the concepts of open-minded, scientific research in testing with no doctrine to sell and with an eye toward public service. The ETS was chartered after the death of Brigham, who had opposed the creation of such an entity. Brigham felt that the interests of a consolidated testing agency would be more aligned with sales or marketing than with research into the science of testing. It has been argued that the interest of the ETS in expanding the SAT in order to support its operations aligned with the desire of public college and university faculties to have smaller, diversified, and more academic student bodies as a means to increase research activities. In 1951, about 80,000 SATs were taken; in 1961, about 800,000; and by 1971, about 1.5 million SATs were being taken each year.
A timeline of notable events in the history of the SAT follows.
1901 essay exams
On June 17, 1901, the first exams of the College Board were administered to 973 students across 67 locations in the United States, and two in Europe. Although those taking the test came from a variety of backgrounds, approximately one third were from New York, New Jersey, or Pennsylvania. The majority of those taking the test were from private schools, academies, or endowed schools. About 60% of those taking the test applied to Columbia University. The test contained sections on English, French, German, Latin, Greek, history, mathematics, chemistry, and physics. The test was not multiple choice, but instead was evaluated based on essay responses as "excellent", "good", "doubtful", "poor" or "very poor".
The first administration of the SAT occurred on June 23, 1926, when it was known as the Scholastic Aptitude Test. This test, prepared by a committee headed by Princeton psychologist Carl Campbell Brigham, had sections of definitions, arithmetic, classification, artificial language, antonyms, number series, analogies, logical inference, and paragraph reading. It was administered to over 8,000 students at over 300 test centers. Men composed 60% of the test-takers. Slightly over a quarter of males and females applied to Yale University and Smith College. The test was paced rather quickly, test-takers being given only a little over 90 minutes to answer 315 questions. The raw score of each participating student was converted to a score scale with a mean of 500 and a standard deviation of 100. This scale was effectively equivalent to a 200 to 800 scale, although students could score more than 800 and less than 200.
1928 and 1929 tests
In 1928, the number of sections on the SAT was reduced to seven, and the time limit was increased to slightly under two hours. In 1929, the number of sections was again reduced, this time to six. These changes were designed in part to give test-takers more time per question. For these two years, all of the sections tested verbal ability: math was eliminated entirely from the SAT.
1930 test and 1936 changes
In 1930 the SAT was first split into the verbal and math sections, a structure that would continue through 2004. The verbal section of the 1930 test covered a more narrow range of content than its predecessors, examining only antonyms, double definitions (somewhat similar to sentence completions), and paragraph reading. In 1936, analogies were re-added. Between 1936 and 1946, students had between 80 and 115 minutes to answer 250 verbal questions (over a third of which were on antonyms). The mathematics test introduced in 1930 contained 100 free response questions to be answered in 80 minutes, and focused primarily on speed. From 1936 to 1941, like the 1928 and 1929 tests, the mathematics section was eliminated entirely. When the mathematics portion of the test was re-added in 1942, it consisted of multiple choice questions.
1941 and 1942 score scales
Until 1941, the scores on all SATs had been scaled to a mean of 500 with a standard deviation of 100. Although one test-taker could be compared to another for a given test date, comparisons from one year to another could not be made. For example, a score of 500 achieved on an SAT taken in one year could reflect a different ability level than a score of 500 achieved in another year. By 1940, it had become clear that setting the mean SAT score to 500 every year was unfair to those students who happened to take the SAT with a group of higher average ability.
In order to make cross-year score comparisons possible, in April 1941 the SAT verbal section was scaled to a mean of 500, and a standard deviation of 100, and the June 1941 SAT verbal section was equated (linked) to the April 1941 test. All SAT verbal sections after 1941 were equated to previous tests so that the same scores on different SATs would be comparable. Similarly, in June 1942 the SAT math section was equated to the April 1942 math section, which itself was linked to the 1942 SAT verbal section, and all SAT math sections after 1942 would be equated to previous tests. From this point forward, SAT mean scores could change over time, depending on the average ability of the group taking the test compared to the roughly 10,600 students taking the SAT in April 1941. The 1941 and 1942 score scales would remain in use until 1995. 
1946 test and associated changes
Paragraph reading was eliminated from the verbal portion of the SAT in 1946, and replaced with reading comprehension, and "double definition" questions were replaced with sentence completions. Between 1946 and 1957 students were given 90 to 100 minutes to complete 107 to 170 verbal questions. Starting in 1958 time limits became more stable, and for 17 years, until 1975, students had 75 minutes to answer 90 questions. In 1959 questions on data sufficiency were introduced to the mathematics section, and then replaced with quantitative comparisons in 1974. In 1974 both verbal and math sections were reduced from 75 minutes to 60 minutes each, with changes in test composition compensating for the decreased time.
1960s and 1970s score declines
From 1926 to 1941, scores on the SAT were scaled to make 500 the mean score on each section. In 1941 and 1942, SAT scores were standardized via test equating, and as a consequence, average verbal and math scores could vary from that time forward. In 1952, mean verbal and math scores were 476 and 494, respectively, and scores were generally stable in the 1950s and early 1960s. However, starting in the mid-1960s and continuing until the early 1980s, SAT scores declined: the average verbal score dropped by about 50 points, and the average math score fell by about 30 points. By the late 1970s, only the upper third of test takers were doing as well as the upper half of those taking the SAT in 1963. From 1961 to 1977, the number of SATs taken per year doubled, suggesting that the decline could be explained by demographic changes in the group of students taking the SAT. Commissioned by the College Board, an independent study of the decline found that most (up to about 75%) of the test decline in the 1960s could be explained by compositional changes in the group of students taking the test; however, only about 25 percent of the 1970s decrease in test scores could similarly be explained. Later analyses suggested that up to 40 percent of the 1970s decline in scores could be explained by demographic changes, leaving unknown at least some of the reasons for the decline.
In early 1994, substantial changes were made to the SAT. Antonyms were removed from the verbal section in order to make rote memorization of vocabulary less useful. Also, the fraction of verbal questions devoted to passage-based reading material was increased from about 30% to about 50%, and the passages were chosen to be more like typical college-level reading material, compared to previous SAT reading passages. The changes for increased emphasis on analytical reading were made in response to a 1990 report issued by a commission established by the College Board. The commission recommended that the SAT should, among other things, "approximate more closely the skills used in college and high school work". A mandatory essay had been considered as well for the new version of the SAT; however, criticism from minority groups as well as a concomitant increase in the cost of the test necessary to grade the essay led the College Board to drop it from the planned changes.
Major changes were also made to the SAT mathematics section at this time, due in part to the influence of suggestions made by the National Council of Teachers of Mathematics. Test-takers were now permitted to use calculators on the math sections of the SAT. Also, for the first time since 1935, the SAT would now include some math questions that were not multiple choice, instead requiring students to supply the answers. Additionally, some of these "student-produced response" questions could have more than one correct answer. The tested mathematics content on the SAT was expanded to include concepts of slope of a line, probability, elementary statistics including median and mode, and counting problems.
1995 recentering (raising mean score back to 500)
By the early 1990s, average total SAT scores were around 900 (typically, 425 on the verbal and 475 on the math). The average scores on the 1994 modification of the SAT I were similar: 428 on the verbal and 482 on the math. SAT scores for admitted applicants to highly selective colleges in the United States were typically much higher. For example, the score ranges of the middle 50% of admitted applicants to Princeton University in 1985 were 600 to 720 (verbal) and 660 to 750 (math). Similarly, median scores on the modified 1994 SAT for freshmen entering Yale University in the fall of 1995 were 670 (verbal) and 720 (math). For the majority of SAT takers, however, verbal and math scores were below 500: In 1992, half of the college-bound seniors taking the SAT were scoring between 340 and 500 on the verbal section and between 380 and 560 on the math section, with corresponding median scores of 420 and 470, respectively.
The drop in SAT verbal scores, in particular, meant that the usefulness of the SAT score scale (200 to 800) had become degraded. At the top end of the verbal scale, significant gaps were occurring between raw scores and uncorrected scaled scores: a perfect raw score no longer corresponded to an 800, and a single omission out of 85 questions could lead to a drop of 30 or 40 points in the scaled score. Corrections to scores above 700 had been necessary to reduce the size of the gaps and to make a perfect raw score result in an 800. At the other end of the scale, about 1.5 percent of test takers would have scored below 200 on the verbal section if that had not been the reported minimum score. Although the math score averages were closer to the center of the scale (500) than the verbal scores, the distribution of math scores was no longer well approximated by a normal distribution. These problems, among others, suggested that the original score scale and its reference group of about 10,000 students taking the SAT in 1941 needed to be replaced.
Beginning with the test administered in April 1995, the SAT score scale was recentered to return the average math and verbal scores close to 500. Although only 25 students had received perfect scores of 1600 in all of 1994, 137 students taking the April test scored a 1600. The new scale used a reference group of about one million seniors in the class of 1990: the scale was designed so that the SAT scores of this cohort would have a mean of 500 and a standard deviation of 110. Because the new scale would not be directly comparable to the old scale, scores awarded on April 1995 and later were officially reported with an "R" (for example, "560R") to reflect the change in scale, a practice that was continued until 2001. Scores awarded before April 1995 may be compared to those on the recentered scale by using official College Board tables. For example, verbal and math scores of 500 received before 1995 correspond to scores of 580 and 520, respectively, on the 1995 scale.
1995 re-centering controversy
Certain educational organizations viewed the SAT re-centering initiative as an attempt to stave off international embarrassment in regards to continuously declining test scores, even among top students. As evidence, it was presented that the number of pupils who scored above 600 on the verbal portion of the test had fallen from a peak of 112,530 in 1972 to 73,080 in 1993, a 36% backslide, despite the fact that the total number of test-takers had risen over 500,000. Other authors have argued that the evidence for a decline in student quality is mixed, citing that top scorers on the ACT have shown little change in the same period, and that the proportion of 17-year-olds scoring at the highest performance level on the NAEP long-term trend assessment has been roughly stable for decades.
2002 changes – Score Choice
Since 1993, using a policy referred to as "Score Choice", students taking the SAT-II subject exams were able to choose whether or not to report the resulting scores to a college to which the student was applying. In October 2002, the College Board dropped the Score Choice option for SAT-II exams, matching the score policy for the traditional SAT tests that required students to release all scores to colleges. The College Board said that, under the old score policy, many students who waited to release scores would forget to do so and miss admissions deadlines. It was also suggested that the old policy of allowing students the option of which scores to report favored students who could afford to retake the tests.
2005 changes, including a new 2400-point score
In 2005, the test was changed again, largely in response to criticism by the University of California system. In order to have the SAT more closely reflect high school curricula, certain types of questions were eliminated: analogies from the verbal section and quantitative comparison items from the math section. A new writing section, with an essay, based on the former SAT II Writing Subject Test, was added, in part to increase the chances of closing the opening gap between the highest and midrange scores. Other factors included the desire to test the writing ability of each student; hence the essay. The essay section added an additional maximum 800 points to the score, which increased the new maximum score to 2400. The "New SAT" was first offered on March 12, 2005, after the last administration of the "old" SAT in January 2005. The mathematics section was expanded to cover three years of high school mathematics. To emphasize the importance of reading, the verbal section's name was changed to the Critical Reading section.
Scoring problems of October 2005 tests
In March 2006, it was announced that a small percentage of the SATs taken in October 2005 had been scored incorrectly due to the test papers' being moist and not scanning properly, and that some students had received erroneous scores. The College Board announced they would change the scores for the students who were given a lower score than they earned, but at this point many of those students had already applied to colleges using their original scores. The College Board decided not to change the scores for the students who were given a higher score than they earned. A lawsuit was filed in 2006 on behalf of the 4,411 students who received an incorrect score on the SAT. The class-action suit was settled in August 2007 when the College Board and Pearson Educational Measurement, the company that scored the SATs, announced they would pay $2.85 million into a settlement fund. Under the agreement each student could either elect to receive $275 or submit a claim for more money if he or she felt the damage was greater. A similar scoring error occurred on a secondary school admission test in 2010–2011 when the ERB (Educational Records Bureau) announced after the admission process was over that an error had been made in the scoring of the tests of 2010 (17%) of the students who had taken the Independent School Entrance Examination for admission to private secondary schools for 2011. Commenting on the effect of the error on students' school applications in The New York Times, David Clune, President of the ERB stated "It is a lesson we all learn at some point—that life isn't fair."
As part of an effort to “reduce student stress and improve the test-day experience", in late 2008 the College Board announced that the Score Choice option, recently dropped for SAT subject exams, would be available for both the SAT subject tests and the SAT starting in March, 2009. At the time, some college admissions officials agreed that the new policy would help to alleviate student test anxiety, while others questioned whether the change was primarily an attempt to make the SAT more competitive with the ACT, which had long had a comparable score choice policy. Recognizing that some colleges would want to see the scores from all tests taken by a student, under this new policy, the College Board would encourage but not force students to follow the requirements of each college to which scores would be sent. A number of highly selective colleges and universities, including Yale, the University of Pennsylvania, Cornell, and Stanford, rejected the Score Choice option at the time and continue to require applicants to submit all scores. Others, such as MIT and Harvard, allow students to choose which scores they submit, and use only the highest score from each section when making admission decisions. Still others, such as Oregon State University and University of Iowa, allow students to choose which scores they submit, considering only the test date with the highest combined score when making admission decisions.
Beginning in 2012, test takers were required to submit a current, recognizable photo during registration. Students are required to present their photo admission ticket – or another acceptable form of photo ID – for admittance to their designated test center. Student scores and registration information, including the photo provided, are made available to the student’s high school. In the event of an investigation involving the validity of a student’s test scores, their photo may be made available to institutions to which they have sent scores. Any college that is granted access to a student’s photo is first required to certify that they are all admitted students.
2016 changes, including the return to a 1600-point score
On March 5, 2014, the College Board announced its plan to redesign the SAT in order to link the exam more closely to the work high school students encounter in the classroom. The new exam was administered for the first time in March 2016. Some of the major changes are: an emphasis on the use of evidence to support answers, a shift away from obscure vocabulary to words that students are more likely to encounter in college and career, a math section that is focused on fewer areas, a return to the 1600-point score scale, an optional essay, and the removal of penalty for wrong answers (rights-only scoring). To combat the perceived advantage of costly test preparation courses, the College Board announced a new partnership with Khan Academy to offer free online practice problems and instructional videos.
The SAT has been renamed several times since its introduction in 1926. It was originally known as the Scholastic Aptitude Test. In 1990, a commission set up by the College Board to review the proposed changes to the SAT program recommended that the meaning of the initialism SAT be changed to "Scholastic Assessment Test" because a "test that integrates measures of achievement as well as developed ability can no longer be accurately described as a test of aptitude". In 1993, the College Board changed the name of the test to SAT I: Reasoning Test; at the same time, the name of the Achievement Tests was changed to SAT II: Subject Tests. The Reasoning Test and Subject Tests were to be collectively known as the Scholastic Assessment Tests. According to the president of the College Board at the time, the name change was meant "to correct the impression among some people that the SAT measures something that is innate and impervious to change regardless of effort or instruction." The new SAT debuted in March 1994, and was referred to as the Scholastic Assessment Test by major news organizations. However, in 1997, the College Board announced that the SAT could not properly be called the Scholastic Assessment Test, and that the letters SAT did not stand for anything. In 2004, the Roman numeral in SAT I: Reasoning Test was dropped, making SAT Reasoning Test the new name of the SAT.
Math–verbal achievement gap
Main article: Math–verbal achievement gap
In 2002, Richard Rothstein (education scholar and columnist) wrote in The New York Times that the U.S. math averages on the SAT and ACT continued their decade-long rise over national verbal averages on the tests.
Reuse of old SAT exams
The College Board has been accused of completely reusing old SAT papers previously given in the United States. The recycling of questions from previous exams has been exploited to allow for cheating on exams and impugned the validity of some students' test scores, according to college officials. Test preparation companies in Asia have been found to provide test questions to students within hours of a new SAT exam's administration.
Association with culture
For decades many critics have accused designers of the verbal SAT of cultural bias as an explanation for the disparity in scores between poorer and wealthier test-takers. A famous (and long past) example of this bias in the SAT I was the oarsman–regatta analogy question. The object of the question was to find the pair of terms that had the relationship most similar to the relationship between "runner" and "marathon". The correct answer was "oarsman" and "regatta". The choice of the correct answer was thought to have presupposed students' familiarity with rowing, a sport popular with the wealthy. However, according to Murray and Herrnstein, the black-white gap is smaller in culture-loaded questions like this one than in questions that appear to be culturally neutral. Analogy questions have since been replaced by short reading passages.
Association with family income
A report from The New York Times stated that family income can explain about 95% of the variance in SAT scores. In response, Lisa Wade, contributor at the website The Society Pages, commented that those with higher family income, “tend to have better teachers, more resource-rich educational environments, more educated parents who can help them with school and, sometimes, expensive SAT tutoring.” However, University of California system research found that after controlling for family income and parental education, the already low ability of the SAT to measure aptitude and college readiness fell sharply while the more substantial aptitude and college readiness measuring abilities of high school GPA and the SAT II each remained undiminished (and even slightly increased). The University of California system required both the SAT and the SAT II from applicants to the UC system during the four years included in the study. They further found that, after controlling for family income and parental education, the so-called achievement tests known as the SAT II measure aptitude and college readiness 10 times higher than the SAT. As with racial bias, correlation with income could also be due to the social class of the makers of the test, although according to the authors of The Bell Curve, empirical research suggests that poorer students actually perform worse on questions the authors believed to be "neutral" compared to the ones they termed as "privileged."
Association with gender
This section needs expansion. You can help by adding to it.(September 2015)
The largest association with gender on the SAT is found in the math section, where male students, on average, score higher than female students by approximately 30 points. In 2013, the American College Testing Board released a report stating that boys outperformed girls on the mathematics section of the test.
Association with race and ethnicity
African American, Hispanic, and Native American students, on average, perform an order of one standard deviation lower on the SAT than white and Asian students.