CAASPP graphic Home Page  Contact Us  Glossary

STAR Glossary

This glossary provides brief descriptions of terms you may encounter in your work with STAR. We welcome your feedback. If you have suggestions for definitions or if you know of additional terms that should be added, please contact us.



Academic Performance Index (API)
A number on a scale of 200 to 1000 that indicates how well a school performed academically in the previous year. The API measures a school's change in test achievement and progress toward a target API of 800. The API ranks school performance, sets growth targets, and provides similar-school comparisons. It is the cornerstone of the Public Schools Accountability Act. The STAR tests are part of what constitutes the API.


Any variation in the assessment environment or process that does not fundamentally alter what the exam measures or affect the comparability of exam scores. Accommodations are specified in the eligible student's IEP or Section 504 plan.

Examples: Braille or large-print forms for students who are visually impaired; additional time for students with certain learning disabilities; test questions read aloud to students for the California Standards Test for Mathematics

The extent to which an individual, group, or institution is held responsible for actions or performance, evidence of student learning and achievement, and school improvement.

achievement test
A test designed to measure students' learning in such areas as reading, writing, mathematics, history–social science, and science.

adequate yearly progress (AYP)
An annual measurement of improvement in student achievement based on state academic standards. School districts and schools must meet this minimum standard as a part of the Elementary and Secondary Education Act (ESEA).

The process of linking curriculum, assessment, classroom instruction, and learning to a set of standards that describes what students should know and be able to do. The goal of alignment is to ensure that classroom instruction and learning activities support adopted standards and assessments. Professional development and instructional materials must be linked to what is needed to achieve the standards.

Alternative Schools Accountability Model (ASAM)
A set of indicators that is considered along with the STAR Program as a measure of school and student progress.

Examples: A reduction of inappropriate behavior in the classroom; improved attendance; increased graduation rate

analytic scoring
The process of evaluating student work by individual elements rather than by overall quality (holistic scoring). The crucial elements of response are identified and scored separately. An overall impression of quality may be included as an element.

Examples: Historical essay analytic scores; use of prior knowledge, application of principles, use of original source material to support a point of view, accuracy, and composition

answer document
Pages with circles for students to record answers to test questions. These pages may be for multiple-choice or writing tests.

See Academic Performance Index (API).

See Assessment Review Panels (ARPs).

See Alternative School Accountability Model (ASAM).

The processes used to collect information about student progress toward educational goals. The particular form of an assessment depends on what is being assessed and on the uses to which results of the assessment will be applied. Assessments can be small in scale (assessments that teachers use in the classroom to obtain day-to-day information about student progress), medium in scale (assessments that school districts use to evaluate the effectiveness of schools or educational programs), or large in scale (assessments that state or national bodies use to assess the degree to which students have met large educational goals).

Assessment Review Panels (ARPs)
Groups of California educators and administrators appointed by the SBE to evaluate blueprints and test items.

assessment system
The combination of assessments in a system that produces comprehensive, credible, and dependable information. Educational and political institutions use assessment systems to make decisions about education for students, schools, school districts, or states.

See adequate yearly progress (AYP).

Back to alphabet


A characteristic of a test that could reduce the chances for identifiable subpopulations to receive scores that accurately reflect their abilities to respond to the skill being measured. Common sources of bias may be related to language, cultural, or gender differences.

Example: A mathematics word problem that contains difficult language may be biased against English learners (Inadequate performance may not be due to a lack of mathematical ability but, rather, a lack of English-language skills.)

Statements of the goals that California's SBE wants California students to reach.

Back to alphabet


California Alternate Performance Assessment (CAPA)
A criteria-response test for students with significant cognitive disabilities in grades two through eleven whose disabilities prevent them from taking the CSTs and CMA. Students are tested in ELA, mathematics, and science.

California Basic Educational Data System (CBEDS)
An annual collection of basic student and staff data from K–12 schools; includes data on student enrollment, graduates, dropouts, course enrollment, enrollment in alternative education, gifted and talented education, and more.

California Department of Education (CDE)
The state agency that administers the STAR Program and implements SBE policies.

California Modified Assessment (CMA)
A grade-level assessment for students who have an IEP, are receiving grade-level instruction, and, even with interventions, will not achieve grade-level proficiency within the year covered by the student's IEP. The purpose of the CMA tests is to allow students with disabilities greater access to demonstrate their achievement of the content standards for ELA, mathematics, and science.

California Reading List (CRL)
A list of books geared to reading level, based on how students performed on the STAR test.

California Standards Tests (CSTs)
Tests based on the educational goals or objectives that California's SBE and other educators have agreed that California students should reach. Students are tested in ELA, mathematics, science, and history–social science.

See California Alternate Performance Assessment (CAPA).

See California Basic Educational Data System (CBEDS).

See California Department of Education (CDE).

Hollow circles or ovals provided on an answer document. Students use a No. 2 pencil fill in the circles, or bubbles, to record their answers to test items.

classroom assessment
An assessment that teachers or groups of teachers develop, administer, and score to evaluate individual student or classroom performance on a topic. Ideally, the results of classroom assessments inform teachers and improve instruction to help students reach identified standards.

See California Modified Assessment.

competency test
An assessment to ensure that students have met minimal content and skill standards. Generally, students are required to pass such tests as a condition of promotion or graduation.

constructed-response items
Test questions that require a behavior or action on the part of the student.

Examples: Oral, pictorial, and written responses

content standards
Stated expectations of what students in California should know and be able to do in particular subjects and grade levels. Content standards define for teachers, schools, students, parents, and communities not only what is expected of students but also what schools should teach.

Guidelines, rules, characteristics, or dimensions that are used to judge the quality of student performance. Criteria indicate what is valued in student responses, products, or performance. Criteria may be holistic, analytic, general, or specific. Scoring rubrics are based on criteria and define what the criteria mean and how they are used.

criterion-referenced test (CRT)
An assessment designed to reveal what a student knows, understands, or can do in relation to specific objectives or standards. Individual items are designed to assess specific educational objectives. In a CRT, it is possible that none, some, or all of the examinees will reach a particular goal or performance standard.

See California Reading List (CRL).

See criterion-referenced test (CRT).

See California Standards Tests (CSTs).

curriculum alignment
The process of matching curriculum to the content standards assessed in a testing program to ensure that teachers will cover the material assessed. *

Back to alphabet


See Directions for Administration (DFA).

See differential item functioning (DIF).

differential item functioning (DIF)
A statistical procedure used to investigate whether students of similar ability in different groups, such as gender or ethnicity, perform differently on individual questions. Investigations of DIF are a typical part of efforts to ensure that test items are fair for all groups to which they are administered.

Desired knowledge or skills measured in an assessment, usually represented in a scoring rubric or criteria.

Directions for Administration (DFA)
Instructions to teachers or test examiners for setting up testing and telling students information they need to complete the tests.

Back to alphabet


Educational Testing Service (ETS)
A nonprofit testing and measurement organization that designs, develops, administers, and scores customized assessment systems for numerous testing programs under the authority of the CDE and the SBE.

See English learner (EL).

See English–language arts (ELA).

ELD standards and assessments
See English-language development (ELD) standards and assessments.

Elementary and Secondary Education Act (ESEA), Title I
A federal law signed into being on January 2, 2002, that strives to meet the needs of disadvantaged and minority students in K–12 to equalize educational opportunities for all students.

embedded items
Questions in a test instrument that are used as a field test or that serve a function other than the main purpose of the test.

end-of-course (EOC) examinations
Examinations administered at or near the end of a course to determine whether students have met specified course content and/or standards.

English–language arts (ELA)
One of the subjects that STAR measures; includes reading, writing, grammar, and vocabulary.

English-language development (ELD) standards and assessments
ELD standards provide criteria for documenting the progress of English learners and serve as a guide for development of the ELD assessments, which measure the progress of English learners toward proficiency in English. Under AB 748 (Escutia) and SB 638 (Alpert), school districts were required to administer the ELD assessments to their English learners beginning spring of 1991.

English learner (EL)
A student whose native language is other than English and who is not yet proficient in English.

See end-of-course (EOC) examination.

A statistical procedure used to adjust for minor differences in difficulty across different forms or versions of a test so that student scores on different tests can be directly compared.

The concern that assessments be fair and free from bias or favoritism. An assessment that is fair enables all examinees to show what they can do. At a minimum, all assessments should be reviewed for (1) stereotypes; (2) situations that may favor one culture over another; (3) excessive language demands that prevent some students from showing their knowledge; and (4) the assessment's potential to include students with disabilities or limited English proficiency.

See Elementary and Secondary Education Act (ESEA), Title I.

See Educational Testing Service (ETS).

The measuring, comparing, and judging of the quality of student work, schools, or a specific educational program.

Back to alphabet


See fluent-English proficient.

field test
A trial in which test items (questions) are given to students who would normally take the test to determine the quality of the items. Test items may be provided in a separate test or embedded in a regular test.

fluent-English proficient (FEP)
A student whose native language is not English, but who has become fluent in English.

Back to alphabet


high-stakes assessment
Testing that has strong consequences for the participants. A student's performance on a high-stakes exam might affect entry into a special class, college admission, or the awarding of a diploma or degree.

holistic scoring
Evaluation of student work in which the score is based on the overall quality of the response or performance rather than the individual elements of performance (analytic scoring).

Back to alphabet


See item characteristic curve (ICC).

See individualized education program (IEP).

individualized education program (IEP)
A program that coordinates services for students with disabilities and special needs. Such services include special education, transportation, and clinical services.

Test format.

See item response theory (IRT).
An individual question in an assessment or test instrument.

item characteristic curve (ICC)
A mathematical function that relates the probability that a student will answer a question correctly according to his or her underlying ability or skill. See also IRT.

item response theory (IRT)
A statistical theory that models how a student's response to a test question relates to ability or skill. A variety of IRT models are used in practice to construct and score such tests as the CSTs.

Back to alphabet


low-stakes assessment
Testing that has few direct consequences for the participants. Such testing is generally used for diagnosis of individual students or to provide information for such purposes as instructional improvement or curriculum redesign. *

Back to alphabet


Mantel-Haenszel (MH)
A methodology for measuring DIF.

See Mantel-Haenszel (MH).

Any variation in the assessment environment or process that fundamentally alters what the exam measures or affects the comparability of exam scores. Students who use a modification on any STAR examination shall not be included in the participation calculation for AYP and shall receive a score of 200 and a ranking of far below basic for the purposes of calculating the API.

Examples: Having ELA test questions read aloud; using a calculator on a mathematics test

multiple choice
A response format in which students select from two or more predetermined choices. Enhanced multiple-choice formats may involve questions that are linked and sequenced in a manner that provides more insight into such features as the student's prior knowledge or the particulars of the solution process used by the student.

multiple measures
The use of a variety of measures, such as standardized test results, classroom assessments, tasks and projects, grades, and teacher evaluation, to provide a complete picture of a student's academic achievement.

Back to alphabet


See National Assessment of Educational Progress (NAEP).

National Assessment of Educational Progress (NAEP)
An ongoing, national assessment of samples of what America's students in grades four, eight, and twelve know and can do in various academic subject areas. NAEP is administered by the National Center for Education Statistics of the U.S. Department of Education. California has participated in NAEP for nearly 30 years. One NAEP component provides states with a measure of their students' academic performance over time and a comparison with results of other states and students nationwide.

National School Lunch Program (NSLP)
A federally funded program to provide lunches to designated students.

No Child Left Behind (NCLB) Act
See ESEA, Title 1.

norm-referenced test (NRT)
An assessment in which individual or group performance is compared to a larger group. Usually, the larger group or "norm group," is a national sample representing a wide and diverse cross section of students. Students, schools, or school districts are then compared or rank-ordered in relation to the norm group.

See norm-referenced test (NRT).

See National School Lunch Program (NSLP).

Back to alphabet


See Office of Professional Standards Compliance (OPSC).

Office of Professional Standards Compliance (OPSC)
A division of ETS that promotes high quality based on professional standards across ETS.

on-demand assessment
An assessment that takes place at a predetermined time and place, usually under standardized conditions for all students being assessed.

Examples: The STAR tests; school district tests; some in-class unit tests and final exams

Back to alphabet


The data collection company responsible for shipping, processing, scoring, and reporting STAR results.

performance assessment
A testing method that requires students to write an answer or develop a product that demonstrates their knowledge or skills. Performance assessment can take many different forms, including writing short answers, doing mathematical computations, writing an extended essay, conducting an experiment, presenting an oral argument, or assembling a portfolio of representative work.

performance level
A standard of performance on STAR tests based on the student's scale score. There are five performance levels: advanced; proficient; basic; below basic; and far below basic. The goal in California is to have all students perform at the proficient or advanced level.

point-biserial correlation
A correlation that provides an index of the relationship between students' scores on a question and their total scores on a test. Point-biserial correlations are used to evaluate how well questions discriminate between high- and low-scoring students. Point-biserial correlations can range from -1.0 to +1.0, although most point-biserial correlations are between +0.10 (a question with poor discrimination) and +0.70 (a question with good discrimination).

Demographic data that are preprinted on the answer document or on labels for each student in a given district. Pre-ID saves schools time by minimizing the need to fill in demographic information manually on answer documents.

The difficulty of a test question is often expressed by the p-value, which is the proportion of students answering the question correctly. P-values can range from 0.00 to 1.00, although most p-values are between 0.20 (a relatively difficult question) to 0.80 (a relatively easy question).

Back to alphabet


The degree to which the results of an assessment are dependable and consistent in measuring particular student knowledge and skills. Reliability is an indication of the consistency of scores over time, between scores, or across different tasks or items that measure the same thing. If scores are unreliable, interpretations based on those scores (as well as subsequent decisions) will not be valid.

Back to alphabet


See State Board of Education (SBE).

scoring rubric
A listing of specific criteria used to score written-response questions in an assessment. A typical rubric contains a scoring scale; states all the different major traits or elements to be examined; and provides criteria for deciding the score to assign to student responses or performance. Scales may be quantitative (e.g., a score from 0 to 6), qualitative (e.g., "adequate performance" or "minimal competency"), or a combination of the two.

scoring scale
The range of scores possible for a test or assessment. Scale scores occur when examinees' responses to any number of items are combined and used to establish and place students on a single scale of achievement.

See specially designed academic instruction in English (SDAIE).

Section 504 plan
A plan by which students with disabilities who are in regular classes are provided special attention or special situations, such as being allowed to take a test alone rather than with other students.

See special education local plan area (SELPA).

special education local plan area (SELPA)
A school district or group of school districts in a given geographical area that coordinate the administration and delivery of special education services.

See Statewide Pupil Assessment Review (SPAR).

specially designed academic instruction in English (SDAIE)
A teaching style that uses special strategies to assist students who are English learners in learning subject-area content at the appropriate grade level while becoming proficient in English.

A consistent set of procedures for designing, administering, and scoring an assessment. The purpose of standardization is to ensure that all students are assessed under the same conditions so their scores have the same meaning and are not influenced by differing conditions. Standardized procedures are particularly important when scores are to be used to compare individuals or groups.

Standardized Testing and Reporting (STAR)
California's Standardized Testing and Reporting (STAR) Program, authorized by law in 1997, consists of achievement tests administered annually to California students in public schools in grades two through eleven. The program has two major objectives:
  • To test progress toward achievement of the content standards
  • To measure the achievement of California students in comparison with students nationwide

The STAR program has two components:

standardized tests
Tests with the same content that are administered and scored under conditions uniform for all students. Standardization is needed to make test scores comparable and to ensure, as much as possible, that all examinees have equal and fair chances to demonstrate what they know.

Statements of expectations for student learning that commonly include content standards and performance standards.

Standards-based Tests in Spanish (STS)
A criterion-referenced designated primary language test administered to Spanish-speaking English learners in grades two through eleven. The STS is based on content standards. Students are tested in reading/language arts and mathematics.

A score on a normalized standardized test that indicates a student's rank in comparison with other students who took the same test. Stanine scores range from 1 to 9 and indicate a student's performance level. A score of 1, 2, or 3 is below average; a score of 4, 5, or 6 is average; and a score of 7, 8, or 9 is above average.

See Standardized Testing and Reporting (STAR).

STAR standards-based tests
Tests developed for California that test students in English–language arts and mathematics in grades two through eleven. STAR standards-based tests also cover history–social science in grades eight and eleven and an end-of-course test for world history for students in grades nine through eleven; science for students in grades five, eight, ten; and end-of-course science tests for students in grades nine through eleven. They also include a writing prompt in grade seven. These tests are aligned to the content standards.

State Board of Education (SBE)
The governor-appointed body that sets educational policies and designates the STAR contractor.

Statewide Pupil Assessment Review (SPAR)
A panel responsible for reviewing and approving a single achievement test to be used statewide for the testing of students in California public schools, grades two through eleven.

Groupings of content standards that weave through the curriculum at each grade level, becoming progressively complex.

See Standards-based Tests in Spanish.

Back to alphabet


A measuring instrument for assessing and documenting student learning. The traditional test is a single-occasion, timed exercise.

test administration
A single testing period (or window) for a school district that has a variety of beginning and ending dates of instruction.
  • MC01: Apple High School is a single-track school and gives all the multiple-choice tests in one 21-day administration.
  • W01: Bay Elementary School includes students in grade four, so it has an administration for the writing test.
  • Multiple administrations: Canton Middle School is a multi-track and year-round school with different beginning and ending dates of instruction, and one track is not in session in March. (There would be W01 and W02 writing administrations and MC01 and MC02 multiple-choice administrations.)

test booklets
Printed material that includes directions and questions.

Back to alphabet


The degree to which evidence and theory support the interpretation of test scores entailed by proposed uses of tests. The process of validation involves accumulating evidence to provide a sound scientific basis for the proposed score interpretations. It is the interpretations of test scores for proposed uses that are evaluated, not the test itself.

Example: If a student performs poorly on a reading test, how confident are we that this score indicates poor reading ability? How confident are we that a low reading score requires special educational interventions? +

vertical equating/scoring
A statistical procedure used to adjust for differences when students of different levels take the same test.

Back to alphabet

* Hart, Diane (ed.). Authentic Assessment: A Handbook for Educators. Reading, Mass.: Addison-Wesley, 1993.

+ American Psychological Association, American Educational Research Association, and National Council on Measurement in Education. Standards for Educational and Psychological Testing. Washington, D.C.: American Educational Research Association, 1999.

STAR Tests heading
2014 Administration on
STAR Administration heading
STAR General Info heading
STAR Archive Library