The postsecondary researcher data files contain commonly requested longitudinal student-level data from the postsecondary educational environment. The data is organized into 5 datasets: base, enrollment, program, coursework, and awards. These datasets reflect information about postsecondary student demographics, enrollments, programs, courses, cumulative progress, and degrees/certifications (referred to as awards). Data from the 2010-present school years are available.

The primary data sources for the postsecondary researcher data are the Student Transcript and Academic Record Repository (STARR) and the National Student Clearinghouse (NSC). STARR is a collection of student data from all Michigan public institutions and some independent institutions. NSC is a national database of postsecondary student information from most colleges nationwide. The information collected from STARR is more detailed than that obtained from NSC, but we use the NSC data to track the progress of Michigan's students who attend college out of state, and as a backup to ensure we obtain the most complete data possible for Michigan's postsecondary students. The STARR and NSC data are merged together and standardized, and these merged datasets serve as the source for the researcher files. More specifically, CEPI submits a student list to NSC for the past 8 cohorts of Michigan high school graduates and for all Michigan postsecondary students. NSC matches the student list to their Student Tracker database and CEPI uses the returned results.

When merging these data sources, student x year x institution enrollment and awards records are first identified from the STARR and NSC. Where only a STARR record or both STARR and NSC have a matching record, data from STARR are incorporated into the panel. After having done this, any remaining unique NSC records are then added to the panel. Most commonly, these NSC records will come from out-of-state and independent/private institutions although community colleges and, to a lesser degree, four-year institutions will also show records.

Image
Flowchart of STARR record categorization

While this approach greatly reduces duplicate records, there are potential shortcomings. Academic Year is based on Start and End dates that are received from STARR & NSC and there are known data quality concerns around the End date where data is received differently between STARR & NSC. When those dates fall into a different Academic Year, overreporting of the outcomes from those data may occur. A second artifact is the mixing of STARR and NSC data for the same student at the same college in the same school year if, for example, a STARR enrollment record and an NSC award record are received. NSC enrollment records at that college/school year are dropped, but the NSC award record is kept, even if it was not reported in STARR.

This strategy of combining the two data sources, plus the method of data collection used, results in something of a convenience sample for the overall postsecondary dataset. Included are the universe of students enrolled at a Michigan public institution regardless of whether or not they attended Michigan public high school. Additionally, K12 Michigan public high school attendees who attended an out-of-state postsecondary institution or an in-state independent/private postsecondary institution are included, meaning a comprehensive of post-secondary outcomes for Michigan public K12 students is achieved. Beyond this, K12 students who dual-enroll will also show records.

Overall Data Table Notes

Table Hierarchy

This is the general hierarchy of the tables:

Student Base 
EnrollmentsA student may have multiple associated enrollments.
CourseworkAn enrollment session may have multiple associated courses.
ProgramsAn enrollment session may have multiple associated academic programs.
AwardsA student may have multiple associated awards, but awards do not link directly to enrollment session records.
Independent IHE Reporting OptionsIHEs and their reporting option designation. This table is mostly informational, and exists outside of the normal 5 table structure.

Standard Table Joins

The below grids show how the tables link together, but other joins may be appropriate depending on the requirements of the research.

Student BaseJoins toEnrollments, Coursework, Programs, Awards
RIC=RIC
EnrollmentsJoins toCoursework, Programs
RIC=RIC
IPEDS_CODE=IPEDS_CODE
YEAR_SESSION=YEAR_SESSION
ProgramsJoins toCoursework
RIC=RIC
IPEDS_CODE=IPEDS_CODE
YEAR_SESSION=YEAR_SESSION

Though awards do not have a direct one-to-one link to enrollment records, most student awards can be linked to a corresponding set of enrollments as shown below. 

AwardsMay join toEnrollments
RIC=RIC
IPEDS_CODE=IPEDS_CODE
Independent IHE Reporting OptionsMay join toAny dataset with an IPEDS Code
IPEDS_CODE=IPEDS_CODE

Post-Secondary Student Base

Highlights

Record Level: Student-level 
Record Count: ~2.6 million observations 
Years Covered: 2010-Present* 
Population Coverage: Students enrolled in or receiving an award from a Michigan public institution

The base dataset contains information on student demographics: race, gender, and date of birth. The base population includes students who have enrolled in or received an award from a Michigan public institution (and some independent institutions) during the 2010 academic year or later. The data will include the entire academic history of any such student, which may include enrollments or awards prior to the 2010 academic year.

Students are identified with a research identification code (RIC), which is unique in nearly all cases in the dataset. However, due to linking and unlinking over time as data was moved from the source datasets to the final datasets, some students may have duplicated or conflicting data. In these cases, both records are retained within the final dataset, allowing researchers to decide how to eliminate duplicate RICs. In addition, in some cases there are discrepancies between STARR and NSC data.

In addition, the data file includes a stable date of birth (latest reported date of birth) for each student, and indicators for where conflicts in demographic data exist (i.e., if conflicting dates of birth, race, or gender have been reported for any given student over time).

* Note that, because the academic histories of students might extend before the 2010 academic year, the years covered for other postsecondary files may differ. For example, there are student enrollments dating back as early to 1900 in the enrollment datafile.

Post-Secondary Student Enrollment

Highlights

Record Level: Student by school year session by institution 
Record Count: 1,792,406 observations/year beginning in 2010 
Years Covered: 2010-Present* 
Population Coverage: Students with an enrollment record at a Michigan public institution

The enrollment dataset includes one record per student, for each institution attended and school year session (for example, fall 2010-2011). For each record, session beginning and end dates are provided. Additionally, the dataset includes variables that reflect academic progress, such as session GPA and credits attempted/earned, in addition to variables reflecting academic progress. The dataset also includes flags for full-time enrollment and whether any courses were taken for remedial purposes. As with the base file, students are identified using a RIC, which is unique within a student-year-IHE-session for nearly all cases.

In this dataset, Spring and Winter sessions are treated the same by CEPI. Researchers should note that in the 2016-17 collection, CEPI began renaming all Winter sessions to Spring to reflect that. Therefore, you will see no Winter sessions from the 2016-17 enrollments on. The figure below illustrates how different sessions were categorized. For example, sessions categorized as “Late Summer” have a start date between 4/15 and 7/31 and an end date between 7/16 and 8/31. Some sessions span more than one session range; a session falling entirely within an overlap range is put into the later second session of the range (i.e., the end date determines the session for these cases). Any sessions with a start date between 4/15-EndYear and 8/1-EndYear and ending on or after 7/16-EndYear are moved to the subsequent school year.

Image
Timeline of session coverage in data

In addition, researchers should note the following definitions of student transfers, which apply to those with ENROLLMENT_TYPE = “TransferIn”. CEPI has two principal definitions of a transfer:

  1. If an institution tells us that a student is a transfer, that student is a transfer, full stop. This stems from an Enrollment Type of “TransferIn”. CEPI tries to find the student’s previous college or university, but if they cannot find one then their previous enrollment is labeled as ‘Unknown’.
  2. CEPI looks at all enrollments within the academic year of the report and finds the most recent enrollments for the students before that point. They look for the following to be true. If these are true, then the enrollment pattern is considered a transfer by CEPI’s definition: - The previous enrollment is at a different institution - The previous enrollment is at the same Enrollment Level (Undergraduate vs Graduate) - The student has stopped enrollment at that previous institution - The student has not previously been enrolled at their current institution (that is, readmissions are not eligible to be transfers)

CEPI then takes all transfers discovered by methods 1 and/or 2, and compiles them into the data file.

View the enrollment data: record counts by institution for 2010-2019

* Years covered includes any year for which there is data available in this file. However, researchers should note that these data encompass the academic histories of any student receiving an award beginning with the 2010 school year. As such, the base file “years covered” encompasses 2010, but all other files may include earlier records as well.

Post-Secondary Student Programs

Highlights

Record Level: Student by program by school year session by institution 
Record Count: 1,994,786 observations/year beginning in 2010 
Years Covered: 2010-Present* 
Population Coverage: Students enrolled in or receiving an award from a Michigan public institution, as well as some out of state students from NSC.

The programs dataset includes one record per student, for each program the student is working toward at all institutions and school year session (for example, fall 2010-2011). For each record, session beginning and end dates are provided. Additionally, the dataset includes variables that reflect program type and description (such as “Major in Computer Science”). Program CIP codes and descriptions are also provided but due to reporting issues, many records reflect a null value.

CEPI conducts some deduplication of program records, as follows:

  • Drop records with a null program name if there is a corresponding record where all other key fields match and the program name field contains a value.
  • Drop records with a null program CIP code if there is a corresponding record where all other key fields match and the program CIP code field contains a value.
  • If both a STARR and an NSC program record exist where all key fields are identical, keep only the STARR record.

* Years covered includes any year for which there is data available in this file. However, researchers should note that these data encompass the academic histories of any student enrolled in or receiving an award beginning with the 2010 school year. As such, the base file “years covered” encompasses 2010, but all other files may include earlier records as well.

Codebook

Showing 1 - 100 of 1138 results
Category Sort descending Variable Description
Category: Advanced Placement Exams Variable: PHYSEMGR Description: 

Physics C: Electricity and Magnetism: Exam Score

Category: Advanced Placement Exams Variable: COMSCAGR Description: 

Computer Science A: Exam Score

Category: Advanced Placement Exams Variable: ART3DGR Description: 

Studio Art: 3-D Design Portfolio: Exam Score

Category: Advanced Placement Exams Variable: SPANLTGR Description: 

Spanish Literature and Culture: Exam Score

Category: Advanced Placement Exams Variable: PSYCHGR Description: 

Psychology: Exam Score

Category: Advanced Placement Exams Variable: ENVSCIGR Description: 

Environmental Science: Exam Score

Category: Advanced Placement Exams Variable: CALCBCGR Description: 

Calculus BC: Exam Score

Category: Advanced Placement Exams Variable: ARTSTDGR Description: 

Studio Art: Drawing Portfolio: Exam Score

Category: Advanced Placement Exams Variable: USHISTGR Description: 

United States History: Exam Score

Category: Advanced Placement Exams Variable: EURHISGR Description: 

European History: Exam Score

Category: Advanced Placement Exams Variable: GOVUSGR Description: 

United States Government and Politics: Exam Score

Category: Advanced Placement Exams Variable: GOVCOMGR Description: 

Comparative Government and Politics: Exam Score

Category: Advanced Placement Exams Variable: LATINVGR Description: 

Latin Vergil: Exam Score

Category: Advanced Placement Exams Variable: SPANLAGR Description: 

Spanish Language: Exam Score

Category: Advanced Placement Exams Variable: COMSCPGR Description: 

Computer Science Principles: Exam Score

Category: Advanced Placement Exams Variable: ECONMAGR Description: 

Macroeconomics: Exam Score

Category: Advanced Placement Exams Variable: CALCABGR Description: 

Calculus AB: Exam Score

Category: Advanced Placement Exams Variable: ARTST2GR Description: 

Studio Art: 2-D Design Portfolio: Exam Score

Category: Advanced Placement Exams Variable: FRNLANGR Description: 

French Language and Culture: Exam Score

Category: Advanced Placement Exams Variable: CHINESGR Description: 

Chinese Language and Culture: Exam Score

Category: Advanced Placement Exams Variable: LATINCGR Description: 

Latin: Exam Score

Category: Advanced Placement Exams Variable: PHYS1GR Description: 

Physics 1: Exam Score

Category: Advanced Placement Exams Variable: COMSCBGR Description: 

Computer Science B: Exam Score

Category: Advanced Placement Exams Variable: JAPANGR Description: 

Japanese Language and Culture: Exam Score

Category: Advanced Placement Exams Variable: FRENLTGR Description: 

French Literature: Exam Score

Category: Advanced Placement Exams Variable: PHYSMGR Description: 

Physics C: Mechanics: Exam Score

Category: Advanced Placement Exams Variable: BIOLGR Description: 

Biology: Exam Score

Category: Advanced Placement Exams Variable: PHYSBGR Description: 

Physics B: Exam Score

Category: Advanced Placement Exams Variable: CPSTNSGR Description: 

AP Capstone Seminar: Exam Score

Category: Advanced Placement Exams Variable: MUSICTGR Description: 

Music Theory: Exam Score

Category: Advanced Placement Exams Variable: ITALGR Description: 

Italian Language and Culture: Exam Score

Category: Advanced Placement Exams Variable: PHYS2GR Description: 

Physics 2: Exam Score

Category: Advanced Placement Exams Variable: CPSTNRGR Description: 

AP Capstone Research: Exam Score

Category: Advanced Placement Exams Variable: WDHISTGR Description: 

World History: Exam Score

Category: Advanced Placement Exams Variable: GERLAGR Description: 

German Language and Culture: Exam Score

Category: Advanced Placement Exams Variable: ENGLITGR Description: 

English Literature and Composition: Exam Score

Category: Advanced Placement Exams Variable: ARTHISGR Description: 

Art History: Exam Score

Category: Advanced Placement Exams Variable: ECONMIGR Description: 

Microeconomics: Exam Score

Category: Advanced Placement Exams Variable: ENGLANGR Description: 

English Language and Composition: Exam Score

Category: Advanced Placement Exams Variable: CHEMGR Description: 

Chemistry: Exam Score

Category: Advanced Placement Exams Variable: STATGR Description: 

Statistics: Exam Score

Category: Advanced Placement Exams Variable: HUMGEOGR Description: 

Human Geography: Exam Score

Category: All Variable: ISD_NAME Description: 

Intermediate school district name (EEM).

Category: All Variable: SUBMITTING_SCHOOL_CODE Description: 
Category: All Variable: START_YEAR Description: 

The calendar year in which the school year began.

Category: All Variable: MOECS_EDUCATOR_ID Description: 

A unique identification code assigned to educators when they enter their information in MOECS for the first time, to keep the anonymity of the staff.

Category: All Variable: SCHOOL_OPEN_DATE Description: 

The actual opening date of the school or when it begins to do business.

Category: All Variable: ENTITY_TYPE Description: 

Description of Entity Type which indicates whether the school/entity that the staff is assigned to, is private, charter, public, state-run, etc.

Category: All Variable: OPERATIONAL_DISTRICT_CODE Description: 

The official state assigned five digit code denoting the district of the entity held accountable for the student's graduation status.

Category: All Variable: SCHOOL_NAME Description: 

The name as assigned to the building in the official Educational Entity Master (EEM).

Category: All Variable: EMPLOYED_DISTRICT_CODE Description: 

The official state assigned five digit code denoting the district that the educator is employed by.

Category: All Variable: EMPLOYED_ISD_CODE Description: 

The state-assigned five-digit number, as recorded in the Educational Entity Master (EEM), that identifies the intermediate school district (ISD) or educational service agency (ESA) in which the district or program is located. This is the ISD that the educator is employed by.

Category: All Variable: ASSIGNED_ISD_NAME Description: 

The name, as recorded in the Educational Entity Master (EEM), that identifies the intermediate school district (ISD) or educational service agency (ESA) in which the district or program is located. This is the ISD that the educator is assigned to.

Category: All Variable: SCHOOL_CLOSE_DATE Description: 

the actual closing date of the entity or the date the entity ceased to do business.

Category: All Variable: EMPLOYED_DISTRICT_NAME Description: 

The official district name denoting the district that the educator is employed by.

Category: All Variable: ASSIGNED_DISTRICT_CODE Description: 

The official state assigned five digit code denoting the district that the educator is assigned to.

Category: All Variable: BUILDING_CODE Description: 

The five-digit code as assigned to the building in the official Educational Entity Master (EEM). Note: This field has a value ONLY for data from General collection or SRM, but may be NULL for pre-K special ed.

Category: All Variable: DISTRICT_CODE Description: 

The state-assigned five-digit number, as recorded in the Educational Entity Master (EEM), which identifies the public school district responsible for providing education to the reported student. Note: This field has a value ONLY for data from General collection or SRM.

Category: All Variable: LAST_COLLECTION_NAME Description: 

The collection in which the student was last reported by any district.

Category: All Variable: FIRST_COLLECTION_NAME Description: 

First MSDS Collection in which the student was assigned to a graduation cohort.

Category: All Variable: LAST_COLLECTION_ID Description: 

Code associated with the collection in which the student was last reported by any district.

Category: All Variable: FIRST_COLLECTION_ID Description: 

Code associated with the first MSDS Collection in which the student was assigned to a graduation cohort.

Category: All Variable: COLLECTION_SCHOOL_YEAR Description: 

The academic school year in which the course was completed by the student.

Category: All Variable: SCHOOL_YEAR Description: 

The start and end years of the schoolyear that the student attended / received services during the collection used for this record.

Category: All Variable: OPERATIONAL_ISD_NAME Description: 

The name, as recorded in the Educational Entity Master (EEM), that identifies the intermediate school district (ISD) or educational service agency (ESA) in which the district or program is located.

Category: All Variable: SCHOOL_CODE Description: 

The entity code assigned to the School or Facility (Building) held accountable for the student's graduation status.

Category: All Variable: OPERATIONAL_DISTRICT_NAME Description: 

The official district name denoting the district of the entity held accountable for the student's graduation status.

Category: All Variable: SCHOOL_NAME Description: 

The official name assigned to the School or Facility (Building) type of entity held accountable for the student's graduation status.

Category: All Variable: REPORTING_PERIOD Description: 

Description of the reporting period (school year, fall, spring) that the data represents.

Category: All Variable: RIC Description: 

The unique student identifier--Research Identification Code.

Category: All Variable: IHE_SUBMITTED_SESSION_END_DATE Description: 

The date identifying the end of the academic session as submitted by the college.

Category: All Variable: OPERATIONAL_DISTRICT_CODE Description: 

The state-assigned five-digit number, as recorded in the Educational Entity Master (EEM).

Category: All Variable: SCHOOL_YEAR Description: 

The school year that the data represents.

Category: All Variable: SESSION_END_DATE Description: 

The date identifying the end of the academic session as determined by CEPI, based on the IHE's submitted Session Begin and End dates. Prior to the availability of the session begin and end dates in 2016-17, NSC records used the enrollment end date as reported to NSC.

Category: All Variable: SESSION_NAME Description: 

The academic term for which the data are being reported, as determined by CEPI, based on the IHE's submitted Session Begin and End dates. Example: Fall

Category: All Variable: IPEDS_CODE Description: 

IPEDS (Integrated Postsecondary Education Data Systems) IHE identification code.

Category: All Variable: SESSION_BEGIN_DATE Description: 

The date identifying the start of the academic session as determined by CEPI, based on the IHE's submitted Session Begin and End dates. Prior to the availability of session begin and end dates in 2016-17, this field was derived from the submitted Session Designator, which was the session start month and year.

Category: All Variable: SCHOOL_YEAR_ENROLLED Description: 

The school year during which the student is enrolled. Example: 2016-2017

Category: All Variable: IHE_NAME Description: 

Institution of Higher Education (IHE) Official Name.

Category: All Variable: IHE_SUBMITTED_SESSION_BEGIN_DATE Description: 

The date identifying the start of the academic session as submitted by the college.

Category: All Variable: YEAR_SESSION Description: 

A combination of the school year enrolled and the session name. Example: 2011-2012 Fall

Category: All Variable: RIC Description: 

The unique student identifier--Research Identification Code.

Category: All Variable: SCHOOL_START_YEAR_ENROLLED Description: 

The numeric start year of the school year during which the student is enrolled. Example: 2016

Category: All Variable: OPERATIONAL_ISD_NAME Description: 

The name, as recorded in the Educational Entity Master (EEM), that identifies the Intermediate School District (ISD) or Educational Service Agency (ESA) in which the district or program is located.

Category: All Variable: OPERATIONAL_DISTRICT_NAME Description: 

The name, as recorded in the Educational Entity Master (EEM).

Category: All Variable: OPERATIONAL_ISD_CODE Description: 

The state-assigned two-digit number, as recorded in the Educational Entity Master (EEM), that identifies the Intermediate School District (ISD) or Educational Service Agency (ESA) in which the district or program is located.

Category: All Variable: EMPLOYED_ISD_NAME Description: 

The name, as recorded in the Educational Entity Master (EEM), that identifies the intermediate school district (ISD) or educational service agency (ESA) in which the district or program is located. This is the ISD that the educator is employed by.

Category: All Variable: OPEN_DATE Description: 

The actual opening date of the entity or when it begins to do business.

Category: All Variable: LONGITUDE Description: 

The longitude coordinate of the physical location of the entity.

Category: All Variable: LATITUDE Description: 

The latitude coordinate of the physical location of the entity.

Category: All Variable: COLLECTION Description: 

Collection Period of the observation. Example: EOY 2017

Category: All Variable: CLOSE_DATE Description: 

The actual closing date of the entity. Blank if Entity is open or data is unavailable.

Category: All Variable: START_YEAR Description: 

The numeric start year of the school year during which the data was collected. Example: 2016

Category: All Variable: DISTRICT_NAME Description: 

Operational District Name

Category: All Variable: DISTRICT_CODE Description: 

Operational District Code (i.e., the Employing District)

Category: All Variable: START_YEAR Description: 

The calendar year in which the school year began.

Category: All Variable: ISD_CODE Description: 

Operational Intermediate School District Code

Category: All Variable: ISD_NAME Description: 

Operational Intermediate School District Name

Category: All Variable: SCHOOL_CODE Description: 

The five-digit code as assigned to the building in the official Educational Entity Master (EEM). Otherwise known as Building Code. Building codes 00000 identify the administrative units.

Category: All Variable: ASSIGNED_DISTRICT_NAME Description: 

The official district name denoting the district that the educator is assigned to.