Skip to content

OpenSDP Faketucky College-Going Dataset (Stata and R)

License

Notifications You must be signed in to change notification settings

OpenSDP/faketucky

Repository files navigation

OpenSDP Faketucky College-Going Dataset (Stata and R)

Description

The Faketucky synthetic college-going analysis file contains high school and college outcome data for two graduating cohorts of approximately 40,000 students. There are no real children in the dataset, but it mirrors the relationships between variables present in real data.

Faketucky is a demonstration of using machine learning routines to develop synthetic data based on real datasets. It was developed as an offshoot of the Strategic Data Project's college-going diagnostic for Kentucky, using the R synthpop package. Synthetic datasets like Faketucky can be shared freely for teaching and collaboration, and they can be used to test hypotheses before applying for permission to use confidential data.

Contents

This repository contains the following files:

  • faketucky.dta is a college-going analysis data file in Stata format
  • faketucky.rda is the same data in R format
  • faketucky_codebook.txt contains variable names and descriptions

About

These materials were originally authored by the Strategic Data Project.

OpenSDP is an online, public repository of analytic code, tools, and training intended to foster collaboration among education analysts and researchers in order to accelerate the improvement of our school systems. The community is hosted by the Strategic Data Project, an initiative of the Center for Education Policy Research at Harvard University. We welcome contributions and feedback.

About

OpenSDP Faketucky College-Going Dataset (Stata and R)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published