dataset_demographics {ouladFormat} | R Documentation |
Load and formats the student demographic data set from the OULAD for data analysis.
dataset_demographics(
module = c("All", "AAA", "BBB", "CCC", "DDD", "EEE", "FFF", "GGG"),
presentation = c("2013B", "2014B", "2013J", "2014J", "All", "Summer", "Winter"),
repeat_students = c("remove", "keep")
)
module |
name of the module to be included, either |
presentation |
name of the semester of the module to be included, either |
repeat_students |
indicator of whether students who had previous attempts at the module should be removed, either |
Returns one tibble
(object of class tbl_df
), called 'studentInfo', based on the OULAD studentInfo.csv file
and the specified inputs (module, presentation, and repeat_students).
The tibble
consists of 12 columns (Kuzilek et al., 2017):
code_module - module identification code.
code_presentation - module presentation identification code.
id_student - the unique student identification number.
gender - student’s gender, either Male or Female.
region - the geographic region where the student lived while taking the module-presentation.
highest_education - the highest student education level on entry to the module presentation.
imd_band - the index of multiple deprivation band of the place where the student lived during the module-presentation.
age_band - a band of student’s age.
num_of_prev_attempts - the number of times the student has attempted this module previously.
studied_credits - the total number of credits for the modules the student is currently studying.
disability - indicates whether the student has declared a disability.
final_result - student’s final result in the module-presentation.
Kuzilek, J., Hlosta, M., & Zdrahal, Z. (2017). Open university learning analytics dataset. Scientific Data volume 4 , (pp. 1–8). https://doi.org/10.1038/sdata.2017.171.
dataset_demographics(module = "BBB", presentation = "2013J", repeat_students = "remove")