公开数据集
数据结构 ? 35.53M
Data Structure ?
* 以上分析是由系统提取分析形成的结果,具体实际数据为准。
README.md
Problem Description
============
- [About the data](#about-the-data)
- [Target Variable](#target-variable)
- [Submission Format](#submission-format)
- [Performance Metric](#performance-metric)
- [Features](#features)
- [Categories](#categories)
- [Example Row](#example-row)
## About the data
Your goal is to predict a student's earnings a set number of years after they have enrolled in United States institutions of higher education. The data is compiled from a wide range of sources and [made publicly available by the United States Department of education](https://collegescorecard.ed.gov/).
## Target Variable
We're trying to predict the variable `income`, which represents earnings **in thousands of US dollars** a set interval
from when the student first enrolled.
Submission Format
The format for the submission file is two columns with the `row_id` and the `income`. The data type of `income` is a float, **so make sure there is a decimal point in your submission**. For example `0.0` is a valid float. `0` is _not_.
For example, if you predicted...
| row_id | income |
|--------|--------|
| 2 | 0.0 |
| 8 | 0.0 |
| 9 | 0.0 |
| 10 | 0.0 |
| 11 | 0.0 |
The first few lines of the `.csv` file that you submit would look like:
row_id,income
2,0.0
8,0.0
9,0.0
10,0.0
11,0.0
Performance Metric
We're predicting a numeric quantity, so this is a regression problem. To measure regression, we'll use a metric called Root-mean-squared error. It is an error metric, so lower value is better (as opposed to an accuracy metric, where a higher value is better).
$$RMSE =
qrt{
rac{1}{N}
um_{n=1}^{N} (\hat{y}_n - y_n)^2 }$$
Where $\hat{y}_n$ is the predicted earnings and $y_n$ is the actual earnings. The best possible score is 0, but the worst possible score can be infinite.
## Features
There are 297 variables in this dataset. Each row in the dataset represents a United States institution of higher education in a specific year. The dataset we are working with covers four particular years, denoted `year_a`, `year_f`, `year_w`, and `year_z` in our dataset. An institution may have a row for all, some, or just for one of the years. We don't provide a unique identifier for an individual institution, just a `row_id` for each row.
The variables in the dataset have names that of the form `category__variable`, where `category` is the high level category of the variable (e.g. `academics` or `students`). `variable` is what the specific column contains.
Categories
- **`academics`**
- `program_assoc_agriculture`: Associate degree in Agriculture, Agriculture Operations, And Related Sciences.
- `program_assoc_architecture`: Associate degree in Architecture And Related Services.
- `program_assoc_biological`: Associate degree in Biological And Biomedical Sciences.
- `program_assoc_business_marketing`: Associate degree in Business, Management, Marketing, And Related Support Services.
- `program_assoc_communication`: Associate degree in Communication, Journalism, And Related Programs.
- `program_assoc_communications_technology`: Associate degree in Communications Technologies/Technicians And Support Services.
- `program_assoc_computer`: Associate degree in Computer And Information Sciences And Support Services.
- `program_assoc_construction`: Associate degree in Construction Trades.
- `program_assoc_education`: Associate degree in Education.
- `program_assoc_engineering`: Associate degree in Engineering.
- `program_assoc_engineering_technology`: Associate degree in Engineering Technologies And Engineering-Related Fields.
- `program_assoc_english`: Associate degree in English Language And Literature/Letters.
- `program_assoc_ethnic_cultural_gender`: Associate degree in Area, Ethnic, Cultural, Gender, And Group Studies.
- `program_assoc_family_consumer_science`: Associate degree in Family And Consumer Sciences/Human Sciences.
- `program_assoc_health`: Associate degree in Health Professions And Related Programs.
- `program_assoc_history`: Associate degree in History.
- `program_assoc_humanities`: Associate degree in Liberal Arts And Sciences, General Studies And Humanities.
- `program_assoc_language`: Associate degree in Foreign Languages, Literatures, And Linguistics.
- `program_assoc_legal`: Associate degree in Legal Professions And Studies.
- `program_assoc_library`: Associate degree in Library Science.
- `program_assoc_mathematics`: Associate degree in Mathematics And Statistics.
- `program_assoc_mechanic_repair_technology`: Associate degree in Mechanic And Repair Technologies/Technicians.
- `program_assoc_military`: Associate degree in Military Technologies And Applied Sciences.
- `program_assoc_multidiscipline`: Associate degree in Multi/Interdisciplinary Studies.
- `program_assoc_parks_recreation_fitness`: Associate degree in Parks, Recreation, Leisure, And Fitness Studies.
- `program_assoc_personal_culinary`: Associate degree in Personal And Culinary Services.
- `program_assoc_philosophy_religious`: Associate degree in Philosophy And Religious Studies.
- `program_assoc_physical_science`: Associate degree in Physical Sciences.
- `program_assoc_precision_production`: Associate degree in Precision Production.
- `program_assoc_psychology`: Associate degree in Psychology.
- `program_assoc_public_administration_social_service`: Associate degree in Public Administration And Social Service Professions.
- `program_assoc_resources`: Associate degree in Natural Resources And Conservation.
- `program_assoc_science_technology`: Associate degree in Science Technologies/Technicians.
- `program_assoc_security_law_enforcement`: Associate degree in Homeland Security, Law Enforcement, Firefighting And Related Protective Services.
- `program_assoc_social_science`: Associate degree in Social Sciences.
- `program_assoc_theology_religious_vocation`: Associate degree in Theology And Religious Vocations.
- `program_assoc_transportation`: Associate degree in Transportation And Materials Moving.
- `program_assoc_visual_performing`: Associate degree in Visual And Performing Arts.
- `program_bachelors_agriculture`: Bachelor's degree in Agriculture, Agriculture Operations, And Related Sciences.
- `program_bachelors_architecture`: Bachelor's degree in Architecture And Related Services.
- `program_bachelors_biological`: Bachelor's degree in Biological And Biomedical Sciences.
- `program_bachelors_business_marketing`: Bachelor's degree in Business, Management, Marketing, And Related Support Services.
- `program_bachelors_communication`: Bachelor's degree in Communication, Journalism, And Related Programs.
- `program_bachelors_communications_technology`: Bachelor's degree in Communications Technologies/Technicians And Support Services.
- `program_bachelors_computer`: Bachelor's degree in Computer And Information Sciences And Support Services.
- `program_bachelors_construction`: Bachelor's degree in Construction Trades.
- `program_bachelors_education`: Bachelor's degree in Education.
- `program_bachelors_engineering`: Bachelor's degree in Engineering.
- `program_bachelors_engineering_technology`: Bachelor's degree in Engineering Technologies And Engineering-Related Fields.
- `program_bachelors_english`: Bachelor's degree in English Language And Literature/Letters.
- `program_bachelors_ethnic_cultural_gender`: Bachelor's degree in Area, Ethnic, Cultural, Gender, And Group Studies.
- `program_bachelors_family_consumer_science`: Bachelor's degree in Family And Consumer Sciences/Human Sciences.
- `program_bachelors_health`: Bachelor's degree in Health Professions And Related Programs.
- `program_bachelors_history`: Bachelor's degree in History.
- `program_bachelors_humanities`: Bachelor's degree in Liberal Arts And Sciences, General Studies And Humanities.
- `program_bachelors_language`: Bachelor's degree in Foreign Langu
×
帕依提提提温馨提示
该数据集正在整理中,为您准备了其他渠道,请您使用
注:部分数据正在处理中,未能直接提供下载,还请大家理解和支持。
暂无相关内容。
暂无相关内容。
- 分享你的想法
去分享你的想法~~
全部内容
欢迎交流分享
开始分享您的观点和意见,和大家一起交流分享.
数据使用声明:
- 1、该数据来自于互联网数据采集或服务商的提供,本平台为用户提供数据集的展示与浏览。
- 2、本平台仅作为数据集的基本信息展示、包括但不限于图像、文本、视频、音频等文件类型。
- 3、数据集基本信息来自数据原地址或数据提供方提供的信息,如数据集描述中有描述差异,请以数据原地址或服务商原地址为准。
- 1、本站中的所有数据集的版权都归属于原数据发布者或数据提供方所有。
- 1、如您需要转载本站数据,请保留原数据地址及相关版权声明。
- 1、如本站中的部分数据涉及侵权展示,请及时联系本站,我们会安排进行数据下线。