公开数据集
数据结构 ? 399M
Data Structure ?
* 以上分析是由系统提取分析形成的结果,具体实际数据为准。
README.md
Prof. D. Petkovic (SFSU) Petkovic '@' sfsu.edu; Prof. Rainer Todtenhoefer (Fulda University, Germany); Prof. Shihong Huang (FAU)
Data Set Information:
The data can be used to try to predict student learning in SE teamwork based on observation of their team activity
**** README FILE from the submitted data ZIP ****
# San Francisco State University
# Software Engineering Team Assessment and Prediction (SETAP) Project
# Machine Learning Training Data File Version 0.7
# ====================================================================
#
# Copyright 2000-2017 by San Francisco State University, Dragutin
# Petkovic, and Marc Sosnick-Perez.
#
# CONTACT
# -------
# Professor Dragutin Petkovic: petkovic '@' sfsu.edu
#
# LICENSE
# -------
# This data is released under the Creative Commons Attribution-
# NonCommercial 4.0 International license. For more information,
# please see
# [Web link].
#
# The research that has made this data possible has been funded in
# part by NSF grant NSF-TUES1140172.
#
# YOUR FEEDBACK IS WELCOME
# ------------------------
# We are interested in how this data is being used. If you use it in
# a research project, we would like to know how you are using the
# data. Please contact us at petkovic '@' sfsu.edu.
#
#
# FILES INCLUDED IN DISTRIBUTION PACKAGE
# ======================================
# This archive contains the data collected by the SETAP Project.
#
#
# More data about the SETAP project, data collection, and description
# and use of machine learning to analyze the data can be found in the
# following paper:
#
# D. Petkovic, M. Sosnick-Perez, K. Okada, R. Todtenhoefer, S. Huang,
# N. Miglani, A. Vigil: 'Using the Random Forest Classifier to Assess
# and Predict Student Learning of Software Engineering Teamwork'.
# Frontiers in Education FIE 2016, Erie, PA, 2016
#
#
#
# See DATA DEscriptION below for more information about the data. The
# README file (which you are reading) contains project information
# such as data collection techniques, data organization and field
# naming convention. In addition to the README file, the archive
# contains a number of .csv files. Each of these CSV files contains
# data aggregated by team from the project (see below), paired with
# that team's outcome for either the process or product component of
# the team's evaluation. The files are named using the following
# convention:
#
# setap[Process|Product]T[1-11].csv
#
# For example, the file setapProcessT5.csv contains the data for all
# teams for time interval 5, paired with the outcome data for the
# Process component of the team's evaluation.
#
# Detailed information about the exact format of the .csv file may be
# found in the csv files themselves.
#
#
# DATA DEscriptION
# ====================================================================
# The following is a detailed description of the data contained in the
# accompanying files.
#
# INTRODUCTION
# ------------
#
# The data contained in these files were collected over a period of
# several semesters from students engaged in software engineering
# classes at San Francisco State University (class sections of CSC
# 640, CSC 648 and CSC 848). All students consented to this data
# being shared for research purposes provided no uniquely identifiable
# information was contained in the distributed files. The information
# was collected through various means, with emphasis being placed on
# the collection of objective, quantifiable information. For more
# information on the data collection procedures, please see the paper
# referenced above.
#
#
# PRIVACY
# -------
# The data contained in this file does not contain any information
# which may be individually traced to a particular student who
# participated in the study.
#
#
# BRIEF DEscriptION OF DATA SOURCES AND DERIVATIONS
# -------------------------------------------------
# SAMs (Student Activity Measure) are collected for each student team
# member during their participation in a software engineering class.
# Student teams work together on a final class project, and comprise
# 5-6 students. Teams that are made up of students from only one
# school are labeled local teams. Teams made up of students from more
# than one school are labeled global teams. SAMs are collected from:
# weekly timecards, instructor observations, and software engineering
# tool usage logs. SAMs are then aggregated by team and time interval
# (see next section) into TAMs (Team Activity Measure). Outcomes are
# determined at the end of the semester through evaluation of student
# team work in two categories: software engineering process (how well
# the team applied best software engineering practices), and software
# engineering product (the quality of the finished product the team
# produced). Thus for each team, two outcomes are determined, process
# and product, respectively. Outcomes are classified into two class
# grades, A or F. A represents teams that are at or above
# expectations, F represents teams that are below expectations or need
# attention. For more information, please see the paper referenced
# above.
#
# The SE process and SE product outcomes represent ML training classes
# and are to be considered separately, e.g. one should train ML for SE
# process separately from training for SE product.
#
# TIME INTERVALS FOR WHICH DATA IS COLLECTED
# ------------------------------------------
# Data collected continuously throughout the semester are aggregated
# into different time intervals for the semester's project reflecting
# different dynamics of teamwork during the class. Time intervals
# represent time periods in which a milestone was developed by each
# team. A milestone represents a major deliverable point in the class
# for all student teams. The milestones are roughly divided into the
# following topics:
#
# M1 - high level requirements and specs
# M2 - more detailed requirements and specs
# M3 - first prototype
# M4 - beta release
# M5 - final delivery
#
# Time intervals are combinations of the time in which milestones are
# being produced. Time intervals are used in research only.
#
# In addition to time intervals corresponding to milestones, a number
# of time intervals combining multiple T1-T5 time intervals have been
# calculated. This was done to group student activities into design
# vs. implementation phases which have different dynamics.
#
# These time intervals are defined as follows:
#
# Time Interval Corresponding Milestone Periods in Class
# ----------------- --------------------------------------------
# 0 Milestone 0
# 1 Milestone 1
# 2 Milestone 2
# 3 Milestone 3
# 4 Milestone 4
# 5 Milestone 5
# 6 Milestone 1 - Milestone 2 inclusive
# 7 Milestone 1 - Milestone 3 inclusive
# 8 Milestone 1 - Milestone 4 inclusive
# 9 Milestone 1 - Milestone 5 inclusive
# 10 Milestone 4 - Milestone 5 inclusive
# 11 Milestone 3 - Milestone 5 inclusive
#
#
#
# SETAP PROJECT OVERALL DATA STATISTICS
# ===========================================================
帕依提提提温馨提示
该数据集正在整理中,为您准备了其他渠道,请您使用
- 分享你的想法
全部内容
数据使用声明:
- 1、该数据来自于互联网数据采集或服务商的提供,本平台为用户提供数据集的展示与浏览。
- 2、本平台仅作为数据集的基本信息展示、包括但不限于图像、文本、视频、音频等文件类型。
- 3、数据集基本信息来自数据原地址或数据提供方提供的信息,如数据集描述中有描述差异,请以数据原地址或服务商原地址为准。
- 1、本站中的所有数据集的版权都归属于原数据发布者或数据提供方所有。
- 1、如您需要转载本站数据,请保留原数据地址及相关版权声明。
- 1、如本站中的部分数据涉及侵权展示,请及时联系本站,我们会安排进行数据下线。