公开数据集
数据结构 ? 0.45M
Data Structure ?
* 以上分析是由系统提取分析形成的结果,具体实际数据为准。
README.md
Problem Statement
The noise level of cars could be an indicator of both car’s condition and manufacturing quality. Drivers could use noise level to determine if a potential car suits their needs, or if their current car is in a healthy state. On the other hand, manufactures could use noise level to assess their cars' quality compared to the market. Luxury cars compete to have low noise level, while sports car usually neglect this factor. In this project we compile data from different sources to arrive to a dataset having cars’ manufacturing specification mapped to its noise level at different speed. The compiled dataset could be utilized in evaluating cars noise level or in analyzing which technical specification has the major effect on cars’ noise level.
For mor information about automobile noise level:
- [An overview of automobile noise and vibration control](https://www.researchgate.net/publication/270775858_An_overview_of_automobile_noise_and_vibration_control)
- [Noise, vibration, and harshness From Wikipedia](https://en.wikipedia.org/wiki/Noise,_vibration,_and_harshness)
---
Executive Summary
Initially we scrape data from https://www.auto-decibel-db.com (hereafter referred AD). This website has nearly 2000 data entries about cars' cabin noise level. Each car in the website has its cabin noise (measured in decibel) at different speed. The website doesn't provide further information about the source or the methodology of its collected data, yet it's the most comprehensive data about the subject I could found. Another source which might be used for verification can be found at https://www.edmunds.com. While edmunds.com states its methodology of collecting noise level, its dataset is embedded in PDF files and is not comprehensive compared to the former.
After scrapping the noise level of cars, we use the available information we have about each car to find its specification. In the scrapped dataset from AD there's 4 features which can be used to identify same car's specification in other datasets: brand, model, year, and spec. After looking up the Web for websites and APIs having detailed and comprehensive data about cars, we decided on http://www.carqueryapi.com API (hereafter referred CQA). Though it's not accurate for some cars, and it has different spelling from our AD, it's the most accessible data we could find. In this section we map each car in AD to its equivalence in cqa using the 3 features: brand, model, and year. We first specify the model_id in CQA and then we will use model_id to retrieve the full specification of the car. Due to the limitation imposed by caranddriver.com on the number of requests (60 requests), we used Tor bridge to alternate IP address.
Finally, we look up for the full specification of each car in CQA using its model_id. In this section we added 60 features of specification of nearly a 1000 car in AD. We refer to each feature pulled from CQA by a postfix added to its column name: '_cqa'. At the end we succeeded in getting specification of 1067 car out of 1895 in AD. We couldn’t find specification for all cars in AD due to either different naming of cars between AD and QC, or the car doesn’t exist in QC.
---
Web scrapping sources and API
[auto-decibel-db.com](https://www.auto-decibel-db.com): This website has nearly 2000 data entries about cars' cabin noise level. Each car in the website has its cabin noise (measured in decibel) at different speed.
[carqueryapi.com](http://www.carqueryapi.com): a JSON based API for retrieving detailed car and truck information, including year, make, model, trim, and specifications. It has 73419 vichle in its database.
---
Open questions to explore
What technical specification have the major effect on car noise level?
What are the patterns observed on car noise level through years, brands, and specification?
Can we use car noise level to indicate car's brand, year, specification?
×
帕依提提提温馨提示
该数据集正在整理中,为您准备了其他渠道,请您使用
注:部分数据正在处理中,未能直接提供下载,还请大家理解和支持。
暂无相关内容。
暂无相关内容。
- 分享你的想法
去分享你的想法~~
全部内容
欢迎交流分享
开始分享您的观点和意见,和大家一起交流分享.
数据使用声明:
- 1、该数据来自于互联网数据采集或服务商的提供,本平台为用户提供数据集的展示与浏览。
- 2、本平台仅作为数据集的基本信息展示、包括但不限于图像、文本、视频、音频等文件类型。
- 3、数据集基本信息来自数据原地址或数据提供方提供的信息,如数据集描述中有描述差异,请以数据原地址或服务商原地址为准。
- 1、本站中的所有数据集的版权都归属于原数据发布者或数据提供方所有。
- 1、如您需要转载本站数据,请保留原数据地址及相关版权声明。
- 1、如本站中的部分数据涉及侵权展示,请及时联系本站,我们会安排进行数据下线。