Select Language

AI社区

公开数据集

TED终极数据集

TED终极数据集

544.94M
231 浏览
0 喜欢
0 次下载
0 条讨论
NLP,Classification,Text Data,Recommender Systems Classification

数据结构 ? 544.94M

    Data Structure ?

    * 以上分析是由系统提取分析形成的结果,具体实际数据为准。

    README.md

    Context TED is devoted to spreading powerful ideas in just about any topic. These datasets contain over 4,000 TED talks including transcripts in many languages. If you would like a dataset for a language that is not listed below or a in a different file format (JSON, SQL, etc.), please checkout my Python module – [TEDscraper](https://github.com/corralm/TEDscraper). Languages TED talks have been subtitled in over 100 languages. I've included datasets for these 12 languages: | Code | Language | |-------|-----------------------| | en | English | | es | Spanish | | pt-br | Portuguese (Brazilian)| | fr | French | | it | Italian | | zh-cn | Chinese (simplified) | | zh-tw | Chinese (traditional) | | ko | Korean | | ja | Japanese | | tr | Turkish | | ru | Russian | | he | Hebrew | Attributes | Attribute | Description | Data Type | |------------------|-------------------------------------------------|------------| | talk_id | Talk identification number provided by TED | int | | title | Title of the talk | string | | speaker_1 | First speaker in TED's speaker list | string | | speakers | Speakers in the talk | dictionary | | occupations | *Occupations of the speakers| dictionary | | about_speakers | *Blurb about each speaker| dictionary | | views | Count of views | int | | recorded_date | Date the talk was recorded | string | | published_date | Date the talk was published to TED.com | string | | event | Event or medium in which the talk was given | string | | native_lang | Language the talk was given in | string | | available_lang | All available languages (lang_code) for a talk | list | | comments | Count of comments | int | | duration | Duration in seconds | int | | topics | Related tags or topics for the talk | list | | related_talks | Related talks (key='talk_id', value='title') | dictionary | | url | URL of the talk | string | | description | Description of the talk | string | | transcript | Full transcript of the talk | string | *The dictionary key maps to the speaker in ‘speakers’. Meta Author: Miguel Corral Jr. Email: corraljrmiguel@gmail.com LinkedIn: https://www.linkedin.com/in/miguelcorraljr/ GitHub: https://github.com/corralm Distributed under the Creative Commons license – Attribution-NonCommercial 4.0 International ([CC BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/)). Inspiration * Natural Language Processing * Topic modeling * Clustering * Recommender system * Classification * Regression
    ×

    帕依提提提温馨提示

    该数据集正在整理中,为您准备了其他渠道,请您使用

    注:部分数据正在处理中,未能直接提供下载,还请大家理解和支持。
    暂无相关内容。
    暂无相关内容。
    • 分享你的想法
    去分享你的想法~~

    全部内容

      欢迎交流分享
      开始分享您的观点和意见,和大家一起交流分享.
    所需积分:0 去赚积分?
    • 231浏览
    • 0下载
    • 0点赞
    • 收藏
    • 分享