Twitter Friends and hashtags
This datasets is an extract of a wider database aimed at collecting Twitter user's *friends* (other accound one follows). The global goal is to study user's interest thru who they follow and connection to the hashtag they've used.
It's a list of Twitter user's informations.
In the JSON format one twitter user is stored in one object of this more that 40.000 objects list. Each object holds :
- avatar : URL to the profile picture
- followerCount : the number of followers of this user
- friendsCount : the number of people following this user.
- friendName : stores the *@name* (without the '@') of the user (beware this name can be changed by the user)
- id : user ID, this number can not change (you can retrieve screen name with this service : [https://tweeterid.com/][1])
- friends : **the list of IDs the user follows (data stored is IDs of users followed by this user)**
- lang : the language declared by the user (in this dataset there is only "en" (english))
- lastSeen : the time stamp of the date when this user have post his last tweet.
- tags : the hashtags (whith or without #) used by the user. It's the "trending topic" the user tweeted about.
- tweetID : Id of the last tweet posted by this user.
You also have the CSV format which uses the same naming convention.
These users are selected because they tweeted on Twitter *trending topics*, I've selected users that have at least 100 followers and following at least 100 other account (in order to filter out spam and non-informative/empty accounts).
This data set is build by Hubert Wassner (me) using the Twitter public API. More data can be obtained on request (*hubert.wassner AT gmail.com*), at this time I've collected over 5 milions in different languages. Some more information can be found here (in french only) : [http://wassner.blogspot.fr/2016/06/recuperer-des-profils-twitter-par.html][2]
Past Research
No public research have been done (until now) on this dataset.
I made a private application which is described here : [http://wassner.blogspot.fr/2016/09/twitter-profiling.html][3] (in French) which uses the full dataset (Millions of full profiles).
On can analyse a lot of stuff with this datasets :
- stats about followers & followings
- manyfold learning or unsupervised learning from friend list
- hashtag prediction from friend list
Feel free to ask any question (or help request) via Twitter : [@hwassner][4]
Enjoy! ;)
[1]: https://tweeterid.com/
[2]: http://wassner.blogspot.fr/2016/06/recuperer-des-profils-twitter-par.html
[3]: http://wassner.blogspot.fr/2016/09/twitter-profiling.html
[4]: http://twitter.com/hwassner
