Dataset for "Cloud-assisted Crowdsourced Livecast "

NOTE: The datasets presented below are for academic use only.

Introduction

The crawled data are continually collected from Twitch every five minutes in a one-month period (Feb. 1st-28th, 2015). Through the official APIs, our multi-thread crawler obtained information from each broadcaster and the official system dashboard. The crawler does not need Twitch's API client-ID and avoids the limitation for the maximum number of objects to return in each request.

Broadcasters list

all_broadcaster_dict.txt the ID list of all broadcasters
ps_broadcaster_dict.txt the ID list of PS4 broadcasters
xbox_broadcaster_dict.txt the ID list of XBox broadcasters

Twitch dataset (Sample)

stream ID a string of stream ID, which is unique
current views an integer number of current viewer
stream created time a time of starting time of this stream
game name a string of game name
broadcaster ID a string of broadcaster ID, which is unique
broadcaster name a string of broadcaster's name
delay setting an integer number of the broadcaster's delay setting
follower number an integer number of the followers
partner status an integer number of the comments
broadcaster language a string of broadcaster's languate
total views of this broadcaster an integer number of total viewers of the broadcaster
language a string of the language of the broadcaster's website
broadcaster's created time a time of sign up of the broadcaster
playback bitrate a float number of the playback bitrate
source resolution a string of source resolution

Official dashboard information (Sample)

categorydata.txt the information on system dashboard
channeldata.txt the information on broadcaster dashboard
gamesdata.txt the information on game dashboard