Crowdsourcing Learning Resources

Important Information

  • This page was created for Chapter 6 of my book Crowdsourcing Learning(本页为我的专著《众包学习》第6章所创建).
  • Those who use the datasets downloaded here should cite my articles and books:(使用这里下载的数据集的作者须引用如下论文和著作):
  • Jing Zhang, Xindong Wu, & Victor S. Sheng. (2016). Learning from Crowdsourced Labeled Data: a Survey. Artificial Intelligence Review, 46(4): 543-576.
  • Victor S. Sheng & Jing Zhang*. (2019). Machine Learning with Crowdsourcing: A Brief Summary of the Past Research and Future Directions. Proceedings of AAAI-2019, Honolulu, Hawaii, USA, pp. 9837-9843.
  • 张静, 吴信东.《众包学习》. 科学出版社, 2021.
  • Real-World Datasets

  • Sentiment Judgment
  • Setiment Popularity (SP) 266KB
  • Weather Sentiment (WS) 183KB
  • Face Sentiment (FS) 12.5MB
  • iPhone Sentiment (iPhS) 24.1MB
  • Affective Text (Affective) 43KB
  • CrowdFlower Sentiment Analysis (SAJ2013) 20.5MB
  • Sentiment_Polarity (Polarity) 308MB
  • Company Sentiment (CS) 65KB
  • Relevance Evaluation
  • Text Retrieval Conference 2010 (Trec2020) 462KB
  • Adult Content 2 (Adult) 542KB
  • Web Search (Web) 102KB
  • Product Same (Product) 102KB
  • Image Classification
  • Dog 49KB
  • Duck 47KB
  • Ten Dog Breeds (Dog10) 55.4MB
  • Duchenne Smiles (WVSCM) 62KB
  • Fashion 10000 (Fashion) ( download from the original URL) 9.4GB
  • Natural Language Processing
  • Recognizing Textual Entailment (RTE) 503KB
  • Word Similarity (WordSim) 23KB
  • Temporal Ordering (TempOrder) 296KB
  • Word Sense Disambiguation (WSD) 161KB
  • Fact Evaluation
  • Fact Evaluation Judgment (FEJ2013) 10.4MB
  • Music Genre (Music) 1.4MB
  • Gender Hobby (Hobby) 20KB
  • Media Evaluation 2014 - Music Drop (Drop) 650KB
  • Tools

  • SQUARE
  • Statistical QUality Assurance Robustness Evaluation (code)
  • BATC
  • Benchmark for Aggregate Techniques in Crowdsourcing (Since the code is hosted by Google, You can download them here: win32.win32.x86, win32.win32.x86_64, linux.gtk.x86, linux.gtk.x86_64, macosx.cocoa.x86, macosx.cocoa.x86_64, others (config files, snapshots, datasets, etc.))
  • CEKA
  • Crowd Environment and its Knowledge Analysis (code)

  • Copyright © Jing Zhang 2020-2020. All rights reserved.