Facebook RFP Asks Academia for Harmful Language Datasets and Benchmarks

Few online platforms need to deal with as much user-generated content as Facebook. The situation has escalated to the point where, during Facebook’s recent F8 developer conference, the social media giant spoke at length about its use of cutting-edge machine learning technology to help combat harmful content; from spam to hate speech to terrorist propaganda.

Speaking at F8, Chief Technology Officer Mike Schroepfer contextualized the scale of the challenge: In the third quarter of 2018, Facebook removed 1.2 billion pieces of spam content and over 700 million fake accounts.

Even though Facebook is among the busiest and most prominent corporate entities in the research scene, the social media leader called on academia in its fight against harmful content.

On May 6, 2019, Facebook Research published an RFP “to challenge the community to address these challenges together, and to solicit new tasks.”

Calling it “The Online Safety Benchmark request for proposals,” Facebook is calling on researchers to propose methods to improve their AI systems, thereby becoming more effective in reducing “fake and misleading content” on their platform.

USD 10,000–50,000 Grants

According to the RFP brief, Facebook’s frontline defense against harmful content is a combination of text and visual recognition systems that span automatic character and speech recognition, machine translation, and automated image and text categorization. However, what they currently lack is “a well-defined set of tasks around online safety, together with appropriate benchmarks to quantify performance.”

Through this Online Safety Benchmark RFP, Facebook wants to provide grants ranging from USD 10,000 to USD 50,000 to universities that can help build “open-source datasets on which the community can measure the progress of existing techniques to reduce misleading behavior online.”

London, UK – August 7, 2018: Facebook anti-fake news advert.

The grant is exclusively for universities: applicants must be “current full-time faculty at an accredited [non-profit or non-governmental] academic institution that awards research degrees to PhD students.”

Essentially, the grants are meant for projects that build the infrastructure (e.g., evaluation platforms, data sets), which can accelerate research to facilitate “safer online conversations” — a nebulous concept that spans problems from misinformation to fake profiles to hate speech. Facebook Research is particularly interested in the creation of the following:

  • A “publicly available benchmark and analysis platform” similar in function to General Language Understanding Evaluation (GLUE)
  • Publicly available harmful content datasets (think fake news, rumor propagation, and offensive content) similar to the Kaggle Rumor Tracker Dataset and the MS Offensive Language Dataset

The RFP notes that funding “should roughly match the cost of annotation, e.g. using a crowdsourcing annotation platform or paying expert annotators.”

Deadline for submission of proposals is June 20, 2019. Selected projects will be notified around the middle of July.