10 March 2023 – Bandits for Online Calibration: Application to Content Moderation at Meta

Presented by Deeksha Sinha, Research Scientist at Meta

What tech lies behind the social media giants’ attempts to keep content ”within the rules”? At Meta, we have both hand-crafted and learned risk models to flag questionable content, for humans to review. To operationalize these, we aggregate the different models to give a single ranking score, calibrating them to prioritize more reliable risk models. But violation trends change over time, affecting which risk models are most reliable; risk models change; and novel models are introduced. To continuously update the system in response to such trends, we use a contextual bandit. Our approach increases Meta’s top-line metric for measuring the effectiveness of its content moderation strategy by 13%.