Decentralizing Moderation

I was recently speaking to the leadership team at a Fortune 500 about our work at Bunches (obviously they loved it 😏), and the largest category of questions was about a singular topic: moderation.

From Twitter's – I refuse to call it X – struggles with brand friendliness to Farcaster's growing pains to the philosophical debates of free speech vs. cancel culture, content policing is the topic du jour of many in the user-generated content (UGC) space.

Navigating the thorny fields of moderation is without question the number one issue facing consumer platforms today. The question isn't really "if" (yes, you should moderate in some way) or "when" (start as early as possible), but "how".

Rather than describing a methodology in the abstract, I'll talk about how we've tackled moderation at Bunches this far, and how we plan on evolving in the future.

On Bunches

Bunches is the social network for sports fans. Alongside single player experiences like scoreboard, there are group chats for discussing leagues, sports, teams, and even players.

With nearly 250,000 users and growing rapidly, our UGC is one of our most valuable assets...and one of our biggest risks, as moderation is one of our biggest challenges.

Go visit a random Instagram sports meme account comment section and you'll get a small taste of the content issues were facing. Or imagine your favorite pub, where people are a little heated and tipsy...but then make everyone pseudonymous. It's not civil. It's not pretty.

Welcome to sports in the digital realm.

Discussion about a rival user's family. Racial and homophobic insults and slurs. Constant innuendo not suitable for some younger audiences. Off-topic rants. Spam comments and scam invitations.

The list goes on.

Our Current Approaches

Currently, we do what most people would do in our situation...and in some cases, we take further steps than some would otherwise take:

We have an allowlist/whitelist and a denylist/blacklist of words, insults, and links. We systematically check every message for these lists and moderate appropriately.
We have implemented an AI automated image recognizer to detect pornographic or otherwise inappropriate imagery.
We have user-centric tooling for blocking users or messages and managing what you see as an individual user. Users can also report individual messages, which are then reviewed manually by our team (via integration with a dedicated Slack channel).
Systematically, we can moderate messages (muting them for being "off-topic") and Bunches team members can soft- or hard- delete messages altogether.
Bunch owners can kick/ban users from individual group chats (Bunches), and Bunches team members can remove users from the platform at large...including banning on device ID (which is far more reliant than IP, which a simple VPN gets around).

Other Viable Approaches

Larger companies like ByteDance and Meta handle moderation with scaled-up versions of the above: throwing more tech and more people at the problem. These teams and systems handle moderation platform-wide across users and content.

Other companies like Reddit and Discord distribute the problem of moderation through administration and isolation: each server or subreddit is isolated from one another, with each fiefdom having its own moderation team that reviews and decides on content.

Either way, platforms to date have centralized the power of moderation to either platform-wide teams & systems (such as ByteDance and Meta) or to moderators in isolated communities (such as Reddit and Discord).

The problem with this is that moderation can fail in one of two ways: moderation attempts and systems fail either because of abandonment or abuse.

Abandonment

Moderators under-use their power. They fall asleep on the job (sometimes literally), or content sneaks through during "off-hours", or moderators have inconsistently applied rules, or automated moderation systems go down for a period of time. In any case, the responsibility of moderation is vacated by the centralized authorities in charge of it. This abandonment of responsibility leads to a noisy platform, distrust in moderation systems, and an opinion by the user base that the platform is incompetent.

Abuse

Moderators over-use their power. Content is removed even when it technically follows the laid out rules of the platform, moderators exercise personal vendettas against users via their power, or entire communities have their voices silenced via malicious individuals, misaligned algorithms, or buggy code. This abuse of responsibility leads to a dying platform, distrust in moderation systems, and a contentious relationship between platform and user base.

Alignment: A Better Way Forward

Whereas centralized approaches to moderation fracture the relationship between users and platforms, decentralized moderation can align users and platforms.

Something that we're pursuing here at Bunches is what I believe will be a better way: decentralizing moderation to users themselves. I explain what this could actually look like below, but first the why.

I've said this many times, but o is nnot fundamentally a financial technology. web3 is fundamentally an economic technology. Tokenization is a phenomenal tool for aligning incentives between two or more parties who don't trust one another. After all, this was the primary problem that crypto originally solved (at least probabilistically).

In a world where both abandonment and abuse lead to a distrust in moderation systems and a fracture in the relationship between platform and user, aligning incentives around content moderation seems like a crucial problem to solve for consumer platforms.

By creating shared ownership, platforms can also share the responsibility of moderation.

What This Could Look Like

STEP 1: ESTABLISH ONCHAIN REPUTATION

The first step is for the platform to determine and ideally quantify high-quality contributors. This could be done in a variety of ways, and can include both first-party and third-party data, but at it's most basic form you identify users who are creating and consuming content in a meaningful way. Questions to ask around this identification: Who sends the most messages? Who posts the most original content? Who reacts, likes, replies, or comments the most as a lurker or consumer of content? Whose social graph is high quality and growing? Establishing a rules engine for reputation is the zeroth step; implementing that rules engine via an on-chain mechanism is the next.

This can be done via tokens of any kind (or even other onchain mechanisms like attestations), and again can include both on-platform and external data (this is up to the platform to define), but identifying and quantifying contributors is the goal here. An example of this in action is something akin to Yup's Yup Score (docs here), but perhaps more specific to the content platform.

STEP 2: PROPORTIONALLY ALLOCATE POWER BASED ON REPUTATION

Once reputation is established for your user base, build moderation tools that require consensus from X% of relevant users.

While the rules for consensus may differ from platform to platform, no single user should have the authority to moderate a message or user. Perhaps you'd want to base the threshold for consensus on the total reputation that has access to the content. Perhaps you'd want to base the threshold on reach of the content, or that have a social connection to the original author, etc.

There are probably many permutations of the mechanism that would work, and experimentation would be necessary to get it right here on a per-platform basis.

STEP 3: AUTOMATICALLY ENFORCE MODERATION BY CONSENSUS

Once consensus is reached for a moderation action by relevant users, the platform itself has to enforce the collective action via code (and smart contract when appropriate). There should be no additional human input required. If the threshold is met, the action is taken.

This immediacy accomplishes two things: it shows that the consensus mechanism has immediate effect and it shows that the users (not the platform's moderation team or algorithm) control what content is seen or distributed.

A PRACTICAL EXAMPLE

On Bunches, I've been toying with some of these concepts, and we have enough data internally to proceed with what I believe could be a very interesting model for consumer web3 companies like Bunches, Farcaster, Lens, etc.

In practice, this could be as simple as an upvote/downvote mechanism, weighted by the reputation score of each voter. If a threshold is hit (either in absolute terms or as a ratio), a moderation action against the content or against the user is taken.

The key is making these actions clear, easy, and intentional. Users should not accidentally ban a user, delete a message, or time someone out. Nor should users have to read pages of documentation to figure out how to do so.

What's Next?

Well, we're building this at Bunches. It's a real problem, and I believe we have a real solution for it.

Want to learn more, or just chat about this? Ping me! I primarily hang out on Farcaster these days, so feel free to send me a reply (thinking in public is great!) or shoot me a direct cast there.

Otherwise, thanks for reading! 🙏

Feel free to collect, tip, or otherwise share this content with others you think would appreciate it.