I was recently speaking to the leadership team at a Fortune 500 about our work at Bunches (obviously they loved it 😏), and the largest category of questions was about a singular topic: moderation.
From Twitter's – I refuse to call it X – struggles with brand friendliness to Farcaster's growing pains to the philosophical debates of free speech vs. cancel culture, content policing is the topic du jour of many in the user-generated content (UGC) space.
Navigating the thorny fields of moderation is without question the number one issue facing consumer platforms today. The question isn't really "if" (yes, you should moderate in some way) or "when" (start as early as possible), but "how".
Rather than describing a methodology in the abstract, I'll talk about how we've tackled moderation at Bunches this far, and how we plan on evolving in the future.
Bunches is the social network for sports fans. Alongside single player experiences like scoreboard, there are group chats for discussing leagues, sports, teams, and even players.
With nearly 250,000 users and growing rapidly, our UGC is one of our most valuable assets...and one of our biggest risks, as moderation is one of our biggest challenges.
Go visit a random Instagram sports meme account comment section and you'll get a small taste of the content issues were facing. Or imagine your favorite pub, where people are a little heated and tipsy...but then make everyone pseudonymous. It's not civil. It's not pretty.
Welcome to sports in the digital realm.
Discussion about a rival user's family. Racial and homophobic insults and slurs. Constant innuendo not suitable for some younger audiences. Off-topic rants. Spam comments and scam invitations.
The list goes on.
Currently, we do what most people would do in our situation...and in some cases, we take further steps than some would otherwise take:
We have an allowlist/whitelist and a denylist/blacklist of words, insults, and links. We systematically check every message for these lists and moderate appropriately.
We have implemented an AI automated image recognizer to detect pornographic or otherwise inappropriate imagery.
We have user-centric tooling for blocking users or messages and managing what you see as an individual user. Users can also report individual messages, which are then reviewed manually by our team (via integration with a dedicated Slack channel).
Systematically, we can moderate messages (muting them for being "off-topic") and Bunches team members can soft- or hard- delete messages altogether.
Bunch owners can kick/ban users from individual group chats (Bunches), and Bunches team members can remove users from the platform at large...including banning on device ID (which is far more reliant than IP, which a simple VPN gets around).
Larger companies like ByteDance and Meta handle moderation with scaled-up versions of the above: throwing more tech and more people at the problem. These teams and systems handle moderation platform-wide across users and content.
Other companies like Reddit and Discord distribute the problem of moderation through administration and isolation: each server or subreddit is isolated from one another, with each fiefdom having its own moderation team that reviews and decides on content.
Either way, platforms to date have centralized the power of moderation to either platform-wide teams & systems (such as ByteDance and Meta) or to moderators in isolated communities (such as Reddit and Discord).
The problem with this is that moderation can fail in one of two ways: moderation attempts and systems fail either because of abandonment or abuse.
Moderators under-use their power. They fall asleep on the job (sometimes literally), or content sneaks through during "off-hours", or moderators have inconsistently applied rules, or automated moderation systems go down for a period of time. In any case, the responsibility of moderation is vacated by the centralized authorities in charge of it. This abandonment of responsibility leads to a noisy platform, distrust in moderation systems, and an opinion by the user base that the platform is incompetent.
Moderators over-use their power. Content is removed even when it technically follows the laid out rules of the platform, moderators exercise personal vendettas against users via their power, or entire communities have their voices silenced via malicious individuals, misaligned algorithms, or buggy code. This abuse of responsibility leads to a dying platform, distrust in moderation systems, and a contentious relationship between platform and user base.
Whereas centralized approaches to moderation fracture the relationship between users and platforms, decentralized moderation can align users and platforms.
Something that we're pursuing here at Bunches is what I believe will be a better way: decentralizing moderation to users themselves. I explain what this could actually look like below, but first the why.
I've said this many times, but o is nnot fundamentally a financial technology. web3 is fundamentally an economic technology. Tokenization is a phenomenal tool for aligning incentives between two or more parties who don't trust one another. After all, this was the primary problem that crypto originally solved (at least probabilistically).
In a world where both abandonment and abuse lead to a distrust in moderation systems and a fracture in the relationship between platform and user, aligning incentives around content moderation seems like a crucial problem to solve for consumer platforms.
STEP 1: ESTABLISH ONCHAIN REPUTATION
The first step is for the platform to determine and ideally quantify high-quality contributors. This could be done in a variety of ways, and can include both first-party and third-party data, but at it's most basic form you identify users who are creating and consuming content in a meaningful way. Questions to ask around this identification: Who sends the most messages? Who posts the most original content? Who reacts, likes, replies, or comments the most as a lurker or consumer of content? Whose social graph is high quality and growing? Establishing a rules engine for reputation is the zeroth step; implementing that rules engine via an on-chain mechanism is the next.
This can be done via tokens of any kind (or even other onchain mechanisms like attestations), and again can include both on-platform and external data (this is up to the platform to define), but identifying and quantifying contributors is the goal here. An example of this in action is something akin to Yup's Yup Score (docs here), but perhaps more specific to the content platform.
STEP 2: PROPORTIONALLY ALLOCATE POWER BASED ON REPUTATION
Once reputation is established for your user base, build moderation tools that require consensus from X% of relevant users.
While the rules for consensus may differ from platform to platform, no single user should have the authority to moderate a message or user. Perhaps you'd want to base the threshold for consensus on the total reputation that has access to the content. Perhaps you'd want to base the threshold on reach of the content, or that have a social connection to the original author, etc.
There are probably many permutations of the mechanism that would work, and experimentation would be necessary to get it right here on a per-platform basis.
STEP 3: AUTOMATICALLY ENFORCE MODERATION BY CONSENSUS
Once consensus is reached for a moderation action by relevant users, the platform itself has to enforce the collective action via code (and smart contract when appropriate). There should be no additional human input required. If the threshold is met, the action is taken.
This immediacy accomplishes two things: it shows that the consensus mechanism has immediate effect and it shows that the users (not the platform's moderation team or algorithm) control what content is seen or distributed.
A PRACTICAL EXAMPLE
On Bunches, I've been toying with some of these concepts, and we have enough data internally to proceed with what I believe could be a very interesting model for consumer web3 companies like Bunches, Farcaster, Lens, etc.
In practice, this could be as simple as an upvote/downvote mechanism, weighted by the reputation score of each voter. If a threshold is hit (either in absolute terms or as a ratio), a moderation action against the content or against the user is taken.
The key is making these actions clear, easy, and intentional. Users should not accidentally ban a user, delete a message, or time someone out. Nor should users have to read pages of documentation to figure out how to do so.
Well, we're building this at Bunches. It's a real problem, and I believe we have a real solution for it.
Want to learn more, or just chat about this? Ping me! I primarily hang out on Farcaster these days, so feel free to send me a reply (thinking in public is great!) or shoot me a direct cast there.
Otherwise, thanks for reading!
Feel free to collect, tip, or otherwise share this content with others you think would appreciate it.
Subscribe to Derek Brown to receive new posts directly to your inbox.
Over 100 subscribers
Collect this post as an NFT.
We're also tackling this problem at /bunches. Here's our approach in a nutshell: https://derekbrown.xyz/decentralizing-moderation
i’m done using ‘warn and hide’ (not worth the drama) people don’t actually care about channel norms and enforcing just makes them angry the doubled edged sword of channels is that an account with 0 followers can cast to 100k+ people D.R.E.A.M. (distribution rules everything around me)
huh
Hot take: mods of popular channels should be paid warps instead of paying warps.
your lips to @dwr.eth's ears
did they remove norms on channels? i actually can't find them on any channel now
no, they're just hard to find. even if they were front-and-center tho, i still don't think it'd matter
yea the description is probably the only relevant real estate
“For maximum reach” 🙄 Anyone who thinks this is how channels work: Please go back to Twitter, there’s “more reach” there
The pay for mute thing was dumb though, except that was an individual user acting independently
Have you seen the automod.sh demo?
oof
I wonder how many people on farcaster were also part of the launch of communities on Google+. I see MANY of the same things happening. It’s the primary reason I have not set up a channel on here. I had a g+ community that went way over 100k users and it was a nightmare. Relying on people to admin is a challenge
I would not use this - primary incentive is for less activity in the channel, not necessarily higher quality activity. Generally not a fan of the gas model for contributing to a network (vs. ETH's gas model of *using* the network). I'm biased, but strongly prefer a reputation-based model. Curious what others think?
If you go the route of rep, how would you define good rep vs bad rep? A fee could potentially be useful especially if you can redistribute said fee back to good actors within a channel I don’t have a bias either way but just curious how you are thinking about it
There are fairly established ways of calculating reputation on an existing platform based on contribution. Most models are based on peer-to-peer gradation. If you get moderated, you lose rep. Likes/warps/recasts increase rep. More thoughts here: https://derekbrown.xyz/decentralizing-moderation
Popular channels get a lot of spam and manual moderation is hard to scale and decentralize. Proposal: - Hosts can turn on a fee in settings - Users must pay 100 warps before they can post - If banned, user must pay the fee again before posting - Fees go to hosts If you are a channel lead, would you use this?
I honestly cant foresee this strategy improving quality at all for /cryptoart.
Maybe the option to turn it on
it's an interesting option! but maybe also give channel hosts some additional options if they don't want all/nothing on it being fee-gated? e.g. maybe allow-list gated with various FC/onchain criteria as options?
where are these images from 👀
my team created some mocks that anyone can use for potential Allow Lists, which could be used in messaging groups, Frames, Channels and more https://www.figma.com/community/file/1339262555703918569
Love this! But if we go down this path, wouldn't it be wiser to create a webhook infrastructure, instead of having Warpcast try to satisfy all this? Something like UNI v4 hooks? And create an ecosystem of standard or custom webhooks? @v?
i wouldn't want to restrict someone with a thoughtful comment to drop in and post because they don't want to pay the fee. then again, my channel is small and doesn't have spam and main focus is to increase engagement.
We're building this: https://derekbrown.xyz/decentralizing-moderation (for reference, we're roughly the size of FC and our "channels" are largely not owned by admins, so we have to figure something out) Happy to chat through our ideas at any point, Varun - DCs open.
another option also could be to have users "stake" the warps rather than pay them if they are good actors for say 30 days they get it back
Thay can turn bad afer 30 days.
Thay can turn bad afer 30 days.
Thay can turn bad afer 30 days.
Thay can turn bad afer 30 days.
Thay can turn bad afer 30 days.
Thay can turn bad afer 30 days.
hmm another thought, if a channel advertised that it has a Warn Mode, that might be enough of a deterrent ⚠️ Warning: If your cast is "warned" by the hosts you will have to pay 100 Warps in order to cast again into the channel
Yes. In addition to: Make user agree to norms before posting the first time.
Yes. In addition to: Make user agree to norms before posting the first time.
Yes. In addition to: Make user agree to norms before posting the first time.
Yes. In addition to: Make user agree to norms before posting the first time.
Pretty interesting approach and getting closer to my proposal here: https://derekbrown.xyz/decentralizing-moderation Not sure how I feel about all votes being equal, though, and not taking into account reputation/contribution. Interesting experiment, will be following along. Well done, @depatchedmode!
Ayyyy! Thanks for the nod. Skimmed your article and will give it a deeper think. We basically agree! Not all votes will be equal in the counting for sure. I'll generate a reputation graph (daily?), where your judgement's impact will be proportional to your current rating. Actioning the graph is a gradient opt-in.
Primary precedent for the reputation graph is gonna be EigenTrust. But v0 will be my dumb-dumb "just a designer" interpretation of it.
Another thing I'm planning on is that whether or not your vote counts at all may be throttled by some threshold of your participation. And judgements of any individual cast shouldn't count until there's some kind of quorum. But to start, blind acceptance of all data points!
14106 $DEGEN