Andrew Wegner | Ponderings of an Andy - Stack Exchange

A Decade of Fighting Spam

2023-11-27T15:45:00-06:00

Introduction¶

Charcoal is nearing a decade of existence. In January of 2024, the Stack Exchange community will have been fighting the good fight of keeping spam off the platform. I've written about a machine being able to flag spam in the past. I've also posted the original and it's follow up on being able to spam flag even better on Stack Exchange itself.

Recently, I was asked to talk a bit about a hobby of mine. I put together this presentation.

What is Stack Exchange?¶

To set a bit of context for those who need it. Stack Exchange is a network of over 180 sites covering almost any topic you can think of. It's a question and answer network. The slide you are seeing here are just a handful of the sites of more interesting logos - but you can see they cover a range of topics from professional work place questions, to the intricacies of the English Language, to Data Science and gaming.

But by far the largest and most popular is Stack Overflow. With 24 million questions covering any programming language or framework you have used. It consistently ranks in the top 500 most visited sites on the internet - depending on what service is doing the measuring. Basically, it gets a lot of eyeballs looking at it daily.

Which means it's a target for spam.

What is spam?¶

The network and the community within it settled on a fairly standard definition of spam:

A post exists only to promote a product or service and doesn't disclose author's affiliation.

The images here show what the site looks like when the community systems aren't operational. This is the front page of two sites and if you look closely at the time stamps, you'll see that these posts occurred within about 10 minutes.

If users - new or experienced - come to the site and see this, they start to turn away.

Back in 2013/2014 this was common. Spam posts would stick around for hours and a group of users decided they could help out across the network by flagging these posts more quickly.

What is flagging?¶

The final bit of context that is needed is: flagging. It's exactly what you'd think it is. The goal of a flag to bring attention to the post by forcing it in the community review queues. This gets more people to look at it. Stack Exchange is built around community moderation. There is very little that elected "Diamond Moderators" need to handle that the community can't handle.

If enough people flag a post as spam, it's automatically deleted. The community and company decided that getting 6 people to agree a post is spam is an appropriate number.

Once a post is removed as spam, the post is locked, deleted and the author has 100 reputation points removed. These are the visible actions. The reputation hit is to prevent - or slow - a spammer from getting more privileges within the network.

Behind the scenes, a spam post also triggers company checks against future posts matching similar information to the user. These aren't publicly disclosed. But, the company is fairly conservative in terms of blocking users.

What is Charcoal?¶

The community hates spam. It's a bad user experience at best to have a page filled with spammy posts. It also makes the site, and community at large, look rather seedy. This isn't great for a community and a company that has built its reputation on accuracy and trust.

Charcoal was created to watch for spam across the 180+ sites. Actually, when we started it was less than half of that, but over the past decade the network has grown and the anti-spam systems have grown with it.

The community has a two phase process to dealing with spam. First is alerting the spam fighting community of potential spam. Users can go cast their flags across the network and deal with it. Second, for the truly egregious spam, the system can utilize the community's flags and automatically cast those flags.

How's this work?

SmokeDetector and Metasmoke¶

There are two systems behind this community effort - SmokeDetector and Metasmoke.

SmokeDetector, affectionately named "Smokey", is designed to be the early warning system. It quickly provides a yes/no decision on whether a post is spam and alerts users for manual action. It passes off the more intense confidence checks and automatic flags to MetaSmoke.

Every post on the network goes through the process below. A user clicks the submit button and Stack Exchange does their few checks - remember these are black boxed - and if the post makes it through these it gets published to a real time web socket.

SmokeDetector does a quick "Is this spam?" check. If it is, it's posted to chat rooms around the network - to the network wide Charcoal room and usually to a site specific room if the room is utilized enough. Users then go and investigate and if they agree that it's spam, cast a flag. After 6 of these are cast, the post is removed. Hooray! Another victory against spam.

When Smokey posts that spam is found, it also sends a message to MetaSmoke. This system is checking how confident we are that this is spam. If there is high confidence, it will start utilizing community member flagging privileges to cast spam flags on the post as well. If there isn't high confidence, no automatic flags will be cast.

The goal is to remove the spammy posts as quickly as possible - and by utilizing automatic flags the number of people that have to go do this manually is reduced. Due to larger community and company discussions and outcomes, the system will not cast all 6 flags except in very very rare circumstances. Someone has to agree with the machines here.

How to detect spam¶

What's spam detection look like? Over the last decade we've tried things like classification schemes, machine learning algorithms, and a handful of AI attempts. But, by far the most reliable has been...

Regular expressions.

(Take a deep breath fellow engineers)

Each post - goes through thousands of regular expressions. Each expression is weighted based on how likely matching that particular expression means the post is spam. The higher weights are posted into the chatroom kicking off this entire process.

The community has built watchlists and blacklists over the decade to help find these posts.

Watchlists are experimental checks. Spam evolves over time. It's actually pretty interesting to watch a dedicated spammer craft their posts to get it to last on the network more than a few minutes. These watchlists are designed to allow the team to test regular expressions without fear of automatically flagging something during testing.

Blacklists are finalized regular expressions that catch spam with a high number of true positives and very low false positives. These the weight spam checkers.

Like Stack Exchange itself, the spam fighting community has built tooling that allows work to be done without a high level user to be around. Users can watch for a new regular expression.

Users that aren't trusted just yet, will have their request created as a pull request in GitHub that needs to be approved. Trusted users will get their watchlist automatically added to the system. The same holds true for blacklisted items.

But, watchlists and blacklists are only half the problem. The other half is validating that these are accurate. As posts are detected as spam, users provide a signal back to the system on whether a post is a tp - True Positive - Spam

Or a false positive - fp - not spam. These feedback to the watchlists and will prevent elevating watchlists that are inaccurate to a full blacklist.

Sometimes, a post has features that the system doesn't detect as spam. In those cases, the community can manually report the post. This triggers the alerts through out the chatrooms so that others can flag it and get it removed. It also allows the community to find potential patterns to watch for in the future.

Users that aren't trusted yet get pull requests created for their patterns. All of this can be handled and approved within the chatrooms. A lot of this system is built on top of, and keeps most users within, the Stack Exchange ecosystem.

I mentioned the weighted reasons on a detected post. When these are posted in chat, the reasons are also posted as well as the weight of the post. The one on the slide below is particularly bad. Generally anything over about ~225-250 is spam with higher numbers becoming more and more certain.

These weights shift over time and as a regular expression is utilized more. This keeps the system flexible.

For this particular post, the system determined it was spam and cast three automatic flags from our users. Each user that grants permissions for the system to utilize their flags - because they are responsible for the usage of the flags - can set their threshold for when to allow their name to be used.

The 4th flag here came in via a user script the community built, but was not automatically cast. The remaining two flags would have come from the users of the site or from someone that saw the Smoke Detector alert and manually flagged it. Metasmoke doesn't have a record of that because it didn't go through Metasmoke.

By the numbers¶

Let's look at some numbers.

SmokeDetector has been running since January of 2014. We didn't start recording stats until about 18 months later though, so the dates in the graph start in August 2015. Initially, the system didn't have watch lists, which is why you see the blue and orange lines are pretty close together.

Around mid 2018/early 2019 we introduced watchlists. This was done because we started seeing persistent spammers. These were spammers that noticed their posts were being deleted quickly and worked to find ways to change the message to stick around longer.

The chatrooms are open and based on some messages we have removed, it's obvious the room is watched by the bored spammers. The watchlists reduced the true positives. But because we didn't ever separate the data between blacklists and watchlists the lines began to separate.

In early 2017, autoflagging was introduced. With autoflagging the system can reduce the time on site for nearly half of the true positives.

You'll notice a major spike in the summer of 2022 and a dip in the summer of 2023. The spike was for a massive spam wave. This was the work of a spammer that had access to a lot of geographically distributed systems - which bypassed Stack Exchange's built in protections - and was a persistent spammer or team of spammers that watched the public chatrooms for changes the spam fighting community made to detect their posts. This went on for 2-3 weeks with thousands of posts being made, adjusted, and deleted. Ultimately, the spammer was blocked at the Stack Exchange level based on heuristics the Charcoal team presented.

This past summer, in 2023, the dip you see is because Stack Exchange experienced a crisis of confidence from the community at large. Moderation work stopped for the months of June and July in protest of the company's policies toward generative AI on the platform. Charcoal participated in that. While not fully resolved, some of the worst policies were reworked with input from the larger community and work resumed.

The goal of the Charcoal project is to remove spam quickly from the site. Flags that are cast by the system are tracked and we can clearly see that more automatic flags mean the post is active for less time.

When there is no system cast flags, an average spam post lives for 21 minutes on the site. If the system casts all 6 - which is only utilized during a spam wave like in the summer of 2022 and with company permission - a post lives for 16 seconds. During day to day operations, the system is configured to cast 3 automatic flags. This was determined by a lot of conversations with individual sites around the network and what they felt comfortable with.

SmokeDetector has over 103 thousand commits to its repository over the last 10 years with 90 different code contributors. In the slide below, the top two graphs show that it's rulesets are updated daily - except for this past summer.

Over the course of a 24 hour period, flags are automatically cast from nearly 420 different users around the network.

Finally, the entire goal: over 450,000 spam posts have been identified and deleted by the system and the community in the last decade.

Conclusion¶

You now have an idea of how one of the largest sites on the internet handles spam. I do want to point out that StackExchange operates very differently from sites like reddit or YouTube or Facebook which spent a lot of company time building their anti-spam systems. Stack Exchange built basic protections themselves and then saw the technical community members step up and take on the challenge.

Stack Exchange Strike - The strike is over

2023-08-07T12:00:00-05:00

Introduction¶

The Stack Exchange strike is over. It took two months and two days of inaction on behalf of community members and moderators to resolve the strike. The results of the negotiations were posted last week. Over the weekend the community voted that the goals of the strike have been achieved and called the end of the strike.

So, what was achieved? What did more than two months of community upheaval get the users of Stack Exchange?

AI Generated Posts¶

Effective immediately, Stack Exchange has agreed to allow the removal of content based on a combination of strong and weak indicators of GPT usage. Additionally, the original policy that was the straw that started all of this has been both publicly released and declared invalid.

This release took two months and was the cause of the strike beginning. It drove a wedge between the moderators of Stack Exchange and the company and its employees. I do not understand why it took so long to release this. The public answer is that there was initial moderator pushback on releasing it but that was rescinded within days. Like much of the public communications during this, Stack Exchange clung to outdated information.

Something that I think is amazing is that this private policy was never leaked in full during the strike. Despite what the company was saying about moderators and the community, the community abided by its belief that agreements - private moderator space - should remain just that: private. I applaud the moderation community for this commitment to their ideals.

Data Dumps¶

The Data Dumps were turned off back in March. They were re-enabled several weeks into the strike as one of the first concessions. However, it's important to note that a former employee has presented that this was not an unwavering commitment. They had been contacted by the Stack Exchange CEO in March to disable the scheduled data dump.

The company officially states that it's committed to the long-term (foreseeable future) survival of the data dumps, the API, and SEDE [Stack Exchange Data Explorer].

This is good. It is concerned that this action was done without informing the community and only discovered after the fact though. This type of "ask for forgiveness" behavior from Stack Exchange is common and is a concern for me.

Moderation Agreement Changes¶

Stack Exchange agreed to a review period for binding policy changes and policies must be made public. This is another big one that caused the strike to move forward. I'm happy to see this has been resolved in a way that benefits the community and transparency. Time will tell if this holds true, as it's only something that we can say is effective until it is no longer effective.

Assuming that the agreement holds though, a review period will be good for the moderators that are expected to enforce changes. It will give them time to get clarifications and be prepared for the discussions on meta.

The company also agreed to update their press policy. New statements to the press must get at least one member of the community management team to sign off on the statement and statements must be as general as possible. This works in tandem with the existing policy put in place in 2019, where statements won't discuss an individual moderator without written permission.

This is another one that we can't judge effectiveness of until it's been broken. Unfortunately, the press policy from 2019 wasn't enough to prevent some statements to the press this time that presented moderators in very unflattering light. Even though none were identified by name, the broad statements were taken out of context in a couple instances.

That said, there was an apology for that from the Vice President of Community:

I would also like to take this opportunity to extend my most sincere personal apologies to mods who felt that in our previous text we were accusing them of racism. While that was not the intent of the text that I wrote (nor did that sentiment reflect the feelings of anyone involved in drafting the text), I can understand how it could be read that way, and I regret that we allowed it to be published like that. You have my sincere apologies, which I will also deliver in person at the upcoming mod/staff meetup.

Violations of the Moderator Agreement¶

The current moderator agreement does not lay out a process for determining if the company violates the agreement. The strike representatives and the company agreed to an outline for such a process. If Stack Exchange is found to have violated the agreement, actions taken and comments made during the violation must be retracted and nullified and a public apology must be made detailing the violation.

This is one that I hope we don't have to see utilized. Again, time will tell if it is something that will occur.

There is also a process where the moderation team can vote on if a violation occurred. Exact numbers are still being determined, but essentially a minimum number of moderators must vote on if a violation was committed and from that a minimum amount must vote that a violation occurred.

While I'm unhappy with the percentages used as placeholders, the discussions here continue. I am happy with the process and the proposed actions if a violation is determined.

Stack Exchange Processes¶

Several internal changes were negotiated as well. Each of these were around how the company communicates. These include being transparent that the strike occurred, collaboration with the community instead of fighting it, public policies, and clear communication.

All of these are positives changes and can only be evaluated over time. There are already signs that some are taking place with public releases of policies, acknowledgement of the agreement itself and discussions around the final details.

What do I think?¶

I've been pretty pessimistic of the last two months. Stack Exchange appeared to be following reddit's footsteps in some cases. I was afraid that the company would start replacing moderators in a few months - either when the automated systems started flagging inactivity or when the products announced at the developer conference last month started appearing on the site.

I'm still pessimistic about the future of the platform, if I'm being honest with myself. I do not think generative AI will be a benefit to the community, as it exists right now. By the time it's to a point where generative AI doesn't make things up, it'll be too late. The bad data and information will have already been on the site and trust will be gone.

The policies that can't be measured until a violation occurs also has me concerned. We went through this cycle back in 2019 with a strike being very narrowly averted then. Unfortunately, the same things that occurred then triggered the issues now: a private policy and talking to the press. Both were supposed to be resolved then. Both are supposed to be resolved now. Unfortunately, we won't know if that's the case until the policies cause another problem.

So...am I sticking around as a moderator? There is a virtual moderator meet up later this month with the CEO of Stack Exchange. I think that will be when I make my final determination. However, two months of not moderating was a welcome break.

Stack Exchange Strike - Community and AI Talk Reaction

2023-07-27T12:00:00-05:00

Introduction¶

I really was trying not to post anything about the strike again until it was resolved. There was a long period of no action in July so my assumption was that it was being worked on behind the scenes with the company and the designated representatives. Turns out, that it was just quiet for a couple weeks with no movement. One important development was the release (finally) of the private policy on GPT Generators. This is significantly different from the one that was released publicly and was part of the cause of the current moderator strike. The community's response to this policy was as expected - suprise at how bad the policy was even with moderators saying it was bad.

Today, Prashanth Chandrasekar - CEO of Stack Overflow - presented the next vision for Stack Exchange. The presentation is available on YouTube. It starts around the 9 minute mark.

Reaction¶

Welcoming¶

You will continue to be the focus. 100%. We are here to serve you. To fight on your behalf. To make sure you are recognized for the work you are doing and that you are able to responsibly harness the power of AI for your needs to be the best developers you can be. All of this is centered around the collective community and knowledge. That is irreplaceable.

This is a good start. It rings hollow to me right now though. These last two months in particular have shown that the community is easily tossed aside. 2019, and its literal legal settlement with a community member for tossing them aside, shows that current time isn't a one time thing. We'll see what the rest of this talk covers, but right now this feels like pandering to the audience.

Guiding Principles¶

Find new ways to give technologists more time to create amazing things.

Good. I'm all about efficiency for myself and my teams. If we can make developers' and engineers' lives easier, that's a win for me.

Accuracy is fundamental. That comes from attributed, peer-reviewed sources that provide transparency.

Spoiler - I received a sneak peak of a portion of this presentation two days before the official one in Berlin. The information that will be shown later in the presentation will cover some of this. It doesn't address all of it though. I'm pleased with the attribution aspect of this. Stack Overflow content is licensed under CC-by-SA. Attribution is required and I approve of any effort to improve that.

I'm concerned about accuracy. Their own test of the GenAI powered formatting assistant last month showed this is a problem. Current GenAI - Chat GPT, Bard, etc - are well known for making information up instead of saying "I don't know".

The technology field should be accessible to all, including beginners to advanced users.

Weird ending phrase, but I agree. Technology fields are not limited to engineers, software developers and people with the desire to learn how to program. Technological improvements should come from any one who wants to make life easier. AI can and will help with that.

Humans should always be included in the application of any new technology.

Again, agreed. My concern here is whether or not Stack Exchange has the capability to do that. Technically, they've had a human involved, but it's only been from a single perspective - that of the company. They have ignored the other side of this: the community. The people that provided the data for their business to thrive.

Overflow AI¶

insert applause

This has been in development for "the past three months". It also covers 6 items being talked about today and 6 additional items. That time frame is concerning. Three months is not a lot of time to build a lot of these larger items.

Search: Summarized version of multiple answers with citations from where the answers came from. It also allows you to continue the conversation with the results - including code. It also allows the user to post the question to Stack Overflow if the generative AI portion gets stuck.

My concerns lie with how this impacts the community. Hand waving away that the system is able to synthesize an answer and properly attribute it, what happens next? Stack Overflow has a reputation - deserved or not is up to the reader - for being harsh on new users that don't "try". If a new user comes and gets an answer without posting the question that's wonderful for the user. They move on and complete what they need to complete. For the community, though, we have a problem.

I find it unlikely that a new user is going to post a question to Stack Overflow when they received an answer. Why would they? They already got the answer they need. If this repeats constantly, the knowledge base that Stack Overflow represents becomes significantly less useful. Fewer questions mean fewer answers. With fewer answers, the platform becomes more dependent on the trained AI model which is receiving less data to feed it. The system spirals.

Does this only matter to prevent duplicates though? If the system is only summarizing other topics, wouldn't that mean the question being asked is a duplicate? I don't think that's the case. I think with the "continue discussion" portion of this, Stack Overflow will be losing out on better written and described problems. Problems with more detail that the community could benefit from seeing and answering.

The Visual Studio plugin looks interesting. I don't think it's adding much that other plugins can't do already, other than providing a first party integration to Stack Overflow. The same is true for the Slack / Stack Overflow integration.

The Enterprise knowledge injestion is an interesting product. While we don't use Stack Overflow for teams at my current company, this would be a great way to start using it. The ability to load initial data into a new system is important. I'd love to see this in practice and how well it does at building initial Q&A pairs from the data it ingests.

I haven't looked into Stack Overflow for Teams for my professional work. A couple comments that were made do raise a few concerns about how Stack Exchange is using company data. Before making a recommendation to utilize it for my company, I'd need to investigate that data from my company is not made available outside of my organization.

Finally, discussions. Stack Exchange is adding a forum to their collectives product. The community has spent over a decade to not be a forum. This change is not great in my opinion. Forums, social media, are a different beast than Stack Overflow.

Final Thoughts¶

I am concerned about the direction Stack Exchange is taking and the employee resources they are using to go that route. Engagement, page views, and traffic has been a huge underlying concern at Stack Exchange for a while. It was particularly noticed when ChatGPT was released and with the comments about driving users away. However, this downward trend has been ongoing for years.

I don't think GenAI is going to solve this problem. I think it will improve new user experience, but the way this has been presented it will be to the detriment of the larger community. Fewer questions will exist to answer and that will slowly cause users to disengage from the site.

This is why the opening of the talk feels like lip service to the community to me. These changes are designed to put a barrier between users of the community. Even the discussions product has the barrier. Discussions will only be available in Collectives, not to the general community.

Stack Exchange Strike - Now AI is bad? Does Stack Exchange know what it is doing?

2023-07-04T23:30:00-05:00

Introduction¶

My previous posts about the ongoing moderator and curator strike on the Stack Exchange network can be found linked at the bottom of this post, or by visiting the Stack Exchange Strike category on this site. I'd post a summary about what's happened in the last ten days, but there is nothing to report. There are discussions, but no agreements. The appointed Stack Exchange employee empowered to talk with moderators stepped back and is not participating any longer.

Tomorrow marks the one month point. We are hours away from 10,000 pending moderator flags on Stack Overflow. This is up from 78 (yes, two digits, in mid-May). The way this has gone down, the lack of progress, and the continued mischaracterization of moderators to the press hasn't motivated me to spend my free time to volunteer in the last long though. I still have this feeling that Stack Exchange is looking at the reddit protests recently with their demand that moderators return to the community and wondering if they can replicate that here.

New confusion¶

On July 3, 2023 Stack Overflow published a blog post entitled: "Do large language models know what they are talking about?". Spoiler: the conclusion of the article is "Nope."

But that's not the interesting thing. The interesting thing is how this answer is presented. The very last paragraph of the post cuts to the heart of the matter that moderators on Stack Overflow raised in December when we banned ChatGPT.

Treating AI-generated information as purely actionable might be the biggest danger of LLMs, especially as more and more web content gets generated by GPT and others: we’ll be awash in information that no one understands. The original knowledge will have been vacuumed up by deep learning models, processed into vectors, and spat out as statistically accurate answers. We’re already in a golden age of misinformation as anyone can use their sites to publish anything that they please, true or otherwise, and none of it gets vetted. Imagine when the material doesn’t even have to pass through a human editor.

We saw this in action with ChatGPT. We still see it in action with ChatGPT and it's still a problem users are becoming more aware of as the strike continues. We saw it when Stack Exchange tried their formatting assistant on Stack Overflow. What I see here is Stack Overflow admitting that the moderators are correct, in public.

The other interesting thing about that paragraph is that it links to an article from The Verge that quotes the Stack Overflow moderators and the decision to ban AI. It also has this dig at Stack Exchange executives:

The mods say AI output can’t be trusted, but execs say it’s worth the risk.

Their own post is explaining why it's not worth the risk.

What's this mean?¶

I see this as more communication failure on Stack Exchange's part. In an update I posted weeks ago, I linked to internal emails that were leaked.

How are we messaging this? Who is allowed to post and respond to questions and comments on Meta, chat, social media, etc?

The Community Leadership Team ([redacted]) are working together in close coordination with Marketing ([redacted]) on comms. They will post and respond to questions on-site. Unless you are specifically tapped to respond to something please do not engage. It is best to avoid commenting on anything related to this action on site, even if you think you have something helpful to add. Please get review and approval from Philippe prior to posting on site, or from [redacted] if you are approached off-site.

Someone, somewhere, didn't realize what this blog post was about or what it linked to.

But, nothing changes with this. The company has dug in so hard on forcing GenAI to be on the sites and is marching toward an announcement of some kind in late July 2023 about AI. In the meantime, I can only see blog posts like this one as an indication that Stack Exchange doesn't know what they are attempting to build toward and at the same time have come to the conclusion (or at least a team within Stack Exchange has) that GenAI isn't to be trusted.

Just like the community said back in December and continues to say now.

Stack Exchange Strike - How does the company regain my trust?

2023-06-23T23:00:00-05:00

Introduction¶

This post was originally posted on Meta Stack Exchange, the network wide "Meta" for the Stack Exchange network. This meta site and the "child meta" sites mentioned in the post are utilized by community members to discuss the network itself. This is where questions about how the individual site or the network as a whole are posted, where policies are determined, where questions about questions are discussed. This is the location that the community has to make their voice heard.

This post is my answer to the question "What is needed for users to trust the Stack Exchange company?" It's been edited slightly to fit the format of this blog.

How will my trust be regained?¶

Short version¶

TL;DR: I'm not sure and that's a bad thing for me and for the community.

Before I begin, I'm not going to segment the company into various groups. I've gotten the impression from moderator representatives that this is a bad thing and they are offended by this segmentation. I have no desire to further that, so "Stack Exchange" in this case refers to both the company as a whole and all employees.

Author note: This complaint was relayed by the moderator representatives from Stack Exchange during discussions. It seems that using phrases like "management" and "leadership" is being interpreted as "good staff" vs "bad staff" at Stack Exchange. While I disagree with this, to me it's not worth the argument thus it's just "Stack Exchange" for me from now on.

Who am I?¶

For context, I've been on the network for nearly 14 years. I'm a moderator on Stack Overflow, Hardware Recommendations and Community Building. I have built automated tooling to flag comments (at one point accounting for 15% of the comment flags raised on Stack Overflow in a year), and I am an admin on the community led Smoke Detector (spam detection) project. In short, I know this network, the tooling it does and does not have, and various communities across the network. My time here is voluntary. Time that I, until recently, was happy to provide without much of a thought. I've had very interesting discussions with fellow moderators and Stack Exchange employees throughout my time here.

Why don't I trust Stack Exchange?¶

What is needed for users to trust the Stack Exchange company

Stack Exchange has gone through this cycle before. I've written about it in those past cycles for anyone who wishes to go through my profile and find previous thoughts. Each time, less of my energy comes back as we - community and company - reconcile and bury the problem in the sand.

The last major cycle ended with a lot of lawyer language, including the new moderator agreement that every mod had to accept to retain their diamond. This cycle started with a violation of one of the provisions of that agreement by Stack Exchange. It received a, in my opinion, flippant "Oops, that was my fault" by the Vice President of Community at Stack Exchange.

This tells me that the legal agreement is completely one sided and Stack Exchange feels comfortable violating it without repercussion. If I, on the other hand, had violated a term in the agreement I'd be forced to hand in my diamond. This has eroded a ton of trust I have with the company.

In the announcement regarding how Generative AI can and should be moderated and in statements to the press, there has been disparagement against the moderators of the network. To me, the subtext of all of that reads as "we don't trust you to moderate correctly". If the company does not trust us to perform activities we've either been elected or appointed to do for our community, why are we still here?

Combining this with the incredible way this cycle all started and the fact that none of this mistrust was known by the moderation team, my trust of the company took a hit. This policy was announced at the end of May. Data was shared several weeks ago. In all of that, there are allusions to improper moderator activity and hints that moderators are banning so many people that engagement across the platform is down. It wasn't until yesterday (nearly a month) that moderators saw any discussion of these "improper bans". It was just...silent. This big, massive problem that could have been talked about back in February or March was just tossed into the public eye with the implication that moderators are doing the wrong thing. Then it took nearly a month for a conversation to begin.

The Stack Exchange network has lost at least four months of time where this "moderation problem" could have been discussed, policies adjusted, and moderators who deal with generative AI on their sites on the daily basis educating the company on how it's actually being detected. Instead, an easily disproved lie about using ChatGPT detectors has been blamed and shared repeatedly with the press for the reason for their sudden policy change.

My trust level of the company takes several hits here too. I dislike being lied to and I really dislike being lied about.

Finally, the method of communication through out the last month. The company has a team dedicated to managing the community. There have been many questions on this site and on child metas during the moderator strike. I have seen very little coming from the community management team to answer these questions. The community has questions and the company is not providing answers to them. Instead, we see announcements on topics that the community is against being announced. Long discussions, in public, are not occurring though. Which erodes my trust even further.

What's this all mean?¶

Where am I today then? How does the company rebuild my trust in them? My answer to that is that I don't know. This past month has eroded so much of my faith in the company to be the trusted repository of knowledge that it was in the past. It's also removed much faith that the company actually cares about the community. Much like the previous cycle we've seen details come out that reflect poorly on the company and employees attempt to respond to that only for more details to come out that make the response look like lies.

14 years is a lot of time to spend some place and not have strong feelings about. It makes simply accepting negative changes impossible and it makes walking away difficult. That's part of why I'm still here. The other part is the communities I mentioned in my introduction. I have built friends and acquaintances across the network and the sense of community that used to exist is a strong desire to remain. But, this isn't something that will hold the community as a whole together. I am an outlier in terms of a user on the network. Honestly, everyone reading this on Meta is an outlier.

The company's goal is engagement and traffic. I guarantee the moderation team has not banned enough users to bring down the traffic the network has seen since December. But, we are the scapegoat at least right now. We're nearing a month and there are users with access to site analytics. The number of bans has been close to 0. Theoretically, traffic should be recovering if we were the problem.

Stack Exchange Strike - Personal Frustrations

2023-06-21T11:00:00-05:00

Recap¶

The Moderation and curator strike started on June 5.
Stack Exchange has downplayed the effect of this to the press, while at the same time straight lied to the press about causes of the strike.
Stack Exchange removed access to the data dump back in March but never told anyone until they were called out on it in early June.
The moderation team has elected three representatives to engage with Stack Exchange to solve these problems and end the strike.
Stack Exchange launches the GenAI powered formatting assistant. The community quickly shows that it's very bad at its job and it is shut down temporarily.

What's new?¶

Yesterday Stack Exchange announced the upcoming launch of a new Prompt Design site. To say the community disliked the idea would be an understatement. The community has pointed out that this will be incredibly niche, with very short term answers because models update constantly. There are also concerns that this will essentially become a "write my prompt" (instead of "write my code") site.

I agree with all of those concerns.

Also hidden in this announcement is another announcement. Stack Exchange is changing their method of launching new sites. This is something that should have it's own announcement. This is a big deal and the community has been asking for improvements to the Area 51 incubator site for a decade. Unfortunately, the new method is to completely do away with all of the work that Area 51 does - building a community, setting initial questions, helping to set scope, ensuring there is an audience for the topic - and instead launching with a "Community Stakeholders" group that will do the work in either a private Stack Overflow for Teams instance or a read only chatroom.

Both of those options entirely exclude people that may want to participate but don't have access. It adds barriers that don't exist on Area 51.

My thoughts¶

The announcement sets a launch date of July 26 for this new site. This is the day before the CEO talks at WeAreDevelopers World Congress. A venue where the company has been promising a major AI announcement for months. This new site, combined with the formatting assistant failure from last week, is starting to show clearly what the company wants to do here.

I mentioned in my last post that I was becoming more pessimistic that this strike doesn't end with resignations. That continues to hold true. I also mentioned that Stack Exchanges seems to be keeping an eye on the Reddit strike - first with the blackouts, then with the John Oliver protests and currently with the NSFW toggles to prevent ads. Last night, Reddit started removing moderators as a result.

With the lack of updates and communication from Stack Exchange to the moderators and curators in the last week, I can't help but think that something similar is being discussed at Stack Exchange. Time will tell, but my feeling is that Stack Exchange is going to plow ahead with GenAI content on their platform. They are going to burn 15 years of trust and quality content and they are going to do it regardless of what the community wants. If the community protests, they will be shown the door.

Obviously, I'm still pretty pessimistic about all of this.

Stack Exchange Strike - Does Stack Exchange Care?

2023-06-19T14:00:00-05:00

Recap¶

The Moderation and curator strike started on June 5.
Stack Exchange has downplayed the effect of this to the press, while at the same time straight lied to the press about causes of the strike.
Stack Exchange removed access to the data dump back in March but never told anyone until they were called out on it in early June.
The moderation team has elected three representatives to engage with Stack Exchange to solve these problems and end the strike.

Where are we now?¶

Last week, I ended with a note that the data dumps should be restored by June 16. Good news. That has been completed and the dump was uploaded by June 16. It's progress, but after two weeks we have only accomplished one out of four tasks on the list of conditions to end the strike.

A retraction of the prohibition of moderating GPT content.
The private policy on GPT content that was issued to moderators must be revealed publicly.
The data dumps must be re-enabled and SEDE and API access guaranteed.
Stack Exchange, Inc. must communicate, gather feedback, and act on that feedback before making major policy or software changes to the public platform.

I've heard rumors that the second bullet may occur, but nothing has been done publicly. Thus, nothing outside of feedback from representatives to go on here.

GPT Content¶

On Thursday, June 15, Stack Exchange enabled their "formatting assistant". To say it went poorly, is an understatement. There are currently 52 answers to that question showing how it doesn't work. It's a thin wrapper around a version of ChatGPT or GPT4. It's also very, very bad at being a "formatting assistant". Instead, it's rewriting content, butchering code being asked about, making stuff up, answering questions and everything other than making formatting better. One user, Mithical, found the prompt the formatting assistant is using.

The one small positive that came out of this is that Stack Exchange did communicate with the community a few days before this was released. This isn't enough though.

Communication¶

Why isn't one post enough?

Behind the scenes it's become clearer to me that several staff members of Stack Exchange don't wish to engage with the community. I briefly touched on one of these people in my [last update][strikeweek1update]. The public press has been out of the loop sounding, the internal reactions on moderation channels has been complaints that moderators are too negative.

Of course moderators are negative right now. There is a strike going on because they are unhappy. The feedback from representatives continues to be filled with road blocks.

Where do I sit today?¶

I think it's becoming more clear that Stack Exchange is not interested in removing GenerativeAI content from it's site. It's actively building and promoting a tool that utilizes ChatGPT under the hood. I am very surprised that they pulled the plug on the formatting assistant after two days. Previous negative feedback has been ignored and I fully expected this one to be as well. The problem with pulling the plug on this, is that CEO has committed to exciting announcements about AI this summer. If the community just showed that one of those AI projects was a flop, they are going to go even harder at getting the next one to succeed.

As this drags on into it's third week, I've become more pessimistic that this strike doesn't end with resignations. Reddit had a two day strike during this time period, and they are already threatening to replace community moderators. The Stack Exchange CEO has expressed their fondness for Reddit on occasion, so I suspect that the action being taken over there is being considered here. Of course, the difference here is that Stack Exchange wasn't effectively shut down by the strike like Reddit was. Sites didn't go dark. Instead, curators and moderators stopped curating and moderating. Everything is still available, the sites are still working, it's just less tidy than usual.

Stack Exchange Strike Update 2

2023-06-13T23:00:00-05:00

Summary¶

Yesterday's update summarized the first week of the Stack Exchange Moderator strike. The strike began last Monday with an open letter to Stack Exchange.

Over the weekend the moderators, and community curators elected three representatives to talk with Stack Exchange. Those talks started this week. They went into these discussions to reiterate the four conditions to end the strike.

A retraction of the prohibition of moderating GPT content.
The private policy on GPT content that was issued to moderators must be revealed publicly.
The data dumps must be re-enabled and SEDE and API access guaranteed.
Stack Exchange, Inc. must communicate, gather feedback, and act on that feedback before making major policy or software changes to the public platform.

So, where are we now?

The update¶

Representatives¶

Discussions with Stack Exchange started off less than stellar, in my opinion. Quoting from the Vice President of Community at Stack Exchange, as relayed to those of us not in the discussions:

"So in summary: Cesar is my delegate for issues here, while I reserve final decision making to myself I"ve vested him with broad discretionary authority and we're meeting on a frequent (daily or multiple times daily) basis to clear any differences between us."

My take away here is that the Vice President of Community - a person who's job should involve dealing with the community doesn't think this is important enough to attend. While I've worked with Cesar in my role as a moderator, this is just making Cesar the car salesman that has to "talk with the manager about your offer". It's a way for the company to drag this out and a way to make the real decisions without moderation input. I fully expect to hear from the community representatives that Cesar liked a proposal but the VP did not but that the VP wasn't around to discuss why not. I'd love to be proven wrong on that though.

Data Dumps¶

Data dumps are the third bullet on our list of things that must be restored. Good news! Stack Exchange will have those restored by June 16, 2023.

This was posted (twice) by the VP of Community - the one not attending the talks above.

[...] Our intention was never to stop posting the data dump, only to begin to collect more information on how it was being used and by whom - especially in light of the rise of LLMs and questions around how genAI models are handling attribution. However, it’s clear that many individual users (academics, researchers, etc) have an immediate need to access updated versions of the dumps. So we are re-enabling the automatic data dumps (and uploading the one that’s about a week overdue). We believe that this can happen by end of the day Friday. We will continue to work toward the creation of certain guardrails (for large AI/LLM companies) for both the dumps and the API, but again - we have no intention of restricting/charging community members or other responsible users of the dumps or the API from accessing them. [...] In the meantime, the data dumps will be re-enabled by end of day Friday. We will communicate here when that has been completed or if there are any delays. We will also post here prior to making any future changes to the dumps or distribution of the dumps.

I suppose now we wait to see if there are any "delays" before Friday.

This message was confirmed by one of the Co-Founders that has since left Stack Exchange and originally committed to these data dumps back in June 2009.

I have confirmation via email from Prashanth that this is, indeed, the new official policy. I'm glad to see it. Creative Commons is part of our contract with the community, and it should never be broken -- however, CC does need to address the AI issue in an updated license, in my personal opinion. [...] - Jeff Atwood

I am happy with this concession and confirmation of the concession from our representatives, Stack Exchange and a Co-founder.

However, it's telling that once again it's Philippe making statements that are lies.

He's done it with posts to the press that I mentioned yesterday (moderators are depending on GPT detectors!). He's done it with the internal emails to his own coworkers. He's doing it again here.

Our intention was never to stop posting the data dump...

This directly contradicts that statement provided by the Stack Exchange Chief Technology Officer last week.

Stack Overflow senior leadership is working on a strategy to protect Stack Overflow data from being misused by companies building LLMs. While working on this strategy, we decided to stop the dump until we could put guardrails in place.

For being a VP of Community, the ability to communicate with the community is greatly lacking.

AI Policy¶

Not much more progress has been provided by the community representatives. From what I have seen, Stack Exchange is pushing to call an end of this with the promise of a new policy. But it's not done yet. They'll work on it with the moderators and once that's done, that will replace the current policy that started the strike. The representative mentioned they were pushing for a deadline on how quickly moderators would be able to commit to this change.

In my opinion, this is a way to end the community's moderation strike and agree to essentially nothing. It's another promise that something will happen. It gets the community back to moderation (which Stack Exchange employees have been doing for the week), and if they break that promise the effort to re-organize action has to start all over again.

Right now, I'm not agreeing to go back to utilizing my free time to perform moderation duties without knowing what the new policy is.

Where do I sit?¶

Much like yesterday, I continue to re-evaluate my relationship with Stack Exchange. I'm really happy that the data dump has been restored. The messaging around it though continues to erode my trust in the company's actions. This was also one of the easier items for Stack Exchange to agree to, even though it looks like a co-founder may have had a hand in resolving this as well. I don't know if that's true and I appreciate the work the representatives have done to resolve our first point of contention.

It's also very telling that the messaging doesn't mention the restoration in the context of the strike at all. If the company was attempting to build goodwill in this environment, I'd think they would point to the conditions in the open letter and tie the enablement of the data dumps directly to that. Instead, we got a statement that says they didn't intend to stop posting the data dump, directly contradicting a previous statement saying senior leadership decided to stop the dump.

Amazing.

Stack Exchange Strike - Week One

2023-06-12T09:00:00-05:00

Summary¶

A week ago, many Stack Exchange diamond moderators began a moderation strike. They have been joined by power users and curators. At the time of this post, there are over 1200 users that have signed the open letter to Stack Exchange.

This strike has been covered in a few news articles, but I suspect this week's actions against Reddit and their new API pricing changes will overshadow Stack Overflow for a little while. That's fine with me. Perhaps cooler heads will prevail when there is less public focus.

The Stack Exchange strike has been covered:

On Gizmodo
On Vice
On DevClass (with an interview from a fellow Charcoal power user and Stack Overflow moderator)

The important thing in this set of articles is the public statement that was released by Philippe Beaudette, Vice President of Community (taken from the Vice article above).

A small number of moderators (11%) across the Stack Overflow network have stopped engaging in several activities, including moderating content. The primary reason for this action is dissatisfaction with our position on detection tools regarding AI-generated content. Stack Overflow ran an analysis and the ChatGPT detection tools that moderators were previously using have an alarmingly high rate of false positives.

We stand by our decision to require that moderators stop using the tools previously used. We are confident that we will find a path forward. We regret that actions have progressed to this point, and the Community Management team is evaluating the current situation as we work hard to stabilize things in the short term.

They doubled down on this explanation in two meta posts. The initial statement and a post with "data". I encourage readers to spend a while reading through that second link and the answers. The community is skeptical of the conclusions drawn and have counter arguments and data scattered in the answers.

Finally, during the week it was discovered that Stack Exchange stopped their quarterly data dump of all content. This was announced after a former employee stated that the data dumps were turned off in March.

Stack Overflow senior leadership is working on a strategy to protect Stack Overflow data from being misused by companies building LLMs. While working on this strategy, we decided to stop the dump until we could put guardrails in place.

We are working on setting up the infrastructure to do this correctly in the age of LLMs --- where we continue to be open and share the data with our developer community but work to set up a formal framework for large AI companies that want to leverage the data.

We are looking for ways to gate access to the Dump, APIs, and SEDE, that will allow individuals access to the data while preventing misuse by organizations looking to profit from the work of our community. We are working to design and implement appropriate safeguards and still sorting out the details and timelines. We will provide regular updates on our progress to this group.

Where are we now?¶

With the summary out of the way, where do we sit now?

As of midnight today, the users that have stopped moderation activities have selected three representatives to be our voice in conversations with Stack Exchange and listed the conditions for ending the strike.

A retraction of the prohibition of moderating GPT content.
The private policy on GPT content that was issued to moderators must be revealed publicly.
The data dumps must be re-enabled and SEDE and API access guaranteed.
Stack Exchange, Inc. must communicate, gather feedback, and act on that feedback before making major policy or software changes to the public platform.

I want to point out that there is nothing here about GPT detection tools. That's because this isn't the reason for the strike. Despite what Stack Exchange has said in their messages to the press, this isn't about detection tools. (I also have an answer on that question about the policy's origination.)

The discovery that the data dumps were turned off has angered many people - those both already involved and others that learned of it this week. The fact that these were turned off over two months ago and nothing was said to the community has made the situation even worse.

What's Stack Saying?¶

This section was added after original publication

Shortly after publishing my thoughts with this article, internal Stack Exchange emails were published. These show how Stack Exchange is communicating this with their employees.

I highly encourage everyone to read those too. There is a lot of reading scattered around to get the full scope of how unhappy the users of Stack Exchange are.

I think it's telling that the company managed to copy and paste from the strike letter, but at the same time managed to completely ignore that this isn't about GPT detectors. Even worse, the company is spreading that falsehood to its employees, and the press. On top of that, has two teams - Community Leadership and Marketing - working on communications, yet no progress has been made.

Where do I sit?¶

I mentioned last week that I've been here for over 13 years. I've applied and interviewed at the company. I've made friends with the employees and gotten recommendations during those interviews. I've built tooling to help moderate comments and eliminate spam on the network. I've been here a long time.

In 2019, I reevaluated my role on the network during Stack Exchange's last screw up. In that one, they managed to libel a moderator, by name, to the press. This event still reverberates through the network today and serves as a brick that this current situation is built with. Stack Exchange destroyed 10 years worth of trust and community relationships in that event. They've tried to rebuild it over these last three years and have been marginally successful. That's gone again.

Now I'm reevaluating how I utilize my free time again. It's not constructive to say I'll hand in my diamonds and walk away if the conditions above aren't hit. But, it's worth noting that I agree with all four of those conditions. This is my free time I'm donating and if the organization I'm donating that time to has changed their philosophy, I will take that into consideration as I reevaluate.

I think we are long past the point of "how it used to be" at Stack Exchange. The question I am asking my self is whether or not I agree with the new direction the platform is going.

Updated after the email publication, above

Reading the email and FAQ that Stack Exchange sent to their employees, I am struck by how out of touch members of the company are. Not the people that don't interact with the community. The people that should know the pulse of the moderation teams, the community opinions, and users in general - like the Vice President of Community. While I'm not surprised that they are down playing the strike, I am surprised that they are flat out lying to their employees.

Stack Exchange should be called out on that and their employees should know that it's happening. Stack Exchange, Stack Overflow, Inc., is lying to their employees. The email presented was written after the strike started and contains information about all of the FAQ items mentioned.

Joining the Stack Exchange Moderator Strike

2023-06-05T01:00:00-05:00

Summary¶

I'm signing an open letter to Stack Exchange because those of us that volunteer our time and energy have been put in an impossible situation. On Stack Overflow we banned ChatGPT created content almost immediately after it was released. Theoretically, that is no longer in effect.

This isn't due to a change in community perception. Instead, it's due to an abrupt policy change on Stack Exchange's part that was posted on May 30. It's important to note that this public policy does not match the guidance that was provided privately.

The relevant portion of this public policy is:

In order to help mitigate the issue, we've asked moderators to apply a very strict standard of evidence to determining whether a post is AI-authored when deciding to suspend a user.

The private policy notes that "very strict" is essentially "don't moderate unless they explicitly say it was created by an AI".

Background¶

The Stack Exchange CEO posted a blog post at the end of May 2023. In this post, they stated:

Approximately 10% of our company is working on features and applications leveraging GenAI that have the potential to increase engagement within our public community and add value to customers of our SaaS product, Stack Overflow for Teams.

This goes against the community desire to not have GenAI content on the site. The CEO has not provided any feedback to the community, other than a note that there is a big summer project with GenAI. The community reaction hasn't been positive.

Combining this announcement with the newly announced policy the previous day, and an astonishing inability to articulate more details or reasoning, has produced a lot of unhappy moderators.

Additionally, I feel that the company has destroyed 3 years of rebuilding trust. Back in 2019, they destroyed this trust and nearly had a moderator strike then too. That involved providing information to a journalist that had very limited context and due to that, presented a single moderator in unflattering light. The feelings from that have taken years to rebuild and even today that incident is cited as a low point, and users can point out that singular incident when trust of the company plummeted.

The public announcement to not moderate GenAI content contained this:

Through no fault of moderators' own, we also suspect that there have been biases for or against residents of specific countries as a potential result of the heuristics being applied to these posts.

That's just wrong. Absolutely wrong. It is 100% inaccurate and Stack Exchange has offered no data to back this up. User country of origin, region in the world, or any kind of physical location is not available to moderators. We are presented a flag and given information to the content that has been flagged.

Why am I participating?¶

So...my participation? Why am I joining in?

Stack Exchange, and Stack Overflow, thrive due to the human element. It's userbase has been around over a decade and answered millions of questions across over 180 sites. The week ChatGPT came out, the community saw the bad results it can provide. For the past 6 months, we've continued to see how bad that is. The moderation teams across the network to generate a policy and get it approved by the company. I've reproduced it in full below, but as of this post it is still on the site and contradicts the new policy.

In addition to completely ignoring the community's input on how they do not want GenAI on the site, the company ignored their own moderator agreement.

With the assumption that the above will change at some point, the relevant section is pasted here:

``` Stack Exchange, Inc. agrees that it will:

i. Respect your privacy per the terms of the Privacy Policy for the Public Network. ii. Get your explicit written permission before commenting to any media (including media outlets controlled by Stack Exchange Inc.) or independent reporters about you or your moderator actions as per our Press Policy. iii. Allow you to resign your position for any reason without penalty or repercussions. As a volunteer, Stack Exchange, Inc. respects your time and will release you from duty should you ask. iv. Operate “Stack Gives Back”, an annual program giving to selected charities in honor of our moderators. v. Post previews for review of all new official policies in the Moderators Teams instance with the policy tag, marked with links to their public version once published, and maintain a listing of all official network-wide policies with links to them in the Help Center. vi. Announce changes to the moderator agreement no less than sixty days before the deadline to accept the new agreement with a period of at least thirty days for discussion and review. vii. Provide support for your questions, requests and concerns on the Moderators Teams instance and/or the Teachers’ Lounge, direct email to CMs, and content on Meta escalated to staff by whatever formal documented process is in effect at the time. viii. Respect your right to speak openly to question and challenge policy without reprisal so long as such speech does not break the Code of Conduct. ```

The relevant section is vi. There was no discussion period on this. No engagement with the moderators, or the community. Instead it was "effective immediately." The best we've gotten so far is an "oops, sorry."

That's not how something in a /legal link should operate. If I can't trust them to uphold an agreement they have in writing, why should I trust them to uphold anything else?

What does the community want out of this?¶

From the open letter:

Until Stack Overflow, Inc. retracts this policy change to a degree that addresses the concerns of the moderators, and allows moderators to effectively enforce established policies against AI-generated answers, we are calling for a general moderation strike, as a last-resort effort to protect the Stack Exchange platform and users from a total loss in value. We would also like to remind Stack Overflow, Inc. that a network that entirely relies on volunteers for its moderation model cannot then consistently ignore, mistreat, and malign those same volunteers.

My feelings on this¶

I've been on Stack Overflow and Stack Exchange for over 13 years. I've been a moderator on the network since 2014 and on Stack Overflow since 2017. I've written about Stack Overflow a bit over the years. I've participated in Charcoal, the spam fighting community since 2015ish. Charcoal has automatically flagged more than 86,000 posts across the network since 2016.

I've applied to several jobs at Stack Exchange. I've interviewed for a couple positions. I have spent a lot of time with the network, the community, moderators, and employees making this a great place for internet users to find their answers.

I have helped to build the community/company trust that I mentioned above. I watched it crumble. I thought about leaving in 2019, but instead spent the next three years working to rebuild that trust. I'm at the point again, where I see the company not understanding their community. At all. It's sad that this cycle has repeated itself and it's worse that the company is, again, tossing their most engaged users under the bus.

This strike serves two purposes in my mind - the first is the officially stated one. Do not allow GenAI content on the network. It will erode the value of the network quickly. We've also demonstrated that ChatGPT and it's peers are not great at answering complex questions. But, second, and unofficially, this strike will represent a change for Stack Exchange to show whether or not they care about what the community has to say. This is, I believe, the last opportunity for them to retract this policy and reflect on why they are over ruling so many communities that reject GenAI in their community. It's their last opportunity to show they support their moderators and the human aspect of moderation. Failure to do either means that Stack Exchange has given up on community building.

I am a volunteer for this community. I would love to continue that role, but this is the best way for me to show that GenAI is not the right path for the company to take.

What's Next¶

On June 5, 2023 at midnight, moderator local time, the network will start to see moderation activities cease (or slow down). I will be part of that.

GPT Policy¶

This policy was crafted with the input of moderators and Stack Exchange. An important thing to notice is that the moderators are empowered to issue suspensions. This is something the new policy prevents.

``` Why posting GPT and ChatGPT generated answers is not currently acceptable

This Help Center article provides insight and rationale on our policy regarding the usage of GPT and ChatGPT on Stack Overflow. While this is the position of Stack Overflow staff, it’s meant to support the prior work done by moderators (namely, the temporary policy issued to ban contributions by ChatGPT).

Stack Overflow is a community built upon trust. The community trusts that users are submitting answers that reflect what they actually know to be accurate and that they and their peers have the knowledge and skill set to verify and validate those answers. The system relies on users to verify and validate contributions by other users with the tools we offer, including responsible use of upvotes and downvotes. Currently, contributions generated by GPT most often do not meet these standards and therefore are not contributing to a trustworthy environment. This trust is broken when users copy and paste information into answers without validating that the answer provided by GPT is correct, ensuring that the sources used in the answer are properly cited (a service GPT does not provide), and verifying that the answer provided by GPT clearly and concisely answers the question asked.

The objective nature of the content on Stack Overflow means that if any part of an answer is wrong, then the answer is objectively wrong. In order for Stack Overflow to maintain a strong standard as a reliable source for correct and verified information, such answers must be edited or replaced. However, because GPT is good enough to convince users of the site that the answer holds merit, signals the community typically use to determine the legitimacy of their peers’ contributions frequently fail to detect severe issues with GPT-generated answers. As a result, information that is objectively wrong makes its way onto the site. In its current state, GPT risks breaking readers’ trust that our site provides answers written by subject-matter experts.

Moderators are empowered (at their discretion) to issue immediate suspensions of up to 30 days to users who are copying and pasting GPT content onto the site, with or without prior notice or warning. ```

The letter¶

The letter below was originally posted as an open letter to Stack Exchange. I've reposted it here for a record.

``` June 5, 2023 Stack Overflow, Inc. has decreed a near-total prohibition on moderating AI-generated content in the wake of a flood of such content being posted to and subsequently removed from the Stack Exchange network, tacitly allowing the proliferation of incorrect information ("hallucinations") and unfettered plagiarism on the Stack Exchange network. This poses a major threat to the integrity and trustworthiness of the platform and its content.

We, the undersigned, are volunteer moderators, contributors, and users of Stack Overflow and the Stack Exchange network. Effective immediately, we are enacting a general moderation strike on Stack Overflow and the Stack Exchange network, in protest of this and other recent and upcoming changes to policy and the platform that are being forced upon us by Stack Overflow, Inc.

Our efforts to effect change through proper channels have been ignored, and our concerns disregarded at every turn. Now, as a last resort, we are striking out of dedication to the platform that we have put over a decade of care and volunteer effort into. We deeply believe in the core mission of the Stack Exchange network: to provide a repository of high-quality information in the form of questions and answers, and the recent actions taken by Stack Overflow, Inc. are directly harmful to that goal.

Specifically, moderators are no longer allowed to remove AI-generated answers on the basis of being AI-generated, outside of exceedingly narrow circumstances. This results in effectively permitting nearly all AI-generated answers to be freely posted, regardless of established community consensus on such content.

In turn, this allows incorrect information (colloquially referred to as "hallucinations") and plagiarism to proliferate unchecked on the platform. This destroys trust in the platform, as Stack Overflow, Inc. has previously noted.

In addition, the details of the policies issued directly to moderators differ substantially from the guidelines outlined publicly, with moderators barred from publicly sharing the details.

These policies disregard the leeway historically granted to individual Stack Exchange communities to determine their policies, by making changes without the input of the community, overriding community consensus, and outright refusing to reconsider their position.

Until this matter is resolved satisfactorily, we will be pausing activities including, but not limited to:

Raising and handling flags.
Running SmokeDetector, the anti-spam bot.
Closing or voting to close posts.
Deleting or voting to delete posts.
Reviewing tasks in the various review queues.
Running various other bots designed to assist in moderation, such as detecting plagiarism, low-quality answers, and rude comments.

Until Stack Overflow, Inc. retracts this policy change to a degree that addresses the concerns of the moderators, and allows moderators to effectively enforce established policies against AI-generated answers, we are calling for a general moderation strike, as a last-resort effort to protect the Stack Exchange platform and users from a total loss in value. We would also like to remind Stack Overflow, Inc. that a network that entirely relies on volunteers for its moderation model cannot then consistently ignore, mistreat, and malign those same volunteers. ```

Stack Overflow bans ChatGPT temporarily

2022-12-05T14:45:00-06:00

Today Stack Overflow moderators (myself included), have implemented a temporary ban on ChatGPT on the site.

Use of ChatGPT generated text for posts on Stack Overflow is temporarily banned.

This ban was picked up immediately by several technology news outlets, including ZDNet, TheVerge, and Vice. It was also picked up by CNN, The New York Times, and the Washington Post.

That's a lot of news coverage.

The question that I want to answer is "Why?". Stories and marketing from OpenAI, give reasons why this new chatbot is a "good thing". With their examples, it definitely looks that way. But, in practice, it's not working out so well.

The responses that users have been posting on Stack Overflow have a high rate of being incorrect. Normally, the community can handle this, but the responses aren't your usual small code snippet of an answer. Instead it is a long, detailed, explanation that looks plausible. But, it's wrong. Combined with users posting multiple answers an hour, this is a lot of content that Stack Overflow reviewers (or worse, the handful of elected moderators) to go through and determine if it's valid.

At the time the ban was implemented, we'd seen thousands of answers generated by ChatGPT. On the one hand, this is impressive work on the ChatGPT AI itself. It can be difficult to detect and is good at holding a conversation. On the other hand, and more important from Stack Overflow's perspective, this isn't helping the user base. Thousands of subtly wrong answers is awful. It doesn't help the user looking for help, and it will very quickly destroy the trust that millions of developers put in the site if this is allowed to continue.

I'll admit that I'm disappointed that AI hasn't reached the point where it can do what ChatGPT seems to do. But, this is a step forward. Unfortunately, this step seems to have left a bad taste in the mouth of developers looking for help beyond a toy example.

For the time being, ChatGPT is banned on Stack Overflow. The moderation team will continue to work with the company to ensure the community we have volunteered to moderate remains one of high quality. Additionally, as the larger Stack Exchange network of sites debates a similar ban, the Stack Overflow moderation team will be able to provide input on our experiences.

Stack Overflow still has issues and it's getting worse

2019-03-29T10:34:00-05:00

Last time on this blog¶

A little over a year and a half ago, I wrote an article about Stack Overflow's problems from my perspective as an experienced user. This was before I was elected as a moderator on Stack Overflow. I ended the previous article with this:

I continue to invest my time and effort into the community, but even as an active user who really wants the company and community to succeed, it's getting harder and harder to ignore that those of us that have been around for years are not being listened to any more. We're being treated as the grumpy old person that grumbles about the way things used to be. Our experiences on the site are brushed aside as being unhelpful to new users. That completely ignores that fact that we are still trying to reach the goal on which Stack Overflow was created: "With your help, we're working together to build a library of detailed answers to every question about programming." To do this, we need high quality questions and answers so that we can actually provide help to all users. I think this is the biggest challenge that Stack Overflow is going to face in the next 18 months.

So, what's happened in the last 18 months?

Documentation¶

After years of development (being announced in 2015), Documentation was shuttered in August of 2017. Stack Overflow wasn't drawing users to the Documentation feature. Their own metrics and analysis showed that fixing Documentation to be useful to users - both new and experienced - would require a significantly larger team.

What went wrong?¶

In my opinion, and as I mentioned in 2017, Stack Overflow has ignored its user base. This is going to be a recurring theme in this post. For years, users provided feedback on meta, in dedicated user experience interviews and in chatrooms. This resulted in superficial changes and major rewrites. Yet, complaints still existed. These complaints turned off the experienced users that could produce the high quality documentation. Instead, Documentation became a reputation farming operation in all but name. This turned off even more users.

By Stack Overflow's own admission when sun setting the feature, Documentation was built to solve a problem that wasn't really a problem.

Finally, our research showed that while a lot of developers were dissatisfied, the current state of programming documentation is not universally broken the way Q&A was when Stack Overflow started. In particular, we heard over and over that Stack Overflow has become de facto documentation for many technologies. As many of you pointed out, Stack Overflow is already good enough at providing documentation of obscure features. Even when considering just the company's mission of helping programmers “learn, share their knowledge and build their careers”, Documentation isn't the most efficient use of resources.

Two years of major development, focusing on a problem that the community had not been enthusiastic about, and intentionally ignoring other feature requests and other improvements angered a lot of users.

Teams¶

In my last post, I mentioned that Teams had been launched and shut down in less than a year. Teams is back! At least the name is. Initially launched as "Channels", and later re-branded to "Stack Overflow for Teams", this is a money generating route for Stack Overflow. It uses the old URL.

Now, generating money is good. It's good for both the community and the company. Without money, the company can't survive. Without the company there is no Stack Overflow or community. My problem isn't with money generation. My problem is that, once again, community feature requests for higher quality and moderation tooling to cultivate that higher quality was ignored.

By all accounts, Teams seems to be doing well and bringing in revenue. I am hopeful that this translates into development time to build out the features the community still clamors for.

Meta hatred¶

Meta. It's murder. Until it's not. Meta is how Stack Overflow communicates with the community. It's how the community communicates with itself. It's where governing principles/thoughts/guidance/sticky notes comes from. In short, meta is a large part of how Stack Overflow the company and Stack Overflow the community talks with one another. Decisions are questioned here, announcements are posted here, and little by little the site is made better.

That is, until nothing happens. Stack Overflow's response time has become a meme.

"6 to 8 weeks" is a joke. It's used to indicate that something isn't going to be built or changed. It's so prevalent that this comment crops up over and over on feature request posts. It's used by the community to say that nothing is going to happen.

When something does happen, it's a "big deal". There have been a few examples in the past year. Unfortunately, these changes happened due to feedback from Twitter, not Meta. For years we've been told to post on Meta. For years we've been told that Meta is where the company will engage with us. Then two massive changes happened.

The Welcoming¶

The first change was to make Stack Overflow more "welcoming". This isn't bad. As both an experienced user and as a moderator, I've seen my fair share of users not being welcomed. I've seen hostility to poorly asked questions.

Unfortunately, this whole blog post and resulting meta-drama appears to have cropped up because of a post on Twitter from someone who felt unwelcome. That's fair. I believe they felt that way. However, from my point of view, Stack Overflow ignored their own users (some of whom had been saying the exact same thing for years) because it was suddenly posted on Twitter where the entire world could comment on things that may have been out of context. Instead of listening to their own users and the experiences those users had, Stack Overflow went into damage control mode and rapidly updated its "Be Nice" policy.

Whether this is actually what happened or not is really beside the point. Many long time users had this perception. Meta was ignored. User feedback was ignored. Instead, the person that could shout the loudest and had made the most noise appeared to be the one that was listened to.

A few months after the welcoming blog was posted and a month after the update, another post was made about how the company was attempting to classify comments. The idea behind this was good, the execution of the blog post was not. In the initial version of the post, exact comments were posted to show "bad comments". I disagreed that a few of them were rude. I'd have removed them as no longer needed without a problem. Honestly, I'd probably have removed them as rude too, because comments don't need to stick around and it's easier to accept the rude flag than it is to decline and manually delete.

My problem was that the exact comment content was posted as a "wall of shame". Then, despite only employees being involved, none of these comments were removed or even flagged for moderators to remove. In short, it really was a "wall of shame".

I believe I covered my disappointment in both this failure and in the technical aspect in my comment on the blog.

I am a huge fan of automatically removing unwanted comments. I did so for several years. That said, I’m disappointed in how this is playing out here. I’m disappointed on both a community level and a technical level.

On the community level, I am very disappointed that 57 Stack Exchange employees were able to evaluate bad comments, determine they were bad enough to put in the hall of shame post here, and then do nothing about them. It took users less than 15 minutes to find those comments on Stack Overflow and identify the “rude” users. Users who are rude because they asked why a certain tag was on a question. Did none of your 57 users have a diamond where you could remove the comment from the site? Even if that’s the case, all of you have the option to flag a comment. Even that wasn't done.

On a technical level, you evaluated less than 4000 comments. That is a few hours worth of comments on a single week day. (source: http://data.stackexchange.com/stackoverflow/query/872382) Is that really representative? How did you determine which comments to use in your evaluation?

The good news is that the comment samples were edited to be "representative" of the problem later.

Welcoming users is great. Helping users is the purpose of the site. I fully support all of that. What I don't support is ignoring the feedback mechanism you've built and told everyone to use for years because someone else with a lot of Twitter followers put Stack Overflow in a bad light. Yes, it should be fixed and should have been fixed sooner, but the perception of "listen to the loudest shout" is not a good look.

Which brings me to...

Removal from Network Questions list¶

In October, the entire "Twitter shouted, Stack Overflow reacted" repeated itself. This time, a user was offended (while on Stack Overflow) over the Hot Network Questions list for two questions on another Stack Exchange site. In under an hour, Stack Overflow (the company) removed that question from the hot network questions list.

The community in question was shocked by the result. A community manager explained the decision on that site's meta

It was the solution we chose - without consulting IPS - because it was effective and easy to implement since it would fix the perceived problem immediately and there was already a technical solution in place for doing it.

Notice a couple things here that stand out to me:

"perceived problem"
"without consulting IPS"

The company knee-capped an entire community and a large source of their traffic (the Hot Network Questions list) because of a single Twitter comment. Understandably, the community was upset.

Behind the scenes was even worse. On Twitter, the original user posting their complaint was engaged by community moderators. It didn't go well. Then they complained about that. A Stack Overflow employee jumped into the thread with the following:

If the DM trolls claimed to be moderators on any of the sites then I'd like to follow up with the community team and see about getting removed - they take this very seriously.

Turns out that Stack Overflow doesn't value their community moderators. One employee might be misguided, but this Twitter reply remained active and moderators across the network clamored for an official response. A moderator reached out to the Twitter user in good faith and was threatened with removal by a Stack Overflow employee.

One of Stack Exchange's most respected moderators posted their frustrations on Medium. I highly recommend you read it. One of the community managers posted a response on their own blog too.

The "super-official almost response" was posted even later. This was more than 10 days after the original incident. It took half a month for a first draft of a "moderator social media guidelines" post to be made in the private Stack Moderators Team. That post consisted of bullet points on how a moderator should behave on social media. I replied to that post with this

I am underwhelmed by this response. The event that led to this post and recent discussions around Stack Exchange (and the broader internet) wasn't due to a moderator's bad behavior. Moderators engaged a user on Twitter following the bullets in this post, and yet stuff still exploded in everyone's face. From my point of view, this post is so far down the list of responses that I was hoping to see from Stack Exchange that I'm feeling insulted.

I was asked to hold my judgment until the final draft was posted. That took place in December - two months after the incident. It was changed from "Social media guidelines" to a "community emergency process". These four bullets were provided:

Introduce yourself and if necessary, your role as moderator of a SO/SE site.

Offer to help with the situation, and be very respectful if someone declines your assistance. Sometimes, people just want to vent, and the best thing we can do to help is to give them space.

Be aware of the volatile nature of online discussions; if the path to constructive discourse becomes blurred, it's often best to disengage.

Keep your interactions with others, concerning SO/SE, as clear and as kind as possible. If things begin to get out of hand, please disengage and let us know about it.

In short, do exactly what the moderator did initially on Twitter which resulted in the threat of being removed.

Communication¶

Stack Overflow is slowly isolating itself from the community. There have been multiple comments scattered around the network saying the employees don't want to engage on any meta. There are community managers that are feeling hated because of complaints users have made. Users are taking out their anger of being ignored on posts talking about new or unrelated features. In turn, the employees engage just a little bit less. Lines are being drawn. I see it as a moderator. I see it as a user. Very slowly the community is trusting the company less and less.

Everything is becoming "us" vs. "them". There is "the company" vs. "the users". Blog posts, comments, meta discussions also appear to be driving a wedge between "the users" and making it "new users" vs. "established users". In the blog post announcing the search for a new Stack Overflow CEO, this comment was made by the current CEO:

One thing I’m very concerned about, as we try to educate the next generation of developers, and, importantly, get more diversity and inclusiveness in that new generation, is what obstacles we’re putting up for people as they try to learn programming. In many ways Stack Overflow’s specific rules for what is permitted and what is not are obstacles, but an even bigger problem is rudeness, snark, or condescension that newcomers often see.

The underlying sentiment - improving inclusiveness and diversity - is great. I'm all for that. The rest of it, though, is a dig at the established community in the same way that the Welcoming blog post was. Stack Overflow's high quality standards are the problem. It makes the community seem rude and abusive. You should stop closing those questions, stop down voting new users, and just be nice. It doesn't say that directly, but that's how existing members are seeing it. Read under hairboat's answer to see some of the simmering feelings of high reputation users.

The idea of trust between users and the company is brought up in the comments. This is just another example, in a long list, where the community and the company are butting heads. Something happens that the community doesn't like - reacting to incidents off site, focusing on features no one asked for, not explaining why these new features need to be done, comments are made by one side that makes the other seem unflattering - and another round of not trusting the company starts again.

The company has had a decade of experience with this community. It's grown, shrunk, and grown again. For most of that time, there has been fairly open communication and trust. I am afraid that trust has eroded over the last few years and can't be recovered.

What can be done?¶

The company wants to focus on areas that can bring in more money. In my previous post I quoted the President and Chief Technology Officer of the company.

I appreciate that there are a lot of issues on Stack Overflow that need to be addressed, and maybe we haven't been responding to them as quickly as we should. But Stack Overflow Q&A is a big, established product, most of the problems left are hard, and we can't let maintenance become the only thing we work on or we'll just slowly run out of money and go out of business. We are trying to both maintain Q&A and solve new problems for developers and reach new audiences. The latter is hard, and maybe we'll fail on a lot of our ideas, but we're not going to stop trying. – David Fullerton May 17 at 21:10

I bemoaned that this sounded that Q&A was feature frozen. It's been nearly two years since that time. I can't remember a new feature that was introduced into Q&A that helped the community maintain high quality posts. There was a new wizard introduced for new users that is supposed to help. A quick look at the review queue numbers on Stack Overflow shows that they are still stable at the same point it was two years ago.

My suggestion as a user, a moderator and someone interested in seeing Stack Overflow remain successful, is to focus on helping to manage the quality of your content. Users have been asking for years to be able to better handle poor content. They've asked for tools (both system tools and moderator tools). There have been projects started, stopped, restarted, and stopped again that are supposed to improve quality. Community tools have been built to help deal with quality problems. Use some of this!

Stack Overflow has a data science team. Work with the community directly to help figure out ways to prevent low quality content from ever getting posted. Force users - all users - to post higher quality content. Work with the communities that have developed automated tools. Run it with larger data sets. Even if Stack Overflow has to be more conservative than the community tool, if you can prevent some of the low quality content from making it to the site you have a victory.

Obviously the company can't ignore the areas that bring in revenue, but it's becoming increasingly clear that the community is much less forgiving than they used to be. Continued communication blunders will not help with anything.

Where to from here?¶

I ended my last post with this:

I want Stack Overflow to continue to grow. I also want Stack Overflow to have high quality content. I think my experience and the experience of others can help build the features to accomplish this. We just need Stack Overflow to refocus on the Q&A portion of their network again.

I think that holds true today, just as it did 18 months ago. The aspect of the site that draws users in is Q&A. Make it better. Make the content better. Give users tools to make it better. With all of this, I believe, the "welcoming" aspect will improve. Let the system handle the low quality stuff automatically. Eliminate the need for users to ask basic questions or remind users to post their code. Let the system be "the bad guy", and let users interact and help one another.

We'll see how everything looks in 18 months. In the meantime, I'll be here, cleaning up the low quality content and prodding the company to provide improvements to Q&A.

Collecting Diamonds on Stack Exchange

2017-08-18T10:06:00-05:00

Introduction¶

It's been over two years since I first ran for moderator on Stack Overflow. I've run for moderator three times, previously on Stack Overflow. In each election I've done better, coming in fifth in the third election. Well, it's been a little over 8 months since the last one and new moderators are needed again. I decided to run once more with the knowledge that if I lost, I probably wouldn't run again in the next election.

Nomination Phase¶

The nomination phase started off as usual, with a handful of users posting their nomination. This time there were 12 candidates, meaning there would be a primary to narrow it down to 10 before the final election. My nomination was the following:

I'm Andy. I've answered the questions posted by the community here I encourage you to take a look.

Why should you vote for me?¶

I've been a moderator on Community Building for several years. I was appointed to a moderator position on Hardware Recommendations. I know the moderator tools and have worked with the current moderators.

I'm active in the review queues (I am the top reviewer in the Low Quality Post reviewers, with over 26,500 reviews in this queue). I also enjoy the other moderation aspects of Stack Exchange.

I believe that moderation can be tool assisted. I've helped to flag a sizable percentage of comments on Stack Overflow. I've helped build the community spam detection bot. These types of tools help eliminate the obvious bad stuff so that moderation time can be spent on the less obvious stuff.

I have a history of good community moderation, am here consistently, and believe I can help the current team.

Nomination Reflections¶

An astute reader may have noticed this is pretty similar to previous nomination posts. There are a couple major differences though. The first thing is that I am the number one Low Quality Posts reviewer on the site. I am pretty proud of this particular statistic. It shows just how much work I've done during my tenure at Stack Overflow to improve the quality of the site. The unfortunate thing is that I'd probably lose this position as a moderator because they don't sit in the review queues.

The other major change was that I had picked up a moderator position on Hardware Recommendations. That happened at the end of June. Hardware Recommendations is about ten times the size of Community Building (a site I've moderated for several years). It's also a couple orders of magnitude smaller than Stack Overflow.

Primary Phase¶

The most exciting part of the election season is the primary phase. The community can see the scores of users over time and have built tools to watch those numbers change over time. It turns out that this time, my numbers were really high.

There were plenty of good people in the election this time. One interesting thing that I found was that a lot of candidates, like me, were supportive of automation. Several users utilized bots that posted low quality content to various chat rooms. This is a big change from previous elections. It was a welcome change. I think that automated quality content checking can help a lot.

Election¶

The election ended on August 1st. (A busy day for me, apparently). It was a close election. Most surprisingly, no one won in the first round with everyone else picking up carry over votes to get second. I think that speaks to the quality of the candidates. After 8 rounds in OpaVote, both Cody Gray and I were elected the two newest moderators on Stack Overflow!

Post Election¶

The election ended a few weeks ago. I handled more moderator flags in my first hour as a Stack Overflow moderator than I had at both Community Building and Hardware Recommendations combined. What I'm saying...Stack Overflow has a ton of flags that need to be handled. In good news, since the election, we have gotten the moderator queue down from about 1,100 flags to about 75 at any given time. I doubt it will stay that low, but it's still nice to see that I was immediately helpful.

Finally, since the election I turned off the comment flagging bot. It had been used for just over 3 years. The community is currently debating whether or not it should run under a moderator account. The thing that I am finding more interesting about this discussion is that the community seems to agree it's helpful, respects the 99+% accuracy, would love for Stack Exchange themselves to run this tool, but doesn't want the bot to run with moderator privileges. There is, however, a very sizable portion of the community that does want this done under my account. We'll see how this plays out, but I'm hoping to be able to use the bot again soon.

Stack Overflow's Problem - Feedback from an experienced user

2017-05-22T23:45:00-05:00

Introduction¶

Stack Overflow launched in 2008. As it nears its 9th year of operation, I am afraid the resource that I depend on is losing its way. Stack Overflow launched after I graduated college. I can't imagine how helpful it would have been during that time period, but it's been invaluable in my professional career. I joined the site about a year after its public launch, in October 2009.

In that time, I've gone from lurker to participant to moderator candidate (several times). I know Stack Overflow and Meta Stack Overflow. I am a moderator on another Stack Exchange site and have a good understanding of how the network operates. I also am one of the most prolific reviewers in the Stack Overflow Low Quality Posts review queue and have built several applications that work with the Stack Exchange API. I am a power user and know the network and the community.

With those credentials out of the way, I want you to understand that I am active on the network. I am in good standing on Stack Overflow and am not a disgruntled user. I am a concerned user. I am getting more and more concerned that Stack Overflow - the company - is losing its way.

This post isn't another "Stack Overflow sucks" post (Google if you're curious). I'm going to present a few areas that I'm concerned about and hopefully provide either my suggestions for improvement or acknowledge that I don't know the solution but want the team to be aware of in the future. I still believe Stack Overflow is an incredible resource. I'd just like it to fix some of the perceived missteps that have occurred over the past two years.

What's going wrong?¶

In the past two years, Stack Overflow has made several changes that the established community hasn't liked. Some of these changes still are not liked. These changes include the Teams feature, the new top bar, the Stack Overflow (versus existing Stack Exchange) mobile app, and Documentation. There have also been minor missteps that have caused a rift between portions of the community and the company. These areas include multiple political stances, and a number of post quality improvements that haven't been made.

Each of these, separately, is a minor problem that could be worked through and moved on from. The problem I'm seeing is that taken together, all of these are causing a rift between users, power users and the community.

Let's work through each of these items.

Teams¶

Teams was announced in October 2015 and clarified a week later. It was then shut down after nine months. The page it used to go to now has the following blurb (emphasis is mine):

Teams was in private beta for almost a year with 295 teams created and while we believe in its potential value, after a lot of consideration we’ve decided to un-ship the idea for the time being. We’ve realized that making a successful version of the Team page, as we originally proposed would ultimately take more time and resources than we want to devote to it. Our resources are currently allocated on projects to enhance and improve quality on Q&A, Documentation, and Jobs on Stack Overflow, as a result we don’t have the dedicated developers to get Teams to its fullest potential. The intention was to add more features to Teams, but we never expanded it to anything beyond a team description.

The emphasized section sounds good, except that the one section that is taking up a majority of time (Documentation) has its own major issues. The area that many power users want developers to focus on is Q&A.

The problem with Teams, and many of the projects mentioned in this post, is that this was a feature that removed focus on areas the community wanted improved. Meta Stack Overflow has been asking for improvements to reduce the number of low quality posts for years. Moderators have been asking for better tooling. The review queues are overflowing with tasks and the number of users performing reviews isn't high enough to keep up. Teams was built without a true end goal and users weren't entirely sure what to do with it. This was the first in a series of mis-steps that continue to plague community interactions when new features are announced.

Top Bar¶

The new top bar was announced in November 2016. It went through a handful of iterations before being released in mid-February 2017. During the iterations users provided feedback. When initially released, though, much of this feedback felt ignored. Things like notification overload, stickiness of the top bar, and hidden review counts were all mentioned during the three months of testing but not implemented until the change was live to millions of users.

After three months of usage, a larger problem was noticed. One of the review queues was constantly full. One of the changes that was made with this top bar was that the "Review" button no longer linked directly to the "Suggested Edits" review queue. Now it went to the page showing all review queues. Users that used to click once to get to a review queue were now presented with a list of queues to work in. Some of these queues are much more time consuming that others. It turns out the number of reviews being done has decreased significantly since the top bar was implemented.

The spike in reviews in February 2017 is when the new top bar was released. Since that release, the number of reviewers has plummeted. This has been attributed to notification fatigue and not linking users directly to the Suggested Edits queue.

Three months after implementation, it took the community asking for results (disclaimer: I asked the question), to find out how the top bar has been performing. It turns out that the top bar is performing decently well compared to what the developers were expecting, with the exception of fewer review tasks being performed.

The problem with this project, is that it's felt unneeded and has materially impacted one of the quality control features of the site. There is still a vocal group of users that don't like it because it doesn't match the rest of the network. Several are concerned about the review queue problem. Experienced users felt that they were ignored during the beta tests. Users provided feedback and examples of problems and it was only after implementation when millions of other users experienced the same thing that these changes were made.

Mobile App¶

A recent announcement (as in last week, at the time of this post) announced a new Stack Overflow mobile application. The community response was not positive. Users asked why a new application was being built when one already existed (the response was "branding"). Users asked why the new app was less functional than the existing one (it's limited to Stack Overflow versus the entire Stack Exchange network). Users asked why it took a year to develop and why the existing application hasn't received bug fixes in that year.

I think one of the most disappointing things about this is a response I received in the comments from the VP of Engineering:

@Andy You're right, it wasn't worth a year. There's a long, sad story here, but it was originally expected to only take a few months and... well, here we are a year later. We decided to go ahead and launch and see what we can learn, and we'll reassess from here. – David Fullerton? May 17 at 16:26

Another user expressed the dissatisfaction in a very pointed way. They provided a list of features that the community has asked for over the years that many feel have been ignored. The VP's response to this wasn't encouraging either:

I appreciate that there are a lot of issues on Stack Overflow that need to be addressed, and maybe we haven't been responding to them as quickly as we should. But Stack Overflow Q&A is a big, established product, most of the problems left are hard, and we can't let maintenance become the only thing we work on or we'll just slowly run out of money and go out of business. We are trying to both maintain Q&A and solve new problems for developers and reach new audiences. The latter is hard, and maybe we'll fail on a lot of our ideas, but we're not going to stop trying. – David Fullerton? May 17 at 21:10

This sounds like work on the Q&A side is feature frozen at this point. They are done innovating in this area and instead are focused on drawing in users via other features - like Jobs or Documentation. Multiple times in the comments the new app was promoted as being able to use the Dev Story or Jobs features in the future. Perhaps it's just me, but I don't apply for jobs via my phone. That doesn't seem like a good way to really put the effort needed into a cover letter or application.

Documentation¶

Now we've reached Documentation. This is the project that's sucked up development time over the past two years. This is the project that Stack Overflow developers are defending tooth and nail and the community has all but given up on.

Documentation was announced back in August 2015. It's had a ton of updates since then. It was met with initial enthusiasm but that quickly turned around. When the system launched for all users, one of the first complaints was that the reputation generated via documentation was doing bad things to the main Q&A site. This resulted in a massive recalculation of reputation and resulted in many users losing a lot of their internet points.

Another change that was announced with the introduction of a new review queue for documentation. Initially, developers didn't expect the low quality to begin immediately, it seems. Long time users weren't surprised. Now we've reached the point where the company is realizing that the users knew what they were talking about. Documentation is undergoing a massive change, to the point that much of it is being completely redone - not fixed - scrapped and redone.

This project has years worth of feedback from the community that has been ignored. It is the black sheep of Stack Overflow and many community users feel that quality of the content is lacking so badly that they don't participate any longer. This feeling isn't helped that many users have been explaining why things aren't working for a while and it's only after two years the developers are starting to realize the private beta testers, public beta testers and experienced community users mentioned many of these problems. In this particular instance, the company took Jeff Atwood's advice (co-founder of Stack Overflow) to not let the community tell you what to do to heart. To the company's surprise, a community of developers that live in programming documentation had decent thoughts on what does and does not work in programming documentation.

Politics¶

For many users, the lack of true social features on Stack Overflow and across the Stack Exchange network has been a good thing. You can't easily follow a single user, you can't send private messages to a user, and you can't really do anything on the site that isn't public to everyone. The focus is on content, not opinions or social interactions.

This breaks down once and a while though when a big political thing occurs. The two most frequently mentioned instances are the response to the Obergefell v. Hodges Supreme Court decision and the response to President Trump's initial immigration executive order.

Both of these caused huge uproars within the community when the company took a stand. These stands caused problems due to users holding opposite political views, users not wanting politics on their programming site, users not wanting to deal with the drama caused by the vocal members of the other groups. This led to an apology. The community wasn't pleased with this apology. Users mentioned in multiple answers to this apology that they don't want the company to post such political agendas on the site. It's out of place for a programmer community. Both of these instances are still brought up on Meta when the community feels that the company is imposing on them.

I don't really have advice or suggestions on this problem other than "I don't want to see this on Stack Overflow, because these hot button issues cause so much drama that nothing gets accomplished". These posts grind Meta and chatrooms to a halt while everyone expresses their opinion on the post, on the post's existence, on one another and on related issues.

Quality Improvements¶

Finally, the community has been asking for years about ways to improve the quality of posts on the site. Stack Exchange started a project to improve the quality back in October 2016. This generated 80 different suggestions on how the community sees "quality improvement" taking place. Since then there haven't been any updates on the status of this project or even subprojects.

This was brought up during all of the projects listed above by long time users. The hope was that this quality project would help. Being ignored hasn't brought any good feelings. The lower quality has been measurable and seen less participation from experienced users.

The Fix¶

Above I've pointed out several issues that I've seen over the past two years. These issues are part of a bigger problem though. It seems that Stack Overflow doesn't know how to handle its community size any longer. It's in the top 300 sites visited in the US and receives half a billion views globally per month. Couple this with the fact that they don't have a sustainable business model yet and have a sizable team with good benefits and they are getting concerned.

Q&A is what built Stack Overflow, but it isn't enough to sustain them. Thus, the other projects are being created. Unfortunately, in this process, it seems the company is forgetting its existing user base at the expense of expanding to new users. Existing users are getting frustrated with the lack of quality improvements, being ignored and not having changes that benefit their use cases.

Documentation has taken up a giant chunk of time and developer effort and it's all been wasted. The announcement that it is being redone has been met with "thanks" from the community, along with warnings to consider that "quality" problem. We'll see how it plays out, or if that quality issue is ignored like their own Quality Project.

Which brings us to the final point I want to make. I think the feeling of Q&A being "done" is the biggest problem I've had with Stack Overflow over the last year. New features aren't being built in that space. Instead of focusing on some of the "hard" problems, the company is throwing stuff at the wall and hoping something will stick. Unfortunately, the four biggest projects in the last year have either failed completely (Teams, Documentation, Mobile App...perhaps) or have significant unintended consequences that aren't helping the quality issue users have been reporting for years.

Power users, the underlying community that has put time and effort into growing Stack Overflow to what it is today, is feeling ignored. It is only after months or years long experiments fail that community opinions are finally validated or considered. Users have expressed concerns in each of the above projects repeatedly. Yet, those opinions were not addressed. The silos that the developers have built around themselves are causing the company to lose touch with its community. This is being done at the expense of alienating the users that care and the cost of developer time.

Users want a high quality site with answers to their questions. Even new or potentially new users want this. Stack Overflow continues to avoid dealing with that problem because "it's hard". The unfortunate thing is, this is costing the site users that return to provide more than one answer.

This chart is showing the number of answers provided per month by different types of users. Users that have provided more than 100 answers, between 11 and 100 answers, between 2 and 10 answers and only a single answer. The furthest data point on the right is an artifact of being an incomplete month. From this chart, we can see that the only group that has continued to rise are users that provide a single answer over time. The other groups took a steep drop in April 2014 and haven't recovered since then. The number of experienced users that are participating has dropped.

What happened in April 2014? That's been answered by a Stack Overflow community manager. The theory is that users aren't getting answers to their questions and due to being ignored they never return to participate further in the site. Another community manager also provided an answer:

Starting around 2013 and peaking around March, 2014, people began asking fewer interesting questions. That lead to a decrease in voting on questions and fewer answers being given. Since the feedback on these uninteresting questions was discouraging, people began asking fewer questions on the whole. Meanwhile, truly poor questions continued being asked with little regard to negative feedback.

Stack Overflow users began noticing increasing numbers of truly awful questions and decided, rightly, that downvoting and refusing to answer them is the best remedy. These questions fit broad categories of awful and users began withholding votes from questions that were not themselves awful, but bore some of the markers of awful. Fewer of these questions got answered and askers of mediocre questions did not see any point in trying to improve.

Thus began a slow spiral downward. Not all is lost though, because there are the upticks. I hope it's enough to break the cycle, but I really fear that something needs to be done about this quality issue. This is the issue that is brought up by the experienced community.

Where to from here?¶

I continue to invest my time and effort into the community, but even as an active user who really wants the company and community to succeed, it's getting harder and harder to ignore that those of us that have been around for years are not being listened to any more. We're being treated as the grumpy old person that grumbles about the way things used to be. Our experiences on the site are brushed aside as being unhelpful to new users. That completely ignores that fact that we are still trying to reach the goal on which Stack Overflow was created: "With your help, we're working together to build a library of detailed answers to every question about programming." To do this, we need high quality questions and answers so that we can actually provide help to all users. I think this is the biggest challenge that Stack Overflow is going to face in the next 18 months.

I want Stack Overflow to continue to grow. I also want Stack Overflow to have high quality content. I think my experience and the experience of others can help build the features to accomplish this. We just need Stack Overflow to refocus on the Q&A portion of their network again.

Can a machine be taught to flag spam automatically

2017-02-19T22:51:00-06:00

Introduction¶

This post was originally published on Meta Stack Exchange on February 20, 2017. I've republished it here so that I can easily update information related to recent developments. If you have questions or comments, I highly encourage you to visit the question on Meta Stack Exchange and post there.

The post was featured across the entire Stack Exchange network for a week, too. This drove a huge amount of traffic to the question and resulted in some valuable feedback:

TL;DR: We did it, so... yes.

What is this?¶

Charcoal is the organization behind the SmokeDetector bot and other nice things. This bot scans new posts across the entire network for spam posts and reports them to various chatrooms where people can act on them. If a post has been created or edited, anywhere on the network, we've probably seen it. The bot utilizes our knowledge of how spammers work and what they have previously posted to come up with common patterns and rules to detect spam in the new and updated posts. You've likely seen the SmokeDetector bot if you visit chatrooms such as Tavern on the Meta, Charcoal HQ, SO Close Vote Reviewers and others across the network. Over time, the bot has become very accurate.

Now we are leveraging the years of data and accuracy to automatically cast spam flags. With approximately 58,000 posts to draw from and over 46,000 true positives, we have a vast trove of data to utilize.

What problem does this address?¶

To put it simply, spam. Stack Exchange is one of the most popular networks of websites on the Internet, and all of it gets spammed at some point. Our statistics show that we see about 100 spam posts per day, on average over the last three months.

A decent chunk of this isn't the type you'd want to see at work (or at all). The faster we can get this off the home page, the better for all involved. Unfortunately, it's not unheard of for spam to last several hours, even on the larger sites such as Graphic Design.

Over the past three years, efforts with Smokey have significantly cut the time it takes for spam to be deleted. This project is an extension of that, and it's now well within reach to delete spam within seconds of it being posted.

What are we doing?¶

For over 3 years, SmokeDetector has reported potential spam across the Stack Exchange network so that users can flag the posts as appropriate. Users have provided feedback to inform the bot on whether the detection was correct or not (referred to as "feedback"). This feedback is stored in our web dashboard, metasmoke (code). Over time, we've used this feedback to evaluate our patterns ("reasons") and improve our accuracy. Several of our reasons are over 99.9% accurate.

Early last year, and after getting a baseline accuracy from jmac (thank you!), we realized we could use the system to automatically cast spam flags. On Stack Overflow the current accuracy of users flagging spam posts is 85.7%. Across the rest of the network users are 95.4% accurate. We determined we can beat those numbers and eliminate spam from Stack Overflow and the rest of the network even faster.

Without going into too much detail (if you really want it, it's available on our website), we leverage the accuracy of each existing reason to come up with a weight indicating how certain the system is that a post is spam. If this value exceeds a specific threshold, the system will cast up to three spam flags on the post. We cast multiple flags utilizing a number of different users' accounts and the Stack Exchange API. Via metasmoke, users are given the opportunity to enable their accounts to be used to flag spam (You can too, if you've made it this far). When a post is eligible for flagging because it exceeded the threshold set by each individual user, accounts are randomly selected from the pool of enabled users to cast a single flag each, up to a maximum of three per post so that we never unilaterally nuke something.

What are our safety checks?¶

We designed the entire system with accuracy and sanity checks in mind. Our design collaborations are available for your browsing pleasure (RFC 1, RFC 2, RFC 3 (no longer available)). The major things that make this system safe and sane are:

We give users a choice as to how accurate they want to be with their automatic flags. Before casting any flags, we check that the preferences the user has set result in a spam detection accuracy of over 99.5% over a sample of at least 1000 posts. Remember, the current accuracy of humans is 85.7% on SO and network wide it is 95.4%.
We do not unilaterally spam nuke a post, regardless of how sure we are it is spam. This means that a human must be involved to finish off a post, even on the few sites with lower spam thresholds.
We’ve designed the system to be tolerant of faults - if there’s a malfunction anywhere in the system, any user with access to SmokeDetector can immediately halt all automatic flagging - this includes all network moderators. If this happens, it needs a system administrator to step in to re-enable flags.
We've discussed this with a community manager and have their blessing on the project.

Results¶

We have been casting an average of 60-70 automatic flags per day for over two months, for a total of just over 4000 flags network wide. These flags were cast by 22 different users. In that time, we've had four false positives. We would like to be able to automatically cancel these particular cases. This isn't possible though, so we've created a feature request to retract flags via the API. In the mean time, the flags are either manually retracted by the user or declined by a moderator.

The above graph plots the weight of the reasons against its overall volume of reports and accuracy. As minimum weight increases, accuracy (yellow line and rightmost Y-axis) and total reports (blue line) on the left-hand scale increase. The green line represents the number of true positives, which are verified by SmokeDetector user feedback.

This shows the number of posts we've automatically flagged per day over the last month. The jump on February 15th, is due to increasing the number of automatic flags from 1 per post to 3 per post. You can see a live version of this graph on metasmoke's autoflagging page.

Spam arrives on Stack Exchange in waves. It is easy to see the time of day that many spam reports come in. The hours, above, are UTC time. The busiest spam times of day are the 8 hour block between 4am and Noon. We have affectionately named this "spam hour" in the chat room.

Our goal is to delete spam quickly and accurately. The graph shows the time it takes for a reported spam post to be removed from the network. This section has three trend lines that show these averages. The first, red section is when we were simply reporting the posts to chatrooms and all flags had to come from users. You can see we are pretty constant in the time it takes to remove spam during this period. It took, on average, just over five minutes to get a post removed.

The green trend line is when we were issuing a single automatic flag. At implementation, we eliminated a full minute from time to deletion and after a month we'd eliminated two full minutes compared to no automatic flags.

The last section, the orange, is when we implemented three automatic flags to most sites. This was rolled out last week, but it's already had a dramatic improvement on the time to deletion. We are seeing between 1 and 2 minutes to time to deletion.

As mentioned above, spam arrives in waves. The dashed and dotted lines on the graph show the average deletion time during these two different time periods. The dashed lines show deletion time during 4am and Noon UTC, the dotted lines show the rest of the 24 hour period. An interesting thing this graph shows is that time to deletion during spam hour was higher when we didn't cast any automatic flags. It was removed faster outside of spam hour. That reversed when we started issuing a single auto-flag. The spam hour time to deletion is slightly lower than the average. Comparing the two time periods though, time to deletion during non-spam hour at the end of the non-flagging time period and the end of the single flag period are roughly the same.

We'll update these in a few weeks too, to better show the trend we are seeing with three automatic flags.

Discussion¶

We are confident in SmokeDetector and the three years of history it has. We've had many talented developers assist us over the years and many more users have provided feedback to improve our detection rules. Let us know what you want us to elaborate on, features you're wondering about or would like to see added, or things we might have missed in the process or the tooling. Take a look at the feature we'd really like Stack Exchange to consider so that we can further improve this system (and some of the other community built systems). We'll have Charcoal members hanging around and answering your questions. Alternatively, feel free to drop into Charcoal HQ and have a chat.

Third time's the charm?

2016-11-06T22:54:00-06:00

Introduction¶

Last year, I ran for moderator (twice) on Stack Overflow and didn't make it through the primaries. I came close on that second run. Now, a year later, and a year more experienced, I'm going to try again. This post will document my progress through the election cycle.

Spoiler Alert: I didn't win. The rest of this post details my thoughts as the election occurred though.

Nomination Phase¶

The election this year took a slightly different route than last time. In previous years, the election was announced at the same time as the call for nominations began. Users had a week to nominate themselves, then we answered a series of community provided questions during the primaries, then the final election.

This year, the election was announced a week in advance of nominations. During the week, a call was put out for Community questions. When a nomination was posted, the answers would be posted as well. This change was made due to how much the community needed to read during the primaries. The primaries were only a few days long and the Q&As were usually ten questions for each user. When a primary has 20-30 nominees, that is a lot of reading that was expected in a short period of time. By bringing this phase forward, now the community has the entire election cycle to read and interact with the nominees.

I provided one question that was used in the final selection of questions. I mentioned last time that I thought it was a great question, so I suggested it again:

Do you have any Meta posts that you're particularly proud of, or that you feel best demonstrate your moderation style?

My nomination¶

My platform isn't all that different than the last two times.

Hi Everyone, I'm Andy and I'd like to be a moderator for you and Stack Overflow. I've answered the questions posted by the community here. I encourage you to take a look.

Why should you vote for me?¶

I've been a moderator on Community Building for over two years. I know the moderator tools and have worked with many of the current moderators. This interaction will continue as a new moderator here.

I have a lot of helpful flags. A decent percentage of these are on comments, but not all. I'd like to help keep the site clean without adding to the current moderators' work load.

I'm active in the review queues (currently holding 5th in Low Quality Post reviewers of all time), provide edits to posts, answers and enjoy the moderation aspect of Stack Exchange.

I have a history on Meta.SO that shows I'm involved in the meta aspect of the site as well.

I enjoy the moderation aspect on Stack Overflow (and Stack Exchange in general). I have a history of good community moderation, am here all the time and believe I can help the current team.

During the first full day, I've gotten positive responses to this post. My two favorite, so far, are:

Andy's work around comment flags has been very impressive. I'm definitely curious to see what his thoughts on the mod queue are and if we could incorporate some of his work permanently on the site. Better identification of flags is something that would be very nice to have permanently. - bluefeet Stack Overflow Community Manager

and

There are always some nominees for this position who are very active, some who have good judgment and cool heads, and some who innovate with their approach to community moderation. Andy is the rare candidate who very clearly checks all three boxes. As a user on SO for 3.5 years, a moderator pro tempore on Engineering SE for 1.7 years and an early participant in the Community Building SE beta, I strongly support this nomination. - Air Moderator on Engineering.SE

My candidate score this time is an impressive 39/40. This is up six from a year ago, and up ten from my first run. The one missing point is due to missing the Refiner badge. I believe the reason for this is because of my workflow. I, generally, don't edit and answer questions at the same time. If I'm answering, I'm not in "edit" mode. If I'm editing, I'm usually in "moderation" mode. It's something I'll work on. I'm 38 out of 50 questions there, so I'll get it soon enough.

Candidate questions¶

None of the questions were that surprising. With the added benefit of a week to prepare answers prior to nominating, I am very pleased with my answers. Two answers have generated a bit of discussion though.

A 10k+ user regularly has their comments flagged as "rude or offensive" or "not constructive", to the tune of 4-5 flags a day. No comment by itself is particularly offensive, but their general tone causes them to be flagged by multiple users. You've contacted them privately about this, but they believe that they aren't doing anything wrong and that people are being too sensitive. The flags keep coming in on their comments. What, if anything, do you do next?

My response is:

No one has an exemption from the Be Nice policy. I think the first step is to understand why nothing has already been done about the user. 4-5 a day seems like the user has moved beyond the "nuisance" stage. I think a temporary ban is appropriate, with another explanation as to what is expected when interacting with others. While some users are more sensitive than others, a stream of this many flags across an extended period of time doesn't lead me to believe the problem is with the community users.

The point raised in the comments was that I was rushing into banning the user without communicating first. I disagree with that, and explained that they've already been contacted privately and ignored those warnings. A ban is the next step in getting the user's attention. I was told this would be "humiliating" for a high rep user. Again, I disagree and believe it's not humiliating, but educating the user.

The second question that generated some discussion was:

You impose a temporary ban (say 1 week) on a user for what you judged as reasonable and valid reasons (the user gets notified by email of your action and the reason). The user replies to your email acknowledging the transgression, says they won't do it again and asks for the ban to be lifted. The user sounds genuine. Do you remove the ban? Do you even reply at all? Explain your reasoning. The context of this question applies to longer bans too. If it helps get the juices flowing, consider the situation of a second offence for the same behaviour, which has a default ban period of 1 month.

My response:

I have two answers for this question, based on the user's history. If this is a first offense, up to this point the user hasn't been pushing limits and attempting to disrupt others, and the ban isn't related to voting fraud, then I'd be willing to remove the ban. Sometimes a ban is put in place to get the user's attention. Once the situation has been resolved, the ban is no longer appropriate and should be removed.

On the other hand, if the user has a history of crossing the line and looking for a reaction, or if the ban is related to vote fraud, I'd simply not reply and the user will return in a week. Stack Overflow has enough "voting irregularity" bans that I imagine the responses to such bans are all similar (and invalid). I see no reason to change that policy.

The push back I received on this was that I was letting a user off the hook by unbanning them. I argued that unbanning has been done in the past. Sometimes the ban is needed simply to get the user's attention and start the conversation and explain that what they are doing is wrong. If the user abuses the trust at that point and repeats the behavior, then the longer ban is completely justified. A bit of compassion isn't a bad thing.

Primary Phase¶

There are 12 nominees, so a primary will occur. Once again, the primary phase will reduce the number of candidates in the final phase to 10. With so few being eliminated this time around, it feels a little unneeded. The primary will last for a few days and during that time users can vote candidates up or down depending whether they believe the nominee should be a moderator. I'll return in a few days...

Primary Results¶

The Primary phase has ended and the final election has begun. I ended the primary in 5th place, securing a position in the final election. I have a sizable margin between my position and sixth place as well. One other stat that I'm rather proud of: I received the fewest number of down votes of any candidate.

On to the election!

Election Phase¶

The election lasts for several days and covers a weekend. We'll see how it turns out in a few days.

Election Results¶

Well, the election has concluded. I didn't secure one of the three positions for moderator. I finished in 5th place, with my elimination propelling second and third place to a victory. I was eliminated in the 10th round of the Meek STV process.

Good luck to the new moderators!

Post Election thoughts¶

This election started differently than the previous two I've run in. This election was announced a week in advance and solicited community input for questions for the candidates. I think this was a good change. The element of surprise in the previous two made it much more stressful. Additionally, by having the questions available at the start of the election - instead of at the start of the primary phase - I was able to better answer the questions. Previously, the questions would be available at the start of the primary phase. With the amount of reading needed to get through one candidate's answers, let alone all of them, I imagine that many people didn't read all of the responses.

The other nice thing about this lead time, is that I had time to get my answers read for when I posted my nomination. By posting the questions and answers at the same time, I was able to have my responses available the entire time. Score-wise, on the questionnaire, I did much better than my opponents. I think a big reason for this is that I have my responses posted as soon as my nomination was posted.

One question this time, though, seemed to split the candidates. I mentioned it previously, but it was regarding potentially removing a temporary ban.

You impose a temporary ban (say 1 week) on a user for what you judged as reasonable and valid reasons (the user gets notified by email of your action and the reason). The user replies to your email acknowledging the transgression, says they won't do it again and asks for the ban to be lifted. The user sounds genuine. Do you remove the ban? Do you even reply at all? Explain your reasoning. The context of this question applies to longer bans too. If it helps get the juices flowing, consider the situation of a second offence for the same behaviour, which has a default ban period of 1 month.

I was one of two candidates that explicitly stated we'd consider removing the ban. A third user didn't state it explicitly, but did say they'd consider it. I was surprised by the harsh tone the others took, especially since there is a lot of previous discussions on Meta where the outcome is the moderators or community managers removing the ban. I was happy to see that the other candidate who said they'd consider removing the ban get elected though.

I still believe that removing the ban is a valid option. Especially because their next ban would be much longer if they broke my trust.

We'll see when the next election on Stack Overflow is, but with three new moderators and no resignations, I suspect it'll be a while. I'll consider running again then.

I'm running for moderator on Stack Overflow again

2015-11-18T09:38:00-06:00

Introduction¶

In April, I ran for moderator on Stack Overflow and didn't make it through the primaries. That's ok though, there were several very good users that did get elected. In a surprise announcement, though, Stack Overflow is running a second election this year. This is the first time this has happened since 2011. I'm still interested in a position and I'm still active in the community, so I'm going to run again. This post will follow the process.

Nomination Phase¶

Like last time, the nomination phase began with users throwing their hat into the ring. Nominations were slower and fewer this time. Only 19 nominees, so no one was eliminated due to low reputation. Several users from the last election are rerunning too.

My Platform¶

My platform hasn't changed much since the previous run. Below is my nomination post. This time, I tried to pull emphasis off the automated script by putting it lower on the list of things I've done and instead focused on the moderation tasks I do on Stack Overflow and the work I've done on Community Building. We'll see if it works.

Hi Everyone, I'm Andy and I'd make a great moderator for Stack Overflow.

Why vote for me?¶

I'm active in the review queues (currently holding 10th in Low Quality Post reviewers of all time), provide edits to posts, answers and enjoy the moderation aspect of Stack Exchange

I've been a moderator on CommunityBuilding for nearly a year and a half. I know the moderator tools and have worked with several of the current moderators. This interaction will continue as a new moderator here.

I've built an automated script that continues to handle noisy comments very accurately.

I have a history on Meta.SO that shows I'm involved in the meta aspect of the site as well.

I have a history of good community moderation already. I enjoy the moderation aspect on Stack Overflow (and Stack Exchange in general). I deal with users with respect, even if our opinions on an issue differ.

With this, I received my "candidate score". It was 33/40. Not the highest, but better than last time. The score wasn't mentioned in April. I am not expecting it to be an issue this time either.

Primary Phase¶

Updated November 21, 2015

The primary phase is in the third day. In day 1, I was hovering around 9th/10th place. Overnight, between days 1 and 2 though, I dropped down to 11th. I've been sitting here consistently for a full day now and, while still gaining votes, I'm not gaining as fast as 10th position. It appears I may not make the cutoff by Friday's deadline. While disappointing, there are a few things that I came away with that I'm very happy about.

In the last primary, I received 1,492 positive votes. I've surpassed that already. I have over 2,100 currently. I'm pleased with that upswing. I was also more prepared for the questionnaire portion of the primaries this time. I've gotten the second highest number of upvotes on my responses. Several of the questions were similar to last time, but there are a few that I think should be included in the future elections.

Questionnaire¶

This first question is a great post for candidates. It allows them to show off their involvement in Meta and show their best work. For users, it gives them a sense of how a candidate interacts with the community. I am very surprised that several candidates list only one or two posts. This seems to be doing a disservice to themselves.

Do you have any Meta posts that you're particularly proud of, or that you feel best demonstrate your moderation style?

My response to this question:

I'm proud of several of my posts both here on Meta.SO and on other network sites I participate in. Here on MSO, I have two questions that I am proud of:

Can a machine be taught to flag comments automatically?

I estimate 10% of the links posted here are dead. How do we deal with them?

In both of these, you can see that I care about quality on Stack Overflow. I've spent time analyzing the problem, as I see it, and present my findings to the community. I participated in the discussions that both posts generated and continue to run the bot to this day.

Elsewhere on the network, my participation in meta has helped to shape communities. For example, on Hardware Recommendations, my meta post about "What type of hardware is allowed" helped to set the scope of what the community accepts as on topic hardware. I've also helped to set up the high quality guidelines for questions and argued against certain types of tags and hardware.

With all of these, I've presented my arguments and logic and strived to remain professional. I believe the community on HardwareRecs has seen that as well.

As a moderator on Community Building, I've been involved in many discussions. I was involved in the discussions to rename the community from Moderators.SE to CommunityBuilding.SE. I've been involved in discussions about slow growth of the community. I've also presented arguments that go against other moderators, and walked away still feeling like a moderation team. (Go communication!)

Finally, on OpenSource, I made a post about how moderators had implemented a policy to watch the reviewers. It was similar to the long removed "flag weight" option that used to exist. I believe the post was presented in a way that questioned the decisions of the moderators, yet remained professional.

With all of these meta posts, across the network, I think you can pick up on my moderation style and personality. I like data and I try to present my thoughts in a way that is understandable to all. I'm also willing to speak my mind.

This second question I struggled with for a bit. I've had ideas on how Stack Overflow/Stack Exchange could improve, but what did I want to present in this response.

If you could add/revise one Stack Overflow policy/guideline, what would you change? Why would you change it, and what would it mean for the community?

My response to this question:

At the risk of talking myself out of a position, I think more community moderation would help the problem that Stack Overflow has with scaling moderators. There are a couple areas that I think would work well in opening this to the higher reputation users

Comment flagging: Comments can be removed if enough users flag a comment. If not, a moderator needs to handle the flag. Instead, opening this as a review queue can remove a lot of this burden from the moderators. Users could handle all but the "Other..." flag. There may be guidance needed on the "Obsolete" one due to the difference between "obsolete comment" and "obsolete code block" differences.

Audit Review reviews: On Stack Overflow, we get a decent number of disputed audit review posts on meta. There may be a way to get users with a history of passing both audits and good reviews involved in dealing with these disputed audits. The idea would be to say whether an audit is good or not.

These changes, and other areas where the community could be leveraged for moderation tasks, helps to remove the burden on moderators. Handling 2,000 (and growing) flags a day means that something needs to change. Moderators are exception handlers. They should be handling the cases that are exceptional - not comments that are no longer relevant.

For the community, this would be more involvement with the moderation aspect. Users would be able to more quickly clean up a comment thread. Flag it and it appears in the review queue. From here, the moderators don't need to be involved. The downside of this is that it adds another queue for users to be involved with.

Primary Results¶

With the primaries over, I ended with 2483 positive votes. This put me in 11th place. Sadly, this was not enough to get into the election. I was 185 votes shy of overtaking 10th. Good luck to the candidates that made it.

One of the tools that came out of this election was a way to visualize various data points to compare candidates. I provided a couple notes about outliers various candidates show regarding aspects on the site. I found it interesting to see what each user had "specialized" in.

Election¶

Updated December 8, 2015

The election is over and the new moderators have settled in. We've had our first bout of public drama over one of these moderators actually moderating a chat room too. gasp

Final thoughts¶

I was closer to the top 10 this time, but still missed it. Even more surprising was that the user that ended up in 3rd in the primaries didn't even come close to getting elected. He was eliminated in the 5th round of final STV votes. I still think I'd make a great moderator for Stack Overflow, but I need to figure out the best way to promote myself in the next election.

Link Analysis - Technical Explanation

2015-08-10T23:41:00-05:00

Introduction¶

In my last two posts, I've discussed the number of rotten links on Stack Overflow and a proposal to fix said links. In this post, I'm going to discuss how I performed this analysis.

Set up¶

The database¶

The process began by downloading the March 2013 data dump. I loaded the posts into a MariaDB instance on my local machine. This was accomplished with a very simple script and patience, as the script took a while to run.

load xml local infile '/path/to/posts.xml'
into table posts
rows identified by '<row>';

The data¶

Once this was done, the next step was selecting my random sample of data. I did this by randomly selecting 25% of the days in a year and then pulling all posts for those days across all years of Stack Overflow's existence. The Python script I used to do this was fairly simple:

from datetime import timedelta, datetime
from random import randint
from math import ceil

def random_date(start, end):
    return start + timedelta(
        seconds=randint(0, int((end - start).total_seconds())))

percentage = 0.25
days = 366

dayslist = []
for d in xrange(int(ceil(days*percentage))):
    dayslist.append(random_date(datetime(2008,1,1), datetime(2008,12,31)))

At the end of this run, the days that I cared about are in the dayslist variable. I used that to pull questions and answers from the database that were created on that month/day combination. In the end, this resulted in just over 25% of the total posts being selected. To ensure that I could replicate the results, I also saved the dates that were selected.

Parsing the data¶

The next step was to parse out links from the data. I used the following script to extract anchor text and links from a post.

def links_in_post(post):
    """
    Returns a list of all links found
    :param posts: A list of dictionaries with a 'body' key containing HTML strings
     [
        {
            'body': "<b>This is HTML</b>"
        },
    ]
    :return: A list of tuples containing anchor text and URL
        [
            ('Display Text', 'http://example.com')
        ]
    """
    logging.debug("Extracting links...")
    links = []
    images = []
    regexp = "&.+?;"
    list_of_html = re.findall(regexp, post)
    for e in list_of_html:
        if e in invalid_entities:
            h = HTMLParser.HTMLParser()
            unescaped = h.unescape(e) 
            post = post.replace(e, unescaped)

    doc = html.fromstring(post)
    for link in doc.xpath('//a'):
        links.append(Link(text=link.text_content(), link=link.get('href')))
    for image in doc.xpath('//img'):
        images.append(Link(text=image.get('alt'), link=image.get('src')))
    all_items = links + images
    seen = set()
    unique_items = [item for item in all_items if item[1] not in seen and not seen.add(item[1])]
    return unique_items

The regular expression being utilized, is to strip out HTML entities. This was needed due to weird parsing issues with non-ASCII characters. Fortunately, I wasn't the first to encounter oddities like this. The list comprehension at the end of the function is returning only unique tuples of anchor text/link. I was surprised how often I'd end up with tuples such as ('this', 'http://google.com') in the same post. This uniqueness saved a lot of processing time later.

After all links in a post had been extracted, this information and information about the post itself, was saved to the database. If a post had no links, it was not saved. The database consisted of three tables.

Links - This table contains the base URLs seen in all posts. URLs are distinct. It also contains an ID that will be utilized for linking to the other tables.
Post Links - This table contains information about links in a post. This includes the specific anchor text/link combinations
Link Results - This table contains the results of link status checks

Processing the posts was fairly time consuming, but was able to be parallelized easily. That significantly cut down on processing time.

Checking the links¶

The most time consuming portion of this entire project was actually checking link status. Each link that appeared in the Links table was checked. As I mentioned in my first post, the original idea was to simply send a HEAD request to each URL. The idea was to save myself and the end point a tiny amount of bandwidth. I had over a million links to process. I figured a little saved bandwidth wouldn't hurt.

It turns out this isn't a good idea. When I started seeing larger sites as not being accessible, I got suspicious that something was wrong. These sites were returning status 405 errors. This indicates that the method is not allowed. So, I switched to GET for every link. The next problem I ran into was that many sites didn't like the default user agent of the spider. They rejected requests with 404 and 401 errors. In the end, I got around this by mimicking Firefox on every request.

With those kinks worked out, every link was sent a GET request that looked to be from a Firefox browser. The process would allow 20 seconds per link. If the link didn't respond in that time limit, it was declared inaccessible.

A week later, I repeated the process with anything that hadn't returned a status code less than 400. Once more, on the third week, I repeated this with the failed links. At the end of three weeks, I had a list of sites that were inaccessible to me - on a residential connection - three times over a period of three weeks.

Results¶

The SVG image that I created for the write up was generated with Pygal. The tables were the result of various breakdowns of the data via queries to the status results.

Wrap up¶

I am rather proud of how the results turned out for this project. I went into it expecting about 15% of links to be broken, but I didn't really realize what that meant. Fifteen percent of 21 million total posts is over 3 million. That's a large number. BUT, it also ignored that a large percentage of posts don't contain links. I failed to consider that in my original hypothesis.

Less than half of my sample had links (2.3M out of 5.6M). Of the 2.3M with links, only 1.5M were unique links. The final result of 10% failed links makes much more sense in this context. Ten percent of 1.5M links means that there are 150K links that are bad.

A proposal to fix broken links on Stack Overflow

2015-08-07T07:34:00-05:00

This post was published by me on Meta Stack Overflow on August 7th, 2015. I've republished it here so that I can easily update information related to recent developments. If you have questions or comments, I highly encourage you to visit the question on Meta Stack Overflow and post there.

This is a follow up to yesterday's post about how many links on Stack Overflow are starting to rot.

The proposal¶

I propose another hybrid of the previous broken link queue (as was mentioned above in comments and other answers) and an automated process to fix broken links with an archived version (which has also been suggested).

The broken link queue should focus on editing and fixing the links in a post (as opposed to closing it). It'd be similar to the suggested edits queue, but with the focus intended to correct links not spelling and grammar. This could be done by only allowing a user to edit the links.

One possibility, I envision is presenting the user with the links in the post and a status on whether or not the link is available. If it's not available, give the user a way to change that specific link. Utilizing this post, I have a quick mock up of what I propose such a review task looks like:

The Queue¶

All the links that appear in the post are on the right hand side of the screen. The links that are accessible have a green check mark. The ones that are broken (and the reason for being in this queue) have a red X. When a user elects to fix a post, they are presented with a modal showing only the broken URLs.

The Automation¶

With this queue, though, I think an automated process would be helpful as well. The idea is that this would operate similarly to the Low Quality queue, where the system can automatically add a post to the queue if certain criteria are met or a user can flag a post as having broken links. I've based my idea on what Tim Post outlined in the comments to a previous post.

Automated process performs a "Today in History" type check. This keeps the fixes limited to a small subset of posts per day. It also focuses on older posts, which were more likely to have a broken link than something posted recently. Example: On July 31, 2015, the only posts being checked for bad links would be anything posted on July 31 in any year 2008 through current year - 1.
Utilizing the Wayback Machine API, or similar service, the system attempts to change broken links into an archived version of the URL. This archived version should probably be from "close" to the time the post was originally made. If the automated process isn't able to find an archived version of the link, the post should be tossed into the Broken Link queue
When the Community edits a post to fix a link, a new Post History event is utilized to show that a link was changed. This would allow anyone looking at revision history to easily see that a specific change was only to fix links.
Actions performed in the previous bullets are exposed to 10K users in the moderator tools. Much like recent close/delete posts show up, these do as well. This allows higher rep users to spot check (if they so desire). I think this portion is important when the automated process fixes a link. For community edits in the queue, the history tab in /review seems sufficient.
If a post consists of a large percentage of a link (or links) and these links were changed by Community, the post should have further action taken on it in some queue.

Example: - A post where X+% of the text is hyperlinks is very dependent on the links being active. If one or more of the links are broken, the post may no longer be relevant (or may be a link only post). One example I found while doing this was this answer.

I don't think that this type of edit from the Community user should bump a post to the front page. Edits done in the broken link queue, though, should bump the post just like a suggested edit does today. By preventing the automated Community posts from being bumped, we prevent the front page from being flooded, daily, with old posts and these edits. I think that the exposure in the 10K tools and the broken link queue will provide the visibility needed to check the process is working correctly.

Process flows¶

Queue Flow:

Automated process flow:

Potential pitfalls¶

The automated link checking will likely run into several of the problems I did. Mainly:

Sites modify the HEAD request to send a 404 instead of a 405. My solution to this was to issue GET requests for everything.
Sites don't like certain user agents. My solution to this was to mimic the Firefox user agent. To be a good internet citizen, Stack Exchange probably shouldn't go that far, but providing a unique user agent that is easily identifiable as "StackExchangeBot" (think "GoogleBot"), should be helpful in identifying where traffic is coming from.
Sites that are down one week and up another. I solved this by spreading my tests over a period of 3 weeks. With the queue and automatic linking to an archived version of the site, this may not be necessary. However, immediately converting a link to an archived copy should be discussed by the community. Do we convert the broken link immediately? Or do we try again in X days. If it's still down then convert it? It was suggested in another answer that we first offer the poster the chance to make changes before an automatic process takes place.
The need to throttle requests so that you don't flood a site with requests. I solved this by only querying unique URLs. This still issues a lot of requests to certain, popular, domains. This could be solved by staggering the checks over a period of minutes/hours versus spewing 100s - 1000s of GET requests at midnight daily.

With the broken link queue, I feel the first two would be acceptable. Much like posts in the Low Quality queue appear because of a heuristic, despite not being low quality, links will be the same way. The system will flag them as broken and the queue will determine if that is true (if an archived version of the site can't be found by the automated process). The bullet about throttling requests is an implementation detail that I'm sure the developers would be able to figure out.

Analysis of links posted to Stack Overflow

2015-08-06T07:35:00-05:00

This post was published by me on Meta Stack Overflow on August 6th, 2015. I've republished it here so that I can easily update information related to recent developments. If you have questions or comments, I highly encourage you to visit the question on Meta Stack Overflow and post there.

TL;DR: Approximately 10% of 1.5M randomly selected unique links in the March 2015 data dump are unavailable. To be more precise, that is approximately 150K dead links.

Motivation¶

I've been running into more and more links that are dead on Stack Overflow and it's bothering me. In some cases, I've spent the time hunting down a replacement, in others I've notified the owner of the post that a link is dead, and (more shamefully), in others I've simply ignored it and left just a down vote. Obviously that's not good.

Before making sweeping generalizations that there are dead links everywhere, though, I wanted to make sure I wasn't just finding bad posts because I was wandering through the review queues. Utilizing the March 2015 data dump, I randomly selected about 25% of the posts (both questions and answers) and then parsed out the links. This works out to 5.6M posts out of 21.7M total.

Of these 5.6M posts, 2.3M contained links and 1.5M of these were unique links. I sent each unique URL a HEAD request, with a user agent mimicking Firefox¹. I then retested everything that didn't return a successful response a week later. Finally, anything that failed from that batch, I resent a final test a week later. If a site was down in all three tests, I considered it down for this test.

Results²¶

By status code¶

Good news/Bad News: A majority of the links returned a valid response, but there are still roughly 10% that failed.

(This image is showing the top status codes returned)

The three largest slices of the pie are the status 200s (site working!), status 404 (page not found, but server responded saying the page isn't found) and Connection Errors. Connection errors are sites that had no proper server response. The request to access the page timed out. I was generous in the time out and allowed a request to live for 20 seconds before failing a link with this status. The 4xx and 5xx errors are status codes that fall in the 400 and 500 range of HTTP responses. These are client and server error ranges, thus counted as a failure. 2xx errors are pages that responded with a success message in the 200 range, but it wasn't a 200 code. Finally, there were just over a hundred sites that hit a redirect loop that didn't seem to end. These are the 3xx errors. I failed a site with this range if it redirected more than 30 times. There are a negligible number of sites that returned status codes in the 600 and 700 range⁴

By most common¶

There are, expectedly, many URLs that failed that appeared frequently in the sample set. Below is a list of the top 50³ URLs that are in posts most often, but failed three times over the course of three weeks.

http://docs.jquery.com/Plugins/validation
http://www.eclipse.org/eclipselink/moxy.php
http://jackson.codehaus.org/
http://xstream.codehaus.org/
http://opencv.willowgarage.com/wiki/
http://developer.android.com/resources/articles/painless-threading.html
http://valums.com/ajax-upload/
http://sqlite.phxsoftware.com/
http://qt.nokia.com/
http://www.oracle.com/technetwork/java/codeconv-138413.html
http://download.java.net/jdk8/docs/api/java/time/package-summary.html
http://docs.oracle.com/javase/1.4.2/docs/api/java/text/SimpleDateFormat.html
http://watin.sourceforge.net/
http://leandrovieira.com/projects/jquery/lightbox/
https://graph.facebook.com/
https://ccrma.stanford.edu/courses/422/projects/WaveFormat/
http://www.postsharp.org/
http://www.erichynds.com/jquery/jquery-ui-multiselect-widget/
http://ha.ckers.org/xss.html
http://jetty.codehaus.org/jetty/
http://cpp-next.com/archive/2009/08/want-speed-pass-by-value/
http://codespeak.net/lxml/
http://www.hpl.hp.com/personal/Hans_Boehm/gc/
http://jquery.com/demo/thickbox/
http://book.git-scm.com/5_submodules.html
http://monotouch.net/
http://developer.android.com/resources/articles/timed-ui-updates.html
http://jquery.bassistance.de/validate/demo/
http://codeigniter.com/user_guide/database/active_record.html
http://www.phantomjs.org/
http://watin.org/
http://www.db4o.com/
http://qt.nokia.com/products/
http://referencesource.microsoft.com/netframework.aspx
https://github.com/facebook/php-sdk/
http://java.decompiler.free.fr/
http://pivotal.github.com/jasmine/
http://api.jquery.com/category/plugins/templates/
http://code.google.com/closure/library
http://www.w3schools.com/tags/ref_entities.asp
http://xstream.codehaus.org/tutorial.html
https://github.com/facebook/php-sdk
http://download.java.net/maven/1/jstl/jars/jstl-1.2.jar
https://developers.facebook.com/docs/offline-access-deprecation/
http://www.parashift.com/c++-faq-lite/pointers-to-members.html
https://developers.facebook.com/docs/mobile/ios/build/
http://downloads.php.net/pierre/
http://fluentnhibernate.org/
http://net.tutsplus.com/tutorials/javascript-ajax/5-ways-to-make-ajax-calls-with-jquery/
http://dev.iceburg.net/jquery/jqModal/

By post score¶

Count of posts by score (top 10) (Covers 94% of all broken links):

| Score | Percentage of Total Broken |
|-------|----------------------------|
| 0     | 36.4087%                   |
| 1     | 25.1674%                   |
| 2     | 13.4089%                   |
| 3     | 7.2806%                    |
| 4     | 4.2971%                    |
| 5     | 2.7065%                    |
| 6     | 1.8068%                    |
| 7     | 1.2854%                    |
| -1    | 1.1935%                    |
| 8     | 0.9415%                    |

By number of views¶

Note, this is number of views at the time the data dump was created, not as of today

Count of posts by number of views (top 10):

| Views        | Total Views |
|--------------|-------------|
| (0, 200]     | 24.4709%    |
| (200, 400]   | 14.2186%    |
| (400, 600]   | 9.5045%     |
| (600, 800]   | 6.9793%     | 
| (800, 1000]  | 5.2574%     |
| (1000, 1200] | 4.1864%     |
| (1200, 1400] | 3.3699%     |
| (1400, 1600] | 2.7766%     |
| (1600, 1800] | 2.3477%     |
| (1800, 2000] | 1.9550%     |

By days since post created¶

Note: This is number of days since creation at the time the data dump was created, not from today

Count of posts by days since creation (top 10) (Covers 64% of broken links):

| Days since Creation | Percentage of Total Broken |
|---------------------|----------------------------|
| (1110, 1140]        | 7.2938%                    |
| (1140, 1170]        | 6.7648%                    |
| (1470, 1500]        | 6.6579%                    |
| (1080, 1110]        | 6.6535%                    | 
| (750, 780]          | 6.5535%                    |
| (720, 750]          | 6.5516%                    |
| (1500, 1530]        | 6.3978%                    |
| (390, 420]          | 5.8508%                    |
| (360, 390]          | 5.8258%                    |
| (780, 810]          | 5.5175%                    |

By Ratio of Views:Days¶

Ratio Views:Days (top 20) (Covers 90% of broken links):

| Views:Days Ratio | Percentage of Total Broken |
|------------------|-------------|
| (0, 0.25]        | 27.2369%    |
| (0.25, 0.5]      | 18.8496%    |
| (0.5, 0.75]      | 11.4321%    |
| (0.75, 1]        | 7.2481%     | 
| (1, 1.25]        | 5.1668%     |
| (1.25, 1.5]      | 3.7907%     |
| (1.5, 1.75]      | 2.9310%     |
| (1.75, 2]        | 2.4033%     |
| (2, 2.25]        | 1.9788%     |
| (2.25, 2.5]      | 1.6850%     |
| (2.5, 2.75]      | 1.4080%     |
| (2.75, 3]        | 1.1879%     |
| (3, 3.25]        | 1.0654%     |
| (3.25, 3.5]      | 0.9391%     |
| (3.5, 3.75]      | 0.8334%     |
| (3.75, 4]        | 0.7165%     |
| (4, 4.25]        | 0.6634%     |
| (4.25, 4.5]      | 0.5789%     |
| (4.5, 4.75]      | 0.5508%     |
| (4.75, 5]        | 0.4833%     |

Discussion¶

What can we do with all of this? How do we, as a community, solve the issue of 10% of our outbound links pointing to places on the internet that no longer exist? Assuming that my sample was indicative of the entire data dump, there are close to 600K (150K broken unique links x 4, because I took 1/4 of the data dump as a sample) broken links posted in questions and answers on Stack Overflow. I assume a large number of links posted in comments would be broken as well, but that's an activity for another month.

We encourage posters to provide snippets from their links just in case a link dies. That definitely helps, but the resources behind the links and the (presumably) expanded explanation behind the links are still gone. How can we properly deal with this?

It looks like there have been a few previous discussions:

Utilize the Wayback API to automatically fix broken links. Development appeared to stall on this due to the large number of edits the Community user would be making. This would also hide posts that depended on said link from being surfaced for the community to fix it.
Link review queue. It was in alpha, but disappeared in early 2014.
Badge proposal for fixing broken links

Footnotes¶

This is how it ultimately played out. Originally I sent HEAD requests, in an effort to save bandwidth. This turned out to waste a whole bunch of time because there are a whole bunch of sites around the internet that return a 405 Method Not Allowed when sending a HEAD request. The next step was to send GET requests, but utilize the default Python requests user-agent. A lot of sites were returning 401 or 404 responses to this user agent.
Links to Stack Exchange sites were not counted in the above results. The failures seen are almost 100% due to a question/answer/comment being deleted. The process ran as an anonymous user, thus didn't have any reputation and was served a 404. A user with appropriate permissions can still visit the link. I verified a number of 404'd links to Stack Overflow posts and this was the case.
The 4th most common failure was to localhost. The 16th and 17th most common were localhost on ports other than 80. I removed these from the result table with the knowledge that these shouldn't be accessible from the internet.
There where 7 total URLs that returned status codes in the 600 and 700 range. One such site was code.org with a status code of 752. Sadly, this is not even defined in the joke RFC.

Follow up¶

I posted a proposal on how I think this could be fixed.