<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom"><title>Andrew Wegner | Ponderings of an Andy - Stack Exchange</title><link href="https://andrewwegner.com/" rel="alternate"/><link href="https://andrewwegner.com/feeds/tag/stack-exchange.atom.xml" rel="self"/><id>https://andrewwegner.com/</id><updated>2023-11-27T15:45:00-06:00</updated><subtitle>Can that be automated?</subtitle><entry><title>A Decade of Fighting Spam</title><link href="https://andrewwegner.com/decade-fighting-spam-charcoal.html" rel="alternate"/><published>2023-11-27T15:45:00-06:00</published><updated>2023-11-27T15:45:00-06:00</updated><author><name>Andy Wegner</name></author><id>tag:andrewwegner.com,2023-11-27:/decade-fighting-spam-charcoal.html</id><summary type="html">&lt;p&gt;A run down of what a decade of spam fighting looks like on the Stack Exchange network.&lt;/p&gt;</summary><content type="html">
&lt;h2 id="introduction"&gt;Introduction&lt;a class="headerlink" href="#introduction" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;Charcoal is nearing a decade of existence. In January of 2024, the Stack Exchange community will have been fighting the good fight of keeping spam off the platform. I've written about a &lt;a href="https://andrewwegner.com/can-a-machine-be-taught-to-flag-spam-automatically.html"&gt;machine being able to flag spam&lt;/a&gt; in the past. I've also posted &lt;a href="https://meta.stackexchange.com/q/291301/186281"&gt;the original&lt;/a&gt; and it's follow up on being able to &lt;a href="https://meta.stackexchange.com/q/307585/186281"&gt;spam flag even better&lt;/a&gt; on Stack Exchange itself.&lt;/p&gt;
&lt;p&gt;Recently, I was asked to talk a bit about a hobby of mine. I put together this presentation. &lt;/p&gt;
&lt;p&gt;&lt;img alt="A Decade of Fighting Spam by the Stack Exchange community" src="https://andrewwegner.com/images/charcoal-10years/slide1.png"/&gt;&lt;/p&gt;
&lt;h2 id="what-is-stack-exchange"&gt;What is Stack Exchange?&lt;a class="headerlink" href="#what-is-stack-exchange" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;To set a bit of context for those who need it. Stack Exchange is a network of over 180 sites covering almost any topic you can think of. It's a question and answer network. The slide you are seeing here are just a handful of the sites of more interesting logos - but you can see they cover a range of topics from &lt;a href="https://workplace.stackexchange.com/"&gt;professional work place questions&lt;/a&gt;, to the &lt;a href="https://english.stackexchange.com/"&gt;intricacies of the English Language&lt;/a&gt;, to &lt;a href="https://datascience.stackexchange.com/"&gt;Data Science&lt;/a&gt; and &lt;a href="https://gaming.stackexchange.com/"&gt;gaming&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img alt="A handful of Stack Exchange site logos - Workplace, English Language, Data Science and Arquade included" src="https://andrewwegner.com/images/charcoal-10years/slide2.png"/&gt;&lt;/p&gt;
&lt;p&gt;But by far the largest and most popular is &lt;a href="https://stackoverflow.com/"&gt;Stack Overflow&lt;/a&gt;. With 24 million questions covering any programming language or framework you have used. It consistently ranks in the top 500 most visited sites on the internet - depending on what service is doing the measuring. Basically, it gets a lot of eyeballs looking at it daily.&lt;/p&gt;
&lt;p&gt;Which means it's a target for spam.&lt;/p&gt;
&lt;p&gt;&lt;img alt="The largest Stack Exchange site is Stack Overflow" src="https://andrewwegner.com/images/charcoal-10years/slide3.png"/&gt;&lt;/p&gt;
&lt;h2 id="what-is-spam"&gt;What is spam?&lt;a class="headerlink" href="#what-is-spam" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;The network and the community within it settled on a fairly standard definition of spam: &lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;A post exists only to promote a product or service and doesn't disclose author's affiliation. &lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The images here show what the site looks like when the community systems aren't operational. This is the front page of two sites and if you look closely at the time stamps, you'll see that these posts occurred within about 10 minutes.&lt;/p&gt;
&lt;p&gt;If users - new or experienced - come to the site and see this, they start to turn away. &lt;/p&gt;
&lt;p&gt;Back in 2013/2014 this was common. Spam posts would stick around for hours and a group of users decided they could help out across the network by flagging these posts more quickly.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Two examples of spam filled front pages on Stack Exchange sites" src="https://andrewwegner.com/images/charcoal-10years/slide4.png"/&gt;&lt;/p&gt;
&lt;h2 id="what-is-flagging"&gt;What is flagging?&lt;a class="headerlink" href="#what-is-flagging" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;The final bit of context that is needed is: flagging. It's exactly what you'd think it is. The goal of a flag to bring attention to the post by forcing it in the community review queues. This gets more people to look at it. Stack Exchange is built around community moderation. There is very little that elected "Diamond Moderators" need to handle that the community can't handle.&lt;/p&gt;
&lt;p&gt;If enough people flag a post as spam, it's automatically deleted. The community and company decided that getting 6 people to agree a post is spam is an appropriate number.&lt;/p&gt;
&lt;p&gt;Once a post is removed as spam, the post is locked, deleted and the author has 100 reputation points removed. These are the visible actions. The reputation hit is to prevent - or slow - a spammer from getting more privileges within the network. &lt;/p&gt;
&lt;p&gt;Behind the scenes, a spam post also triggers company checks against future posts matching similar information to the user. These aren't publicly disclosed. But, the company is fairly conservative in terms of blocking users. &lt;/p&gt;
&lt;p&gt;&lt;img alt="Flagging brings attention to posts for others in the community to see, act upon and eventually remove spam" src="https://andrewwegner.com/images/charcoal-10years/slide5.png"/&gt;&lt;/p&gt;
&lt;h2 id="what-is-charcoal"&gt;What is Charcoal?&lt;a class="headerlink" href="#what-is-charcoal" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;The community hates spam. It's a bad user experience at best to have a page filled with spammy posts. It also makes the site, and community at large, look rather seedy. This isn't great for a community and a company that has built its reputation on accuracy and trust.&lt;/p&gt;
&lt;p&gt;Charcoal was created to watch for spam across the 180+ sites. Actually, when we started it was less than half of that, but over the past decade the network has grown and the anti-spam systems have grown with it.&lt;/p&gt;
&lt;p&gt;The community has a two phase process to dealing with spam. First is alerting the spam fighting community of potential spam. Users can go cast their flags across the network and deal with it. Second, for the truly egregious spam, the system can utilize the community's flags and automatically cast those flags.&lt;/p&gt;
&lt;p&gt;How's this work?&lt;/p&gt;
&lt;p&gt;&lt;img alt="Characoal is the community organization that runs two systems to deal with spam. The first alerts users about potential spam and the second casts flags against detected spam." src="https://andrewwegner.com/images/charcoal-10years/slide6.png"/&gt;&lt;/p&gt;
&lt;h2 id="smokedetector-and-metasmoke"&gt;SmokeDetector and Metasmoke&lt;a class="headerlink" href="#smokedetector-and-metasmoke" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;There are two systems behind this community effort - SmokeDetector and Metasmoke.&lt;/p&gt;
&lt;p&gt;SmokeDetector, affectionately named "Smokey", is designed to be the early warning system. It quickly provides a yes/no decision on whether a post is spam and alerts users for manual action. It passes off the more intense confidence checks and automatic flags to MetaSmoke.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Two systems create the anti-spam system. SmokeDetector for a quick spam/not-spam decision and Metasmoke to handle confidence checks and automatic flags" src="https://andrewwegner.com/images/charcoal-10years/slide7.png"/&gt;&lt;/p&gt;
&lt;p&gt;Every post on the network goes through the process below. A user clicks the submit button and Stack Exchange does their few checks - remember these are black boxed - and if the post makes it through these it gets published to a real time web socket.&lt;/p&gt;
&lt;p&gt;SmokeDetector does a quick "Is this spam?" check. If it is, it's posted to chat rooms around the network - to the network wide Charcoal room and usually to a site specific room if the room is utilized enough. Users then go and investigate and if they agree that it's spam, cast a flag. After 6 of these are cast, the post is removed. Hooray! Another victory against spam.&lt;/p&gt;
&lt;p&gt;When Smokey posts that spam is found, it also sends a message to MetaSmoke. This system is checking how confident we are that this is spam. If there is high confidence, it will start utilizing community member flagging privileges to cast spam flags on the post as well. If there isn't high confidence, no automatic flags will be cast. &lt;/p&gt;
&lt;p&gt;The goal is to remove the spammy posts as quickly as possible - and by utilizing automatic flags the number of people that have to go do this manually is reduced. Due to larger community and company discussions and outcomes, the system will not cast all 6 flags except in very very rare circumstances. Someone has to agree with the machines here.&lt;/p&gt;
&lt;p&gt;&lt;img alt="A process flow diagram of the Charcoal systems" src="https://andrewwegner.com/images/charcoal-10years/slide8.png"/&gt;&lt;/p&gt;
&lt;h2 id="how-to-detect-spam"&gt;How to detect spam&lt;a class="headerlink" href="#how-to-detect-spam" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;What's spam detection look like? Over the last decade we've tried things like classification schemes, machine learning algorithms, and a handful of AI attempts. But, by far the most reliable has been...&lt;/p&gt;
&lt;p&gt;Regular expressions.&lt;/p&gt;
&lt;p&gt;(Take a deep breath fellow engineers)&lt;/p&gt;
&lt;p&gt;Each post - goes through thousands of regular expressions. Each expression is weighted based on how likely matching that particular expression means the post is spam. The higher weights are posted into the chatroom kicking off this entire process.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Regular Expressions run the world - or at least anti-spam systems on Stack Exchange" src="https://andrewwegner.com/images/charcoal-10years/slide9.png"/&gt;&lt;/p&gt;
&lt;p&gt;The community has built watchlists and blacklists over the decade to help find these posts. &lt;/p&gt;
&lt;p&gt;Watchlists are experimental checks. Spam evolves over time. It's actually pretty interesting to watch a dedicated spammer craft their posts to get it to last on the network more than a few minutes. These watchlists are designed to allow the team to test regular expressions without fear of automatically flagging something during testing.&lt;/p&gt;
&lt;p&gt;Blacklists are finalized regular expressions that catch spam with a high number of true positives and very low false positives. These the weight spam checkers.&lt;/p&gt;
&lt;p&gt;Like Stack Exchange itself, the spam fighting community has built tooling that allows work to be done without a high level user to be around. Users can watch for a new regular expression.&lt;/p&gt;
&lt;p&gt;Users that aren't trusted just yet, will have their request created as a pull request in GitHub that needs to be approved. Trusted users will get their watchlist automatically added to the system. The same holds true for blacklisted items.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Watchlists - Experimental regex detections and Blacklists - Tried and tested regex detection are added by users to fight spam" src="https://andrewwegner.com/images/charcoal-10years/slide10.png"/&gt;&lt;/p&gt;
&lt;p&gt;But, watchlists and blacklists are only half the problem. The other half is validating that these are accurate. As posts are detected as spam, users provide a signal back to the system on whether a post is a &lt;code&gt;tp&lt;/code&gt; - True Positive - Spam &lt;/p&gt;
&lt;p&gt;Or a false positive - &lt;code&gt;fp&lt;/code&gt; - not spam. These feedback to the watchlists and will prevent elevating watchlists that are inaccurate to a full blacklist.&lt;/p&gt;
&lt;p&gt;Sometimes, a post has features that the system doesn't detect as spam. In those cases, the community can manually report the post. This triggers the alerts through out the chatrooms so that others can flag it and get it removed. It also allows the community to find potential patterns to watch for in the future.&lt;/p&gt;
&lt;p&gt;Users that aren't trusted yet get pull requests created for their patterns. All of this can be handled and approved within the chatrooms. A lot of this system is built on top of, and keeps most users within, the Stack Exchange ecosystem.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Community Feedback on the detection reasons is critical to ensuring the system has reliable capabilities" src="https://andrewwegner.com/images/charcoal-10years/slide11.png"/&gt;&lt;/p&gt;
&lt;p&gt;I mentioned the weighted reasons on a detected post. When these are posted in chat, the reasons are also posted as well as the weight of the post. The one on the slide below is particularly bad. Generally anything over about ~225-250 is spam with higher numbers becoming more and more certain. &lt;/p&gt;
&lt;p&gt;These weights shift over time and as a regular expression is utilized more. This keeps the system flexible.&lt;/p&gt;
&lt;p&gt;For this particular post, the system determined it was spam and cast three automatic flags from our users. Each user that grants permissions for the system to utilize their flags - because they are responsible for the usage of the flags - can set their threshold for when to allow their name to be used. &lt;/p&gt;
&lt;p&gt;The 4th flag here came in via a user script the community built, but was not automatically cast. The remaining two flags would have come from the users of the site or from someone that saw the Smoke Detector alert and manually flagged it. Metasmoke doesn't have a record of that because it didn't go through Metasmoke.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Automatic flags are cast when a post exceeds the threshold set by our users. The system can cast multiple flags from multiple users." src="https://andrewwegner.com/images/charcoal-10years/slide12.png"/&gt;&lt;/p&gt;
&lt;h2 id="by-the-numbers"&gt;By the numbers&lt;a class="headerlink" href="#by-the-numbers" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;Let's look at some numbers.&lt;/p&gt;
&lt;p&gt;SmokeDetector has been running since January of 2014. We didn't start recording stats until about 18 months later though, so the dates in the graph start in August 2015. Initially, the system didn't have watch lists, which is why you see the blue and orange lines are pretty close together.&lt;/p&gt;
&lt;p&gt;Around mid 2018/early 2019 we introduced watchlists. This was done because we started seeing persistent spammers. These were spammers that noticed their posts were being deleted quickly and worked to find ways to change the message to stick around longer. &lt;/p&gt;
&lt;p&gt;The chatrooms are open and based on some messages we have removed, it's obvious the room is watched by the bored spammers. The watchlists reduced the true positives. But because we didn't ever separate the data between blacklists and watchlists the lines began to separate.&lt;/p&gt;
&lt;p&gt;In early 2017, autoflagging was introduced. With autoflagging the system can reduce the time on site for nearly half of the true positives.&lt;/p&gt;
&lt;p&gt;You'll notice a major spike in the summer of 2022 and a dip in the summer of 2023. The spike was for a massive spam wave. This was the work of a spammer that had access to a lot of geographically distributed systems - which bypassed Stack Exchange's built in protections - and was a persistent spammer or team of spammers that watched the public chatrooms for changes the spam fighting community made to detect their posts. This went on for 2-3 weeks with thousands of posts being made, adjusted, and deleted. Ultimately, the spammer was blocked at the Stack Exchange level based on heuristics the Charcoal team presented.&lt;/p&gt;
&lt;p&gt;This past summer, in 2023, the dip you see is because Stack Exchange experienced a crisis of confidence from the community at large. &lt;a href="https://andrewwegner.com/category/stack-exchange-strike.html"&gt;Moderation work stopped for the months of June and July in protest of the company's policies&lt;/a&gt; toward generative AI on the platform. Charcoal participated in that. While not fully resolved, some of the worst policies were reworked with input from the larger community and work resumed. &lt;/p&gt;
&lt;p&gt;&lt;img alt="A graph showing total detections, true positives, and automatic flags over time." src="https://andrewwegner.com/images/charcoal-10years/slide13.png"/&gt;&lt;/p&gt;
&lt;p&gt;The goal of the Charcoal project is to remove spam quickly from the site. Flags that are cast by the system are tracked and we can clearly see that more automatic flags mean the post is active for less time.&lt;/p&gt;
&lt;p&gt;When there is no system cast flags, an average spam post lives for 21 minutes on the site. If the system casts all 6 - which is only utilized during a spam wave like in the summer of 2022 and with company permission - a post lives for 16 seconds. During day to day operations, the system is configured to cast 3 automatic flags. This was determined by a lot of conversations with individual sites around the network and what they felt comfortable with.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Timing of how long a post is visible based on the number of automatic flags. At the default 3 flags, a post will be visible for less than 5 minutes" src="https://andrewwegner.com/images/charcoal-10years/slide14.png"/&gt;&lt;/p&gt;
&lt;p&gt;SmokeDetector has over 103 thousand commits to its repository over the last 10 years with 90 different code contributors. In the slide below, the top two graphs show that it's rulesets are updated daily - except for this past summer.&lt;/p&gt;
&lt;p&gt;Over the course of a 24 hour period, flags are automatically cast from nearly 420 different users around the network.&lt;/p&gt;
&lt;p&gt;Finally, the entire goal: over 450,000 spam posts have been identified and deleted by the system and the community in the last decade.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Daily ruleset updates, 100k+ code commits, 90 contributors, 420 users with automatic flags daily all results in over 450,000 spam posts removed in a decade" src="https://andrewwegner.com/images/charcoal-10years/slide15.png"/&gt;&lt;/p&gt;
&lt;h2 id="conclusion"&gt;Conclusion&lt;a class="headerlink" href="#conclusion" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;You now have an idea of how one of the largest sites on the internet handles spam. I do want to point out that StackExchange operates very differently from sites like reddit or YouTube or Facebook which spent a lot of company time building their anti-spam systems. Stack Exchange built basic protections themselves and then saw the technical community members step up and take on the challenge. &lt;/p&gt;</content><category term="Side Activities"/><category term="Stack Exchange"/><category term="moderation"/></entry><entry><title>Stack Exchange Strike - The strike is over</title><link href="https://andrewwegner.com/stackexchange-moderator-ends.html" rel="alternate"/><published>2023-08-07T12:00:00-05:00</published><updated>2023-08-07T12:00:00-05:00</updated><author><name>Andy Wegner</name></author><id>tag:andrewwegner.com,2023-08-07:/stackexchange-moderator-ends.html</id><summary type="html">&lt;p&gt;Stack Exchange moderators are done striking. What did two months of inaction on the part of curators achieve?&lt;/p&gt;</summary><content type="html">
&lt;h2 id="introduction"&gt;Introduction&lt;a class="headerlink" href="#introduction" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://meta.stackexchange.com/q/392032/186281"&gt;The Stack Exchange strike is over&lt;/a&gt;. It took two months and two days of inaction on behalf of community members and moderators to resolve
the strike. The &lt;a href="https://meta.stackexchange.com/q/391847/186281"&gt;results of the negotiations&lt;/a&gt; were posted last week. Over the weekend the community voted that the goals of the strike have
been achieved and called the end of the strike.&lt;/p&gt;
&lt;p&gt;So, what was achieved? What did more than two months of community upheaval get the users of Stack Exchange? &lt;/p&gt;
&lt;h2 id="ai-generated-posts"&gt;AI Generated Posts&lt;a class="headerlink" href="#ai-generated-posts" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;Effective immediately, Stack Exchange has agreed to allow the removal of content based on a combination of strong and weak indicators of GPT 
usage. Additionally, the original policy that was the straw that started all of this has been both &lt;a href="https://meta.stackexchange.com/q/391626/186281"&gt;publicly released&lt;/a&gt; and declared invalid.&lt;/p&gt;
&lt;p&gt;This release took two months and was the cause of the strike beginning. It drove a wedge between the moderators of Stack Exchange and the 
company and its employees. I do not understand why it took so long to release this. The public answer is that there was initial moderator 
pushback on releasing it but that was rescinded within days. Like much of the public communications during this, Stack Exchange clung to 
outdated information.&lt;/p&gt;
&lt;p&gt;Something that I think is amazing is that this private policy was never leaked in full during the strike. Despite what the company was saying about
moderators and the community, the community abided by its belief that agreements - private moderator space - should remain just that: private. I 
applaud the moderation community for this commitment to their ideals.&lt;/p&gt;
&lt;h2 id="data-dumps"&gt;Data Dumps&lt;a class="headerlink" href="#data-dumps" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;The Data Dumps were turned off back in March. They were re-enabled several weeks into the strike as one of the first concessions. However, it's 
important to note that a &lt;a href="https://meta.stackexchange.com/a/391640/186281"&gt;former employee has presented that this was not an unwavering commitment&lt;/a&gt;. They had been contacted by the Stack
Exchange CEO in March to disable the scheduled data dump.&lt;/p&gt;
&lt;p&gt;The company officially states that it's committed to the  long-term (foreseeable future) survival of the data dumps, the API, and SEDE [Stack Exchange Data Explorer]. &lt;/p&gt;
&lt;p&gt;This is good. It is concerned that this action was done without informing the community and only discovered after the fact though. This type of 
"ask for forgiveness" behavior from Stack Exchange is common and is a concern for me. &lt;/p&gt;
&lt;h2 id="moderation-agreement-changes"&gt;Moderation Agreement Changes&lt;a class="headerlink" href="#moderation-agreement-changes" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;Stack Exchange agreed to a review period for binding policy changes and policies must be made public. This is another big one that caused the strike
to move forward. I'm happy to see this has been resolved in a way that benefits the community and transparency. Time will tell if this holds true, as 
it's only something that we can say is effective until it is no longer effective.&lt;/p&gt;
&lt;p&gt;Assuming that the agreement holds though, a review period will be good for the moderators that are expected to enforce changes. It will give them
time to get clarifications and be prepared for the discussions on meta. &lt;/p&gt;
&lt;p&gt;The company also agreed to update their press policy. New statements to the press must get at least one member of the community management team to 
sign off on the statement and statements must be as general as possible. This works in tandem with the existing policy put in place in 2019, where
statements won't discuss an individual moderator without written permission.&lt;/p&gt;
&lt;p&gt;This is another one that we can't judge effectiveness of until it's been broken. Unfortunately, the press policy from 2019 wasn't enough to prevent
some statements to the press this time that presented moderators in very unflattering light. Even though none were identified by name, the broad 
statements were taken out of context in a couple instances.&lt;/p&gt;
&lt;p&gt;That said, there was an &lt;a href="https://meta.stackexchange.com/q/391990/186281"&gt;apology&lt;/a&gt; for that from the Vice President of Community:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;I would also like to take this opportunity to extend my most sincere personal apologies to mods who felt that in our previous text we were accusing them of racism. While that was not the intent of the text that I wrote (nor did that sentiment reflect the feelings of anyone involved in drafting the text), I can understand how it could be read that way, and I regret that we allowed it to be published like that. You have my sincere apologies, which I will also deliver in person at the upcoming mod/staff meetup.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2 id="violations-of-the-moderator-agreement"&gt;Violations of the Moderator Agreement&lt;a class="headerlink" href="#violations-of-the-moderator-agreement" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;The current moderator agreement does not lay out a process for determining if the company violates the agreement. The strike representatives and the 
company agreed to an outline for such a process. If Stack Exchange is found to have violated the agreement, actions taken and comments made during the 
violation must be retracted and nullified and a public apology must be made detailing the violation.&lt;/p&gt;
&lt;p&gt;This is one that I hope we don't have to see utilized. Again, time will tell if it is something that will occur. &lt;/p&gt;
&lt;p&gt;There is also a process where the moderation team can vote on if a violation occurred. Exact numbers are still being determined, but essentially a 
minimum number of moderators must vote on if a violation was committed and from that a minimum amount must vote that a violation occurred.&lt;/p&gt;
&lt;p&gt;While I'm unhappy with the percentages used as placeholders, the discussions here continue. I am happy with the process and the proposed actions if 
a violation is determined.&lt;/p&gt;
&lt;h2 id="stack-exchange-processes"&gt;Stack Exchange Processes&lt;a class="headerlink" href="#stack-exchange-processes" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;Several internal changes were negotiated as well. Each of these were around how the company communicates. These include being transparent that the 
strike occurred, collaboration with the community instead of fighting it, public policies, and clear communication. &lt;/p&gt;
&lt;p&gt;All of these are positives changes and can only be evaluated over time. There are already signs that some are taking place with public releases of 
policies, acknowledgement of the agreement itself and discussions around the final details.&lt;/p&gt;
&lt;h2 id="what-do-i-think"&gt;What do I think?&lt;a class="headerlink" href="#what-do-i-think" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;I've been pretty pessimistic of the last two months. Stack Exchange appeared to be following reddit's footsteps in some cases. I was afraid that
the company would start replacing moderators in a few months - either when the automated systems started flagging inactivity or when the products 
announced at the &lt;a href="https://andrewwegner.com/stackexchange-moderator-strike-overflowai-reaction.html"&gt;developer conference last month&lt;/a&gt; started appearing on the site.&lt;/p&gt;
&lt;p&gt;I'm still pessimistic about the future of the platform, if I'm being honest with myself. I do not think generative AI will be a benefit to the community,
as it exists right now. By the time it's to a point where generative AI doesn't make things up, it'll be too late. The bad data and information will 
have already been on the site and trust will be gone.&lt;/p&gt;
&lt;p&gt;The policies that can't be measured until a violation occurs also has me concerned. We went through this cycle back in 2019 with a strike being 
very narrowly averted then. Unfortunately, the same things that occurred then triggered the issues now: a private policy and talking to the press. Both
were supposed to be resolved then. Both are supposed to be resolved now. Unfortunately, we won't know if that's the case until the policies cause another
problem.&lt;/p&gt;
&lt;p&gt;So...am I sticking around as a moderator? There is a virtual moderator meet up later this month with the CEO of Stack Exchange. I think that will be
when I make my final determination. However, two months of not moderating was a welcome break. &lt;/p&gt;</content><category term="Stack Exchange Strike"/><category term="Stack Exchange"/><category term="moderation"/></entry><entry><title>Stack Exchange Strike - Community and AI Talk Reaction</title><link href="https://andrewwegner.com/stackexchange-moderator-strike-overflowai-reaction.html" rel="alternate"/><published>2023-07-27T12:00:00-05:00</published><updated>2023-07-27T12:00:00-05:00</updated><author><name>Andy Wegner</name></author><id>tag:andrewwegner.com,2023-07-27:/stackexchange-moderator-strike-overflowai-reaction.html</id><summary type="html">&lt;p&gt;Stack Exchange moderators are striking. This is my reaction to Stack Exchange's products and updates announced at the WeAreDevelopers conference.&lt;/p&gt;</summary><content type="html">
&lt;h2 id="introduction"&gt;Introduction&lt;a class="headerlink" href="#introduction" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;I really was trying not to post anything about the strike again until it was resolved. There was a long period of no action in July so my 
assumption was that it was being worked on behind the scenes with the company and the designated representatives. Turns out, that it was just
quiet for a couple weeks with no movement. One important development was the release (finally) of the &lt;a href="https://meta.stackexchange.com/q/391626/186281"&gt;private policy on GPT Generators&lt;/a&gt;. This
is significantly different from the one that was released publicly and was part of the cause of the current moderator strike. The community's 
response to this policy was as expected - suprise at how bad the policy was even with moderators saying it was bad.&lt;/p&gt;
&lt;p&gt;Today, Prashanth Chandrasekar - CEO of Stack Overflow - presented the next vision for Stack Exchange. The presentation is available on 
&lt;a href="https://www.youtube.com/watch?v=g5F5t205pYA"&gt;YouTube&lt;/a&gt;. It starts around the 9 minute mark.&lt;/p&gt;
&lt;h2 id="reaction"&gt;Reaction&lt;a class="headerlink" href="#reaction" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;h3 id="welcoming"&gt;Welcoming&lt;a class="headerlink" href="#welcoming" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h3&gt;
&lt;blockquote&gt;
&lt;p&gt;You will continue to be the focus. 100%. We are here to serve you. To fight on your behalf. To make sure you are recognized for the work you are doing and that you are able to responsibly harness the power of AI for your needs to be the best developers you can be. All of this is centered around the collective community and knowledge. That is irreplaceable.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This is a good start. It rings hollow to me right now though. These last two months in particular have shown that the community is easily tossed aside. 
2019, and its literal legal settlement with a community member for tossing them aside, shows that current time isn't a one time thing. We'll see what the rest of this talk covers, but right now this feels like pandering to the audience.&lt;/p&gt;
&lt;h3 id="guiding-principles"&gt;Guiding Principles&lt;a class="headerlink" href="#guiding-principles" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h3&gt;
&lt;blockquote&gt;
&lt;p&gt;Find new ways to give technologists more time to create amazing things.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Good. I'm all about efficiency for myself and my teams. If we can make developers' and engineers' lives easier, that's a win for me.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Accuracy is fundamental. That comes from attributed, peer-reviewed sources that provide transparency.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Spoiler - I received a sneak peak of a portion of this presentation two days before the official one in Berlin. The information that will be shown 
later in the presentation will cover some of this. It doesn't address all of it though. I'm pleased with the attribution aspect of this. 
Stack Overflow content is &lt;a href="https://stackoverflow.com/help/licensing"&gt;licensed under CC-by-SA&lt;/a&gt;. Attribution is required and I approve of any effort to improve that. &lt;/p&gt;
&lt;p&gt;I'm concerned about accuracy. Their own test of the &lt;a href="https://andrewwegner.com/stackexchange-moderator-strike-does-stackexchange-care.html"&gt;GenAI powered formatting assistant&lt;/a&gt; last month showed this is a problem. 
Current GenAI - Chat GPT, Bard, etc - are well known for making information up instead of saying "I don't know". &lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The technology field should be accessible to all, including beginners to advanced users.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Weird ending phrase, but I agree. Technology fields are not limited to engineers, software developers and people with the desire to learn how to 
program. Technological improvements should come from any one who wants to make life easier. AI can and will help with that.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Humans should always be included in the application of any new technology.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Again, agreed. My concern here is whether or not Stack Exchange has the capability to do that. Technically, they've had a human involved, but it's 
only been from a single perspective - that of the company. They have ignored the other side of this: the community. The people that provided the 
data for their business to thrive.&lt;/p&gt;
&lt;h3 id="overflow-ai"&gt;Overflow AI&lt;a class="headerlink" href="#overflow-ai" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;&lt;em&gt;insert applause&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;This has been in development for "the past three months". It also covers 6 items being talked about today and 6 additional items. That time frame is 
concerning. Three months is not a lot of time to build a lot of these larger items. &lt;/p&gt;
&lt;p&gt;Search: Summarized version of multiple answers with citations from where the answers came from. It also allows you to continue the conversation with the 
results - including code. It also allows the user to post the question to Stack Overflow if the generative AI portion gets stuck.&lt;/p&gt;
&lt;p&gt;My concerns lie with how this impacts the community. Hand waving away that the system is able to synthesize an answer and properly attribute it, what
happens next? Stack Overflow has a reputation - deserved or not is up to the reader - for being harsh on new users that don't "try". If a new user 
comes and gets an answer without posting the question that's wonderful for the user. They move on and complete what they need to complete. For the 
community, though, we have a problem. &lt;/p&gt;
&lt;p&gt;I find it unlikely that a new user is going to post a question to Stack Overflow when they received an answer. Why would they? They already got the 
answer they need. If this repeats constantly, the knowledge base that Stack Overflow represents becomes significantly less useful. Fewer questions
mean fewer answers. With fewer answers, the platform becomes more dependent on the trained AI model which is receiving less data to feed it. The system
spirals.&lt;/p&gt;
&lt;p&gt;Does this only matter to prevent duplicates though? If the system is only summarizing other topics, wouldn't that mean the question being asked is a 
duplicate? I don't think that's the case. I think with the "continue discussion" portion of this, Stack Overflow will be losing out on better written
and described problems. Problems with more detail that the community could benefit from seeing and answering.&lt;/p&gt;
&lt;p&gt;The Visual Studio plugin looks interesting. I don't think it's adding much that other plugins can't do already, other than providing a first party 
integration to Stack Overflow. The same is true for the Slack / Stack Overflow integration. &lt;/p&gt;
&lt;p&gt;The Enterprise knowledge injestion is an interesting product. While we don't use Stack Overflow for teams at my current company, this would be 
a great way to start using it. The ability to load initial data into a new system is important. I'd love to see this in practice and how well it does 
at building initial Q&amp;amp;A pairs from the data it ingests. &lt;/p&gt;
&lt;p&gt;I haven't looked into Stack Overflow for Teams for my professional work. A couple comments that were made do raise a few concerns about how Stack Exchange
is using company data. Before making a recommendation to utilize it for my company, I'd need to investigate that data from my company is not made 
available outside of my organization.&lt;/p&gt;
&lt;p&gt;Finally, discussions. Stack Exchange is adding a forum to their collectives product. The community has spent over a decade to not be a forum. This change
is not great in my opinion. Forums, social media, are a different beast than Stack Overflow. &lt;/p&gt;
&lt;h2 id="final-thoughts"&gt;Final Thoughts&lt;a class="headerlink" href="#final-thoughts" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;I am concerned about the direction Stack Exchange is taking and the employee resources they are using to go that route. Engagement, page views, and 
traffic has been a huge underlying concern at Stack Exchange for a while. It was particularly noticed when ChatGPT was released and with the comments 
about driving users away. However, this downward trend has been ongoing for years.&lt;/p&gt;
&lt;p&gt;I don't think GenAI is going to solve this problem. I think it will improve new user experience, but the way this has been presented it will be to the
detriment of the larger community. Fewer questions will exist to answer and that will slowly cause users to disengage from the site. &lt;/p&gt;
&lt;p&gt;This is why the opening of the talk feels like lip service to the community to me. These changes are designed to put a barrier between users of the 
community. Even the discussions product has the barrier. Discussions will only be available in Collectives, not to the general community. &lt;/p&gt;</content><category term="Stack Exchange Strike"/><category term="Stack Exchange"/><category term="moderation"/></entry><entry><title>Stack Exchange Strike - Now AI is bad? Does Stack Exchange know what it is doing?</title><link href="https://andrewwegner.com/stackexchange-moderator-strike-what-are-they-doing.html" rel="alternate"/><published>2023-07-04T23:30:00-05:00</published><updated>2023-07-04T23:30:00-05:00</updated><author><name>Andy Wegner</name></author><id>tag:andrewwegner.com,2023-07-04:/stackexchange-moderator-strike-what-are-they-doing.html</id><summary type="html">&lt;p&gt;Stack Exchange moderators are striking. A recent blog post by Stack Exchange shows they don't trust GenAI either so why are they pushing it so hard?&lt;/p&gt;</summary><content type="html">
&lt;h2 id="introduction"&gt;Introduction&lt;a class="headerlink" href="#introduction" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;My previous posts about the ongoing &lt;a href="https://openletter.mousetail.nl/"&gt;moderator and curator strike on the Stack Exchange network&lt;/a&gt; can be found linked at the bottom
of this post, or by visiting the &lt;a href="https://andrewwegner.com/category/stack-exchange-strike.html"&gt;Stack Exchange Strike&lt;/a&gt; category on this site. I'd post a summary about what's happened in the last 
ten days, but there is nothing to report. There are discussions, but no agreements. The appointed Stack Exchange employee empowered to 
talk with moderators stepped back and is not participating any longer. &lt;/p&gt;
&lt;p&gt;Tomorrow marks the one month point. We are hours away from 10,000 pending moderator flags on Stack Overflow. This is up from 78 (yes, 
two digits, in mid-May). The way this has gone down, the lack of progress, and the continued mischaracterization of moderators to the press
hasn't motivated me to spend my free time to volunteer in the last long though. I still have this feeling that Stack Exchange is looking 
at the reddit protests recently with their demand that moderators return to the community and wondering if they can replicate that here.&lt;/p&gt;
&lt;h2 id="new-confusion"&gt;New confusion&lt;a class="headerlink" href="#new-confusion" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;On July 3, 2023 &lt;a href="https://web.archive.org/web/20230703161433/https://stackoverflow.blog/2023/07/03/do-large-language-models-know-what-they-are-talking-about/"&gt;Stack Overflow published a blog post&lt;/a&gt; entitled: "Do large language models know what they are talking about?". Spoiler:
the conclusion of the article is "Nope."&lt;/p&gt;
&lt;p&gt;But that's not the interesting thing. The interesting thing is how this answer is presented. The very last paragraph of the post cuts to
the heart of the matter that &lt;a href="https://andrewwegner.com/stackoverflow-bans-chatgpt.html"&gt;moderators on Stack Overflow raised in December when we banned ChatGPT&lt;/a&gt;.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Treating AI-generated information as purely actionable might be the biggest danger of LLMs, especially as more and more web content gets generated by GPT and others: we’ll be awash in information that no one understands. The original knowledge will have been vacuumed up by deep learning models, processed into vectors, and spat out as statistically accurate answers.  We’re already in a golden age of misinformation as anyone can use their sites to publish anything that they please, true or otherwise, and none of it gets vetted. Imagine when the material doesn’t even have to pass through a human editor. &lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;We saw this in action with ChatGPT. We still see it in action with ChatGPT and it's still a &lt;a href="https://meta.stackoverflow.com/a/425409/189134"&gt;problem users are becoming more aware of&lt;/a&gt; as the 
strike continues. We saw it when Stack Exchange tried their &lt;a href="https://andrewwegner.com/stackexchange-moderator-strike-does-stackexchange-care.html"&gt;formatting assistant&lt;/a&gt; on Stack Overflow. What I see here is Stack Overflow
admitting that the moderators are correct, in public. &lt;/p&gt;
&lt;p&gt;The other interesting thing about that paragraph is that it links to an &lt;a href="https://www.theverge.com/2023/6/26/23773914/ai-large-language-models-data-scraping-generation-remaking-web"&gt;article from The Verge&lt;/a&gt; that quotes the Stack Overflow moderators
and the decision to ban AI. It also has this dig at Stack Exchange executives:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The mods say AI output can’t be trusted, but execs say it’s worth the risk.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Their own post is explaining why it's not worth the risk. &lt;/p&gt;
&lt;h2 id="whats-this-mean"&gt;What's this mean?&lt;a class="headerlink" href="#whats-this-mean" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;I see this as more communication failure on Stack Exchange's part. In an &lt;a href="https://andrewwegner.com/stackexchange-moderator-strike-week-one.html"&gt;update I posted weeks ago&lt;/a&gt;, I linked to internal emails that
were leaked.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;How are we messaging this? Who is allowed to post and respond to questions and comments on Meta, chat, social media, etc?&lt;/p&gt;
&lt;p&gt;The Community Leadership Team ([redacted]) are working together in close coordination with Marketing ([redacted]) on comms. They will post and respond to questions on-site. Unless you are specifically tapped to respond to something please do not engage. It is best to avoid commenting on anything related to this action on site, even if you think you have something helpful to add. Please get review and approval from Philippe prior to posting on site, or from [redacted] if you are approached off-site.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Someone, somewhere, didn't realize what this blog post was about or what it linked to. &lt;/p&gt;
&lt;p&gt;But, nothing changes with this. The company has dug in so hard on forcing GenAI to be on the sites and is marching toward an announcement
of some kind in late July 2023 about AI. In the meantime, I can only see blog posts like this one as an indication that Stack Exchange
doesn't know what they are attempting to build toward and at the same time have come to the conclusion (or at least a team within Stack 
Exchange has) that GenAI isn't to be trusted. &lt;/p&gt;
&lt;p&gt;Just like the community said back in December and continues to say now. &lt;/p&gt;</content><category term="Stack Exchange Strike"/><category term="Stack Exchange"/><category term="moderation"/></entry><entry><title>Stack Exchange Strike - How does the company regain my trust?</title><link href="https://andrewwegner.com/stackexchange-moderator-strike-regaining-trust.html" rel="alternate"/><published>2023-06-23T23:00:00-05:00</published><updated>2023-06-23T23:00:00-05:00</updated><author><name>Andy Wegner</name></author><id>tag:andrewwegner.com,2023-06-23:/stackexchange-moderator-strike-regaining-trust.html</id><summary type="html">&lt;p&gt;Stack Exchange moderators are striking. Today I talk about what it'll take to regain my trust. It's not an easy answer.&lt;/p&gt;</summary><content type="html">
&lt;h2 id="introduction"&gt;Introduction&lt;a class="headerlink" href="#introduction" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;My previous posts about the ongoing &lt;a href="https://openletter.mousetail.nl/"&gt;moderator and curator strike on the Stack Exchange network&lt;/a&gt; can be found linked at the bottom
of this post, or by visiting the &lt;a href="https://andrewwegner.com/category/stack-exchange-strike.html"&gt;Stack Exchange Strike&lt;/a&gt; category on this site. &lt;/p&gt;
&lt;p&gt;This post was &lt;a href="https://meta.stackexchange.com/a/390626/186281"&gt;originally posted on Meta Stack Exchange&lt;/a&gt;, the network wide "Meta" for the Stack Exchange network. This meta site and the 
"child meta" sites mentioned in the post are utilized by community members to discuss the network itself. This is where questions about 
how the individual site or the network as a whole are posted, where policies are determined, where questions about questions are discussed. 
This is the location that the community has to make their voice heard.&lt;/p&gt;
&lt;p&gt;This post is my answer to the question "What is needed for users to trust the Stack Exchange company?" It's been edited slightly to fit
the format of this blog.&lt;/p&gt;
&lt;h2 id="how-will-my-trust-be-regained"&gt;How will my trust be regained?&lt;a class="headerlink" href="#how-will-my-trust-be-regained" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;h3 id="short-version"&gt;Short version&lt;a class="headerlink" href="#short-version" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;TL;DR: I'm not sure and that's a bad thing for me and for the community.&lt;/p&gt;
&lt;p&gt;Before I begin, I'm not going to segment the company into various groups. I've gotten the impression from moderator representatives that this is a bad thing and they are offended by this segmentation. I have no desire to further that, so "Stack Exchange" in this case refers to both the company as a whole and all employees.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Author note&lt;/em&gt;: This complaint was relayed by the moderator representatives from Stack Exchange during discussions. It seems that using phrases 
like "management" and "leadership" is being interpreted as "good staff" vs "bad staff" at Stack Exchange. While I disagree with this, to me it's 
not worth the argument thus it's just "Stack Exchange" for me from now on. &lt;/p&gt;
&lt;h3 id="who-am-i"&gt;Who am I?&lt;a class="headerlink" href="#who-am-i" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;For context, I've been on the network for nearly 14 years. I'm a moderator on Stack Overflow, Hardware Recommendations and Community Building. I have built automated tooling to flag comments (at one point accounting for 15% of the comment flags raised on Stack Overflow in a year), and I am an admin on the community led Smoke Detector (spam detection) project. In short, I know this network, the tooling it does and does not have, and various communities across the network. My time here is voluntary. Time that I, until recently, was happy to provide without much of a thought. I've had very interesting discussions with fellow moderators and Stack Exchange employees throughout my time here. &lt;/p&gt;
&lt;h3 id="why-dont-i-trust-stack-exchange"&gt;Why don't I trust Stack Exchange?&lt;a class="headerlink" href="#why-dont-i-trust-stack-exchange" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h3&gt;
&lt;blockquote&gt;
&lt;p&gt;What is needed for users to trust the Stack Exchange company&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Stack Exchange has gone through this cycle before. I've written about it in those past cycles for anyone who wishes to go through my profile and find previous thoughts. Each time, less of my energy comes back as we - community and company - reconcile and bury the problem in the sand. &lt;/p&gt;
&lt;p&gt;The last major cycle ended with a lot of lawyer language, including the new &lt;a href="https://stackoverflow.com/legal/moderator-agreement"&gt;moderator agreement&lt;/a&gt; that every mod had to accept to retain their diamond. This cycle started with a violation of one of the provisions of that agreement by Stack Exchange. It received a, in my opinion, flippant "Oops, that was my fault" by the Vice President of Community at Stack Exchange. &lt;/p&gt;
&lt;p&gt;This tells me that the legal agreement is completely one sided and Stack Exchange feels comfortable violating it without repercussion. If I, on the other hand, had violated a term in the agreement I'd be forced to hand in my diamond. This has eroded a ton of trust I have with the company.&lt;/p&gt;
&lt;p&gt;In the announcement regarding how Generative AI can and should be moderated and in statements to the press, there has been disparagement against the moderators of the network. To me, the subtext of all of that reads as "we don't trust you to moderate correctly". If the company does not trust us to perform activities we've either been elected or appointed to do for our community, why are we still here? &lt;/p&gt;
&lt;p&gt;Combining this with the incredible way this cycle all started and the fact that none of this mistrust was known by the moderation team, my trust of the company took a hit. This policy was announced at the end of May. Data was shared several weeks ago. In all of that, there are allusions to improper moderator activity and hints that moderators are banning so many people that engagement across the platform is down. It wasn't until yesterday (nearly a month) that moderators saw any discussion of these "improper bans". It was just...silent. This big, massive problem that could have been talked about back in February or March was just tossed into the public eye with the implication that moderators are doing the wrong thing. Then it took nearly a month for a conversation to begin. &lt;/p&gt;
&lt;p&gt;The Stack Exchange network has lost &lt;em&gt;at least&lt;/em&gt; four months of time where this "moderation problem" could have been discussed, policies adjusted, and moderators who deal with generative AI on their sites on the daily basis educating the company on how it's actually being detected. Instead, an easily disproved lie about using ChatGPT detectors has been blamed and shared repeatedly with the press for the reason for their sudden policy change.&lt;/p&gt;
&lt;p&gt;My trust level of the company takes several hits here too. I dislike being lied to and I really dislike being lied about. &lt;/p&gt;
&lt;p&gt;Finally, the method of communication through out the last month. The company has a team dedicated to managing the community. There have been many questions on this site and on child metas during the moderator strike. I have seen very little coming from the community management team to answer these questions. The community has questions and the company is not providing answers to them. Instead, we see announcements on topics that the community is &lt;em&gt;against&lt;/em&gt; being announced. Long discussions, in public, are not occurring though. Which erodes my trust even further.&lt;/p&gt;
&lt;h3 id="whats-this-all-mean"&gt;What's this all mean?&lt;a class="headerlink" href="#whats-this-all-mean" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Where am I today then? How does the company rebuild my trust in them? My answer to that is that I don't know. This past month has eroded so much of my faith in the company to be the trusted repository of knowledge that it was in the past. It's also removed much faith that the company actually cares about the community. Much like the previous cycle we've seen details come out that reflect poorly on the company and employees attempt to respond to that only for &lt;em&gt;more&lt;/em&gt; details to come out that make the response look like lies.&lt;/p&gt;
&lt;p&gt;14 years is a lot of time to spend some place and not have strong feelings about. It makes simply accepting negative changes impossible and it makes walking away difficult. That's part of why I'm still here. The other part is the communities I mentioned in my introduction. I have built friends and acquaintances across the network and the sense of community that used to exist is a strong desire to remain. But, this isn't something that will hold the community as a whole together. I am an outlier in terms of a user on the network. Honestly, everyone reading this on Meta is an outlier. &lt;/p&gt;
&lt;p&gt;The company's goal is engagement and traffic. I guarantee the moderation team has not banned enough users to bring down the traffic the network has seen since December. But, we are the scapegoat at least right now. We're nearing a month and there are users with access to site analytics. The number of bans has been close to 0. Theoretically, traffic should be recovering if we were the problem.&lt;/p&gt;</content><category term="Stack Exchange Strike"/><category term="Stack Exchange"/><category term="moderation"/></entry><entry><title>Stack Exchange Strike - Personal Frustrations</title><link href="https://andrewwegner.com/stackexchange-moderator-strike-personal-frustrations.html" rel="alternate"/><published>2023-06-21T11:00:00-05:00</published><updated>2023-06-21T11:00:00-05:00</updated><author><name>Andy Wegner</name></author><id>tag:andrewwegner.com,2023-06-21:/stackexchange-moderator-strike-personal-frustrations.html</id><summary type="html">&lt;p&gt;Stack Exchange moderators are striking. The strike continues and my frustration grows.&lt;/p&gt;</summary><content type="html">
&lt;h2 id="recap"&gt;Recap&lt;a class="headerlink" href="#recap" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://andrewwegner.com/stackexchange-moderator-strike.html"&gt;The Moderation and curator strike started on June 5.&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Stack Exchange has downplayed the effect of this to the press, while at the same time straight lied to the press about &lt;a href="https://openletter.mousetail.nl/"&gt;causes of the strike&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Stack Exchange &lt;a href="https://andrewwegner.com/stackexchange-moderator-strike-week-one.html"&gt;removed access to the data dump back&lt;/a&gt; in March but never told anyone until they were called out on it in early June.&lt;/li&gt;
&lt;li&gt;The &lt;a href="https://andrewwegner.com/stackexchange-moderator-strike-update.html"&gt;moderation team has elected three representatives&lt;/a&gt; to engage with Stack Exchange to solve these problems and end the strike.&lt;/li&gt;
&lt;li&gt;Stack Exchange launches the &lt;a href="https://andrewwegner.com/stackexchange-moderator-strike-does-stackexchange-care.html"&gt;GenAI powered formatting assistant&lt;/a&gt;. The community quickly shows that it's very bad at its job and it is shut down temporarily.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="whats-new"&gt;What's new?&lt;a class="headerlink" href="#whats-new" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;Yesterday Stack Exchange announced the upcoming launch of a new &lt;a href="https://meta.stackexchange.com/q/390463/186281"&gt;Prompt Design site&lt;/a&gt;. To say the community disliked the idea
would be an understatement. The community has pointed out that this will be incredibly niche, with very short term answers because
models update constantly. There are also concerns that this will essentially become a "write my prompt" (instead of "write my code")
site.&lt;/p&gt;
&lt;p&gt;I agree with all of those concerns.&lt;/p&gt;
&lt;p&gt;Also hidden in this announcement is another announcement. Stack Exchange is changing their method of launching new sites. This is
something that should have it's own announcement. This is a big deal and the community has been asking for improvements to the 
Area 51 incubator site for a decade. Unfortunately, the new method is to completely do away with all of the work that Area 51
does - building a community, setting initial questions, helping to set scope, ensuring there is an audience for the topic - and
instead launching with a "Community Stakeholders" group that will do the work in either a private Stack Overflow for Teams instance
or a read only chatroom. &lt;/p&gt;
&lt;p&gt;Both of those options entirely exclude people that may want to participate but don't have access. It adds barriers that don't exist 
on Area 51. &lt;/p&gt;
&lt;h2 id="my-thoughts"&gt;My thoughts&lt;a class="headerlink" href="#my-thoughts" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;The announcement sets a launch date of July 26 for this new site. This is the day before the CEO talks at 
WeAreDevelopers World Congress. A venue where the company has been promising a major AI announcement for months. This new site,
combined with the formatting assistant failure from last week, is starting to show clearly what the company wants to do here.&lt;/p&gt;
&lt;p&gt;I mentioned in my &lt;a href="https://andrewwegner.com/stackexchange-moderator-strike-does-stackexchange-care.html"&gt;last post&lt;/a&gt; that I was becoming more pessimistic that this strike doesn't end with resignations.
That continues to hold true. I also mentioned that Stack Exchanges seems to be keeping an eye on the Reddit strike - 
first with the blackouts, then with the John Oliver protests and currently with the NSFW toggles to prevent ads. Last night, 
&lt;a href="https://www.theverge.com/2023/6/20/23767848/reddit-blackout-api-protest-moderators-suspended-nsfw"&gt;Reddit started removing moderators&lt;/a&gt; as a result. &lt;/p&gt;
&lt;p&gt;With the lack of updates and communication from Stack Exchange to the moderators and curators in the last week, I can't help but 
think that something similar is being discussed at Stack Exchange. Time will tell, but my feeling is that Stack Exchange is going
to plow ahead with GenAI content on their platform. They are going to burn 15 years of trust and quality content and they are going
to do it regardless of what the community wants. If the community protests, they will be shown the door.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://andrewwegner.com/stackexchange-moderator-strike-regaining-trust.html"&gt;Obviously, I'm still pretty pessimistic about all of this.&lt;/a&gt;&lt;/p&gt;</content><category term="Stack Exchange Strike"/><category term="Stack Exchange"/><category term="moderation"/></entry><entry><title>Stack Exchange Strike - Does Stack Exchange Care?</title><link href="https://andrewwegner.com/stackexchange-moderator-strike-does-stackexchange-care.html" rel="alternate"/><published>2023-06-19T14:00:00-05:00</published><updated>2023-06-19T14:00:00-05:00</updated><author><name>Andy Wegner</name></author><id>tag:andrewwegner.com,2023-06-19:/stackexchange-moderator-strike-does-stackexchange-care.html</id><summary type="html">&lt;p&gt;Stack Exchange moderators are striking. It's been two full weeks and the progress toward resolution has been minimal at best.&lt;/p&gt;</summary><content type="html">
&lt;h2 id="recap"&gt;Recap&lt;a class="headerlink" href="#recap" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://andrewwegner.com/stackexchange-moderator-strike.html"&gt;The Moderation and curator strike started on June 5.&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Stack Exchange has downplayed the effect of this to the press, while at the same time straight lied to the press about &lt;a href="https://openletter.mousetail.nl/"&gt;causes of the strike&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Stack Exchange &lt;a href="https://andrewwegner.com/stackexchange-moderator-strike-week-one.html"&gt;removed access to the data dump back&lt;/a&gt; in March but never told anyone until they were called out on it in early June.&lt;/li&gt;
&lt;li&gt;The &lt;a href="https://andrewwegner.com/stackexchange-moderator-strike-update.html"&gt;moderation team has elected three representatives&lt;/a&gt; to engage with Stack Exchange to solve these problems and end the strike.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="where-are-we-now"&gt;Where are we now?&lt;a class="headerlink" href="#where-are-we-now" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;Last week, I ended with a note that the data dumps should be restored by June 16. Good news. That has been completed and the &lt;a href="https://meta.stackexchange.com/q/389922/186281"&gt;dump was uploaded&lt;/a&gt; by June 16. It's 
progress, but after two weeks we have only accomplished one out of four tasks on the list of conditions to end the strike.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;A retraction of the prohibition of moderating GPT content.&lt;/li&gt;
&lt;li&gt;The private policy on GPT content that was issued to moderators must be revealed publicly.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The data dumps must be re-enabled and SEDE and API access guaranteed.&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;Stack Exchange, Inc. must communicate, gather feedback, and act on that feedback before making major policy or software changes to the public platform.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I've heard rumors that the second bullet may occur, but nothing has been done publicly. Thus, nothing outside of feedback from representatives to go on here.&lt;/p&gt;
&lt;h3 id="gpt-content"&gt;GPT Content&lt;a class="headerlink" href="#gpt-content" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;On Thursday, June 15, &lt;a href="https://meta.stackoverflow.com/q/425162/189134"&gt;Stack Exchange enabled their "formatting assistant".&lt;/a&gt; To say it went poorly, is an understatement. There are currently 52 answers to that
question showing how it doesn't work. It's a thin wrapper around a version of ChatGPT or GPT4. It's also very, very bad at being a "formatting assistant". Instead,
it's &lt;a href="https://meta.stackoverflow.com/a/425176/189134"&gt;rewriting content&lt;/a&gt;, &lt;a href="https://meta.stackoverflow.com/a/425165/189134"&gt;butchering code being asked about&lt;/a&gt;, &lt;a href="https://meta.stackoverflow.com/a/425169/189134"&gt;making stuff up&lt;/a&gt;, &lt;a href="https://meta.stackoverflow.com/a/425190/189134"&gt;answering questions&lt;/a&gt; and everything other than making formatting 
better. One user, Mithical, found the &lt;a href="https://meta.stackoverflow.com/a/425208/189134"&gt;prompt the formatting assistant&lt;/a&gt; is using. &lt;/p&gt;
&lt;p&gt;The one small positive that came out of this is that Stack Exchange did &lt;a href="https://meta.stackoverflow.com/q/425081/189134"&gt;communicate with the community&lt;/a&gt; a few days before this was released. This isn't enough 
though.&lt;/p&gt;
&lt;h3 id="communication"&gt;Communication&lt;a class="headerlink" href="#communication" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Why isn't one post enough? &lt;/p&gt;
&lt;p&gt;Behind the scenes it's become clearer to me that several staff members of Stack Exchange don't wish to engage with the community. I briefly touched on one of these
people in my [last update][strikeweek1update]. The public press has been out of the loop sounding, the internal reactions on moderation channels has been 
complaints that moderators are too negative. &lt;/p&gt;
&lt;p&gt;Of course moderators are negative right now. &lt;em&gt;There is a strike going on because they are unhappy&lt;/em&gt;. The feedback from representatives continues to be filled with
road blocks. &lt;/p&gt;
&lt;h2 id="where-do-i-sit-today"&gt;Where do I sit today?&lt;a class="headerlink" href="#where-do-i-sit-today" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;I think it's becoming more clear that Stack Exchange is not interested in removing GenerativeAI content from it's site. It's actively building and promoting a tool
that utilizes ChatGPT under the hood. I am very surprised that they pulled the plug on the formatting assistant after two days. Previous negative feedback has been 
ignored and I fully expected this one to be as well. The problem with pulling the plug on this, is that CEO has committed to exciting announcements about AI this 
summer. If the community just showed that one of those AI projects was a flop, they are going to go even harder at getting the next one to succeed. &lt;/p&gt;
&lt;p&gt;&lt;a href="https://andrewwegner.com/stackexchange-moderator-strike-personal-frustrations.html"&gt;As this drags on into it's third week, I've become more pessimistic that this strike doesn't end with resignations&lt;/a&gt;. Reddit had a two day strike during this 
time period, and they are already threatening to replace community moderators. The Stack Exchange CEO has expressed their fondness for Reddit on occasion,
so I suspect that the action being taken over there is being considered here. Of course, the difference here is that Stack Exchange wasn't effectively shut down
by the strike like Reddit was. Sites didn't go dark. Instead, curators and moderators stopped curating and moderating. Everything is still available,
the sites are still working, it's just less tidy than usual. &lt;/p&gt;</content><category term="Stack Exchange Strike"/><category term="Stack Exchange"/><category term="moderation"/></entry><entry><title>Stack Exchange Strike Update 2</title><link href="https://andrewwegner.com/stackexchange-moderator-strike-update.html" rel="alternate"/><published>2023-06-13T23:00:00-05:00</published><updated>2023-06-13T23:00:00-05:00</updated><author><name>Andy Wegner</name></author><id>tag:andrewwegner.com,2023-06-13:/stackexchange-moderator-strike-update.html</id><summary type="html">&lt;p&gt;Stack Exchange moderators are striking. My update and impression of the last day.&lt;/p&gt;</summary><content type="html">
&lt;h2 id="summary"&gt;Summary&lt;a class="headerlink" href="#summary" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;Yesterday's update summarized the &lt;a href="https://andrewwegner.com/stackexchange-moderator-strike-week-one.html"&gt;first week of the Stack Exchange Moderator strike&lt;/a&gt;. The &lt;a href="https://andrewwegner.com/stackexchange-moderator-strike.html"&gt;strike began last Monday&lt;/a&gt;
with an &lt;a href="https://openletter.mousetail.nl/"&gt;open letter to Stack Exchange&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Over the weekend the moderators, and community curators elected three representatives to talk with Stack Exchange. Those talks started this week.
They went into these discussions to reiterate the four conditions to end the strike.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;A retraction of the prohibition of moderating GPT content.&lt;/li&gt;
&lt;li&gt;The private policy on GPT content that was issued to moderators must be revealed publicly.&lt;/li&gt;
&lt;li&gt;The data dumps must be re-enabled and SEDE and API access guaranteed.&lt;/li&gt;
&lt;li&gt;Stack Exchange, Inc. must communicate, gather feedback, and act on that feedback before making major policy or software changes to the public platform.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;So, where are we now?&lt;/p&gt;
&lt;h2 id="the-update"&gt;The update&lt;a class="headerlink" href="#the-update" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;h3 id="representatives"&gt;Representatives&lt;a class="headerlink" href="#representatives" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Discussions with Stack Exchange started off less than stellar, in my opinion. Quoting from the Vice President of Community at Stack Exchange, as 
relayed to those of us not in the discussions:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;"So in summary:  Cesar is my delegate for issues here, while I reserve final decision making to myself I"ve vested him with broad discretionary authority and we're meeting on a frequent (daily or multiple times daily) basis to clear any differences between us."&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;My take away here is that the Vice President of Community - a person who's job should involve &lt;em&gt;dealing with the community&lt;/em&gt; doesn't think this is 
important enough to attend. While I've worked with Cesar in my role as a moderator, this is just making Cesar the car salesman that has to 
"talk with the manager about your offer". It's a way for the company to drag this out and a way to make the real decisions without moderation
input. I fully expect to hear from the community representatives that Cesar liked a proposal but the VP did not but that the VP wasn't around to 
discuss why not. I'd love to be proven wrong on that though.&lt;/p&gt;
&lt;h3 id="data-dumps"&gt;Data Dumps&lt;a class="headerlink" href="#data-dumps" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Data dumps are the third bullet on our list of things that must be restored. Good news! Stack Exchange will have those restored by June 16, 2023.&lt;/p&gt;
&lt;p&gt;This was &lt;a href="https://meta.stackexchange.com/a/390200/186281"&gt;posted&lt;/a&gt; (&lt;a href="https://meta.stackexchange.com/a/390202/186281"&gt;twice&lt;/a&gt;) by the VP of Community - the one not attending the talks above.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;[...]
Our intention was never to stop posting the data dump, only to begin to collect more information on how it was being used and by whom - especially in light of the rise of LLMs and questions around how genAI models are handling attribution. However, it’s clear that many individual users (academics, researchers, etc) have an immediate need to access updated versions of the dumps. So we are re-enabling the automatic data dumps (and uploading the one that’s about a week overdue). We believe that this can happen by end of the day Friday. We will continue to work toward the creation of certain guardrails (for large AI/LLM companies) for both the dumps and the API, but again - we have no intention of restricting/charging community members or other responsible users of the dumps or the API from accessing them.
[...]
In the meantime, the data dumps will be re-enabled by end of day Friday. We will communicate here when that has been completed or if there are any delays. We will also post here prior to making any future changes to the dumps or distribution of the dumps.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I suppose now we wait to see if there are any "delays" before Friday.&lt;/p&gt;
&lt;p&gt;This message was confirmed by one of the Co-Founders that has since left Stack Exchange and 
&lt;a href="https://stackoverflow.blog/2009/06/04/stack-overflow-creative-commons-data-dump/"&gt;originally committed to these data dumps back in June 2009&lt;/a&gt;. &lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;I have confirmation via email from Prashanth that this is, indeed, &lt;strong&gt;the new official policy&lt;/strong&gt;. I'm glad to see it. Creative Commons is part of our contract with the community, and it should never be broken -- however, CC does need to address the AI issue in an updated license, in my personal opinion. [...] - &lt;a href="https://meta.stackexchange.com/questions/390201/what-is-the-current-june-2023-status-of-the-data-dumps-and-the-company-s-commi?noredirect=1&amp;amp;lq=1#comment1302765_390202"&gt;Jeff Atwood&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I am happy with this concession and confirmation of the concession from our representatives, Stack Exchange and a Co-founder.&lt;/p&gt;
&lt;p&gt;However, it's telling that once again it's Philippe making statements that are lies.&lt;/p&gt;
&lt;p&gt;He's done it with posts to the press that I mentioned yesterday (moderators are depending on GPT detectors!). He's done it with the internal 
emails to his own coworkers. He's doing it again here.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Our intention was never to stop posting the data dump...&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This &lt;a href="https://meta.stackexchange.com/a/390040/186281"&gt;directly contradicts that statement provided by the Stack Exchange Chief Technology Officer&lt;/a&gt; last week. &lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Stack Overflow senior leadership is working on a strategy to protect Stack Overflow data from being misused by companies building LLMs. While working on this strategy, we decided to stop the dump until we could put guardrails in place.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;For being a VP of Community, the ability to communicate with the community is greatly lacking. &lt;/p&gt;
&lt;h3 id="ai-policy"&gt;AI Policy&lt;a class="headerlink" href="#ai-policy" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Not much more progress has been provided by the community representatives. From what I have seen, Stack Exchange is pushing to call an end of this
with the promise of a new policy. But it's not done yet. They'll work on it with the moderators and once that's done, that will replace the current 
policy that started the strike. The representative mentioned they were pushing for a deadline on how quickly moderators would be able to commit to this 
change.&lt;/p&gt;
&lt;p&gt;In my opinion, this is a way to end the community's moderation strike and agree to essentially nothing. It's another promise that something will happen. 
It gets the community back to moderation (which Stack Exchange employees have been doing for the week), and if they break that promise the effort to 
re-organize action has to start all over again. &lt;/p&gt;
&lt;p&gt;Right now, I'm not agreeing to go back to utilizing my free time to perform moderation duties without knowing what the new policy is. &lt;/p&gt;
&lt;h2 id="where-do-i-sit"&gt;Where do I sit?&lt;a class="headerlink" href="#where-do-i-sit" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;Much like yesterday, I continue to re-evaluate my relationship with Stack Exchange. I'm really happy that the data dump has been restored. The messaging
around it though continues to erode my trust in the company's actions. This was also one of the easier items for Stack Exchange to agree to, even 
though it looks like a co-founder may have had a hand in resolving this as well. I don't know if that's true and I appreciate the work the 
representatives have done to resolve our first point of contention.&lt;/p&gt;
&lt;p&gt;It's also very telling that the messaging doesn't mention the restoration in the context of the strike at all. If the company was attempting to 
build goodwill in this environment, I'd think they would point to the conditions in the open letter and tie the enablement of the data dumps directly
to that. Instead, we got a statement that says they didn't intend to stop posting the data dump, directly contradicting a previous statement saying
senior leadership decided to stop the dump.&lt;/p&gt;
&lt;p&gt;Amazing.&lt;/p&gt;</content><category term="Stack Exchange Strike"/><category term="Stack Exchange"/><category term="moderation"/></entry><entry><title>Stack Exchange Strike - Week One</title><link href="https://andrewwegner.com/stackexchange-moderator-strike-week-one.html" rel="alternate"/><published>2023-06-12T09:00:00-05:00</published><updated>2023-06-12T00:00:00-05:00</updated><author><name>Andy Wegner</name></author><id>tag:andrewwegner.com,2023-06-12:/stackexchange-moderator-strike-week-one.html</id><summary type="html">&lt;p&gt;Stack Exchange moderators are striking. This is my summary of the first week.&lt;/p&gt;</summary><content type="html">
&lt;h2 id="summary"&gt;Summary&lt;a class="headerlink" href="#summary" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;A week ago, many &lt;a href="https://andrewwegner.com/stackexchange-moderator-strike.html"&gt;Stack Exchange diamond moderators began a moderation strike&lt;/a&gt;. They have been joined by 
power users and curators. At the time of this post, there are over 1200 users that have signed the &lt;a href="https://openletter.mousetail.nl/"&gt;open letter to Stack Exchange&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This strike has been covered in a few news articles, but I suspect this week's actions against Reddit and their new API pricing
changes will overshadow Stack Overflow for a little while. That's fine with me. Perhaps cooler heads will prevail when there is 
less public focus.&lt;/p&gt;
&lt;p&gt;The Stack Exchange strike has been covered:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://gizmodo.com/ai-stack-overflow-content-moderation-chat-gpt-1850505609"&gt;On Gizmodo&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.vice.com/en/article/4a33dj/stack-overflow-moderators-are-striking-to-stop-garbage-ai-content-from-flooding-the-site"&gt;On Vice&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://devclass.com/2023/06/05/stack-overflow-volunteer-moderators-down-tools-over-secret-new-policy-that-obstructs-removal-of-ai-generated-content/"&gt;On DevClass&lt;/a&gt; (with an interview from a fellow Charcoal power user and Stack Overflow moderator)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The important thing in this set of articles is the public statement that was released by Philippe Beaudette, Vice President of Community (taken from the Vice article above).&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;A small number of moderators (11%) across the Stack Overflow network have stopped engaging in several activities, including moderating content. The primary reason for this action is dissatisfaction with our position on detection tools regarding AI-generated content. Stack Overflow ran an analysis and the ChatGPT detection tools that moderators were previously using have an alarmingly high rate of false positives.&lt;/p&gt;
&lt;p&gt;We stand by our decision to require that moderators stop using the tools previously used. We are confident that we will find a path forward. We regret that actions have progressed to this point, and the Community Management team is evaluating the current situation as we work hard to stabilize things in the short term.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;They doubled down on this explanation in two meta posts. The &lt;a href="https://meta.stackexchange.com/q/389834/186281"&gt;initial statement&lt;/a&gt; and a &lt;a href="https://meta.stackexchange.com/q/389928/186281"&gt;post with "data"&lt;/a&gt;. I encourage readers 
to spend a while reading through that second link and the answers. The community is skeptical of the conclusions drawn and 
have counter arguments and data scattered in the answers.&lt;/p&gt;
&lt;p&gt;Finally, during the week it was discovered that Stack Exchange stopped their quarterly data dump of all content. This was 
&lt;a href="https://meta.stackexchange.com/a/390040/186281"&gt;announced&lt;/a&gt; after a former employee stated that the data dumps were turned off in March. &lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Stack Overflow senior leadership is working on a strategy to protect Stack Overflow data from being misused by companies building LLMs. While working on this strategy, we decided to stop the dump until we could put guardrails in place.&lt;/p&gt;
&lt;p&gt;We are working on setting up the infrastructure to do this correctly in the age of LLMs --- where we continue to be open and share the data with our developer community but work to set up a formal framework for large AI companies that want to leverage the data.&lt;/p&gt;
&lt;p&gt;We are looking for ways to gate access to the Dump, APIs, and SEDE, that will allow individuals access to the data while preventing misuse by organizations looking to profit from the work of our community. We are working to design and implement appropriate safeguards and still sorting out the details and timelines. We will provide regular updates on our progress to this group.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2 id="where-are-we-now"&gt;Where are we now?&lt;a class="headerlink" href="#where-are-we-now" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;With the summary out of the way, where do we sit now?&lt;/p&gt;
&lt;p&gt;As of midnight today, the users that have stopped moderation activities have &lt;a href="https://meta.stackexchange.com/q/390106/186281"&gt;selected three representatives&lt;/a&gt; to be our voice
in conversations with Stack Exchange and listed the conditions for ending the strike.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;A retraction of the prohibition of moderating GPT content.&lt;/li&gt;
&lt;li&gt;The private policy on GPT content that was issued to moderators must be revealed publicly.&lt;/li&gt;
&lt;li&gt;The data dumps must be re-enabled and SEDE and API access guaranteed.&lt;/li&gt;
&lt;li&gt;Stack Exchange, Inc. must communicate, gather feedback, and act on that feedback before making major policy or software changes to the public platform.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I want to point out that there is nothing here about GPT detection tools. That's because this isn't the reason for the strike. 
Despite what Stack Exchange has said in their messages to the press, &lt;a href="https://meta.stackexchange.com/a/389856/186281"&gt;this isn't about detection tools&lt;/a&gt;. (I also have 
&lt;a href="https://meta.stackexchange.com/a/389825/186281"&gt;an answer&lt;/a&gt; on that question about the policy's origination.)&lt;/p&gt;
&lt;p&gt;The discovery that the data dumps were turned off has angered many people - those both already involved and others that learned of 
it this week. The fact that these were turned off over two months ago and nothing was said to the community has made the situation
even worse.&lt;/p&gt;
&lt;h2 id="whats-stack-saying"&gt;What's Stack Saying?&lt;a class="headerlink" href="#whats-stack-saying" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;&lt;em&gt;This section was added after original publication&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Shortly after publishing my thoughts with this article, internal Stack Exchange emails were published. These show how 
&lt;a href="https://web.archive.org/web/20230612155947/https://jlericson.com/2023/06/12/internal_messages.html"&gt;Stack Exchange is communicating this with their employees&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I highly encourage everyone to read &lt;em&gt;those&lt;/em&gt; too. There is a lot of reading scattered around to get the full scope
of how unhappy the users of Stack Exchange are.&lt;/p&gt;
&lt;p&gt;I think it's telling that the company managed to copy and paste from the &lt;a href="https://openletter.mousetail.nl/"&gt;strike letter&lt;/a&gt;, but at the same time managed 
to completely ignore that this isn't about GPT detectors. Even worse, the company is spreading that falsehood to its employees,
and the press. On top of that, has two teams - Community Leadership and Marketing - working on communications, yet no progress
has been made.&lt;/p&gt;
&lt;h2 id="where-do-i-sit"&gt;Where do I sit?&lt;a class="headerlink" href="#where-do-i-sit" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;I mentioned &lt;a href="https://andrewwegner.com/stackexchange-moderator-strike.html"&gt;last week&lt;/a&gt; that I've been here for over 13 years. I've applied and interviewed at the company. I've 
made friends with the employees and gotten recommendations during those interviews. I've built tooling to help &lt;a href="https://andrewwegner.com/can-a-machine-be-taught-to-flag-comments-automatically.html"&gt;moderate comments&lt;/a&gt;
and &lt;a href="https://andrewwegner.com/can-a-machine-be-taught-to-flag-spam-automatically.html"&gt;eliminate spam&lt;/a&gt; on the network. I've been here a long time.&lt;/p&gt;
&lt;p&gt;In 2019, I reevaluated my role on the network during Stack Exchange's last screw up. In that one, they managed to libel a moderator,
by name, to the press. This event still reverberates through the network today and serves as a brick that this current situation is
built with. Stack Exchange destroyed 10 years worth of trust and community relationships in that event. They've tried to rebuild it
over these last three years and have been marginally successful. That's gone again.&lt;/p&gt;
&lt;p&gt;Now I'm reevaluating how I utilize my free time again. It's not constructive to say I'll hand in my diamonds and walk away if the
conditions above aren't hit. But, it's worth noting that I agree with all four of those conditions. This is my free time I'm donating
and if the organization I'm donating that time to has changed their philosophy, I will take that into consideration as I reevaluate.&lt;/p&gt;
&lt;p&gt;I think we are long past the point of "how it used to be" at Stack Exchange. The question I am asking my self is whether or not I
agree with the new direction the platform is going.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Updated after the email publication, above&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Reading the email and FAQ that Stack Exchange sent to their employees, I am struck by how out of touch members of the company are.
Not the people that don't interact with the community. The people that should know the pulse of the moderation teams, the 
community opinions, and users in general - like the Vice President of Community. While I'm not surprised that they are 
down playing the strike, I am surprised that they are flat out lying to their employees.&lt;/p&gt;
&lt;p&gt;Stack Exchange should be called out on that and their employees should know that it's happening. Stack Exchange, Stack Overflow, Inc.,
is lying to their employees. The email presented was written after the strike started and contains information about all of the 
FAQ items mentioned.&lt;/p&gt;</content><category term="Stack Exchange Strike"/><category term="Stack Exchange"/><category term="moderation"/></entry><entry><title>Joining the Stack Exchange Moderator Strike</title><link href="https://andrewwegner.com/stackexchange-moderator-strike.html" rel="alternate"/><published>2023-06-05T01:00:00-05:00</published><updated>2023-06-05T01:00:00-05:00</updated><author><name>Andy Wegner</name></author><id>tag:andrewwegner.com,2023-06-05:/stackexchange-moderator-strike.html</id><summary type="html">&lt;p&gt;Stack Exchange moderators are striking. This is my reasoning for why I'm joining.&lt;/p&gt;</summary><content type="html">
&lt;h2 id="summary"&gt;Summary&lt;a class="headerlink" href="#summary" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;I'm signing an open letter to Stack Exchange because those of us that volunteer our time and energy have 
been put in an impossible situation. On Stack Overflow we &lt;a href="https://andrewwegner.com/stackoverflow-bans-chatgpt.html"&gt;banned ChatGPT created content&lt;/a&gt; almost immediately
after it was released. Theoretically, that is no longer in effect.&lt;/p&gt;
&lt;p&gt;This isn't due to a change in community perception. Instead, it's due to an &lt;a href="https://meta.stackexchange.com/a/389583/186281"&gt;abrupt policy change&lt;/a&gt; on Stack
Exchange's part that was posted on May 30. It's important to note that this public policy does &lt;em&gt;not&lt;/em&gt; match
the guidance that was provided privately.&lt;/p&gt;
&lt;p&gt;The relevant portion of this public policy is:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;In order to help mitigate the issue, we've asked moderators to apply a very strict standard of evidence to determining whether a post is AI-authored when deciding to suspend a user.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The private policy notes that "very strict" is essentially "don't moderate unless they explicitly say it was created by an AI".&lt;/p&gt;
&lt;h2 id="background"&gt;Background&lt;a class="headerlink" href="#background" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;The Stack Exchange CEO &lt;a href="https://stackoverflow.blog/2023/05/31/ceo-update-paving-the-road-forward-with-ai-and-community-at-the-center/"&gt;posted a blog post&lt;/a&gt; at the end of May 2023. In this post, they stated:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Approximately 10% of our company is working on features and applications leveraging GenAI that have the potential to increase engagement within our public community and add value to customers of our SaaS product, Stack Overflow for Teams.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This goes against the community desire to &lt;em&gt;not&lt;/em&gt; have GenAI content on the site. The CEO has not provided any feedback to the community,
other than a note that there is a big summer project with GenAI. The community reaction hasn't been positive.&lt;/p&gt;
&lt;p&gt;Combining this announcement with the newly announced policy the previous day, and an astonishing inability to articulate more details or reasoning,
has produced a lot of unhappy moderators.&lt;/p&gt;
&lt;p&gt;Additionally, I feel that the company has destroyed 3 years of rebuilding trust. Back in 2019, they destroyed this trust and nearly had a moderator
strike then too. That involved providing information to a journalist that had very limited context and due to that, presented a single moderator in 
unflattering light. The feelings from that have taken years to rebuild and even today that incident is cited as a low point, and users can point
out that singular incident when trust of the company plummeted. &lt;/p&gt;
&lt;p&gt;The public announcement to not moderate GenAI content contained this:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Through no fault of moderators' own, we also suspect that there have been biases for or against residents of specific countries as a potential result of the heuristics being applied to these posts.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;That's just wrong. Absolutely wrong. It is 100% inaccurate and Stack Exchange has offered no data to back this up. User country of origin, region in the world, or any kind of physical location is not available to moderators. We are presented a flag and given information to the content that has been flagged. &lt;/p&gt;
&lt;h2 id="why-am-i-participating"&gt;Why am I participating?&lt;a class="headerlink" href="#why-am-i-participating" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;So...my participation? Why am I joining in? &lt;/p&gt;
&lt;p&gt;Stack Exchange, and Stack Overflow, thrive due to the human element. It's userbase has been around over a decade and answered millions of questions across over 180 sites. The week ChatGPT came out, the community saw the bad results it can provide. For the past 6 months, we've continued to see how bad that is. The moderation teams across the network to generate a policy and get it approved by the company. I've reproduced it in full below, but as of this post it is still on the site and contradicts the new policy.&lt;/p&gt;
&lt;p&gt;In addition to completely ignoring the community's input on how they do not want GenAI on the site, the company ignored their own &lt;a href="https://stackoverflow.com/legal/moderator-agreement"&gt;moderator agreement&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;With the assumption that the above will change at some point, the relevant section is pasted here:&lt;/p&gt;
&lt;p&gt;```
Stack Exchange, Inc. agrees that it will:&lt;/p&gt;
&lt;p&gt;i. Respect your privacy per the terms of the Privacy Policy for the Public Network.
   ii.  Get your explicit written permission before commenting to any media (including media outlets controlled by Stack Exchange Inc.) or independent reporters about you or your moderator actions as per our Press Policy.
   iii.  Allow you to resign your position for any reason without penalty or repercussions. As a volunteer, Stack Exchange, Inc. respects your time and will release you from duty should you ask.
   iv. Operate “Stack Gives Back”, an annual program giving to selected charities in honor of our moderators.
   v. Post previews for review of all new official policies in the Moderators Teams instance with the policy tag, marked with links to their public version once published, and maintain a listing of all official network-wide policies with links to them in the Help Center.
   vi. Announce changes to the moderator agreement no less than sixty days before the deadline to accept the new agreement with a period of at least thirty days for discussion and review.
   vii. Provide support for your questions, requests and concerns on the Moderators Teams instance and/or the Teachers’ Lounge, direct email to CMs, and content on Meta escalated to staff by whatever formal documented process is in effect at the time.
   viii. Respect your right to speak openly to question and challenge policy without reprisal so long as such speech does not break the Code of Conduct.
```&lt;/p&gt;
&lt;p&gt;The relevant section is &lt;code&gt;vi&lt;/code&gt;. There was no discussion period on this. No engagement with the moderators, or the community. Instead it was "effective
immediately." The best we've gotten so far is an "oops, sorry."&lt;/p&gt;
&lt;p&gt;That's not how something in a &lt;code&gt;/legal&lt;/code&gt; link should operate. If I can't trust  them to uphold an agreement they have in writing, why should I trust
them to uphold anything else? &lt;/p&gt;
&lt;h2 id="what-does-the-community-want-out-of-this"&gt;What does the community want out of this?&lt;a class="headerlink" href="#what-does-the-community-want-out-of-this" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;From the open letter:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Until Stack Overflow, Inc. retracts this policy change to a degree that addresses the concerns of the moderators, and allows moderators to effectively enforce established policies against AI-generated answers, we are calling for a general moderation strike, as a last-resort effort to protect the Stack Exchange platform and users from a total loss in value. We would also like to remind Stack Overflow, Inc. that a network that entirely relies on volunteers for its moderation model cannot then consistently ignore, mistreat, and malign those same volunteers. &lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2 id="my-feelings-on-this"&gt;My feelings on this&lt;a class="headerlink" href="#my-feelings-on-this" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;I've been on Stack Overflow and Stack Exchange for over 13 years. I've been a moderator on the network since 2014 and on Stack Overflow since 2017.
I've &lt;a href="https://andrewwegner.com/tag/stack-exchange.html"&gt;written about Stack Overflow a bit over the years&lt;/a&gt;. I've participated in &lt;a href="https://meta.stackexchange.com/questions/291301/can-a-machine-be-taught-to-flag-spam-automatically"&gt;Charcoal&lt;/a&gt;, the spam fighting community since 2015ish. &lt;a href="https://andrewwegner.com/decade-fighting-spam-charcoal.html"&gt;Charcoal has automatically flagged more than 86,000 posts across the network since 2016&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I've applied to several jobs at Stack Exchange. I've interviewed for a couple positions. I have spent &lt;em&gt;a lot&lt;/em&gt; of time with the network, the 
community, moderators, and employees making this a great place for internet users to find their answers.&lt;/p&gt;
&lt;p&gt;I have helped to build the community/company trust that I mentioned above. I watched it crumble. I thought about leaving in 2019, but instead
spent the next three years working to rebuild that trust. I'm at the point again, where I see the company not understanding their community. At all.
It's sad that this cycle has repeated itself and it's worse that the company is, again, tossing their most engaged users under the bus.&lt;/p&gt;
&lt;p&gt;This strike serves two purposes in my mind - the first is the officially stated one. Do not allow GenAI content on the network. It will erode the value of the network quickly. We've also demonstrated that ChatGPT and it's peers are not great at answering complex questions. But, second, and unofficially, 
this strike will represent a change for Stack Exchange to show whether or not they care about what the community has to say. This is, I believe, the last
opportunity for them to retract this policy and reflect on why they are over ruling so many communities that reject GenAI in their community. It's their
last opportunity to show they support their moderators and the human aspect of moderation. Failure to do either means that Stack Exchange has given up
on community building. &lt;/p&gt;
&lt;p&gt;I am a volunteer for this community. I would love to continue that role, but this is the best way for me to show that GenAI is not the right path for the company to take.&lt;/p&gt;
&lt;h2 id="whats-next"&gt;What's Next&lt;a class="headerlink" href="#whats-next" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;On June 5, 2023 at midnight, moderator local time, the network will start to see moderation activities cease (or slow down). I will be part of that.&lt;/p&gt;
&lt;h2 id="gpt-policy"&gt;GPT Policy&lt;a class="headerlink" href="#gpt-policy" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;This policy was crafted with the input of moderators and Stack Exchange. An important thing to notice is that the moderators are empowered to issue suspensions. This is something the new policy prevents.&lt;/p&gt;
&lt;p&gt;```
Why posting GPT and ChatGPT generated answers is not currently acceptable&lt;/p&gt;
&lt;p&gt;This Help Center article provides insight and rationale on our policy regarding the usage of GPT and ChatGPT on Stack Overflow. While this is the position of Stack Overflow staff, it’s meant to support the prior work done by moderators (namely, the temporary policy issued to ban contributions by ChatGPT).&lt;/p&gt;
&lt;p&gt;Stack Overflow is a community built upon trust. The community trusts that users are submitting answers that reflect what they actually know to be accurate and that they and their peers have the knowledge and skill set to verify and validate those answers. The system relies on users to verify and validate contributions by other users with the tools we offer, including responsible use of upvotes and downvotes. Currently, contributions generated by GPT most often do not meet these standards and therefore are not contributing to a trustworthy environment. This trust is broken when users copy and paste information into answers without validating that the answer provided by GPT is correct, ensuring that the sources used in the answer are properly cited (a service GPT does not provide), and verifying that the answer provided by GPT clearly and concisely answers the question asked.&lt;/p&gt;
&lt;p&gt;The objective nature of the content on Stack Overflow means that if any part of an answer is wrong, then the answer is objectively wrong. In order for Stack Overflow to maintain a strong standard as a reliable source for correct and verified information, such answers must be edited or replaced. However, because GPT is good enough to convince users of the site that the answer holds merit, signals the community typically use to determine the legitimacy of their peers’ contributions frequently fail to detect severe issues with GPT-generated answers. As a result, information that is objectively wrong makes its way onto the site. In its current state, GPT risks breaking readers’ trust that our site provides answers written by subject-matter experts.&lt;/p&gt;
&lt;p&gt;Moderators are empowered (at their discretion) to issue immediate suspensions of up to 30 days to users who are copying and pasting GPT content onto the site, with or without prior notice or warning.
```&lt;/p&gt;
&lt;h2 id="the-letter"&gt;The letter&lt;a class="headerlink" href="#the-letter" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;The letter below was originally posted as an &lt;a href="https://openletter.mousetail.nl/"&gt;open letter to Stack Exchange&lt;/a&gt;. I've reposted it here 
for a record.&lt;/p&gt;
&lt;p&gt;```
June 5, 2023
Stack Overflow, Inc. has decreed a near-total prohibition on moderating AI-generated content in the wake of a flood of such content being posted to and subsequently removed from the Stack Exchange network, tacitly allowing the proliferation of incorrect information ("hallucinations") and unfettered plagiarism on the Stack Exchange network. This poses a major threat to the integrity and trustworthiness of the platform and its content.&lt;/p&gt;
&lt;p&gt;We, the undersigned, are volunteer moderators, contributors, and users of Stack Overflow and the Stack Exchange network. Effective immediately, we are enacting a general moderation strike on Stack Overflow and the Stack Exchange network, in protest of this and other recent and upcoming changes to policy and the platform that are being forced upon us by Stack Overflow, Inc.&lt;/p&gt;
&lt;p&gt;Our efforts to effect change through proper channels have been ignored, and our concerns disregarded at every turn. Now, as a last resort, we are striking out of dedication to the platform that we have put over a decade of care and volunteer effort into. We deeply believe in the core mission of the Stack Exchange network: to provide a repository of high-quality information in the form of questions and answers, and the recent actions taken by Stack Overflow, Inc. are directly harmful to that goal.&lt;/p&gt;
&lt;p&gt;Specifically, moderators are no longer allowed to remove AI-generated answers on the basis of being AI-generated, outside of exceedingly narrow circumstances. This results in effectively permitting nearly all AI-generated answers to be freely posted, regardless of established community consensus on such content.&lt;/p&gt;
&lt;p&gt;In turn, this allows incorrect information (colloquially referred to as "hallucinations") and plagiarism to proliferate unchecked on the platform. This destroys trust in the platform, as Stack Overflow, Inc. has previously noted.&lt;/p&gt;
&lt;p&gt;In addition, the details of the policies issued directly to moderators differ substantially from the guidelines outlined publicly, with moderators barred from publicly sharing the details.&lt;/p&gt;
&lt;p&gt;These policies disregard the leeway historically granted to individual Stack Exchange communities to determine their policies, by making changes without the input of the community, overriding community consensus, and outright refusing to reconsider their position.&lt;/p&gt;
&lt;p&gt;Until this matter is resolved satisfactorily, we will be pausing activities including, but not limited to:&lt;/p&gt;
&lt;div class="codehilight code"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;Raising&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="ow"&gt;and&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;handling&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;flags&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
&lt;span class="n"&gt;Running&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;SmokeDetector&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;anti&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;spam&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;bot&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
&lt;span class="n"&gt;Closing&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="ow"&gt;or&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;voting&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;to&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;close&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;posts&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
&lt;span class="n"&gt;Deleting&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="ow"&gt;or&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;voting&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;to&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;delete&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;posts&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
&lt;span class="n"&gt;Reviewing&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;tasks&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="ow"&gt;in&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;various&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;review&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;queues&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
&lt;span class="n"&gt;Running&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;various&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;other&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;bots&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;designed&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;to&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;assist&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="ow"&gt;in&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;moderation&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;such&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;as&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;detecting&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;plagiarism&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;low&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;quality&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;answers&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="ow"&gt;and&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;rude&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;comments&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Until Stack Overflow, Inc. retracts this policy change to a degree that addresses the concerns of the moderators, and allows moderators to effectively enforce established policies against AI-generated answers, we are calling for a general moderation strike, as a last-resort effort to protect the Stack Exchange platform and users from a total loss in value. We would also like to remind Stack Overflow, Inc. that a network that entirely relies on volunteers for its moderation model cannot then consistently ignore, mistreat, and malign those same volunteers.
```&lt;/p&gt;</content><category term="Stack Exchange Strike"/><category term="Stack Exchange"/><category term="moderation"/><category term="chatgpt"/></entry><entry><title>Stack Overflow bans ChatGPT temporarily</title><link href="https://andrewwegner.com/stackoverflow-bans-chatgpt.html" rel="alternate"/><published>2022-12-05T14:45:00-06:00</published><updated>2022-12-06T00:00:00-06:00</updated><author><name>Andy Wegner</name></author><id>tag:andrewwegner.com,2022-12-05:/stackoverflow-bans-chatgpt.html</id><summary type="html">&lt;p&gt;Today Stack Overflow moderators banned ChatGPT on the site. This is my reflection on why we went this route.&lt;/p&gt;</summary><content type="html">&lt;p&gt;Today Stack Overflow moderators (&lt;a href="https://andrewwegner.com/collecting-diamonds-on-stack-exchange.html"&gt;myself included&lt;/a&gt;), have implemented a &lt;a href="https://meta.stackoverflow.com/q/421831/189134"&gt;temporary ban on ChatGPT on the site&lt;/a&gt;.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Use of ChatGPT generated text for posts on Stack Overflow is temporarily banned.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This ban was picked up immediately by several &lt;a href="https://www.google.com/search?q=stack+overflow+chatgpt&amp;amp;biw=1506&amp;amp;bih=1308&amp;amp;tbs=cdr%3A1%2Ccd_min%3A12%2F5%2F2022%2Ccd_max%3A12%2F5%2F2022"&gt;technology news outlets&lt;/a&gt;, including &lt;a href="https://www.zdnet.com/article/stack-overflow-temporarily-bans-answers-from-openais-chatgpt-chatbot/"&gt;ZDNet&lt;/a&gt;, &lt;a href="https://www.theverge.com/2022/12/5/23493932/chatgpt-ai-generated-answers-temporarily-banned-stack-overflow-llms-dangers"&gt;TheVerge&lt;/a&gt;, and &lt;a href="https://www.vice.com/en/article/wxnaem/stack-overflow-bans-chatgpt-for-constantly-giving-wrong-answers"&gt;Vice&lt;/a&gt;. 
It was also picked up by &lt;a href="https://www.cnn.com/2022/12/05/tech/chatgpt-trnd/index.html"&gt;CNN&lt;/a&gt;, &lt;a href="https://www.nytimes.com/2022/12/05/technology/chatgpt-ai-twitter.html"&gt;The New York Times&lt;/a&gt;, and the &lt;a href="https://www.washingtonpost.com/business/chatgpt-could-makedemocracy-even-more-messy/2022/12/06/e613edf8-756a-11ed-a199-927b334b939f_story.html"&gt;Washington Post&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;That's a lot of news coverage.&lt;/p&gt;
&lt;p&gt;The question that I want to answer is "Why?". Stories and marketing from &lt;a href="https://openai.com/blog/chatgpt/"&gt;OpenAI&lt;/a&gt;, give reasons why this new chatbot is a 
"good thing". With their examples, it definitely looks that way. But, in practice, it's not working out so well.&lt;/p&gt;
&lt;p&gt;The responses that users have been posting on Stack Overflow have a high rate of being incorrect. Normally, the community can handle this, but the responses aren't your usual small code snippet of an answer. Instead it is a long, detailed, 
explanation that &lt;em&gt;looks&lt;/em&gt; plausible. But, it's wrong. Combined with users posting multiple answers an hour, this is
a lot of content that Stack Overflow reviewers (or worse, the handful of elected moderators) to go through and determine if it's valid.&lt;/p&gt;
&lt;p&gt;At the time the ban was implemented, we'd seen thousands of answers generated by ChatGPT. On the one hand, this is impressive 
work on the ChatGPT AI itself. It can be difficult to detect and is good at holding a conversation. On the other hand, 
and more important from Stack Overflow's perspective, this isn't helping the user base. Thousands of subtly wrong answers is awful. 
It doesn't help the user looking for help, and it will very quickly destroy the trust that millions of developers put in the site if
this is allowed to continue.&lt;/p&gt;
&lt;p&gt;I'll admit that I'm disappointed that AI hasn't reached the point where it can do what ChatGPT &lt;em&gt;seems&lt;/em&gt; to do. But, this is a step
forward. Unfortunately, this step seems to have left a bad taste in the mouth of developers looking for help beyond 
a toy example. &lt;/p&gt;
&lt;p&gt;For the time being, ChatGPT is banned on Stack Overflow. The &lt;a href="https://stackoverflow.com/users?tab=moderators"&gt;moderation team&lt;/a&gt; will continue to work with the company to 
ensure the community we have volunteered to moderate remains one of high quality. Additionally, as the larger &lt;a href="https://meta.stackexchange.com/q/384396/186281"&gt;Stack Exchange 
network of sites debates a similar ban&lt;/a&gt;, the Stack Overflow moderation team will be able to provide input on our experiences.&lt;/p&gt;</content><category term="Side Activities"/><category term="Stack Exchange"/><category term="moderation"/><category term="chatgpt"/></entry><entry><title>Stack Overflow still has issues and it's getting worse</title><link href="https://andrewwegner.com/stack-overflows-still-has-issues-and-its-getting-worse.html" rel="alternate"/><published>2019-03-29T10:34:00-05:00</published><updated>2019-03-29T10:34:00-05:00</updated><author><name>Andy Wegner</name></author><id>tag:andrewwegner.com,2019-03-29:/stack-overflows-still-has-issues-and-its-getting-worse.html</id><summary type="html">&lt;p&gt;A follow up from a post I made a year and a half ago. How's Stack Overflow doing (from the perspective of a long time community member and moderator)?&lt;/p&gt;</summary><content type="html">
&lt;h2 id="last-time-on-this-blog"&gt;Last time on this blog&lt;a class="headerlink" href="#last-time-on-this-blog" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;A little over a year and a half ago, I wrote an article about &lt;a href="https://andrewwegner.com/stack-overflows-problem-feedback-from-an-experienced-user.html"&gt;Stack Overflow's problems&lt;/a&gt;
from my perspective as an experienced user. This was before I was &lt;a href="https://andrewwegner.com/collecting-diamonds-on-stack-exchange.html"&gt;elected as a moderator&lt;/a&gt; on
Stack Overflow. I ended the previous article with this:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;I continue to invest my time and effort into the community, but even as an active user who
really wants the company and community to succeed, it's getting harder and harder to ignore that
those of us that have been around for years are not being listened to any more. We're being treated as the
grumpy old person that grumbles about the way things used to be. Our experiences on the site are brushed aside as
being unhelpful to new users. That completely ignores that fact that we are still trying to reach
the goal on which Stack Overflow was created:
&lt;a href="https://stackoverflow.com/tour"&gt;"With your help, we're working together to build a library of detailed answers to every question about programming."&lt;/a&gt;
To do this, we need high quality questions and answers so that we can actually provide help to all users. I
think &lt;em&gt;this&lt;/em&gt; is the biggest challenge that Stack Overflow is going to face in the next 18 months.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;So, what's happened in the last 18 months?&lt;/p&gt;
&lt;h2 id="documentation"&gt;Documentation&lt;a class="headerlink" href="#documentation" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;After years of development (being announced in 2015), &lt;a href="https://meta.stackoverflow.com/q/354217/189134"&gt;Documentation was shuttered&lt;/a&gt; in August of 2017. Stack Overflow
wasn't drawing users to the Documentation feature. Their own metrics and analysis showed that fixing Documentation to
be useful to users - both new and experienced - would require a significantly larger team.&lt;/p&gt;
&lt;h3 id="what-went-wrong"&gt;What went wrong?&lt;a class="headerlink" href="#what-went-wrong" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;In my opinion, and as I &lt;a href="https://andrewwegner.com/stack-overflows-problem-feedback-from-an-experienced-user.html"&gt;mentioned&lt;/a&gt; in 2017, Stack Overflow has ignored its user base. This is going to be a recurring
theme in this post. For years, users provided feedback on meta, in dedicated user experience interviews and in chatrooms.
This resulted in superficial changes and major rewrites. Yet, complaints still existed. These complaints turned off the
experienced users that could produce the high quality documentation. Instead, Documentation became a reputation farming operation
in all but name. This turned off even more users.&lt;/p&gt;
&lt;p&gt;By Stack Overflow's own admission when sun setting the feature, Documentation was built to solve a problem that wasn't really a
problem.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Finally, our research showed that while a lot of developers were dissatisfied, the current state of programming documentation is
not universally broken the way Q&amp;amp;A was when Stack Overflow started. In particular, we heard over and over that Stack Overflow
has become de facto documentation for many technologies. As many of you pointed out, Stack Overflow is already good enough
at providing documentation of obscure features. Even when considering just the company's mission of helping programmers
“learn, share their knowledge and build their careers”, Documentation isn't the most efficient use of resources.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Two years of major development, focusing on a problem that the community had not been enthusiastic about, and intentionally ignoring
other feature requests and other improvements angered a lot of users.&lt;/p&gt;
&lt;h2 id="teams"&gt;Teams&lt;a class="headerlink" href="#teams" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;In my &lt;a href="https://andrewwegner.com/stack-overflows-problem-feedback-from-an-experienced-user.html"&gt;last post&lt;/a&gt;, I mentioned that Teams had been launched and shut down in less than a year. Teams is back! At least the name is.
Initially launched as &lt;a href="https://meta.stackoverflow.com/q/352065/189134"&gt;"Channels"&lt;/a&gt;, and later re-branded to "Stack Overflow for Teams", this is a money generating route for Stack Overflow.
It uses the &lt;a href="https://stackoverflow.com/teams"&gt;old URL&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Now, generating money is good. It's good for both the community and the company. Without money, the company can't survive. Without the
company there is no Stack Overflow or community. My problem isn't with money generation. My problem is that, once again, community feature
requests for higher quality and moderation tooling to cultivate that higher quality was ignored.&lt;/p&gt;
&lt;p&gt;By all accounts, Teams seems to be doing well and bringing in revenue. I am hopeful that this translates into development time to
build out the features the community still clamors for.&lt;/p&gt;
&lt;h2 id="meta-hatred"&gt;Meta hatred&lt;a class="headerlink" href="#meta-hatred" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;Meta. &lt;a href="https://blog.codinghorror.com/meta-is-murder/"&gt;It's murder.&lt;/a&gt; Until it's not. Meta is how Stack Overflow communicates with the community. It's how the community
communicates with itself. It's where governing principles/thoughts/guidance/sticky notes comes from. In short, meta is a
large part of how Stack Overflow the company and Stack Overflow the community talks with one another. Decisions are questioned here,
announcements are posted here, and little by little the site is made better.&lt;/p&gt;
&lt;p&gt;That is, until nothing happens. Stack Overflow's &lt;a href="https://meta.stackexchange.com/a/19514/186281"&gt;response time has become a meme&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;"6 to 8 weeks" is a joke. It's used to indicate that something isn't going to be built or changed. It's so prevalent that this
comment crops up over and over on feature request posts. It's used by the community to say that nothing is going to happen.&lt;/p&gt;
&lt;p&gt;When something does happen, it's a "big deal". There have been a few examples in the past year. Unfortunately, these changes happened
due to feedback from Twitter, not Meta. For years we've been told to post on Meta. For years we've been told that Meta is where
the company will engage with us. Then two massive changes happened.&lt;/p&gt;
&lt;h3 id="the-welcoming"&gt;The Welcoming&lt;a class="headerlink" href="#the-welcoming" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;The first change was to make Stack Overflow more &lt;a href="https://stackoverflow.blog/2018/04/26/stack-overflow-isnt-very-welcoming-its-time-for-that-to-change/"&gt;"welcoming"&lt;/a&gt;. This isn't bad. As both an experienced user and as a
moderator, I've seen my fair share of users not being welcomed. I've seen hostility to poorly asked questions.&lt;/p&gt;
&lt;p&gt;Unfortunately, this whole blog post and resulting meta-drama &lt;em&gt;appears&lt;/em&gt; to have cropped up because of a post on Twitter from
someone who felt unwelcome. That's fair. I believe they felt that way. However, from my point of view, Stack Overflow ignored
their own users (some of whom had been saying the exact same thing for years) because it was suddenly posted on Twitter where
the entire world could comment on things that may have been out of context. Instead of listening to their own users and the
experiences those users had, Stack Overflow went into damage control mode and rapidly &lt;a href="https://stackoverflow.blog/2018/06/21/rolling-out-the-welcome-wagon-june-update/"&gt;updated its "Be Nice" policy.&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Whether this is actually what happened or not is really beside the point. Many long time users had this perception. Meta was
ignored. User feedback was ignored. Instead, the person that could shout the loudest and had made the most noise appeared to
be the one that was listened to.&lt;/p&gt;
&lt;p&gt;A few months after the welcoming blog was posted and a month after the update, another post was made about how the company
was attempting to &lt;a href="https://stackoverflow.blog/2018/07/10/welcome-wagon-classifying-comments-on-stack-overflow"&gt;classify comments&lt;/a&gt;. The idea behind this was good, the execution of the blog post was not. In the initial
version of the post, exact comments were posted to show "bad comments". I disagreed that a few of them were rude. I'd have removed them
as no longer needed without a problem. Honestly, I'd probably have removed them as rude too, because comments don't need to stick around
and it's easier to accept the rude flag than it is to decline and manually delete.&lt;/p&gt;
&lt;p&gt;My problem was that the exact comment content was posted as a "wall of shame". Then, despite only employees being involved, none
of these comments were removed or even flagged for moderators to remove. In short, it really was a "wall of shame".&lt;/p&gt;
&lt;p&gt;I believe I covered my disappointment in both this failure and in the technical aspect in my &lt;a href="https://stackoverflow.blog/2018/07/10/welcome-wagon-classifying-comments-on-stack-overflow/#comment-29001"&gt;comment&lt;/a&gt; on the blog.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;I am a huge fan of automatically removing unwanted comments. I did so for several years. That said, I’m disappointed in how
this is playing out here. I’m disappointed on both a community level and a technical level.&lt;/p&gt;
&lt;p&gt;On the community level, I am very disappointed that 57 Stack Exchange employees were able to evaluate bad comments, determine
they were bad enough to put in the hall of shame post here, and then do nothing about them. It took users less than 15 minutes
to find those comments on Stack Overflow and identify the “rude” users. Users who are rude because they asked why a certain
tag was on a question. Did none of your 57 users have a diamond where you could remove the comment from the site? Even if
that’s the case, all of you have the option to flag a comment. Even that wasn't done.&lt;/p&gt;
&lt;p&gt;On a technical level, you evaluated less than 4000 comments. That is a few hours worth of comments on a single week
day. (source: http://data.stackexchange.com/stackoverflow/query/872382) Is that really representative? How did you determine
which comments to use in your evaluation?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The good news is that the comment samples were edited to be "representative" of the problem later.&lt;/p&gt;
&lt;p&gt;Welcoming users is &lt;em&gt;great&lt;/em&gt;. Helping users is the purpose of the site. I fully support all of that. What I don't support is
ignoring the feedback mechanism you've built and told everyone to use for years because someone else with a lot of Twitter
followers put Stack Overflow in a bad light. Yes, it should be fixed and should have been fixed sooner, but the perception
of "listen to the loudest shout" is not a good look.&lt;/p&gt;
&lt;p&gt;Which brings me to...&lt;/p&gt;
&lt;h3 id="removal-from-network-questions-list"&gt;Removal from Network Questions list&lt;a class="headerlink" href="#removal-from-network-questions-list" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;In October, the entire "Twitter shouted, Stack Overflow reacted" repeated itself. This time, a user was offended (while on
Stack Overflow) over the Hot Network Questions list for two questions on another Stack Exchange site. In under an hour,
Stack Overflow (the company) removed that question from the hot network questions list.&lt;/p&gt;
&lt;p&gt;The community in question was shocked by the result. A community manager explained the decision on that &lt;a href="https://interpersonal.meta.stackexchange.com/a/3335/34"&gt;site's meta&lt;/a&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;It was the solution we chose - without consulting IPS - because it was effective and easy to implement since it would
fix the perceived problem immediately and there was already a technical solution in place for doing it.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Notice a couple things here that stand out to me:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;"perceived problem"&lt;/li&gt;
&lt;li&gt;"without consulting IPS"&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The company knee-capped an entire community and a large source of their traffic (the Hot Network Questions list) because
of a single Twitter comment. Understandably, the community was upset.&lt;/p&gt;
&lt;p&gt;Behind the scenes was even worse. On Twitter, the original user posting their complaint was engaged by community moderators.
It didn't go well. Then they complained about that. A Stack Overflow employee jumped into the thread with the following:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;If the DM trolls claimed to be moderators on any of the sites then I'd like to follow up with the community team and see about
getting removed - they take this very seriously.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Turns out that Stack Overflow doesn't value their community moderators. One employee might be misguided, but this Twitter
reply remained active and moderators across the network clamored for an official response. A moderator reached out to the Twitter
user in good faith and was threatened with removal by a Stack Overflow employee.&lt;/p&gt;
&lt;p&gt;One of Stack Exchange's most respected moderators posted &lt;a href="https://medium.com/@cellio/dear-stack-overflow-we-need-to-talk-13bf3f90204f"&gt;their frustrations on Medium&lt;/a&gt;. I highly recommend you read it. One
of the community managers posted a &lt;a href="https://jericson.github.io/2018/10/24/lost_trust.html"&gt;response on their own blog&lt;/a&gt; too.&lt;/p&gt;
&lt;p&gt;The &lt;a href="https://meta.stackexchange.com/a/317264/186281"&gt;"super-official &lt;em&gt;almost&lt;/em&gt; response"&lt;/a&gt; was posted even later. This was more than 10 days after the &lt;a href="https://meta.stackexchange.com/q/316934/186281"&gt;original incident&lt;/a&gt;. It
took half a month for a first draft of a "moderator social media guidelines" post to be made in the private Stack Moderators Team.
That post consisted of bullet points on how a moderator should behave on social media. I replied to that post with this&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;I am underwhelmed by this response. The event that led to this post and recent discussions around Stack Exchange (and the broader
internet) wasn't due to a moderator's bad behavior. Moderators engaged a user on Twitter following the bullets in this post, and
yet stuff still exploded in everyone's face. From my point of view, &lt;em&gt;this&lt;/em&gt; post is so far down the list of responses that I was
hoping to see from Stack Exchange that I'm feeling insulted.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I was asked to hold my judgment until the final draft was posted. That took place in December - two months after the incident. It was
changed from "Social media guidelines" to a "community emergency process". These four bullets were provided:&lt;/p&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;Introduce yourself and if necessary, your role as moderator of a SO/SE site.&lt;/li&gt;
&lt;li&gt;Offer to help with the situation, and be very respectful if someone declines your assistance. Sometimes, people just want to vent, and the best thing we can do to help is to give them space.&lt;/li&gt;
&lt;li&gt;Be aware of the volatile nature of online discussions; if the path to constructive discourse becomes blurred, it's often best to disengage.&lt;/li&gt;
&lt;li&gt;Keep your interactions with others, concerning SO/SE, as clear and as kind as possible. If things begin to get out of hand, please disengage and let us know about it.&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;In short, do exactly what the moderator did initially on Twitter which resulted in the threat of being removed.&lt;/p&gt;
&lt;h2 id="communication"&gt;Communication&lt;a class="headerlink" href="#communication" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;Stack Overflow is slowly isolating itself from the community. There have been multiple comments scattered around the network
saying the employees don't want to engage on any meta. There are community managers that are feeling hated because of complaints
users have made. Users are taking out their anger of being ignored on posts talking about new or unrelated features. In turn,
the employees engage just a little bit less. Lines are being drawn. I see it as a moderator. I see it as a user. Very slowly
the community is trusting the company less and less.&lt;/p&gt;
&lt;p&gt;Everything is becoming "us" vs. "them". There is "the company" vs. "the users". Blog posts, comments, meta discussions also appear
to be driving a wedge between "the users" and making it "new users" vs. "established users". In the blog post announcing
&lt;a href="https://stackoverflow.blog/2019/03/28/the-next-ceo-of-stack-overflow/"&gt;the search for a new Stack Overflow CEO&lt;/a&gt;, this comment was made by the current CEO:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;One thing I’m very concerned about, as we try to educate the next generation of developers, and, importantly, get more
diversity and inclusiveness in that new generation, is what obstacles we’re putting up for people as they try to learn programming.
In many ways Stack Overflow’s specific rules for what is permitted and what is not are obstacles, but an even bigger problem is
rudeness, snark, or condescension that newcomers often see.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The underlying sentiment - improving inclusiveness and diversity - is great. I'm all for that. The rest of it, though, is a dig
at the established community in the same way that the Welcoming blog post was. Stack Overflow's high quality standards are the
problem. It makes the community seem rude and abusive. You should stop closing those questions, stop down voting new users, and
just be nice. It doesn't say that directly, but that's how &lt;a href="https://meta.stackoverflow.com/q/381927/189134"&gt;existing members&lt;/a&gt; are seeing it. Read under &lt;a href="https://meta.stackoverflow.com/a/381935/189134"&gt;hairboat's answer&lt;/a&gt;
to see some of the simmering feelings of &lt;em&gt;high&lt;/em&gt; reputation users.&lt;/p&gt;
&lt;p&gt;The idea of trust between users and the company is brought up in the comments. This is just another example, in a long list,
where the community and the company are butting heads. Something happens that the community doesn't like - reacting to incidents off
site, focusing on features no one asked for, not explaining why these new features &lt;em&gt;need&lt;/em&gt; to be done, comments are made
by one side that makes the other seem unflattering - and another round of not trusting the company starts again.&lt;/p&gt;
&lt;p&gt;The company has had a decade of experience with this community. It's grown, shrunk, and grown again. For most of that time, there has
been fairly open communication and trust. I am afraid that trust has eroded over the last few years and can't be recovered.&lt;/p&gt;
&lt;h2 id="what-can-be-done"&gt;What can be done?&lt;a class="headerlink" href="#what-can-be-done" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;The company wants to focus on areas that can bring in more money. In my &lt;a href="https://andrewwegner.com/stack-overflows-problem-feedback-from-an-experienced-user.html"&gt;previous post&lt;/a&gt; I quoted the President and Chief Technology Officer
of the company.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;I appreciate that there are a lot of issues on Stack Overflow that need to be addressed, and maybe we haven't been
responding to them as quickly as we should. But Stack Overflow Q&amp;amp;A is a big, established product, most of the problems left are
hard, and we can't let maintenance become the only thing we work on or we'll just slowly run out of money and go out of
business. We are trying to both maintain Q&amp;amp;A and solve new problems for developers and reach new audiences. The latter is hard,
and maybe we'll fail on a lot of our ideas, but we're not going to stop trying. – David Fullerton May 17 at 21:10&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I bemoaned that this sounded that Q&amp;amp;A was feature frozen. It's been nearly two years since that time. I can't remember a new feature that
was introduced into Q&amp;amp;A that helped the community maintain high quality posts. There was a new wizard introduced for new users that
is supposed to help. A quick look at the review queue numbers on Stack Overflow shows that they are still stable at the same point it
was two years ago.&lt;/p&gt;
&lt;p&gt;My suggestion as a user, a moderator and someone interested in seeing Stack Overflow remain successful, is to focus on helping to manage
the quality of your content. Users have been asking &lt;em&gt;for years&lt;/em&gt; to be able to better handle poor content. They've asked for tools (both
system tools and moderator tools). There have been projects started, stopped, restarted, and stopped again that are supposed to
improve quality. Community tools have been built to help deal with quality problems. &lt;em&gt;Use some of this!&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Stack Overflow has a data science team. Work with the community directly to help figure out ways to prevent low quality content from
ever getting posted. Force users - all users - to post higher quality content. Work with the communities that have developed automated
tools. Run it with larger data sets. Even if Stack Overflow has to be more conservative than the community tool, if you can prevent
&lt;em&gt;some&lt;/em&gt; of the low quality content from making it to the site you have a victory.&lt;/p&gt;
&lt;p&gt;Obviously the company can't ignore the areas that bring in revenue, but it's becoming increasingly clear that the community is
much less forgiving than they used to be. Continued communication blunders will not help with anything.&lt;/p&gt;
&lt;h2 id="where-to-from-here"&gt;Where to from here?&lt;a class="headerlink" href="#where-to-from-here" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;I ended my last post with this:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;I want Stack Overflow to continue to grow. I also want Stack Overflow to have high quality content. I think my experience
and the experience of others can help build the features to accomplish this. We just need Stack Overflow to refocus on the
Q&amp;amp;A portion of their network again.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I think that holds true today, just as it did 18 months ago. The aspect of the site that draws users in is Q&amp;amp;A. Make it better. Make
the content better. Give users tools to make it better. With all of this, I believe, the "welcoming" aspect will improve. Let the system
handle the low quality stuff automatically. Eliminate the need for users to ask basic questions or remind users to post their code. Let
the system be "the bad guy", and let users interact and &lt;em&gt;help&lt;/em&gt; one another.&lt;/p&gt;
&lt;p&gt;We'll see how everything looks in 18 months. In the meantime, I'll be here, cleaning up the low quality content and prodding the
company to provide improvements to Q&amp;amp;A.&lt;/p&gt;</content><category term="Side Activities"/><category term="Stack Exchange"/></entry><entry><title>Collecting Diamonds on Stack Exchange</title><link href="https://andrewwegner.com/collecting-diamonds-on-stack-exchange.html" rel="alternate"/><published>2017-08-18T10:06:00-05:00</published><updated>2017-08-18T10:06:00-05:00</updated><author><name>Andy Wegner</name></author><id>tag:andrewwegner.com,2017-08-18:/collecting-diamonds-on-stack-exchange.html</id><summary type="html">&lt;p&gt;I've picked up a couple moderation diamonds recently. A reflection on the Stack Overflow election.&lt;/p&gt;</summary><content type="html">
&lt;h2 id="introduction"&gt;Introduction&lt;a class="headerlink" href="#introduction" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;It's been over two years since I first ran for moderator on Stack Overflow. &lt;a href="https://andrewwegner.com/i'm-running-to-be-a-moderator-of-stack-overflow.html"&gt;I've run for moderator&lt;/a&gt; &lt;a href="https://andrewwegner.com/i'm-running-for-moderator-on-stack-overflow-again.html"&gt;three times&lt;/a&gt;, &lt;a href="https://andrewwegner.com/third-times-the-charm.html"&gt;previously&lt;/a&gt; on Stack Overflow.
In each election I've done better, coming in fifth in the third election. Well, it's been a little over 8 months since the last one and new moderators are needed
again. I decided to run once more with the knowledge that if I lost, I probably wouldn't run again in the next election. &lt;/p&gt;
&lt;h2 id="nomination-phase"&gt;Nomination Phase&lt;a class="headerlink" href="#nomination-phase" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;The nomination phase started off as usual, with a handful of users posting their nomination. This time there were 12 candidates, meaning there would be a primary
to narrow it down to 10 before the final election. My nomination was the following:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;I'm &lt;strong&gt;Andy&lt;/strong&gt;. I've answered the questions posted by the community &lt;a href="https://meta.stackoverflow.com/questions/352386/2017-moderator-election-qa-questionnaire/352388#352388"&gt;here&lt;/a&gt; I encourage you to take a look.&lt;/p&gt;
&lt;h2 id="why-should-you-vote-for-me"&gt;Why should you vote for me?&lt;a class="headerlink" href="#why-should-you-vote-for-me" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;I've been a moderator on &lt;a href="https://communitybuilding.stackexchange.com/"&gt;Community Building&lt;/a&gt; for several years. I was appointed to a moderator position on &lt;a href="https://hardwarerecs.stackexchange.com/"&gt;Hardware Recommendations&lt;/a&gt;. I know the moderator 
tools and have worked with the current moderators.&lt;/li&gt;
&lt;li&gt;I'm active in the review queues (I am the top reviewer in the Low Quality Post reviewers, with over 26,500 reviews in this queue). I also enjoy the other 
moderation aspects of Stack Exchange.&lt;/li&gt;
&lt;li&gt;I believe that moderation can be tool assisted. I've helped to flag a sizable percentage of &lt;a href="https://meta.stackoverflow.com/questions/280546/can-a-machine-be-taught-to-flag-comments-automatically"&gt;comments&lt;/a&gt; on Stack Overflow. I've helped build the community 
&lt;a href="https://meta.stackexchange.com/questions/291301/can-a-machine-be-taught-to-flag-spam-automatically"&gt;spam detection bot&lt;/a&gt;. These types of tools help eliminate the obvious bad stuff so that moderation time can be spent on the less obvious stuff.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I have a history of good community moderation, am here consistently, and believe I can help the current team.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;img alt="Candidate Score" src="https://andrewwegner.com/images/2017_candidate_score.png"/&gt;&lt;/p&gt;
&lt;h3 id="nomination-reflections"&gt;Nomination Reflections&lt;a class="headerlink" href="#nomination-reflections" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;An astute reader may have noticed this is pretty similar to previous nomination posts. There are a couple major differences though. The first thing is that I am the 
number one Low Quality Posts reviewer on the site. I am pretty proud of this particular statistic. It shows just how much work I've done during my tenure at Stack 
Overflow to improve the quality of the site. The unfortunate thing is that I'd probably lose this position as a moderator because they don't sit in the review queues.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Low Quality Posts" src="https://andrewwegner.com/images/number_1_low_quality_reviewer.png"/&gt;&lt;/p&gt;
&lt;p&gt;The other major change was that I had picked up a moderator position on Hardware Recommendations. That happened at the end of June. Hardware Recommendations is about ten 
times the size of Community Building (a site I've moderated for several years). It's also a couple orders of magnitude &lt;em&gt;smaller&lt;/em&gt; than Stack Overflow. &lt;/p&gt;
&lt;h2 id="primary-phase"&gt;Primary Phase&lt;a class="headerlink" href="#primary-phase" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;The most exciting part of the election season is the primary phase. The community can see the scores of users over time and have built tools to watch those numbers change 
over time. It turns out that this time, my numbers were really high.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Primary Results Table" src="https://andrewwegner.com/images/2017-07-SO-Election-Primary-Results.png"/&gt;&lt;/p&gt;
&lt;p&gt;There were plenty of good people in the election this time. One interesting thing that I found was that a lot of candidates, like me, were supportive of automation. Several
users utilized bots that posted low quality content to various chat rooms. This is a big change from previous elections. It was a welcome change. I think that automated
quality content checking can help a lot.&lt;/p&gt;
&lt;h2 id="election"&gt;Election&lt;a class="headerlink" href="#election" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;The election ended on August 1st. (&lt;a href="https://andrewwegner.com/a-decade-at-caterpillar.html"&gt;A busy day for me, apparently&lt;/a&gt;). It was a close election. Most surprisingly, no one won in the first round with everyone else 
picking up carry over votes to get second. I think that speaks to the quality of the candidates. After &lt;a href="https://www.opavote.com/results/5927932925050880"&gt;8 rounds in OpaVote&lt;/a&gt;, both Cody Gray and I were elected the two newest moderators on Stack Overflow!&lt;/p&gt;
&lt;p&gt;&lt;img alt="Election Results - OpaVote" src="https://andrewwegner.com/images/2017_opavote_results.png"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;img alt="Election Results" src="https://andrewwegner.com/images/2017_election_results.png"/&gt;&lt;/p&gt;
&lt;h2 id="post-election"&gt;Post Election&lt;a class="headerlink" href="#post-election" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;The election ended a few weeks ago. I handled more moderator flags in my first hour as a Stack Overflow moderator than I had at both Community Building and 
Hardware Recommendations combined. What I'm saying...Stack Overflow has a &lt;em&gt;ton&lt;/em&gt; of flags that need to be handled. In good news, since the election, we have gotten the
moderator queue down from about 1,100 flags to about 75 at any given time. I doubt it will stay that low, but it's still nice to see that I was immediately helpful.&lt;/p&gt;
&lt;p&gt;Finally, since the election I turned off the comment flagging bot. It had been used for just over 3 years. The community is currently &lt;a href="https://meta.stackoverflow.com/q/354719/189134"&gt;debating&lt;/a&gt; whether or not it should
run under a moderator account. The thing that I am finding more interesting about this discussion is that the community seems to agree it's helpful, &lt;a href="https://meta.stackoverflow.com/a/354723/189134"&gt;respects&lt;/a&gt; the 99+%
accuracy, would love for Stack Exchange themselves to run this tool, but doesn't want the bot to run with moderator privileges. There is, however, a very sizable portion of
the community that &lt;em&gt;does&lt;/em&gt; want this done under my account. We'll see how this plays out, but I'm hoping to be able to use the bot again soon.&lt;/p&gt;</content><category term="Side Activities"/><category term="Stack Exchange"/><category term="moderation"/></entry><entry><title>Stack Overflow's Problem - Feedback from an experienced user</title><link href="https://andrewwegner.com/stack-overflows-problem-feedback-from-an-experienced-user.html" rel="alternate"/><published>2017-05-22T23:45:00-05:00</published><updated>2017-05-22T23:45:00-05:00</updated><author><name>Andy Wegner</name></author><id>tag:andrewwegner.com,2017-05-22:/stack-overflows-problem-feedback-from-an-experienced-user.html</id><summary type="html">&lt;p&gt;Stack Overflow has made several poor decisions in the past few years. I have feedback as an experienced community member on how those decisions are perceived&lt;/p&gt;</summary><content type="html">
&lt;h2 id="introduction"&gt;Introduction&lt;a class="headerlink" href="#introduction" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;Stack Overflow launched in 2008. As it nears its 9th year of operation, I am afraid the resource that I depend on is losing its way. Stack Overflow launched after I graduated college. I can't imagine how helpful it would have been during that time period, but it's been invaluable in my professional career. I &lt;a href="https://stackoverflow.com/users/189134/andy?tab=profile"&gt;joined&lt;/a&gt; the site about a year after its public launch, in October 2009.&lt;/p&gt;
&lt;p&gt;In that time, I've gone from lurker to participant to moderator candidate (several times). I know Stack Overflow and Meta Stack Overflow. I am a moderator on another Stack Exchange site and have a good understanding of how the network operates. I also am one of the &lt;a href="https://andrewwegner.com/images/top_lqp_queue.png"&gt;most prolific reviewers&lt;/a&gt; in the Stack Overflow &lt;a href="https://stackoverflow.com/review/low-quality-posts/stats"&gt;Low Quality Posts review queue&lt;/a&gt; and have built several applications that work with the Stack Exchange API. I am a power user and know the network and the community.&lt;/p&gt;
&lt;p&gt;With those credentials out of the way, I want you to understand that I am active on the network. I am in good standing on Stack Overflow and am not a disgruntled user. I am a concerned user. I am getting more and more concerned that Stack Overflow - the company - is losing its way.&lt;/p&gt;
&lt;p&gt;This post isn't another "Stack Overflow sucks" post (Google if you're curious). I'm going to present a few areas that I'm concerned about and hopefully provide either my suggestions for improvement or acknowledge that I don't know the solution but want the team to be aware of in the future. I still believe Stack Overflow is an incredible resource. I'd just like it to fix some of the perceived missteps that have occurred over the past two years.&lt;/p&gt;
&lt;h2 id="whats-going-wrong"&gt;What's going wrong?&lt;a class="headerlink" href="#whats-going-wrong" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;In the past two years, Stack Overflow has made several changes that the established community hasn't liked. Some of these changes still are not liked. These changes include the Teams feature, the new top bar, the Stack Overflow (versus existing Stack Exchange) mobile app, and Documentation. There have also been minor missteps that have caused a rift between portions of the community and the company. These areas include multiple political stances, and a number of post quality improvements that haven't been made.&lt;/p&gt;
&lt;p&gt;Each of these, separately, is a minor problem that could be worked through and moved on from. The problem I'm seeing is that taken together, all of these are causing a rift between users, power users and the community.&lt;/p&gt;
&lt;p&gt;Let's work through each of these items.&lt;/p&gt;
&lt;h3 id="teams"&gt;Teams&lt;a class="headerlink" href="#teams" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Teams was &lt;a href="https://meta.stackoverflow.com/questions/307513/189134"&gt;announced&lt;/a&gt; in October 2015 and &lt;a href="https://meta.stackoverflow.com/questions/308601/189134"&gt;clarified&lt;/a&gt; a week later. It was then &lt;a href="https://meta.stackoverflow.com/questions/330427/189134"&gt;shut down&lt;/a&gt; after nine months. The &lt;a href="https://stackoverflow.com/teams/"&gt;page&lt;/a&gt; it used to go to now has the following blurb (emphasis is mine):&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Teams was in private beta for almost a year with 295 teams created and while we believe in its potential value, after a lot of consideration we’ve decided to un-ship the idea for the time being. We’ve realized that making a successful version of the Team page, as we originally proposed would ultimately take more time and resources than we want to devote to it. &lt;strong&gt;Our resources are currently allocated on projects to enhance and improve quality on Q&amp;amp;A, Documentation, and Jobs on Stack Overflow, as a result we don’t have the dedicated developers to get Teams to its fullest potential.&lt;/strong&gt; The intention was to add more features to Teams, but we never expanded it to anything beyond a team description.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The emphasized section sounds good, except that the one section that is taking up a majority of time (Documentation) has its own major issues. The area that many power users want developers to focus on is Q&amp;amp;A.&lt;/p&gt;
&lt;p&gt;The problem with Teams, and many of the projects mentioned in this post, is that this was a feature that removed focus on areas the community wanted improved. Meta Stack Overflow has been asking for improvements to reduce the number of low quality posts for years. Moderators have been asking for better tooling. The review queues are overflowing with tasks and the number of users performing reviews isn't high enough to keep up. Teams was built without a true end goal and users weren't entirely sure what to do with it. This was the first in a series of mis-steps that continue to plague community interactions when new features are announced.&lt;/p&gt;
&lt;h3 id="top-bar"&gt;Top Bar&lt;a class="headerlink" href="#top-bar" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;The new top bar was &lt;a href="https://meta.stackoverflow.com/questions/337745/189134"&gt;announced&lt;/a&gt; in November 2016. It went through a &lt;a href="https://meta.stackoverflow.com/questions/341806/189134"&gt;handful&lt;/a&gt; of &lt;a href="https://meta.stackoverflow.com/questions/343103/189134"&gt;iterations&lt;/a&gt; before being &lt;a href="https://meta.stackoverflow.com/questions/343653/189134"&gt;released&lt;/a&gt; in mid-February 2017. During the iterations users provided feedback. When initially released, though, much of this feedback felt ignored. Things like &lt;a href="https://meta.stackoverflow.com/a/343216/189134"&gt;notification overload&lt;/a&gt;, &lt;a href="https://meta.stackoverflow.com/q/343483/189134"&gt;stickiness of the top bar&lt;/a&gt;, and &lt;a href="https://meta.stackoverflow.com/a/346653/189134"&gt;hidden review counts&lt;/a&gt; were all mentioned during the three months of testing but not implemented until the change was live to millions of users.&lt;/p&gt;
&lt;p&gt;After three months of usage, a larger problem was noticed. One of the review queues was &lt;a href="https://meta.stackoverflow.com/q/349118/189134"&gt;constantly full&lt;/a&gt;. One of the changes that was made with this top bar was that the "Review" button no longer linked directly to the "Suggested Edits" review queue. Now it went to the page showing all review queues. Users that used to click once to get to a review queue were now presented with a list of queues to work in. Some of these queues are much more time consuming that others. It turns out the number of &lt;a href="https://meta.stackoverflow.com/a/349125/189134"&gt;reviews being done has decreased significantly&lt;/a&gt; since the top bar was implemented.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://andrewwegner.com/images/stackoverflow_active_reviewers_per_week.png"&gt;&lt;img alt="Active Reviewer per week" src="https://andrewwegner.com/images/stackoverflow_active_reviewers_per_week.png"/&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;The spike in reviews in February 2017 is when the new top bar was released. Since that release, the number of reviewers has plummeted. This has been attributed to notification fatigue and not linking users directly to the Suggested Edits queue.&lt;/p&gt;
&lt;p&gt;Three months after implementation, it took the &lt;a href="https://meta.stackoverflow.com/q/349204/189134"&gt;community asking for results&lt;/a&gt; (&lt;em&gt;disclaimer: I asked the question&lt;/em&gt;), to find out how the top bar has been performing. It turns out that the top bar is &lt;a href="https://meta.stackoverflow.com/a/349386/189134"&gt;performing decently well&lt;/a&gt; compared to what the developers were expecting, with the exception of fewer review tasks being performed.&lt;/p&gt;
&lt;p&gt;The problem with this project, is that it's felt unneeded and has materially impacted one of the quality control features of the site. There is still a vocal group of users that don't like it because it doesn't match the rest of the network. Several are concerned about the review queue problem. Experienced users felt that they were ignored during the beta tests. Users provided feedback and examples of problems and it was only after implementation when millions of other users experienced the same thing that these changes were made.&lt;/p&gt;
&lt;h3 id="mobile-app"&gt;Mobile App&lt;a class="headerlink" href="#mobile-app" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;A recent announcement (as in last week, at the time of this post) announced a new &lt;a href="https://meta.stackoverflow.com/q/349255/189134"&gt;Stack Overflow mobile application&lt;/a&gt;. The community response was not positive. Users asked why a new application was being built when one already existed (the response was "branding"). Users asked why the new app was less functional than the existing one (it's limited to Stack Overflow versus the entire Stack Exchange network). Users asked why it took a year to develop and why the existing application hasn't received bug fixes in that year.&lt;/p&gt;
&lt;p&gt;I think one of the most disappointing things about this is a &lt;a href="https://meta.stackoverflow.com/questions/349255/stack-overflow-now-has-its-own-app-on-ios-and-android#comment474055_349271"&gt;response&lt;/a&gt; I received in the comments from the VP of Engineering:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;@Andy You're right, it wasn't worth a year. There's a long, sad story here, but it was originally expected to only take a few months and... well, here we are a year later. We decided to go ahead and launch and see what we can learn, and we'll reassess from here. – David Fullerton? May 17 at 16:26&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Another user expressed the dissatisfaction in a &lt;a href="https://meta.stackoverflow.com/a/349335/189134"&gt;very pointed way&lt;/a&gt;. They provided a list of features that the community has asked for over the years that many feel have been ignored. The VP's &lt;a href="https://meta.stackoverflow.com/questions/349255/stack-overflow-now-has-its-own-app-on-ios-and-android#comment474166_349335"&gt;response&lt;/a&gt; to this wasn't encouraging either:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;I appreciate that there are a lot of issues on Stack Overflow that need to be addressed, and maybe we haven't been responding to them as quickly as we should. But Stack Overflow Q&amp;amp;A is a big, established product, most of the problems left are hard, and we can't let maintenance become the only thing we work on or we'll just slowly run out of money and go out of business. We are trying to both maintain Q&amp;amp;A and solve new problems for developers and reach new audiences. The latter is hard, and maybe we'll fail on a lot of our ideas, but we're not going to stop trying. – David Fullerton? May 17 at 21:10&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This sounds like work on the Q&amp;amp;A side is feature frozen at this point. They are done innovating in this area and instead are focused on drawing in users via other features - like Jobs or Documentation. Multiple times in the comments the new app was promoted as being able to use the Dev Story or Jobs features in the future. Perhaps it's just me, but I don't apply for jobs via my phone. That doesn't seem like a good way to really put the effort needed into a cover letter or application.&lt;/p&gt;
&lt;h3 id="documentation"&gt;Documentation&lt;a class="headerlink" href="#documentation" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Now we've reached Documentation. This is the project that's sucked up development time over the past two years. This is the project that Stack Overflow developers are defending tooth and nail and the community has all but given up on.&lt;/p&gt;
&lt;p&gt;Documentation was &lt;a href="https://meta.stackoverflow.com/q/303865/189134"&gt;announced&lt;/a&gt; back in August 2015. It's had a ton of &lt;a href="https://meta.stackoverflow.com/a/340826/189134"&gt;updates&lt;/a&gt; since then. It was met with initial enthusiasm but that quickly turned around. When the system launched for all users, one of the first complaints was that the reputation generated via documentation was doing bad things to the main Q&amp;amp;A site. This resulted in a &lt;a href="https://meta.stackoverflow.com/q/328703/189134"&gt;massive recalculation of reputation&lt;/a&gt; and resulted in many users losing &lt;em&gt;a lot&lt;/em&gt; of their internet points.&lt;/p&gt;
&lt;p&gt;Another change that was announced with the introduction of a new &lt;a href="https://meta.stackoverflow.com/q/331663/189134"&gt;review queue for documentation&lt;/a&gt;. Initially, developers didn't expect the low quality to begin immediately, it seems. Long time users weren't surprised. Now we've reached the point where the company is realizing that the users knew what they were talking about. &lt;a href="https://meta.stackoverflow.com/q/349410/189134"&gt;Documentation is undergoing a massive change&lt;/a&gt;, to the point that much of it is being completely redone - not fixed - scrapped and redone.&lt;/p&gt;
&lt;p&gt;This project has years worth of feedback from the community that has been ignored. It is the black sheep of Stack Overflow and many community users feel that quality of the content is lacking so badly that they don't participate any longer. This feeling isn't helped that many users have been explaining &lt;em&gt;why&lt;/em&gt; things aren't working for a while and it's only after two years the developers are starting to realize the private beta testers, public beta testers and experienced community users mentioned many of these problems. In this particular instance, the company took &lt;a href="https://blog.codinghorror.com/listen-to-your-community-but-dont-let-them-tell-you-what-to-do/"&gt;Jeff Atwood's advice&lt;/a&gt; (co-founder of Stack Overflow) to not let the community tell you what to do to heart. To the company's surprise, a community of developers that live in programming documentation had decent thoughts on what does and does not work in programming documentation.&lt;/p&gt;
&lt;h3 id="politics"&gt;Politics&lt;a class="headerlink" href="#politics" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;For many users, the lack of true social features on Stack Overflow and across the Stack Exchange network has been a good thing. You can't easily follow a single user, you can't send private messages to a user, and you can't really do anything on the site that isn't public to everyone. The focus is on content, not opinions or social interactions.&lt;/p&gt;
&lt;p&gt;This breaks down once and a while though when a big political thing occurs. The two most frequently mentioned instances are the &lt;a href="https://meta.stackoverflow.com/a/297871/189134"&gt;response&lt;/a&gt; to the &lt;a href="http://s3.documentcloud.org/documents/2111821/obergefell-v-hodges.pdf"&gt;Obergefell v. Hodges&lt;/a&gt; Supreme Court decision and the response to President Trump's &lt;a href="https://meta.stackoverflow.com/q/342440/189134"&gt;initial immigration executive order&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Both of these caused huge uproars within the community when the company took a stand. These stands caused problems due to users holding opposite political views, users not wanting politics on their programming site, users not wanting to deal with the drama caused by the vocal members of the other groups. This led to an &lt;a href="https://meta.stackoverflow.com/q/342903/189134"&gt;apology&lt;/a&gt;. The community wasn't pleased with this apology. Users mentioned in multiple answers to this apology that they don't want the company to post such political agendas on the site. It's out of place for a programmer community. Both of these instances are still brought up on Meta when the community feels that the company is imposing on them.&lt;/p&gt;
&lt;p&gt;I don't really have advice or suggestions on this problem other than "I don't want to see this on Stack Overflow, because these hot button issues cause so much drama that nothing gets accomplished". These posts grind Meta and chatrooms to a halt while everyone expresses their opinion on the post, on the post's existence, on one another and on related issues.&lt;/p&gt;
&lt;h3 id="quality-improvements"&gt;Quality Improvements&lt;a class="headerlink" href="#quality-improvements" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Finally, the community has been asking for years about ways to improve the quality of posts on the site. Stack Exchange &lt;a href="https://meta.stackexchange.com/q/285889/186281"&gt;started a project to improve the quality&lt;/a&gt; back in October 2016. This generated 80 different suggestions on how the community sees "quality improvement" taking place. Since then there haven't been any updates on the status of this project or even subprojects.&lt;/p&gt;
&lt;p&gt;This was brought up during all of the projects listed above by long time users. The hope was that this quality project would help. Being ignored hasn't brought any good feelings. The lower quality has been measurable and seen less participation from experienced users.&lt;/p&gt;
&lt;h2 id="the-fix"&gt;The Fix&lt;a class="headerlink" href="#the-fix" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;Above I've pointed out several issues that I've seen over the past two years. These issues are part of a bigger problem though. It seems that Stack Overflow doesn't know how to handle its community size any longer. It's in the top 300 sites visited in the US and receives &lt;a href="https://www.quantcast.com/stackoverflow.com#trafficCard"&gt;half a billion views globally per month&lt;/a&gt;. Couple this with the fact that they &lt;a href="https://meta.stackoverflow.com/questions/349255/stack-overflow-now-has-its-own-app-on-ios-and-android#comment474175_349335"&gt;don't have a sustainable business model yet&lt;/a&gt; and have a sizable team with good benefits and they are getting concerned.&lt;/p&gt;
&lt;p&gt;Q&amp;amp;A is what built Stack Overflow, but it isn't enough to sustain them. Thus, the other projects are being created. Unfortunately, in this process, it seems the company is forgetting its existing user base at the expense of expanding to new users. Existing users are getting frustrated with the lack of quality improvements, being ignored and not having changes that benefit their use cases.&lt;/p&gt;
&lt;p&gt;Documentation has taken up a giant chunk of time and developer effort and it's all been wasted. The &lt;a href="https://meta.stackoverflow.com/q/349410/189134"&gt;announcement&lt;/a&gt; that it is being redone has been met with "thanks" from the community, along with warnings to consider that "quality" problem. We'll see how it plays out, or if that quality issue is ignored like their own Quality Project.&lt;/p&gt;
&lt;p&gt;Which brings us to the final point I want to make. I think the feeling of Q&amp;amp;A being "done" is the biggest problem I've had with Stack Overflow over the last year. New features aren't being built in that space. Instead of focusing on some of the "hard" problems, the company is throwing stuff at the wall and hoping something will stick. Unfortunately, the four biggest projects in the last year have either failed completely (Teams, Documentation, Mobile App...perhaps) or have significant unintended consequences that aren't helping the quality issue users have been reporting for years.&lt;/p&gt;
&lt;p&gt;Power users, the underlying community that has put time and effort into growing Stack Overflow to what it is today, is feeling ignored. It is only after months or years long experiments fail that community opinions are finally validated or considered. Users have expressed concerns in each of the above projects repeatedly. Yet, those opinions were not addressed. The silos that the developers have built around themselves are causing the company to lose touch with its community. This is being done at the expense of alienating the users that care and the cost of developer time.&lt;/p&gt;
&lt;p&gt;Users want a high quality site with answers to their questions. Even new or potentially new users want this. Stack Overflow continues to avoid dealing with that problem because "it's hard". The unfortunate thing is, this is costing the site &lt;a href="http://data.stackexchange.com/stackoverflow/query/674690#graph"&gt;users that return to provide more than one answer&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://andrewwegner.com/images/stackoverflow_return_users.png"&gt;&lt;img alt="Stack Overflow Return Answerers" src="https://andrewwegner.com/images/stackoverflow_return_users.png"/&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;This chart is showing the number of answers provided per month by different types of users. Users that have provided more than 100 answers, between 11 and 100 answers, between 2 and 10 answers and only a single answer. The furthest data point on the right is an artifact of being an incomplete month. From this chart, we can see that the only group that has continued to rise are users that provide a single answer over time. The other groups took a steep drop in April 2014 and haven't recovered since then. The number of experienced users that are participating has dropped.&lt;/p&gt;
&lt;p&gt;What happened in April 2014? That's been &lt;a href="https://meta.stackoverflow.com/a/320234/189134"&gt;answered&lt;/a&gt; by a Stack Overflow community manager. The theory is that users aren't getting answers to their questions and due to being ignored they never return to participate further in the site. Another community manager also provided an &lt;a href="https://meta.stackoverflow.com/a/320440/189134"&gt;answer&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Starting around 2013 and peaking around March, 2014, people began asking fewer interesting questions. That lead to a decrease in voting on questions and fewer answers being given. Since the feedback on these uninteresting questions was discouraging, people began asking fewer questions on the whole. Meanwhile, truly poor questions continued being asked with little regard to negative feedback.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Stack Overflow users began noticing increasing numbers of truly awful questions and decided, rightly, that downvoting and refusing to answer them is the best remedy. These questions fit broad categories of awful and users began withholding votes from questions that were not themselves awful, but bore some of the markers of awful. Fewer of these questions got answered and askers of mediocre questions did not see any point in trying to improve.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/blockquote&gt;
&lt;p&gt;Thus began a slow spiral downward. Not all is lost though, because there are the upticks. I hope it's enough to break the cycle, but I really fear that something needs to be done about this quality issue. This is the issue that is brought up by the experienced community.  &lt;/p&gt;
&lt;h2 id="where-to-from-here"&gt;Where to from here?&lt;a class="headerlink" href="#where-to-from-here" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;I continue to invest my time and effort into the community, but even as an active user who really wants the company and community to succeed, it's getting harder and harder to ignore that those of us that have been around for years are not being listened to any more. We're being treated as the grumpy old person that grumbles about the way things used to be. Our experiences on the site are brushed aside as being unhelpful to new users. That completely ignores that fact that we are still trying to reach the goal on which Stack Overflow was created: &lt;a href="https://stackoverflow.com/tour"&gt;"With your help, we're working together to build a library of detailed answers to every question about programming."&lt;/a&gt; To do this, we need high quality questions and answers so that we can actually provide help to all users. &lt;a href="https://andrewwegner.com/stack-overflows-still-has-issues-and-its-getting-worse.html"&gt;I think &lt;em&gt;this&lt;/em&gt; is the biggest challenge that Stack Overflow is going to face in the next 18 months&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I want Stack Overflow to continue to grow. I also want Stack Overflow to have high quality content. I think my experience and the experience of others can help build the features to accomplish this. We just need Stack Overflow to refocus on the Q&amp;amp;A portion of their network again.&lt;/p&gt;</content><category term="Side Activities"/><category term="Stack Exchange"/></entry><entry><title>Can a machine be taught to flag spam automatically</title><link href="https://andrewwegner.com/can-a-machine-be-taught-to-flag-spam-automatically.html" rel="alternate"/><published>2017-02-19T22:51:00-06:00</published><updated>2017-02-20T00:00:00-06:00</updated><author><name>Andy Wegner</name></author><id>tag:andrewwegner.com,2017-02-19:/can-a-machine-be-taught-to-flag-spam-automatically.html</id><summary type="html">&lt;p&gt;Description of how a group of people helped completely eliminate spam on the Stack Exchange network&lt;/p&gt;</summary><content type="html">
&lt;h2 id="introduction"&gt;Introduction&lt;a class="headerlink" href="#introduction" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;This post was originally &lt;a href="http://meta.stackexchange.com/q/291301/186281"&gt;published&lt;/a&gt; on Meta Stack Exchange on February 20, 2017. I've republished it here
so that I can easily update information related to recent developments. If you have questions or comments, I highly
encourage you to visit the &lt;a href="http://meta.stackexchange.com/q/291301/186281"&gt;question&lt;/a&gt; on Meta Stack Exchange and post there.&lt;/p&gt;
&lt;p&gt;The post was featured across the entire Stack Exchange network for a week, too. This drove a huge amount of traffic
to the question and resulted in some valuable feedback:&lt;/p&gt;
&lt;p&gt;&lt;a href="https://andrewwegner.com/images/spam-featured-announcement.png"&gt;&lt;img alt="Featured Announcement" src="https://andrewwegner.com/images/spam-featured-announcement.png"/&gt;&lt;/a&gt;&lt;/p&gt;
&lt;hr/&gt;
&lt;p&gt;TL;DR: &lt;a href="http://charcoal-se.org/people.html"&gt;We&lt;/a&gt; did it, so... yes.&lt;/p&gt;
&lt;hr/&gt;
&lt;h2 id="what-is-this"&gt;What is this?&lt;a class="headerlink" href="#what-is-this" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;Charcoal is the &lt;a href="http://charcoal-se.org/people.html"&gt;organization&lt;/a&gt; behind the &lt;a href="https://github.com/Charcoal-SE/SmokeDetector"&gt;SmokeDetector&lt;/a&gt; bot and other &lt;a href="https://github.com/Charcoal-SE"&gt;nice things&lt;/a&gt;. This bot scans new 
posts across the entire network for spam posts and reports them to &lt;a href="https://github.com/Charcoal-SE/SmokeDetector/wiki/Chat-Rooms"&gt;various chatrooms&lt;/a&gt; where people can act on them. 
If a post has been created or edited, anywhere on the network, we've probably seen it. The bot utilizes our knowledge 
of how spammers work and what they have previously posted to come up with common patterns and rules to detect spam in
the new and updated posts. You've likely seen the SmokeDetector bot if you visit chatrooms such as
&lt;a href="http://chat.meta.stackexchange.com/rooms/89/tavern-on-the-meta"&gt;Tavern on the Meta&lt;/a&gt;, &lt;a href="http://chat.stackexchange.com/rooms/11540/charcoal-hq"&gt;Charcoal HQ&lt;/a&gt;, &lt;a href="http://chat.stackoverflow.com/rooms/41570/so-close-vote-reviewers"&gt;SO Close Vote Reviewers&lt;/a&gt; and others across the network. Over time, the 
bot has become very accurate. &lt;/p&gt;
&lt;p&gt;Now we are leveraging the years of data and accuracy to automatically cast spam flags. With approximately 58,000 posts 
to draw from and over 46,000 true positives, we have a vast trove of data to utilize.&lt;/p&gt;
&lt;h2 id="what-problem-does-this-address"&gt;What problem does this address?&lt;a class="headerlink" href="#what-problem-does-this-address" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;To put it simply, &lt;strong&gt;spam&lt;/strong&gt;. Stack Exchange is one of the most popular networks of websites on the Internet, and &lt;em&gt;all&lt;/em&gt; 
of it gets spammed at some point. Our statistics show that we see about 100 spam posts per day, on average over the 
last three months. &lt;/p&gt;
&lt;p&gt;A decent chunk of this isn't the type you'd want to see at work (or at all). The faster we can get this off the home 
page, the better for all involved. Unfortunately, it's not unheard of for spam to last several hours, even on the 
larger sites such as Graphic Design.&lt;/p&gt;
&lt;p&gt;Over the past three years, efforts with Smokey have significantly cut the time it takes for spam to be deleted. This 
project is an extension of that, and it's now well within reach to delete spam within seconds of it being posted.&lt;/p&gt;
&lt;h2 id="what-are-we-doing"&gt;What are we doing?&lt;a class="headerlink" href="#what-are-we-doing" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;For over 3 years, SmokeDetector has reported potential spam across the Stack Exchange network so that users can flag 
the posts as appropriate. Users have provided feedback to inform the bot on whether the detection was correct or not 
(referred to as "feedback"). This feedback is stored in our web dashboard, &lt;a href="https://metasmoke.erwaysoftware.com/"&gt;metasmoke&lt;/a&gt; (&lt;a href="https://github.com/Charcoal-SE/metasmoke"&gt;code&lt;/a&gt;). Over time, we've 
used this feedback to evaluate our patterns ("reasons") and improve our accuracy. &lt;a href="https://metasmoke.erwaysoftware.com/reason/106"&gt;Several&lt;/a&gt; of our &lt;a href="https://metasmoke.erwaysoftware.com/reason/21"&gt;reasons&lt;/a&gt; 
are over 99.9% &lt;a href="https://metasmoke.erwaysoftware.com/reason/61"&gt;accurate&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Early last year, and after getting a baseline accuracy from &lt;a href="http://stackoverflow.com/users/1933347/jmac"&gt;jmac&lt;/a&gt; (thank you!), we realized we could use the 
system to automatically cast spam flags. On Stack Overflow the current accuracy of users flagging spam posts is 85.7%. 
Across the rest of the network users are 95.4% accurate. We determined we can beat those numbers and eliminate spam 
from Stack Overflow and the rest of the network even faster. &lt;/p&gt;
&lt;p&gt;Without going into too much detail (if you really want it, it's available on our &lt;a href="https://charcoal-se.org/flagging"&gt;website&lt;/a&gt;), we leverage the 
accuracy of each existing reason to come up with a weight indicating how certain the system is that a post is spam. If 
this value exceeds a specific threshold, the system will cast up to three spam flags on the post. We cast multiple 
flags utilizing a number of different users' accounts and the Stack Exchange API. Via metasmoke, users are given the 
opportunity to &lt;a href="https://metasmoke.erwaysoftware.com/flagging/ocs"&gt;enable their accounts to be used to flag spam&lt;/a&gt; (You can too, if you've made it this far). When a 
post is eligible for flagging because it exceeded the threshold set by each individual user, accounts are randomly 
selected from the pool of enabled users to cast a single flag each, up to a maximum of three per post so that we never 
unilaterally nuke something.&lt;/p&gt;
&lt;h2 id="what-are-our-safety-checks"&gt;What are our safety checks?&lt;a class="headerlink" href="#what-are-our-safety-checks" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;We designed the entire system with accuracy and sanity checks in mind. Our design collaborations are available for 
your browsing pleasure (&lt;a href="https://docs.google.com/document/d/1Bg0u4oY9W_skp79wSnyQWttUIBH8WV46JELDGJ7Bixo/edit"&gt;RFC 1&lt;/a&gt;, &lt;a href="https://docs.google.com/document/d/1voGyl3BUA1JHJ0pR2Mf9E5-wmIDUFC1G8HcThiS7B1k/edit"&gt;RFC 2&lt;/a&gt;, RFC 3 (no longer available)). The major things that make this system safe and sane 
are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;We give users a choice as to how accurate they want to be with their automatic flags. Before casting any flags, we 
 check that the preferences the user has set result in a spam detection accuracy of over 99.5% over a sample of at 
 least 1000 posts. Remember, the current accuracy of humans is 85.7% on SO and network wide it is 95.4%. &lt;/li&gt;
&lt;li&gt;We do not unilaterally spam nuke a post, regardless of how sure we are it is spam. This means that a human &lt;em&gt;must&lt;/em&gt; 
 be involved to finish off a post, even on the few sites with lower spam thresholds.&lt;/li&gt;
&lt;li&gt;We’ve designed the system to be tolerant of faults - if there’s a malfunction anywhere in the system, any user with 
 access to SmokeDetector can immediately halt all automatic flagging - this includes all network moderators. If this 
 happens, it needs a system administrator to step in to re-enable flags.&lt;/li&gt;
&lt;li&gt;We've discussed this with a community manager and have their &lt;a href="http://chat.stackexchange.com/transcript/message/35437121#35437121"&gt;blessing&lt;/a&gt; on the project.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="results"&gt;Results&lt;a class="headerlink" href="#results" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;We have been casting an average of 60-70 automatic flags per day for over two months, for a total of just over 4000 
flags network wide. These flags were cast by 22 different users. In that time, we've had &lt;a href="https://metasmoke.erwaysoftware.com/flagging/logs?filter=fps"&gt;four&lt;/a&gt; false positives. 
We would like to be able to automatically cancel these particular cases. This isn't possible though, so we've created 
a feature request to &lt;a href="http://meta.stackexchange.com/questions/288120/allow-retracting-flags-from-the-api"&gt;retract flags via the API&lt;/a&gt;. In the mean time, the flags are either manually retracted by the 
user or declined by a moderator.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://andrewwegner.com/images/spam-weights-and-accuracies.png"&gt;&lt;img alt="Weights and Accuracy" src="https://andrewwegner.com/images/spam-weights-and-accuracies.png"/&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;The above graph plots the weight of the reasons against its overall volume of reports and accuracy. As minimum weight 
increases, accuracy (yellow line and rightmost Y-axis) and total reports (blue line) on the left-hand scale increase. 
The green line represents the number of true positives, which are verified by SmokeDetector user feedback.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://andrewwegner.com/images/spam-autoflags-per-day.png"&gt;&lt;img alt="Automatic Flags per day" src="https://andrewwegner.com/images/spam-autoflags-per-day.png"/&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;This shows the number of posts we've automatically flagged per day over the last month. The jump on February 15th, is 
due to increasing the number of automatic flags from 1 per post to 3 per post. You can see a live version of this graph 
on &lt;a href="https://metasmoke.erwaysoftware.com/flagging"&gt;metasmoke's autoflagging page&lt;/a&gt;. &lt;/p&gt;
&lt;p&gt;&lt;a href="https://andrewwegner.com/images/spam-spam-hours.png"&gt;&lt;img alt="Spam Hours" src="https://andrewwegner.com/images/spam-spam-hours.png"/&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Spam arrives on Stack Exchange in waves. It is easy to see the time of day that many spam reports come in. The hours, 
above, are UTC time. The busiest spam times of day are the 8 hour block between 4am and Noon. We have affectionately 
named this "spam hour" in the chat room. &lt;/p&gt;
&lt;p&gt;&lt;a href="https://andrewwegner.com/images/spam-average-time-to-delete.png"&gt;&lt;img alt="Average Time to Deletion" src="https://andrewwegner.com/images/spam-average-time-to-delete.png"/&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Our goal is to delete spam quickly and accurately. The graph shows the time it takes for a reported spam post to be 
removed from the network. This section has three trend lines that show these averages. The first, red section is when 
we were simply reporting the posts to chatrooms and all flags had to come from users. You can see we are pretty constant 
in the time it takes to remove spam during this period. It took, on average, just over five minutes to get a post 
removed.&lt;/p&gt;
&lt;p&gt;The green trend line is when we were issuing a single automatic flag. At implementation, we eliminated a full minute 
from time to deletion and after a month we'd eliminated two full minutes compared to no automatic flags.&lt;/p&gt;
&lt;p&gt;The last section, the orange, is when we implemented three automatic flags to most sites. This was rolled out last 
week, but it's already had a dramatic improvement on the time to deletion. We are seeing between 1 and 2 minutes to 
time to deletion.&lt;/p&gt;
&lt;p&gt;As mentioned above, spam arrives in waves. The dashed and dotted lines on the graph show the average deletion time 
during these two different time periods. The dashed lines show deletion time during 4am and Noon UTC, the dotted lines 
show the rest of the 24 hour period. An interesting thing this graph shows is that time to deletion during spam hour 
was higher when we didn't cast any automatic flags. It was removed faster outside of spam hour. That reversed when we 
started issuing a single auto-flag. The spam hour time to deletion is slightly lower than the average. Comparing the 
two time periods though, time to deletion during non-spam hour at the end of the non-flagging time period and the end 
of the single flag period are roughly the same. &lt;/p&gt;
&lt;p&gt;We'll update these in a few weeks too, to better show the trend we are seeing with three automatic flags.  &lt;/p&gt;
&lt;h2 id="discussion"&gt;Discussion&lt;a class="headerlink" href="#discussion" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;We are confident in SmokeDetector and the three years of history it has. We've had many talented developers assist us 
over the years and many more users have provided feedback to improve our detection rules. Let us know what you want us 
to elaborate on, features you're wondering about or would like to see added, or things we might have missed in the 
process or the tooling. Take a look at the &lt;a href="http://meta.stackexchange.com/questions/288120/allow-retracting-flags-from-the-api"&gt;feature&lt;/a&gt; we'd really like Stack Exchange to consider so that we can 
further improve this system (and some of the other community built systems). We'll have &lt;a href="http://charcoal-se.org/people.html"&gt;Charcoal members&lt;/a&gt; hanging 
around and answering your questions. Alternatively, feel free to drop into &lt;a href="http://chat.stackexchange.com/rooms/11540/charcoal-hq"&gt;Charcoal HQ&lt;/a&gt; and have a chat. &lt;/p&gt;</content><category term="Programming Projects"/><category term="Stack Exchange"/><category term="machine learning"/><category term="automation"/><category term="programming"/></entry><entry><title>Third time's the charm?</title><link href="https://andrewwegner.com/third-times-the-charm.html" rel="alternate"/><published>2016-11-06T22:54:00-06:00</published><updated>2015-11-28T00:00:00-06:00</updated><author><name>Andy Wegner</name></author><id>tag:andrewwegner.com,2016-11-06:/third-times-the-charm.html</id><summary type="html">&lt;p&gt;It's been a year since Stack Overflow's last election. I'm running again. Will the third time be the charm, or third strike and I'm out?&lt;/p&gt;</summary><content type="html">
&lt;h2 id="introduction"&gt;Introduction&lt;a class="headerlink" href="#introduction" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;Last year, &lt;a href="https://andrewwegner.com/i'm-running-to-be-a-moderator-of-stack-overflow.html"&gt;I ran for moderator&lt;/a&gt; (&lt;a href="https://andrewwegner.com/i'm-running-for-moderator-on-stack-overflow-again.html"&gt;twice&lt;/a&gt;) on Stack Overflow and didn't make it through the primaries. I came 
close on that second run. Now, a year later, and a year more experienced, I'm going to try again. This post will 
document my progress through the election cycle.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Spoiler Alert&lt;/em&gt;: I didn't win. The rest of this post details my thoughts as the election occurred though.&lt;/p&gt;
&lt;h2 id="nomination-phase"&gt;Nomination Phase&lt;a class="headerlink" href="#nomination-phase" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;The election this year took a slightly different route than last time. In previous years, the election was announced
at the same time as the call for nominations began. Users had a week to nominate themselves, then we answered a series
of community provided questions during the primaries, then the final election. &lt;/p&gt;
&lt;p&gt;This year, the election was announced a week in advance of nominations. During the week, a call was put out for 
&lt;a href="http://meta.stackoverflow.com/q/337191/189134"&gt;Community questions&lt;/a&gt;. When a nomination was posted, the answers would be posted as well. This change was made due
to how much the community needed to read during the primaries. The primaries were only a few days long and the Q&amp;amp;As 
were usually ten questions for each user. When a primary has 20-30 nominees, that is &lt;em&gt;a lot&lt;/em&gt; of reading that was 
expected in a short period of time. By bringing this phase forward, now the community has the &lt;em&gt;entire&lt;/em&gt; election cycle
to read and interact with the nominees. &lt;/p&gt;
&lt;p&gt;I provided &lt;a href="http://meta.stackoverflow.com/a/337238/189134"&gt;one question&lt;/a&gt; that was used in the final selection of questions. I mentioned &lt;a href="https://andrewwegner.com/i'm-running-for-moderator-on-stack-overflow-again.html"&gt;last time&lt;/a&gt; that I 
thought it was a great question, so I suggested it again:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Do you have any Meta posts that you're particularly proud of, or that you feel best demonstrate your moderation style?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3 id="my-nomination"&gt;My nomination&lt;a class="headerlink" href="#my-nomination" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;My &lt;a href="http://stackoverflow.com/election/8?tab=nomination#post-40473869"&gt;platform&lt;/a&gt; isn't all that different than the last two times. &lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Hi Everyone, I'm &lt;strong&gt;Andy&lt;/strong&gt; and I'd like to be a moderator for you and Stack Overflow. I've answered the questions posted by the community &lt;a href="http://meta.stackoverflow.com/a/337574/189134"&gt;here&lt;/a&gt;. I encourage you to take a look.&lt;/p&gt;
&lt;h3 id="why-should-you-vote-for-me"&gt;Why should you vote for me?&lt;a class="headerlink" href="#why-should-you-vote-for-me" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;I've been a moderator on &lt;a href="http://communitybuilding.stackexchange.com"&gt;Community Building&lt;/a&gt; for over two years. I know the moderator tools and have worked with many of the current moderators. This interaction will continue as a new moderator here. &lt;/li&gt;
&lt;li&gt;I have a lot of helpful flags. A decent percentage of these are on &lt;a href="http://meta.stackoverflow.com/questions/280546/"&gt;comments&lt;/a&gt;, but not all. I'd like to help keep the site clean without adding to the current moderators' work load. &lt;/li&gt;
&lt;li&gt;I'm active in the review queues (currently holding 5th in &lt;a href="http://stackoverflow.com/review/low-quality-posts/stats"&gt;Low Quality Post reviewers&lt;/a&gt; of all time), provide edits to posts, answers and enjoy the moderation aspect of Stack Exchange.&lt;/li&gt;
&lt;li&gt;I have a &lt;a href="http://meta.stackoverflow.com/users/189134/"&gt;history on Meta.SO&lt;/a&gt; that shows I'm involved in the meta aspect of the site as well.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I enjoy the moderation aspect on Stack Overflow (and Stack Exchange in general). I have a history of good community moderation, am here all the time and believe I can help the current team.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;During the first full day, I've gotten positive responses to this post. My two favorite, so far, are:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Andy's work around comment flags has been very impressive. I'm definitely curious to see what his thoughts on the mod 
queue are and if we could incorporate some of his work permanently on the site. Better identification of flags is 
something that would be very nice to have permanently.  - &lt;a href="http://stackoverflow.com/users/426671/bluefeet"&gt;bluefeet&lt;/a&gt; &lt;em&gt;Stack Overflow Community Manager&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;and&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;There are always some nominees for this position who are very active, some who have good judgment and cool heads, and 
some who innovate with their approach to community moderation. Andy is the rare candidate who very clearly checks all 
three boxes. As a user on SO for 3.5 years, a moderator pro tempore on Engineering SE for 1.7 years and an early 
participant in the Community Building SE beta, I strongly support this nomination. - &lt;a href="http://stackoverflow.com/users/2359271/air"&gt;Air&lt;/a&gt; &lt;a href="http://engineering.stackexchange.com/"&gt;&lt;em&gt;Moderator on Engineering.SE&lt;/em&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;My candidate score this time is an impressive 39/40. This is up six from a year ago, and up ten from my first run. The
one missing point is due to missing the &lt;a href="http://stackoverflow.com/help/badges/4369/refiner"&gt;Refiner&lt;/a&gt; badge. I believe the reason for this is because of my workflow. I,
generally, don't edit and answer questions at the same time. If I'm answering, I'm not in "edit" mode. If I'm editing,
I'm usually in "moderation" mode. It's something I'll work on. I'm 38 out of 50 questions there, so I'll get it soon
enough.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Candidate Score" src="https://andrewwegner.com/images/november_2016_candidate_score.png"/&gt;&lt;/p&gt;
&lt;h3 id="candidate-questions"&gt;Candidate questions&lt;a class="headerlink" href="#candidate-questions" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;None of the questions were that surprising. With the added benefit of a week to prepare answers prior to nominating,
I am very pleased with my &lt;a href="http://meta.stackoverflow.com/a/337574/189134"&gt;answers&lt;/a&gt;. Two answers have generated a bit of discussion though.&lt;/p&gt;
&lt;blockquote&gt;
&lt;ol&gt;
&lt;li&gt;A 10k+ user regularly has their comments flagged as "rude or offensive" or "not constructive", to the tune of 
4-5 flags a day. No comment by itself is particularly offensive, but their general tone causes them to be flagged 
by multiple users. You've contacted them privately about this, but they believe that they aren't doing anything wrong 
and that people are being too sensitive. The flags keep coming in on their comments. What, if anything, do you do 
next?&lt;/li&gt;
&lt;/ol&gt;
&lt;/blockquote&gt;
&lt;p&gt;My response is: &lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;No one has an exemption from the Be Nice policy. I think the first step is to understand why nothing has already been 
done about the user. 4-5 a day seems like the user has moved beyond the "nuisance" stage. I think a temporary ban is 
appropriate, with another explanation as to what is expected when interacting with others. While some users are more 
sensitive than others, a stream of this many flags across an extended period of time doesn't lead me to believe the 
problem is with the community users.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The point raised in the comments was that I was rushing into banning the user without communicating first. I disagree
with that, and explained that they've already been contacted privately and ignored those warnings. A ban is the next 
step in getting the user's attention. I was told this would be "humiliating" for a high rep user. Again, I disagree
and believe it's not humiliating, but &lt;a href="http://meta.stackoverflow.com/questions/337571/2016-stack-overflow-moderator-election-qa-questionnaire/337574#comment411104_337574"&gt;educating&lt;/a&gt; the user.&lt;/p&gt;
&lt;p&gt;The second question that generated some discussion was:&lt;/p&gt;
&lt;blockquote&gt;
&lt;ol&gt;
&lt;li&gt;You impose a temporary ban (say 1 week) on a user for what you judged as reasonable and valid reasons (the user 
gets notified by email of your action and the reason). The user replies to your email acknowledging the transgression,
says they won't do it again and asks for the ban to be lifted. The user sounds genuine. Do you remove the ban? Do you 
even reply at all? Explain your reasoning. The context of this question applies to longer bans too. If it helps get 
the juices flowing, consider the situation of a second offence for the same behaviour, which has a default ban 
period of 1 month.&lt;/li&gt;
&lt;/ol&gt;
&lt;/blockquote&gt;
&lt;p&gt;My response:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;I have two answers for this question, based on the user's history. If this is a first offense, up to this point the 
user hasn't been pushing limits and attempting to disrupt others, and the ban isn't related to voting fraud, then I'd 
be willing to remove the ban. Sometimes a ban is put in place to get the &lt;a href="http://meta.stackoverflow.com/questions/288229/why-was-balusc-temporarily-suspended-from-so"&gt;user's attention&lt;/a&gt;. Once the situation 
has been resolved, the ban is no longer appropriate and should be removed.&lt;/p&gt;
&lt;p&gt;On the other hand, if the user has a history of crossing the line and looking for a reaction, or if the ban is 
related to vote fraud, I'd simply not reply and the user will return in a week. Stack Overflow has enough "voting 
irregularity" bans that I imagine the responses to such bans are all similar (and invalid). I see no reason to 
change that policy.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The push back I received on this was that I was letting a user off the hook by unbanning them. I argued that unbanning
has been done in the past. Sometimes the ban is needed simply to get the user's attention and start the conversation
and explain that what they are doing is wrong. If the user abuses the trust at that point and repeats the behavior, then
the longer ban is completely justified. A bit of compassion isn't a bad thing. &lt;/p&gt;
&lt;h2 id="primary-phase"&gt;Primary Phase&lt;a class="headerlink" href="#primary-phase" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;There are 12 nominees, so a primary will occur. Once again, the primary phase will reduce the number of candidates in
the final phase to 10. With so few being eliminated this time around, it feels a little unneeded. The primary will last 
for a few days and during that time users can vote candidates up or down depending whether they believe the nominee
should be a moderator. I'll return in a few days...&lt;/p&gt;
&lt;h3 id="primary-results"&gt;Primary Results&lt;a class="headerlink" href="#primary-results" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;The Primary phase has ended and the final election has begun. I ended the primary in 5th place, securing a position 
in the final election. I have a sizable margin between my position and sixth place as well. One other stat that I'm
rather proud of: I received the fewest number of down votes of any candidate. &lt;/p&gt;
&lt;p&gt;&lt;img alt="Primary Results" src="https://andrewwegner.com/images/2016-Fall-SO-Primary-Results.png"/&gt;&lt;/p&gt;
&lt;p&gt;On to the election! &lt;/p&gt;
&lt;h2 id="election-phase"&gt;Election Phase&lt;a class="headerlink" href="#election-phase" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;The election lasts for several days and covers a weekend. We'll see how it turns out in a few days. &lt;/p&gt;
&lt;h3 id="election-results"&gt;Election Results&lt;a class="headerlink" href="#election-results" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Well, the election has concluded. I didn't secure one of the three positions for moderator. I finished in &lt;a href="http://www.opavote.com/results/6488198410665984/0"&gt;5th place&lt;/a&gt;,
with my elimination propelling second and third place to a victory. I was eliminated in the 10th round of the Meek STV
process. &lt;/p&gt;
&lt;p&gt;Good luck to the new moderators!&lt;/p&gt;
&lt;h2 id="post-election-thoughts"&gt;Post Election thoughts&lt;a class="headerlink" href="#post-election-thoughts" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;This election started differently than the previous two I've run in. This election was announced a week in advance
and solicited community input for questions for the candidates. I think this was a good change. The element of 
surprise in the previous two made it much more stressful. Additionally, by having the questions available at the start
of the election - instead of at the start of the primary phase - I was able to better answer the questions. Previously,
the questions would be available at the start of the primary phase. With the amount of reading needed to get through
one candidate's answers, let alone all of them, I imagine that many people didn't read all of the responses. &lt;/p&gt;
&lt;p&gt;The other nice thing about this lead time, is that I had time to get my answers read for when I posted my nomination. 
By posting the questions and answers at the same time, I was able to have my responses available the entire time. 
Score-wise, on the questionnaire, I did much better than my opponents. I think a big reason for this is that I have my
responses posted as soon as my nomination was posted.&lt;/p&gt;
&lt;p&gt;One question this time, though, seemed to split the candidates. I mentioned it previously, but it was regarding 
potentially removing a temporary ban. &lt;/p&gt;
&lt;blockquote&gt;
&lt;ol&gt;
&lt;li&gt;You impose a temporary ban (say 1 week) on a user for what you judged as reasonable and valid reasons (the user 
gets notified by email of your action and the reason). The user replies to your email acknowledging the transgression,
says they won't do it again and asks for the ban to be lifted. The user sounds genuine. Do you remove the ban? Do you 
even reply at all? Explain your reasoning. The context of this question applies to longer bans too. If it helps get 
the juices flowing, consider the situation of a second offence for the same behaviour, which has a default ban 
period of 1 month.&lt;/li&gt;
&lt;/ol&gt;
&lt;/blockquote&gt;
&lt;p&gt;I was one of two candidates that explicitly stated we'd consider removing the ban. A third user didn't state it 
explicitly, but did say they'd consider it. I was surprised by the harsh tone the others took, especially since there
is a lot of previous discussions on Meta where the outcome is the moderators or community managers removing the ban. I 
was happy to see that the other candidate who said they'd consider removing the ban get elected though.&lt;/p&gt;
&lt;p&gt;I still believe that removing the ban is a valid option. Especially because their &lt;em&gt;next&lt;/em&gt; ban would be much longer if 
they broke my trust. &lt;/p&gt;
&lt;p&gt;We'll see when the next election on Stack Overflow is, but with three new moderators and no resignations, I suspect
it'll be a while. I'll consider running again then. &lt;/p&gt;</content><category term="Side Activities"/><category term="Stack Exchange"/><category term="moderation"/></entry><entry><title>I'm running for moderator on Stack Overflow again</title><link href="https://andrewwegner.com/i'm-running-for-moderator-on-stack-overflow-again.html" rel="alternate"/><published>2015-11-18T09:38:00-06:00</published><updated>2015-12-08T00:00:00-06:00</updated><author><name>Andy Wegner</name></author><id>tag:andrewwegner.com,2015-11-18:/i'm-running-for-moderator-on-stack-overflow-again.html</id><summary type="html">&lt;p&gt;Stack Overflow is having a second election this year. I'm throwing my hat in the ring again. This entry follows the process.&lt;/p&gt;</summary><content type="html">
&lt;h2 id="introduction"&gt;Introduction&lt;a class="headerlink" href="#introduction" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;In April, &lt;a href="https://andrewwegner.com/i'm-running-to-be-a-moderator-of-stack-overflow.html"&gt;I ran for moderator&lt;/a&gt; on Stack Overflow and didn't make it through the primaries. That's ok though, there were several very good users that did get &lt;a href="http://stackoverflow.com/election/6"&gt;elected&lt;/a&gt;. In a surprise announcement, though, Stack Overflow is running a second election this year. This is the first time this has happened since 2011. I'm still interested in a position and I'm still active in the community, so I'm going to run again. This post will follow the process.&lt;/p&gt;
&lt;h2 id="nomination-phase"&gt;Nomination Phase&lt;a class="headerlink" href="#nomination-phase" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;Like last time, the nomination phase began with users throwing their hat into the ring. Nominations were slower and fewer this time. Only 19 nominees, so no one was eliminated due to low reputation. Several users from the last election are rerunning too. &lt;/p&gt;
&lt;h3 id="my-platform"&gt;My Platform&lt;a class="headerlink" href="#my-platform" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;My &lt;a href="http://stackoverflow.com/election/7#post-33617646"&gt;platform&lt;/a&gt; hasn't changed much since the previous run. Below is my nomination post. This time, I tried to pull emphasis off the automated script by putting it lower on the list of things I've done and instead focused on the moderation tasks I do on Stack Overflow and the work I've done on &lt;a href="http://communitybuilding.stackexchange.com"&gt;Community Building&lt;/a&gt;. We'll see if it works.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Hi Everyone, I'm &lt;strong&gt;Andy&lt;/strong&gt; and I'd make a great moderator for Stack Overflow.&lt;/p&gt;
&lt;h3 id="why-vote-for-me"&gt;Why vote for me?&lt;a class="headerlink" href="#why-vote-for-me" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;I'm active in the review queues (currently holding 10th in &lt;a href="http://stackoverflow.com/review/low-quality-posts/stats"&gt;Low Quality Post reviewers&lt;/a&gt; of all time), provide edits to posts, &lt;a href="http://stackoverflow.com/users/189134/andy?tab=answers"&gt;answers&lt;/a&gt; and enjoy the moderation aspect of Stack Exchange&lt;/li&gt;
&lt;li&gt;I've been a moderator on &lt;a href="http://communitybuilding.stackexchange.com"&gt;CommunityBuilding&lt;/a&gt; for nearly a year and a half. I know the moderator tools and have worked with several of the current moderators. This interaction will continue as a new moderator here. &lt;/li&gt;
&lt;li&gt;I've built an automated script that continues to handle &lt;a href="http://meta.stackoverflow.com/q/280546/189134"&gt;noisy comments&lt;/a&gt; very &lt;a href="http://i.stack.imgur.com/GK32p.png"&gt;accurately&lt;/a&gt;. &lt;/li&gt;
&lt;li&gt;I have a &lt;a href="http://meta.stackoverflow.com/users/189134/andy"&gt;history&lt;/a&gt; on Meta.SO that shows I'm involved in the meta aspect of the site as well.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I have a history of good community moderation already. I enjoy the moderation aspect on Stack Overflow (and Stack Exchange in general). I deal with users with respect, even if our opinions on an issue differ. &lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;With this, I received my "candidate score". It was 33/40. Not the highest, but better than last time. The score wasn't mentioned in April. I am not expecting it to be an issue this time either.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Candidate Score" src="https://andrewwegner.com/images/november_2015_candidate_score.png"/&gt;&lt;/p&gt;
&lt;h2 id="primary-phase"&gt;Primary Phase&lt;a class="headerlink" href="#primary-phase" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;&lt;em&gt;Updated November 21, 2015&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;The primary phase is in the third day. In day 1, I was hovering around 9th/10th place. Overnight, between days 1 and 2 though, I dropped down to 11th. I've been sitting here consistently for a full day now and, while still gaining votes, I'm not gaining as fast as 10th position. It appears I may not make the cutoff by Friday's deadline. While disappointing, there are a few things that I came away with that I'm very happy about.&lt;/p&gt;
&lt;p&gt;In the last primary, I received 1,492 positive votes. I've surpassed that already. I have over 2,100 currently. I'm pleased with that upswing. I was also more prepared for the questionnaire portion of the primaries this time. I've gotten the second highest number of upvotes on my &lt;a href="http://meta.stackoverflow.com/a/310357/189134"&gt;responses&lt;/a&gt;. Several of the questions were similar to last time, but there are a few that I think should be included in the future elections.&lt;/p&gt;
&lt;h3 id="questionnaire"&gt;Questionnaire&lt;a class="headerlink" href="#questionnaire" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;This first question is a &lt;em&gt;great&lt;/em&gt; post for candidates. It allows them to show off their involvement in Meta and show their best work. For users, it gives them a sense of how a candidate interacts with the community. I am very surprised that several candidates list only one or two posts. This seems to be doing a disservice to themselves.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Do you have any Meta posts that you're particularly proud of, or that you feel best demonstrate your moderation style?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;My response to this question:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;I'm proud of several of my posts both here on Meta.SO and on other network sites I participate in. Here on MSO, I have two questions that I am proud of:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="http://meta.stackoverflow.com/questions/280546/can-a-machine-be-taught-to-flag-comments-automatically"&gt;Can a machine be taught to flag comments automatically?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://meta.stackoverflow.com/questions/300916/i-estimate-10-of-the-links-posted-here-are-dead-how-do-we-deal-with-them"&gt;I estimate 10% of the links posted here are dead. How do we deal with them?&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In both of these, you can see that I care about quality on Stack Overflow. I've spent time analyzing the problem, as I see it, and present my findings to the community. I participated in the discussions that both posts generated and &lt;a href="http://i.stack.imgur.com/XQoP5.png"&gt;continue to run&lt;/a&gt; the bot to this day. &lt;/p&gt;
&lt;p&gt;Elsewhere on the network, my participation in meta has helped to shape communities. For example, on &lt;a href="http://hardwarerecs.stackexchange.com"&gt;Hardware Recommendations&lt;/a&gt;, my meta post about &lt;a href="http://meta.hardwarerecs.stackexchange.com/a/81/57"&gt;"What type of hardware is allowed"&lt;/a&gt; helped to set the scope of what the community accepts as on topic hardware. I've also helped to set up the &lt;a href="http://meta.hardwarerecs.stackexchange.com/a/206/57"&gt;high quality guidelines for questions&lt;/a&gt; and &lt;a href="http://meta.hardwarerecs.stackexchange.com/a/274/57"&gt;argued against certain types of tags&lt;/a&gt; and &lt;a href="http://meta.hardwarerecs.stackexchange.com/a/257/57"&gt;hardware&lt;/a&gt;. &lt;/p&gt;
&lt;p&gt;With all of these, I've presented my arguments and logic and strived to remain professional. I believe the community on HardwareRecs has seen that as well.&lt;/p&gt;
&lt;p&gt;As a moderator on &lt;a href="http://communitybuilding.stackexchange.com"&gt;Community Building&lt;/a&gt;, I've been involved in many &lt;a href="http://meta.communitybuilding.stackexchange.com/users/78/andy?tab=summary"&gt;discussions&lt;/a&gt;. I was involved in the discussions to &lt;a href="http://meta.communitybuilding.stackexchange.com/q/175/78"&gt;rename&lt;/a&gt; the community from Moderators.SE to CommunityBuilding.SE. I've been involved in discussions about slow &lt;a href="http://meta.communitybuilding.stackexchange.com/q/151/78"&gt;growth of the community&lt;/a&gt;. I've also presented &lt;a href="http://meta.communitybuilding.stackexchange.com/a/1274/78"&gt;arguments&lt;/a&gt; that go against other moderators, and walked away still feeling like a moderation team. (Go communication!)&lt;/p&gt;
&lt;p&gt;Finally, on OpenSource, I made a &lt;a href="http://meta.opensource.stackexchange.com/q/642/22"&gt;post&lt;/a&gt; about how moderators had implemented a policy to watch the reviewers. It was similar to the long removed "flag weight" option that used to exist. I believe the post was presented in a way that questioned the decisions of the moderators, yet remained professional. &lt;/p&gt;
&lt;p&gt;With all of these meta posts, across the network, I think you can pick up on my moderation style and personality. I like data and I try to present my thoughts in a way that is understandable to all. I'm also willing to speak my mind.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr/&gt;
&lt;p&gt;This second question I struggled with for a bit. I've had ideas on how Stack Overflow/Stack Exchange could improve, but what did I want to present in this response. &lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;If you could add/revise one Stack Overflow policy/guideline, what would you change? Why would you change it, and what would it mean for the community?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;My response to this question: &lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;At the risk of talking myself out of a position, I think more community moderation would help the problem that Stack Overflow has with scaling moderators. There are a couple areas that I think would work well in opening this to the higher reputation users&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Comment flagging: Comments can be removed if enough users flag a comment. If not, a moderator needs to handle the flag. Instead, opening this as a review queue can remove a lot of this burden from the moderators. Users could handle all but the "Other..." flag. There may be guidance needed on the "Obsolete" one due to the difference between "obsolete comment" and "obsolete code block" differences. &lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Audit Review reviews: On Stack Overflow, we get a decent number of &lt;a href="http://meta.stackoverflow.com/questions/tagged/disputed-review-audits"&gt;disputed audit review&lt;/a&gt; posts on meta. There may be a way to get users with a history of passing both audits and good reviews involved in dealing with these disputed audits. The idea would be to say whether an audit is good or not. &lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;These changes, and other areas where the community could be leveraged for moderation tasks, helps to remove the burden on moderators. Handling 2,000 (and growing) flags a day means that something needs to change. Moderators are exception handlers. They should be handling the cases that are exceptional - not comments that are no longer relevant. &lt;/p&gt;
&lt;p&gt;For the community, this would be more &lt;a href="http://meta.stackexchange.com/questions/252844/make-comment-flags-less-stupid"&gt;involvement&lt;/a&gt; with the moderation aspect. Users would be able to more quickly clean up a comment thread. Flag it and it appears in the review queue. From here, the moderators don't need to be involved. The downside of this is that it adds another queue for users to be involved with. &lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3 id="primary-results"&gt;Primary Results&lt;a class="headerlink" href="#primary-results" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;With the primaries over, I ended with &lt;strong&gt;2483&lt;/strong&gt; positive votes. This put me in 11th place. Sadly, this was not enough to get into the election. I was 185 votes shy of overtaking 10th. Good luck to the candidates that made it.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Live Vote Counter at the end of the Primary" src="https://andrewwegner.com/images/2015-11-21_22_43_25-SO_2015_Election_Vote_Monitor.png"/&gt;&lt;/p&gt;
&lt;p&gt;One of the tools that came out of this election was a way to &lt;a href="http://meta.stackoverflow.com/q/310694/189134"&gt;visualize&lt;/a&gt; various data points to compare candidates. I provided a &lt;a href="http://meta.stackoverflow.com/a/310736/189134"&gt;couple notes about outliers&lt;/a&gt; various candidates show regarding aspects on the site. I found it interesting to see what each user had "specialized" in. &lt;/p&gt;
&lt;h2 id="election"&gt;Election&lt;a class="headerlink" href="#election" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;&lt;em&gt;Updated December 8, 2015&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;The election is over and the new moderators have settled in. We've had our first bout of public drama over one of these moderators actually moderating a chat room too. &lt;em&gt;gasp&lt;/em&gt;&lt;/p&gt;
&lt;h2 id="final-thoughts"&gt;Final thoughts&lt;a class="headerlink" href="#final-thoughts" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;I was closer to the top 10 this time, but still missed it. Even more surprising was that the user that ended up in 3rd in the primaries didn't even come close to getting elected. He was eliminated in the 5th round of final STV votes. I still think I'd make a great moderator for Stack Overflow, but I need to figure out the best way to promote myself in the next election.&lt;/p&gt;</content><category term="Side Activities"/><category term="Stack Exchange"/><category term="moderation"/></entry><entry><title>Link Analysis - Technical Explanation</title><link href="https://andrewwegner.com/link-analysis---technical-explanation.html" rel="alternate"/><published>2015-08-10T23:41:00-05:00</published><updated>2015-08-10T23:41:00-05:00</updated><author><name>Andy Wegner</name></author><id>tag:andrewwegner.com,2015-08-10:/link-analysis---technical-explanation.html</id><summary type="html">&lt;p&gt;Approximately 10% of links on the Stack Overflow are unavailable. This is an analysis of how I determined that and a discussion on how to improve it&lt;/p&gt;</summary><content type="html">
&lt;h2 id="introduction"&gt;Introduction&lt;a class="headerlink" href="#introduction" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;In my last two posts, I've discussed the number of &lt;a href="https://andrewwegner.com/analysis-of-links-posted-to-stack-overflow.html"&gt;rotten links&lt;/a&gt; on Stack Overflow and a &lt;a href="https://andrewwegner.com/a-proposal-to-fix-broken-links-on-stack-overflow.html"&gt;proposal to fix said links&lt;/a&gt;. In this post, I'm going to discuss how I performed this analysis. &lt;/p&gt;
&lt;h2 id="set-up"&gt;Set up&lt;a class="headerlink" href="#set-up" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;h3 id="the-database"&gt;The database&lt;a class="headerlink" href="#the-database" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;The process began by downloading the March 2013 &lt;a href="https://archive.org/details/stackexchange"&gt;data dump&lt;/a&gt;. I loaded the &lt;code&gt;posts&lt;/code&gt; into a &lt;a href="https://mariadb.org/"&gt;MariaDB&lt;/a&gt; instance on my local machine. This was accomplished with a very simple script and patience, as the script took a while to run.&lt;/p&gt;
&lt;div class="codehilight code"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nb"&gt;load&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;xml&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;local&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;infile&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s1"&gt;'/path/to/posts.xml'&lt;/span&gt;
&lt;span class="n"&gt;into&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;table&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;posts&lt;/span&gt;
&lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;identified&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;by&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s1"&gt;'&amp;lt;row&amp;gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h3 id="the-data"&gt;The data&lt;a class="headerlink" href="#the-data" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Once this was done, the next step was selecting my random sample of data. I did this by randomly selecting 25% of the days in a year and then pulling all posts for those days across all years of Stack Overflow's existence. The Python script I used to do this was fairly simple:&lt;/p&gt;
&lt;div class="codehilight code"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;datetime&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;timedelta&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;random&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;randint&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;math&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ceil&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;random_date&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;start&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;timedelta&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;seconds&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;randint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;end&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;total_seconds&lt;/span&gt;&lt;span class="p"&gt;())))&lt;/span&gt;

&lt;span class="n"&gt;percentage&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.25&lt;/span&gt;
&lt;span class="n"&gt;days&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;366&lt;/span&gt;

&lt;span class="n"&gt;dayslist&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;xrange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ceil&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;days&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;percentage&lt;/span&gt;&lt;span class="p"&gt;))):&lt;/span&gt;
    &lt;span class="n"&gt;dayslist&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;random_date&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2008&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2008&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;31&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;At the end of this run, the days that I cared about are in the &lt;code&gt;dayslist&lt;/code&gt; variable. I used that to pull questions and answers from the database that were created on that month/day combination. In the end, this resulted in just over 25% of the total posts being selected. To ensure that I could replicate the results, I also saved the dates that were selected.&lt;/p&gt;
&lt;h2 id="parsing-the-data"&gt;Parsing the data&lt;a class="headerlink" href="#parsing-the-data" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;The next step was to parse out links from the data. I used the following script to extract anchor text and links from a post. &lt;/p&gt;
&lt;div class="codehilight code"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;def&lt;span class="w"&gt; &lt;/span&gt;links_in_post(post):
&lt;span class="w"&gt;    &lt;/span&gt;"""
&lt;span class="w"&gt;    &lt;/span&gt;Returns&lt;span class="w"&gt; &lt;/span&gt;a&lt;span class="w"&gt; &lt;/span&gt;list&lt;span class="w"&gt; &lt;/span&gt;of&lt;span class="w"&gt; &lt;/span&gt;all&lt;span class="w"&gt; &lt;/span&gt;links&lt;span class="w"&gt; &lt;/span&gt;found
&lt;span class="w"&gt;    &lt;/span&gt;:param&lt;span class="w"&gt; &lt;/span&gt;posts:&lt;span class="w"&gt; &lt;/span&gt;A&lt;span class="w"&gt; &lt;/span&gt;list&lt;span class="w"&gt; &lt;/span&gt;of&lt;span class="w"&gt; &lt;/span&gt;dictionaries&lt;span class="w"&gt; &lt;/span&gt;with&lt;span class="w"&gt; &lt;/span&gt;a&lt;span class="w"&gt; &lt;/span&gt;'body'&lt;span class="w"&gt; &lt;/span&gt;key&lt;span class="w"&gt; &lt;/span&gt;containing&lt;span class="w"&gt; &lt;/span&gt;HTML&lt;span class="w"&gt; &lt;/span&gt;strings
&lt;span class="w"&gt;     &lt;/span&gt;[
&lt;span class="w"&gt;        &lt;/span&gt;{
&lt;span class="w"&gt;            &lt;/span&gt;'body':&lt;span class="w"&gt; &lt;/span&gt;"&lt;span class="nt"&gt;&amp;lt;b&amp;gt;&lt;/span&gt;This&lt;span class="w"&gt; &lt;/span&gt;is&lt;span class="w"&gt; &lt;/span&gt;HTML&lt;span class="nt"&gt;&amp;lt;/b&amp;gt;&lt;/span&gt;"
&lt;span class="w"&gt;        &lt;/span&gt;},
&lt;span class="w"&gt;    &lt;/span&gt;]
&lt;span class="w"&gt;    &lt;/span&gt;:return:&lt;span class="w"&gt; &lt;/span&gt;A&lt;span class="w"&gt; &lt;/span&gt;list&lt;span class="w"&gt; &lt;/span&gt;of&lt;span class="w"&gt; &lt;/span&gt;tuples&lt;span class="w"&gt; &lt;/span&gt;containing&lt;span class="w"&gt; &lt;/span&gt;anchor&lt;span class="w"&gt; &lt;/span&gt;text&lt;span class="w"&gt; &lt;/span&gt;and&lt;span class="w"&gt; &lt;/span&gt;URL
&lt;span class="w"&gt;        &lt;/span&gt;[
&lt;span class="w"&gt;            &lt;/span&gt;('Display&lt;span class="w"&gt; &lt;/span&gt;Text',&lt;span class="w"&gt; &lt;/span&gt;'http://example.com')
&lt;span class="w"&gt;        &lt;/span&gt;]
&lt;span class="w"&gt;    &lt;/span&gt;"""
&lt;span class="w"&gt;    &lt;/span&gt;logging.debug("Extracting&lt;span class="w"&gt; &lt;/span&gt;links...")
&lt;span class="w"&gt;    &lt;/span&gt;links&lt;span class="w"&gt; &lt;/span&gt;=&lt;span class="w"&gt; &lt;/span&gt;[]
&lt;span class="w"&gt;    &lt;/span&gt;images&lt;span class="w"&gt; &lt;/span&gt;=&lt;span class="w"&gt; &lt;/span&gt;[]
&lt;span class="w"&gt;    &lt;/span&gt;regexp&lt;span class="w"&gt; &lt;/span&gt;=&lt;span class="w"&gt; &lt;/span&gt;"&lt;span class="ni"&gt;&amp;amp;.+?;&lt;/span&gt;"
&lt;span class="w"&gt;    &lt;/span&gt;list_of_html&lt;span class="w"&gt; &lt;/span&gt;=&lt;span class="w"&gt; &lt;/span&gt;re.findall(regexp,&lt;span class="w"&gt; &lt;/span&gt;post)
&lt;span class="w"&gt;    &lt;/span&gt;for&lt;span class="w"&gt; &lt;/span&gt;e&lt;span class="w"&gt; &lt;/span&gt;in&lt;span class="w"&gt; &lt;/span&gt;list_of_html:
&lt;span class="w"&gt;        &lt;/span&gt;if&lt;span class="w"&gt; &lt;/span&gt;e&lt;span class="w"&gt; &lt;/span&gt;in&lt;span class="w"&gt; &lt;/span&gt;invalid_entities:
&lt;span class="w"&gt;            &lt;/span&gt;h&lt;span class="w"&gt; &lt;/span&gt;=&lt;span class="w"&gt; &lt;/span&gt;HTMLParser.HTMLParser()
&lt;span class="w"&gt;            &lt;/span&gt;unescaped&lt;span class="w"&gt; &lt;/span&gt;=&lt;span class="w"&gt; &lt;/span&gt;h.unescape(e)&lt;span class="w"&gt; &lt;/span&gt;
&lt;span class="w"&gt;            &lt;/span&gt;post&lt;span class="w"&gt; &lt;/span&gt;=&lt;span class="w"&gt; &lt;/span&gt;post.replace(e,&lt;span class="w"&gt; &lt;/span&gt;unescaped)

&lt;span class="w"&gt;    &lt;/span&gt;doc&lt;span class="w"&gt; &lt;/span&gt;=&lt;span class="w"&gt; &lt;/span&gt;html.fromstring(post)
&lt;span class="w"&gt;    &lt;/span&gt;for&lt;span class="w"&gt; &lt;/span&gt;link&lt;span class="w"&gt; &lt;/span&gt;in&lt;span class="w"&gt; &lt;/span&gt;doc.xpath('//a'):
&lt;span class="w"&gt;        &lt;/span&gt;links.append(Link(text=link.text_content(),&lt;span class="w"&gt; &lt;/span&gt;link=link.get('href')))
&lt;span class="w"&gt;    &lt;/span&gt;for&lt;span class="w"&gt; &lt;/span&gt;image&lt;span class="w"&gt; &lt;/span&gt;in&lt;span class="w"&gt; &lt;/span&gt;doc.xpath('//img'):
&lt;span class="w"&gt;        &lt;/span&gt;images.append(Link(text=image.get('alt'),&lt;span class="w"&gt; &lt;/span&gt;link=image.get('src')))
&lt;span class="w"&gt;    &lt;/span&gt;all_items&lt;span class="w"&gt; &lt;/span&gt;=&lt;span class="w"&gt; &lt;/span&gt;links&lt;span class="w"&gt; &lt;/span&gt;+&lt;span class="w"&gt; &lt;/span&gt;images
&lt;span class="w"&gt;    &lt;/span&gt;seen&lt;span class="w"&gt; &lt;/span&gt;=&lt;span class="w"&gt; &lt;/span&gt;set()
&lt;span class="w"&gt;    &lt;/span&gt;unique_items&lt;span class="w"&gt; &lt;/span&gt;=&lt;span class="w"&gt; &lt;/span&gt;[item&lt;span class="w"&gt; &lt;/span&gt;for&lt;span class="w"&gt; &lt;/span&gt;item&lt;span class="w"&gt; &lt;/span&gt;in&lt;span class="w"&gt; &lt;/span&gt;all_items&lt;span class="w"&gt; &lt;/span&gt;if&lt;span class="w"&gt; &lt;/span&gt;item[1]&lt;span class="w"&gt; &lt;/span&gt;not&lt;span class="w"&gt; &lt;/span&gt;in&lt;span class="w"&gt; &lt;/span&gt;seen&lt;span class="w"&gt; &lt;/span&gt;and&lt;span class="w"&gt; &lt;/span&gt;not&lt;span class="w"&gt; &lt;/span&gt;seen.add(item[1])]
&lt;span class="w"&gt;    &lt;/span&gt;return&lt;span class="w"&gt; &lt;/span&gt;unique_items
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The regular expression being utilized, is to strip out HTML entities. This was needed due to weird parsing issues with non-ASCII characters. Fortunately, I wasn't the &lt;a href="http://stackoverflow.com/a/13939198/189134"&gt;first to encounter oddities like this&lt;/a&gt;. The list comprehension at the end of the function is returning only unique tuples of anchor text/link. I was surprised how often I'd end up with tuples such as &lt;code&gt;('this', 'http://google.com')&lt;/code&gt; in the same post. This uniqueness saved a lot of processing time later.&lt;/p&gt;
&lt;p&gt;After all links in a post had been extracted, this information and information about the post itself, was saved to the database. If a post had no links, it was not saved. The database consisted of three tables. &lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Links - This table contains the base URLs seen in all posts. URLs are distinct. It also contains an ID that will be utilized for linking to the other tables.&lt;/li&gt;
&lt;li&gt;Post Links - This table contains information about links in a post. This includes the specific anchor text/link combinations&lt;/li&gt;
&lt;li&gt;Link Results - This table contains the results of link status checks&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Processing the posts was fairly time consuming, but was able to be parallelized easily. That significantly cut down on processing time.&lt;/p&gt;
&lt;h2 id="checking-the-links"&gt;Checking the links&lt;a class="headerlink" href="#checking-the-links" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;The most time consuming portion of this entire project was actually checking link status. Each link that appeared in the &lt;code&gt;Links&lt;/code&gt; table was checked. As I mentioned in my first post, the original idea was to simply send a &lt;code&gt;HEAD&lt;/code&gt; request to each URL. The idea was to save myself and the end point a tiny amount of bandwidth. I had over a million links to process. I figured a little saved bandwidth wouldn't hurt.&lt;/p&gt;
&lt;p&gt;It turns out this isn't a good idea. When I started seeing larger sites as not being accessible, I got suspicious that something was wrong. These sites were returning status 405 errors. This indicates that the method is not allowed. So, I switched to &lt;code&gt;GET&lt;/code&gt; for every link. The next problem I ran into was that many sites didn't like the default user agent of the spider. They rejected requests with 404 and 401 errors. In the end, I got around this by mimicking Firefox on every request. &lt;/p&gt;
&lt;p&gt;With those kinks worked out, every link was sent a &lt;code&gt;GET&lt;/code&gt; request that looked to be from a Firefox browser. The process would allow 20 seconds per link. If the link didn't respond in that time limit, it was declared inaccessible. &lt;/p&gt;
&lt;p&gt;A week later, I repeated the process with anything that hadn't returned a status code less than 400. Once more, on the third week, I repeated this with the failed links. At the end of three weeks, I had a list of sites that were inaccessible to me - on a residential connection - three times over a period of three weeks.&lt;/p&gt;
&lt;h2 id="results"&gt;Results&lt;a class="headerlink" href="#results" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;The &lt;a href="https://andrewwegner.com/images/status_codes.svg"&gt;SVG image&lt;/a&gt; that I created for the write up was generated with Pygal. The tables were the result of various breakdowns of the data via queries to the status results. &lt;/p&gt;
&lt;h2 id="wrap-up"&gt;Wrap up&lt;a class="headerlink" href="#wrap-up" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;I am rather proud of how the results turned out for this project. I went into it expecting about 15% of links to be broken, but I didn't really realize what that meant. Fifteen percent of 21 million total posts is over 3 million. That's a large number. BUT, it also ignored that a large percentage of posts don't contain links. I failed to consider that in my original hypothesis. &lt;/p&gt;
&lt;p&gt;Less than half of my sample had links (2.3M out of 5.6M). Of the 2.3M with links, only 1.5M were unique links. The final result of 10% failed links makes much more sense in this context. Ten percent of 1.5M links means that there are 150K links that are bad. &lt;/p&gt;</content><category term="Side Activities"/><category term="Stack Exchange"/><category term="programming"/></entry><entry><title>A proposal to fix broken links on Stack Overflow</title><link href="https://andrewwegner.com/a-proposal-to-fix-broken-links-on-stack-overflow.html" rel="alternate"/><published>2015-08-07T07:34:00-05:00</published><updated>2015-08-07T07:34:00-05:00</updated><author><name>Andy Wegner</name></author><id>tag:andrewwegner.com,2015-08-07:/a-proposal-to-fix-broken-links-on-stack-overflow.html</id><summary type="html">&lt;p&gt;My proposal to decrease the number of broken links on Stack Overflow&lt;/p&gt;</summary><content type="html">
&lt;p&gt;This post was &lt;a href="http://meta.stackoverflow.com/q/300916/189134"&gt;published&lt;/a&gt; by &lt;a href="http://meta.stackoverflow.com/users/189134/andy?tab=profile"&gt;me&lt;/a&gt; on Meta Stack Overflow on August 7th, 2015. I've republished it here
so that I can easily update information related to recent developments. If you have questions or comments, I highly
encourage you to visit the &lt;a href="http://meta.stackoverflow.com/q/300916/189134"&gt;question&lt;/a&gt; on Meta Stack Overflow and post there.&lt;/p&gt;
&lt;p&gt;This is a follow up to &lt;a href="https://andrewwegner.com/analysis-of-links-posted-to-stack-overflow.html"&gt;yesterday's post&lt;/a&gt; about how many links on Stack Overflow are starting to rot.&lt;/p&gt;
&lt;hr/&gt;
&lt;h2 id="the-proposal"&gt;The proposal&lt;a class="headerlink" href="#the-proposal" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;I propose &lt;a href="http://meta.stackoverflow.com/a/301002/189134"&gt;another hybrid&lt;/a&gt; of the previous &lt;a href="http://meta.stackexchange.com/questions/224895/what-happened-to-the-broken-link-review-queue"&gt;broken link&lt;/a&gt; queue (as was mentioned &lt;a href="http://meta.stackoverflow.com/questions/300916/i-estimate-10-of-the-links-posted-here-are-dead-how-do-we-deal-with-them#comment229798_300916"&gt;above&lt;/a&gt; in &lt;a href="http://meta.stackoverflow.com/questions/300916/i-estimate-10-of-the-links-posted-here-are-dead-how-do-we-deal-with-them#comment229795_300916"&gt;comments&lt;/a&gt; and &lt;a href="http://meta.stackoverflow.com/a/300998/189134"&gt;other&lt;/a&gt; &lt;a href="http://meta.stackoverflow.com/a/300996/189134"&gt;answers&lt;/a&gt;) and an automated process to fix broken links with an archived version (which has also been &lt;a href="http://meta.stackoverflow.com/a/301001/189134"&gt;suggested&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;The broken link queue should focus on editing and fixing the links in a post (as opposed to closing it). It'd be similar to the suggested edits queue, but with the focus intended to correct &lt;em&gt;links&lt;/em&gt; not spelling and grammar. This could be done by only allowing a user to edit the links.&lt;/p&gt;
&lt;p&gt;One possibility, I envision is presenting the user with the links in the post and a status on whether or not the link is available. If it's not available, give the user a way to change that specific link. Utilizing &lt;a href="http://stackoverflow.com/a/2054063/189134"&gt;this&lt;/a&gt; post, I have a quick mock up of what I propose such a review task looks like:&lt;/p&gt;
&lt;h2 id="the-queue"&gt;The Queue&lt;a class="headerlink" href="#the-queue" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://andrewwegner.com/images/brokenlinkqueue.png"&gt;&lt;img alt="Broken Link Mock up" src="https://andrewwegner.com/images/brokenlinkqueue.png"/&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;All the links that appear in the post are on the right hand side of the screen. The links that are accessible have a green check mark. The ones that are broken (and the reason for being in this queue) have a red X. When a user elects to fix a post, they are presented with a modal showing only the broken URLs.&lt;/p&gt;
&lt;hr/&gt;
&lt;h2 id="the-automation"&gt;The Automation&lt;a class="headerlink" href="#the-automation" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;With this queue, though, I think an automated process would be helpful as well. The idea is that this would operate similarly to the Low Quality queue, where the system can automatically add a post to the queue if certain criteria are met &lt;em&gt;or&lt;/em&gt; a user can flag a post as having broken links. I've based my idea on what Tim Post outlined in the &lt;a href="http://meta.stackexchange.com/questions/130398/does-stack-exchange-crawl-websites/198357#comment741544_198357"&gt;comments to a previous post&lt;/a&gt;.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Automated process performs a "Today in History" type check. This keeps the fixes limited to a small subset of posts per day. It also focuses on older posts, which were more likely to have a broken link than something posted recently. Example: On July 31, 2015, the only posts being checked for bad links would be anything posted on July 31 in any year 2008 through current year - 1.&lt;/li&gt;
&lt;li&gt;Utilizing the &lt;a href="http://archive.org/about/wayback_api.php"&gt;Wayback Machine API&lt;/a&gt;, or similar service, the system attempts to change broken links into an archived version of the URL. This archived version should probably be from "close" to the time the post was originally made. If the automated process isn't able to find an archived version of the link, the post should be tossed into the Broken Link queue&lt;/li&gt;
&lt;li&gt;When the Community edits a post to fix a link, a new Post History event is utilized to show that a link was changed. This would allow anyone looking at revision history to easily see that a specific change was only to fix links.&lt;/li&gt;
&lt;li&gt;Actions performed in the previous bullets are exposed to 10K users in the moderator tools. Much like recent close/delete posts show up, these do as well. This allows higher rep users to spot check (if they so desire). I think this portion is important when the automated process fixes a link. For community edits in the queue, the history tab in &lt;code&gt;/review&lt;/code&gt; seems sufficient.&lt;/li&gt;
&lt;li&gt;If a post consists of a large percentage of a link (or links) and these links were changed by Community, the post should have further action taken on it in some queue.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Example:
     - A post where X+% of the text is hyperlinks is very dependent on the links being active. If one or more of the links are broken, the post may no longer be relevant (or may be a link only post). One example I found while doing this was &lt;a href="http://stackoverflow.com/posts/4906230/revisions"&gt;this&lt;/a&gt; answer.&lt;/p&gt;
&lt;p&gt;I don't think that this type of edit from the Community user should bump a post to the front page. Edits done in the broken link queue, though, &lt;em&gt;should&lt;/em&gt; bump the post just like a suggested edit does today. By preventing the automated Community posts from being bumped, we prevent the front page from being flooded, daily, with old posts and these edits. I think that the exposure in the 10K tools and the broken link queue will provide the visibility needed to check the process is working correctly.&lt;/p&gt;
&lt;h2 id="process-flows"&gt;Process flows&lt;a class="headerlink" href="#process-flows" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;Queue Flow:&lt;/p&gt;
&lt;p&gt;&lt;a href="https://andrewwegner.com/images/brokenqueueflow.png"&gt;&lt;img alt="Queue Flow" src="https://andrewwegner.com/images/brokenqueueflow.png"/&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Automated process flow:&lt;/p&gt;
&lt;p&gt;&lt;a href="https://andrewwegner.com/images/automatedlinkflow.png"&gt;&lt;img alt="Automated Link check flow" src="https://andrewwegner.com/images/automatedlinkflow.png"/&gt;&lt;/a&gt;&lt;/p&gt;
&lt;hr/&gt;
&lt;h2 id="potential-pitfalls"&gt;Potential pitfalls&lt;a class="headerlink" href="#potential-pitfalls" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;The automated link checking will likely run into several of the problems I did. Mainly:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Sites modify the &lt;code&gt;HEAD&lt;/code&gt; request to send a 404 instead of a 405. My solution to this was to issue &lt;code&gt;GET&lt;/code&gt; requests for everything.&lt;/li&gt;
&lt;li&gt;Sites don't like certain user agents. My solution to this was to mimic the Firefox user agent. To be a good internet citizen, Stack Exchange probably shouldn't go that far, but providing a unique user agent that is easily identifiable as "StackExchangeBot" (think "GoogleBot"), should be helpful in identifying where traffic is coming from.&lt;/li&gt;
&lt;li&gt;Sites that are down one week and up another. I solved this by spreading my tests over a period of 3 weeks. With the queue and automatic linking to an archived version of the site, this may not be necessary. However, immediately converting a link to an archived copy should be discussed by the community. Do we convert the broken link immediately? Or do we try again in X days. If it's still down then convert it? It was suggested in &lt;a href="http://meta.stackoverflow.com/a/301002/189134"&gt;another answer&lt;/a&gt; that we first offer the poster the chance to make changes before an automatic process takes place.&lt;/li&gt;
&lt;li&gt;The need to throttle requests so that you don't flood a site with requests. I solved this by only querying unique URLs. This still issues a lot of requests to certain, popular, domains. This could be solved by staggering the checks over a period of minutes/hours versus spewing 100s - 1000s of &lt;code&gt;GET&lt;/code&gt; requests at midnight daily.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;With the broken link queue, I feel the first two would be acceptable. Much like posts in the Low Quality queue appear because of a heuristic, despite not being low quality, links will be the same way. The system will flag them as broken and the queue will determine if that is true (if an archived version of the site can't be found by the automated process). The bullet about throttling requests is an implementation detail that I'm sure the developers would be able to figure out.&lt;/p&gt;</content><category term="Side Activities"/><category term="Stack Exchange"/><category term="programming"/></entry><entry><title>Analysis of links posted to Stack Overflow</title><link href="https://andrewwegner.com/analysis-of-links-posted-to-stack-overflow.html" rel="alternate"/><published>2015-08-06T07:35:00-05:00</published><updated>2015-08-07T00:00:00-05:00</updated><author><name>Andy Wegner</name></author><id>tag:andrewwegner.com,2015-08-06:/analysis-of-links-posted-to-stack-overflow.html</id><summary type="html">&lt;p&gt;Approximately 10% of links on Stack Overflow are unavailable. This is an analysis of how I determined that and a discussion on how to improve it.&lt;/p&gt;</summary><content type="html">
&lt;p&gt;This post was &lt;a href="http://meta.stackoverflow.com/q/300916/189134"&gt;published&lt;/a&gt; by &lt;a href="http://meta.stackoverflow.com/users/189134/andy?tab=profile"&gt;me&lt;/a&gt; on Meta Stack Overflow on August 6th, 2015. I've republished it here
so that I can easily update information related to recent developments. If you have questions or comments, I highly
encourage you to visit the &lt;a href="http://meta.stackoverflow.com/q/300916/189134"&gt;question&lt;/a&gt; on Meta Stack Overflow and post there. &lt;/p&gt;
&lt;p&gt;TL;DR: Approximately 10% of 1.5M randomly selected unique links in the March 2015 &lt;a href="https://archive.org/details/stackexchange"&gt;data dump&lt;/a&gt; are unavailable. To be more precise, that is approximately 150K dead links.&lt;/p&gt;
&lt;hr/&gt;
&lt;h2 id="motivation"&gt;Motivation&lt;a class="headerlink" href="#motivation" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;I've been running into more and more links that are dead on Stack Overflow and it's bothering me. In some cases, I've spent the time hunting down a replacement, in others I've notified the owner of the post that a link is dead, and (more shamefully), in others I've simply ignored it and left just a &lt;a href="http://meta.stackoverflow.com/a/262040/189134"&gt;down vote&lt;/a&gt;. Obviously that's not good.&lt;/p&gt;
&lt;p&gt;Before making sweeping generalizations that there are dead links everywhere, though, I wanted to make sure I wasn't just finding bad posts because I was wandering through the review queues. Utilizing the March 2015 data dump, I randomly selected about 25% of the posts (both questions and answers) and then parsed out the links. This works out to 5.6M posts out of 21.7M total.&lt;/p&gt;
&lt;p&gt;Of these 5.6M posts, 2.3M contained links and 1.5M of these were unique links. I sent each unique URL a &lt;a href="https://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol#Request_methods"&gt;&lt;code&gt;HEAD&lt;/code&gt;&lt;/a&gt; request, with a user agent mimicking Firefox&lt;sup&gt;1&lt;/sup&gt;. I then retested everything that didn't return a successful response a week later. Finally, anything that failed from &lt;em&gt;that&lt;/em&gt; batch, I resent a final test a week later. If a site was down in all three tests, I considered it down for this test.&lt;/p&gt;
&lt;hr/&gt;
&lt;h2 id="results2"&gt;Results&lt;sup&gt;2&lt;/sup&gt;&lt;a class="headerlink" href="#results2" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;h3 id="by-status-code"&gt;By status code&lt;a class="headerlink" href="#by-status-code" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Good news/Bad News: A majority of the links returned a valid response, but there are still roughly 10% that failed.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://andrewwegner.com/images/status_codes.svg"&gt;&lt;img alt="PIE CHART IMAGE" src="https://andrewwegner.com/images/status_codes.svg"/&gt;&lt;/a&gt;
&lt;em&gt;(This image is showing the top status codes returned)&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;The three largest slices of the pie are the status 200s (site working!), status 404 (page not found, but server responded saying the page isn't found) and Connection Errors. Connection errors are sites that had no proper server response. The request to access the page timed out. I was generous in the time out and allowed a request to live for 20 seconds before failing a link with this status. The &lt;code&gt;4xx&lt;/code&gt; and &lt;code&gt;5xx&lt;/code&gt; errors are status codes that fall in the 400 and 500 range of HTTP responses. These are client and server error ranges, thus counted as a failure. &lt;code&gt;2xx&lt;/code&gt; errors are pages that responded with a success message in the 200 range, but it wasn't a &lt;code&gt;200&lt;/code&gt; code. Finally, there were just over a hundred sites that hit a redirect loop that didn't seem to end. These are the &lt;code&gt;3xx&lt;/code&gt; errors. I failed a site with this range if it redirected more than 30 times. There are a negligible number of sites that returned status codes in the 600 and &lt;a href="https://github.com/joho/7XX-rfc"&gt;700&lt;/a&gt; range&lt;sup&gt;4&lt;/sup&gt;&lt;/p&gt;
&lt;h3 id="by-most-common"&gt;By most common&lt;a class="headerlink" href="#by-most-common" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;There are, expectedly, many URLs that failed that appeared frequently in the sample set. Below is a list of the top 50&lt;sup&gt;3&lt;/sup&gt; URLs that are in posts most often, but failed three times over the course of three weeks.&lt;/p&gt;
&lt;div class="codehilight code"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;docs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;jquery&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;com&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;Plugins&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;validation&lt;/span&gt;
&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;www&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;eclipse&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;org&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;eclipselink&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;moxy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;php&lt;/span&gt;
&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;jackson&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;codehaus&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;org&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;xstream&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;codehaus&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;org&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;opencv&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;willowgarage&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;com&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;wiki&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;developer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;android&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;com&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;resources&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;articles&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;painless&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;threading&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;html&lt;/span&gt;
&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;valums&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;com&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;ajax&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;upload&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;sqlite&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;phxsoftware&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;com&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;qt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;nokia&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;com&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;www&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;oracle&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;com&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;technetwork&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;java&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;codeconv&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;138413.&lt;/span&gt;&lt;span class="n"&gt;html&lt;/span&gt;
&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;download&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;java&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;net&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;jdk8&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;docs&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;api&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;java&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;package&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;summary&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;html&lt;/span&gt;
&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;docs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;oracle&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;com&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;javase&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="mf"&gt;1.4&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;docs&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;api&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;java&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;SimpleDateFormat&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;html&lt;/span&gt;
&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;watin&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sourceforge&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;net&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;leandrovieira&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;com&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;projects&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;jquery&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;lightbox&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="n"&gt;https&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;facebook&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;com&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="n"&gt;https&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;ccrma&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;stanford&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;edu&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;courses&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="mi"&gt;422&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;projects&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;WaveFormat&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;www&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;postsharp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;org&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;www&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;erichynds&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;com&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;jquery&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;jquery&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;ui&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;multiselect&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;widget&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;ha&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ckers&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;org&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;xss&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;html&lt;/span&gt;
&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;jetty&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;codehaus&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;org&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;jetty&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;cpp&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;next&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;com&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;archive&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="mi"&gt;2009&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="mi"&gt;08&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;want&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;speed&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="k"&gt;pass&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;by&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;codespeak&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;net&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;lxml&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;www&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;hpl&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;hp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;com&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;personal&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;Hans_Boehm&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;gc&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;jquery&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;com&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;demo&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;thickbox&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;book&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;git&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;scm&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;com&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="n"&gt;_submodules&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;html&lt;/span&gt;
&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;monotouch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;net&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;developer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;android&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;com&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;resources&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;articles&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;timed&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;ui&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;updates&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;html&lt;/span&gt;
&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;jquery&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;bassistance&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;de&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;validate&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;demo&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;codeigniter&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;com&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;user_guide&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;database&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;active_record&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;html&lt;/span&gt;
&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;www&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;phantomjs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;org&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;watin&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;org&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;www&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;db4o&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;com&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;qt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;nokia&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;com&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;products&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;referencesource&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;microsoft&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;com&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;netframework&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;aspx&lt;/span&gt;
&lt;span class="n"&gt;https&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;github&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;com&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;facebook&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;php&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;sdk&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;java&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;decompiler&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;free&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fr&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;pivotal&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;github&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;com&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;jasmine&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;api&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;jquery&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;com&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;category&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;plugins&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;templates&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;code&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;google&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;com&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;closure&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;library&lt;/span&gt;
&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;www&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;w3schools&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;com&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;tags&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;ref_entities&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;asp&lt;/span&gt;
&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;xstream&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;codehaus&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;org&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;tutorial&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;html&lt;/span&gt;
&lt;span class="n"&gt;https&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;github&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;com&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;facebook&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;php&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;sdk&lt;/span&gt;
&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;download&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;java&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;net&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;maven&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;jstl&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;jars&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;jstl&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;1.2&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;jar&lt;/span&gt;
&lt;span class="n"&gt;https&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;developers&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;facebook&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;com&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;docs&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;offline&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;access&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;deprecation&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;www&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parashift&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;com&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;++-&lt;/span&gt;&lt;span class="n"&gt;faq&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;lite&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;pointers&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;to&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;members&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;html&lt;/span&gt;
&lt;span class="n"&gt;https&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;developers&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;facebook&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;com&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;docs&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;mobile&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;ios&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;build&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;downloads&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;php&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;net&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;pierre&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;fluentnhibernate&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;org&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;net&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tutsplus&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;com&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;tutorials&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;javascript&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;ajax&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;ways&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;to&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;make&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;ajax&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;calls&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;with&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;jquery&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="n"&gt;dev&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;iceburg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;net&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;jquery&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;jqModal&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h3 id="by-post-score"&gt;By post score&lt;a class="headerlink" href="#by-post-score" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Count of posts by score (top 10)  (Covers 94% of all broken links):&lt;/p&gt;
&lt;div class="codehilight code"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;| Score | Percentage of Total Broken |
|-------|----------------------------|
| 0     | 36.4087%                   |
| 1     | 25.1674%                   |
| 2     | 13.4089%                   |
| 3     | 7.2806%                    |
| 4     | 4.2971%                    |
| 5     | 2.7065%                    |
| 6     | 1.8068%                    |
| 7     | 1.2854%                    |
| -1    | 1.1935%                    |
| 8     | 0.9415%                    |
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h3 id="by-number-of-views"&gt;By number of views&lt;a class="headerlink" href="#by-number-of-views" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;&lt;em&gt;Note, this is number of views at the time the data dump was created, not as of today&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Count of posts by number of views (top 10):&lt;/p&gt;
&lt;div class="codehilight code"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;| Views        | Total Views |
|--------------|-------------|
| (0, 200]     | 24.4709%    |
| (200, 400]   | 14.2186%    |
| (400, 600]   | 9.5045%     |
| (600, 800]   | 6.9793%     | 
| (800, 1000]  | 5.2574%     |
| (1000, 1200] | 4.1864%     |
| (1200, 1400] | 3.3699%     |
| (1400, 1600] | 2.7766%     |
| (1600, 1800] | 2.3477%     |
| (1800, 2000] | 1.9550%     |
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h3 id="by-days-since-post-created"&gt;By days since post created&lt;a class="headerlink" href="#by-days-since-post-created" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;&lt;em&gt;Note: This is number of days since creation at the time the data dump was created, not from today&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Count of posts by days since creation (top 10) (Covers 64% of broken links):&lt;/p&gt;
&lt;div class="codehilight code"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;| Days since Creation | Percentage of Total Broken |
|---------------------|----------------------------|
| (1110, 1140]        | 7.2938%                    |
| (1140, 1170]        | 6.7648%                    |
| (1470, 1500]        | 6.6579%                    |
| (1080, 1110]        | 6.6535%                    | 
| (750, 780]          | 6.5535%                    |
| (720, 750]          | 6.5516%                    |
| (1500, 1530]        | 6.3978%                    |
| (390, 420]          | 5.8508%                    |
| (360, 390]          | 5.8258%                    |
| (780, 810]          | 5.5175%                    |
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h3 id="by-ratio-of-viewsdays"&gt;By Ratio of Views:Days&lt;a class="headerlink" href="#by-ratio-of-viewsdays" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Ratio Views:Days (top 20) (Covers 90% of broken links):&lt;/p&gt;
&lt;div class="codehilight code"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;| Views:Days Ratio | Percentage of Total Broken |
|------------------|-------------|
| (0, 0.25]        | 27.2369%    |
| (0.25, 0.5]      | 18.8496%    |
| (0.5, 0.75]      | 11.4321%    |
| (0.75, 1]        | 7.2481%     | 
| (1, 1.25]        | 5.1668%     |
| (1.25, 1.5]      | 3.7907%     |
| (1.5, 1.75]      | 2.9310%     |
| (1.75, 2]        | 2.4033%     |
| (2, 2.25]        | 1.9788%     |
| (2.25, 2.5]      | 1.6850%     |
| (2.5, 2.75]      | 1.4080%     |
| (2.75, 3]        | 1.1879%     |
| (3, 3.25]        | 1.0654%     |
| (3.25, 3.5]      | 0.9391%     |
| (3.5, 3.75]      | 0.8334%     |
| (3.75, 4]        | 0.7165%     |
| (4, 4.25]        | 0.6634%     |
| (4.25, 4.5]      | 0.5789%     |
| (4.5, 4.75]      | 0.5508%     |
| (4.75, 5]        | 0.4833%     |
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;hr/&gt;
&lt;h2 id="discussion"&gt;Discussion&lt;a class="headerlink" href="#discussion" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;What can we do with all of this? How do we, as a community, solve the issue of 10% of our outbound links pointing to places on the internet that no longer exist? Assuming that my sample was indicative of the entire data dump, there are close to 600K (150K broken unique links x 4, because I took 1/4 of the data dump as a sample) broken links posted in questions and answers on Stack Overflow. I assume a large number of links posted in comments would be broken as well, but that's an activity for another month.&lt;/p&gt;
&lt;p&gt;We encourage posters to provide snippets from their links just in case a link dies. That definitely helps, but the resources behind the links and the (presumably) expanded explanation behind the links are still gone. How can we properly deal with this? &lt;/p&gt;
&lt;p&gt;It looks like there have been a few previous discussions:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="http://meta.stackexchange.com/a/198357/186281"&gt;Utilize the Wayback API to automatically fix broken links.&lt;/a&gt; Development appeared to stall on this due to the large number of edits the Community user would be making. This would also hide posts that depended on said link from being surfaced for the community to fix it.&lt;/li&gt;
&lt;li&gt;&lt;a href="http://meta.stackexchange.com/questions/224895/what-happened-to-the-broken-link-review-queue"&gt;Link review queue&lt;/a&gt;. It was in &lt;a href="http://meta.stackexchange.com/questions/212023/where-can-i-access-the-link-validation-review-queue"&gt;alpha&lt;/a&gt;, but disappeared in early 2014. &lt;/li&gt;
&lt;li&gt;&lt;a href="http://meta.stackexchange.com/questions/174347/badge-request-for-fixing-dead-links-pipefitter"&gt;Badge proposal for fixing broken links&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;hr/&gt;
&lt;h2 id="footnotes"&gt;Footnotes&lt;a class="headerlink" href="#footnotes" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;ol&gt;
&lt;li&gt;This is how it ultimately played out. Originally I sent &lt;code&gt;HEAD&lt;/code&gt; requests, in an effort to save bandwidth. This turned out to waste a whole bunch of time because there are a whole bunch of sites around the internet that return a &lt;a href="https://en.wikipedia.org/wiki/List_of_HTTP_status_codes#4xx_Client_Error"&gt;&lt;code&gt;405 Method Not Allowed&lt;/code&gt;&lt;/a&gt; when sending a &lt;code&gt;HEAD&lt;/code&gt; request. The next step was to send &lt;code&gt;GET&lt;/code&gt; requests, but utilize the default Python &lt;a href="http://docs.python-requests.org/en/latest/"&gt;requests&lt;/a&gt; user-agent. A lot of sites were returning &lt;code&gt;401&lt;/code&gt; or &lt;code&gt;404&lt;/code&gt; responses to this user agent.&lt;/li&gt;
&lt;li&gt;Links to Stack Exchange sites were not counted in the above results. The failures seen are almost 100% due to a question/answer/comment being deleted. The process ran as an anonymous user, thus didn't have any reputation and was served a 404. A user with appropriate permissions &lt;em&gt;can&lt;/em&gt; still visit the link. I verified a number of 404'd links to Stack Overflow posts and this was the case.&lt;/li&gt;
&lt;li&gt;The 4th most common failure was to &lt;code&gt;localhost&lt;/code&gt;. The 16th and 17th most common were &lt;code&gt;localhost&lt;/code&gt; on ports other than 80. I removed these from the result table with the knowledge that these shouldn't be accessible from the internet.&lt;/li&gt;
&lt;li&gt;There where 7 total URLs that returned status codes in the &lt;code&gt;600&lt;/code&gt; and &lt;a href="https://github.com/joho/7XX-rfc"&gt;&lt;code&gt;700&lt;/code&gt;&lt;/a&gt; range. One such site was &lt;a href="http://learn.code.org/hoc/1"&gt;code.org&lt;/a&gt; with a status code of 752. Sadly, this is not even defined in the joke RFC.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id="follow-up"&gt;Follow up&lt;a class="headerlink" href="#follow-up" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;I &lt;a href="https://andrewwegner.com/a-proposal-to-fix-broken-links-on-stack-overflow.html"&gt;posted&lt;/a&gt; a proposal on how I think this could be fixed.&lt;/p&gt;</content><category term="Side Activities"/><category term="Stack Exchange"/><category term="programming"/></entry></feed>