GitHub Stars: The Dark Side of Code Popularity

Bypassing the Popularity Contest The Secret Black Market of GitHub That Aids Cheating Coders

The Dark Side of GitHub When Coders Cheat the Popularity Contest

In the realm of technology, GitHub has solidified its position as a programmer’s best friend. It seamlessly combines powerful software management tools with collaboration features, creating a social network for code enthusiasts. However, along with its success, GitHub has unwittingly attracted a rather unsavory element: a black market for fake engagement.

Picture this: an illicit ecosystem of online stores and chat groups openly peddling GitHub stars. These stars, bestowed by users to show interest in a project, can be tallied to rank the most popular endeavors. For a mere $6 in ether, the cryptocurrency of the Ethereum blockchain, ENBLE purchased 50 stars for a dormant GitHub project from the aptly named website, BuyGithub.com. And within hours, the counterfeit endorsements appeared.

These shady stars are just a fraction of the wider black market for online engagement metrics. Coders, investors, and tech-savvy individuals rely on these metrics to identify promising programmers and startups. Online stores also offer upvotes for projects on Product Hunt, the platform that promises to unveil the next big thing in tech, as well as followers and views on Kaggle, the data science community. It seems these vendors are capitalizing on the ambition and, perhaps, desperation of those seeking rapid success in an industry known for the mantra, “fake it till you make it.”

Filippo Menczer, director of Indiana University’s Observatory on Social Media, astutely states, “Almost all online manipulation is some form of hijacking attention for the purpose of making money.” GitHub is no exception. It has become a marketplace of attention, where individuals accrue notoriety, influence, and reputation based on the popularity and usage of their software.

But just how prevalent is this underground economy of fake engagement? Fraser Marlow, head of growth for data orchestration startup Dagster, inadvertently stumbled upon this market while investigating investors’ reliance on GitHub stars as a signal of open source traction. Marlow’s team purchased stars from two different online stores and used the data they gathered to develop a model to detect fake stars in GitHub repositories. The results were eye-opening: Okcash, a cryptocurrency project, had a staggering 97 percent of its 759 stars flagged as fake, while Apache Airflow, a competitor, had only 1.6 percent of its 29,435 stars flagged. These findings build upon previous research that identified over 63,000 suspicious accounts on GitHub.

GitHub, however, is not turning a blind eye to this issue. Jesse Geraci, the company’s online safety counsel, affirms that GitHub Security has been aware of fake stars for years and actively works to remove them. Balancing the removal of inauthentic accounts while preserving genuine ones is a complex challenge. Nevertheless, GitHub’s anti-abuse team combines manual investigations with sophisticated software techniques to identify and eradicate fake engagement.

Yet, the fascination with GitHub stars may be waning. Fraser Marlow humorously likens the obsession to a hangover from the zero interest-rate policy bubble. While venture capitalists and firms still obsess over these metrics, their significance has diminished in recent years. Investors now seek various growth signals beyond GitHub stars when evaluating open source startups. Pratima Aiyagari, a partner at venture firm Nauta Capital, emphasizes that GitHub engagement is just one piece of the investment puzzle. Factors such as the founding team, market potential, and numerous other data points are carefully considered.

Interestingly, Baddhi Shop, an online store offering fraudulent metrics, has entered the GitHub market. Alongside GitHub stars, the shop also sells Product Hunt upvotes, followers, and views on Kaggle. These fake engagements are generated by a team of 11 individuals, who meticulously click away on various cloud devices. Baddhi asserts that their actions are not spam and that the shop respects each website’s terms of service. However, Discord, a popular chat room service for crypto projects, also faces daily purchases of fake engagement. Discord takes a firm stance against creating or selling fake accounts, actively removing offending users from the platform.

While selling fake engagement is most notorious on mainstream social platforms like Facebook, smaller platforms like GitHub and Product Hunt are becoming new targets. Stefano Cresci, a researcher focused on disinformation, fake news, and social bots, suggests that vendors have migrated to these lesser-known platforms due to increased scrutiny on larger platforms. The unsettling truth is that online cheating now permeates even niche communities. Justin Hollander, a professor at Tufts University, discovered Twitter bots being utilized to influence urban planning projects. These bots infiltrated projects ranging from California’s SoFi Stadium development to mixed-use projects in Atlanta.

Filippo Menczer of Indiana University aptly compares the proliferation of social bots and fake engagement to pollution, obscuring what is valuable and of high quality. As technology advances, this arms race between social bots and detection methods is set to intensify. Even chatbots powered by AI language models like ChatGPT contribute to the problem, effortlessly creating realistic fake accounts that are nearly indistinguishable from genuine ones. The battle to combat fake engagement is ongoing, and it remains to be seen how new metrics will emerge and how scammers will adapt.

Ultimately, while GitHub stars may have lost some of their luster, they still hold some weight in the investment world. However, investors are now armed with a more nuanced perspective on these metrics. The success of open source companies like Mulesoft and Gitlab has shifted investor focus towards a comprehensive analysis of multiple factors, rather than a sole reliance on GitHub engagement.

As the digital landscape continues to evolve, metrics may lose their validity. Campbell’s law posits that the more a metric is used for decision-making, the more it will be manipulated. Goodhart’s law warns that once a metric becomes a target, it no longer serves its intended purpose. Both laws highlight the precarious balance between using metrics as indicators of success and succumbing to manipulative practices.

In conclusion, while GitHub stars may have fallen prey to a black market of fake engagement, the tech community is adapting. Investors and individuals alike are becoming more discerning, using a multitude of signals beyond GitHub stars to evaluate projects and startups. The battle against fraudulent metrics and social bots continues, but with vigilance and technological advancements, we can strive for a more transparent and authentic digital ecosystem.