Not long after it was relaunched with over 55,000 screenshots and a platter of new features, the Game UI Database hit a major roadblock.Edd Coates, creator of the free repository, noticed the website had been "laggy as hell" and with the help of server maestro Jay Peet, started to investigate. The fix was alarmingly simple. Blocking a single IP address allowed normal service to resume, but who was the owner of that digital calling card? None other than OpenAI, the generative AI firm behind ChatGPT and Dall-E.Coates initially shared the news on X, and slammed OpenAI's practice of scraping websites for information that can be used to train its models. The company isn't shy about this. Its website openly states its large language models are trained on three sources including "information that is publicly available on the internet." Of course, a myriad of lawsuits and potential legal challenges involving major newspapers (including the New York Times) and even YouTube creators would argue that "publicly available" doesn't equate to legal (thanks The Guardian and The Verge).For Coates, the issue here is twofold. For starters, he doesn't agree with OpenAI's methods or business model, but being targeted by the company also drove a wrecking ball through a free resource he'd spent five years building."I first noticed that the database was having issues a couple of weeks ago, when pages were taking a lot longer to load. I knew this had nothing to do with the site itself because it had always run smoothly (even with more active online users), so I suspected foul play but couldn’t find any evidence at the time," Coates told Game Developer."I was unable to release any updates to the site as the lag was interfering with my admin tools, and I was even getting angry emails and messages from users who rely on the site as part of their day-to-day workflow."He explained the disruption eventually caused the website to stop working altogether, dishing out "502 Bad Gateway" errors to users. At that stage, Coates sought the help of Jay Peet, who hosted the database on their private server for the last five years. Peet looked at the site logs and realized the website's resources were being swallowed by a single IP address belonging to OpenAI."The homepage was being reloaded 200 times a second, as the [OpenAI] bot was apparently struggling to find its way around the site and getting stuck in a continuous loop," added Coates. "This was essentially a two-week long DDoS attack in the form of a data heist."
Game UI Database founder questions how OpenAI scraping is "fair or even legal?"
Coates makes no money from the Game UI Database. In fact, they actually run the website at a loss. If, however, they had attempted to monetize the project or leveraged external tools such as Amazon Web Services, OpenAI's unwelcome interest might have caused financial harm."If I were relying on [Game UI Database] for ad revenue or membership fees, the downtime caused by OpenAI would have absolutely had an impact on my income," he said. "They were transferring ~70GB of data from the server every ten minutes. Fortunately, I have no bandwidth costs and minimal server fees, so I’m able to provide this resource to everyone for free (as all educational resources should be). But if I were paying AWS for storage, for example, this bandwidth would have cost me around £850 a day.""OpenAI aren't even being transparent about where their data is coming from, so I would have been solely responsible for that bill. How is that fair or even legal? And I'm certainly not the only one being affected by this."Coates said the issue is more profound than a potential loss of income, though. "Don't get me started on what they're doing with this data," he continued, pointing out that he spent years meticulously collecting and cataloguing UI references to help other creatives in the game industry only to have that work (which encompasses the efforts of thousands of developers) "stolen by a multi-billion dollar organization."Coates said the idea that OpenAI is repurposing that work to "hurt and replace the people that I'm trying to help" only adds insult to injury. "It's sick. Generative AI technology simply wouldn't exist without the work of human creatives, and yet we are the ones here being punished without compensation or credit," they added.As for how Game UI Database succeeded in rebuffing OpenAI's advances, Coates and Peet eventually blocked all of the company's associated IPs at the HTTP server level to prevent it from deploying a workaround."When I investigated this problem, it was apparent that we were being scraped or crawled by something. Our usage analytics showed 10 active users, while the actual server was reporting 200 to 300 active requests," said Peet. "I suspected someone was either attempting to DDoS us or that the data was being taken en masse by an automated scraper."Blocking the crawlers was not too complex although the method to do so could be frail if OpenAI were to add or change any of their in use IP addresses. After modifying the robots.txt (which OpenAI can just ignore if they wish) I explicitly blocked all of OpenAIs IPs at the HTTP server level. After switching to the new config, the website was immediately back to its regular response time."At the time of writing, Game UI Database is back firing on all cylinders and championing UI work featured in almost 1,400 titles–including ill-fated projects like Concord. Game Developer has reached out to OpenAI for comment.