Web Archive User's $14k BigQuery Bill Shock After Running Queries On 'free' Dataset

A user left with a surprise bill for thousands of dollars after running queries on Google's BigQuery data warehouse has sparked a debate about how vendors should place limits on the use of their tools.

One user of HTTP Archive – a project that aims to track how the web is built – was recently horrified to get a $14,000 bill from Google.

The HTTP project – which crawls websites recording detailed information about fetched resources, used web platform APIs and features, and execution traces of each page – hosts a publicly available dataset on the Chocolate Factory's BigQuery cloud-base data warehouse system.

"This website makes it seem like this 'public' dataset is for the community to use, but it is instead a for-profit money maker for Google Cloud and you can lose tens of thousands of dollars," said user Tim on the HTTP archive forum.

"This official website should be updated to warn people Google is apparently now hosting this dataset to make money. I don't think that was the original mission, but that's what it is today, there's basically zero customer support, and you can lose $14k in the blink of an eye," he added in the discussion post.

An archive maintainer responded that 99 percent of the archive users only view its free monthly reports and annual Web Almanac reports. BigQuery is designed for the 1 percent of "power users" who "need lower level access to the raw data."

The maintainer pointed out that $14,000 would have come from processing about 2.5 petabytes, given Googles rate of $6.25 per TiB. He said Google warns users how much data the query will process when run, yet nonetheless apologized for the user's experience and said he'll add a more explicit warning about BigQuery charging to the website's FAQ page.

However, the user, Tim, came back into the conversation. He said he was running queries from a Python script with the official GCP libraries, which, unlike the web UI, does not have a mechanism to show costs for a query, he said.

"I think one thing that would help is to highlight people should enable the cost controls prior to running queries, as they are not on by default," he said.

Tim argued for a circuit-breaker at $5k or less to stop users from running queries unless they manually confirm they want to continue.

One respondent logged on to say that the complainant was an idiot — in a post now hidden by moderators — for running a query without understanding the volume of data it might address. Others may see this as unhelpful.

While Google makes BigQuery's pricing clear on its website, users — particularly students or academics — might arrive at the data from another direction. Maybe a default should be to prevent processing data above a certain threshold unless the user explicitly agrees or they have signed up to a data plan.

The Register has contacted Google for a statement. ®

RECENT NEWS

Data: The Sword And Shield Against Disinformation

In the age of information overload, distinguishing fact from fiction has become a daunting task. Disinformation, the del... Read more

Taking Flight: Volocopter's Quest To Revolutionize Urban Mobility Gains Momentum

Volocopter, a pioneering company in the field of urban air mobility, is on a mission to transform how people move around... Read more

OnlyFans Under The Microscope: Addressing Concerns Over Child Safety

Concerns Over Child SafetyOnlyFans' ResponseRegulatory ActionsCollaborative EffortsFuture DirectionsConclusion Read more

The AI Arms Race: Big Tech's Bid For Dominance In Artificial Intelligence

In the rapidly evolving landscape of technology, the race for dominance in artificial intelligence (AI) has intensified ... Read more

Decoding The Impact Of OpenAI's Sora Video Model On Industries And Jobs

In the realm of artificial intelligence, OpenAI's Sora video model stands out as a groundbreaking innovation, promising ... Read more

Apple Poaches Top Talent From Google To Strengthen AI Team

As artificial intelligence (AI) continues to shape the future of technology, companies are intensifying their efforts to... Read more