Ad-tech Firms Grab Email Addresses From Forms Before They're Even Submitted

Tracking, marketing, and analytics firms have been exfiltrating the email addresses of internet users from web forms prior to submission and without user consent, according to security researchers.

Some of these firms are said to have also inadvertently grabbed passwords from these forms.

In a research paper scheduled to appear at the Usenix '22 security conference later this year, authors Asuman Senol (imec-COSIC, KU Leuven), Gunes Acar (Radboud University), Mathias Humbert (University of Lausanne) and Frederik Zuiderveen Borgesius, (Radboud University) describe how they measured data handling in web forms on the top 100,000 websites, as ranked by research site Tranco.

The boffins created their own software to measure email and password data gathering from web forms – structured web input boxes through which site visitors can enter data and submit it to a local or remote application.

Providing information through a web form by pressing the submit button generally indicates the user has consented to provide that information for a specific purpose. But web pages, because they run JavaScript code, can be programmed to respond to events prior to a user pressing a form's submit button.

And many companies involved in data gathering and advertising appear to believe that they're entitled to grab the information website visitors enter into forms with scripts before the submit button has been pressed.

"Our analyses show that users’ email addresses are exfiltrated to tracking, marketing and analytics domains before form submission and without giving consent on 1,844 websites in the EU crawl and 2,950 websites in the US crawl," the researchers state in their paper, noting that the addresses may be unencoded, encoded, compressed, or hashed depending on the vendor involved.

Most of the email addresses grabbed were sent to known tracking domains, though the boffins say they identified 41 tracking domains that are not found on any of the popular blocklists.

"Furthermore, we find incidental password collection on 52 websites by third-party session replay scripts," the researchers say.

Replay scripts are designed to record keystrokes, mouse movements, scrolling behavior, other forms of interaction, and webpage contents in order to send that data to marketing firms for analysis. In an adversarial context, they'd be called keyloggers or malware; but in the context of advertising, somehow it's just session-replay scripts.

Gunes Acar, one of the report co-authors, was also the co-author of a similar research project in 2017 that looked at data gathering by session-replay companies Yandex, FullStory, Hotjar, UserReplay, Smartlook, Clicktale, and SessionCam.

Evidently, not much has changed since then, except perhaps that email addresses have become more desirable as unique identifiers now that privacy-oriented browsers like Brave, Firefox, and Safari are taking more steps to block cookies and tracking scripts.

Email addresses, the researchers observe, represent a cookie replacement because they're unique, persistent, and can be used to track people across applications, platforms, and even offline interactions that may be tied to an email address like loyalty card transactions.

The website categories with the most leaking forms include: Fashion/Beauty (11.1 per cent, EU; 19 per cent US); Online Shopping (9.4 per cent EU; 15.1 per cent US); and General News (6.6 per cent EU; 10.2 per cent US).

Websites categorized as Pornography had the best privacy when it comes to surreptitious form data harvesting.

"A somehow surprising result was the following: despite filling email fields on hundreds of websites categorized as Pornography, we have not a single email leak," the researchers say, noting that previous studies of adult-oriented websites have relatively fewer third-party trackers than similarly popular general interest websites.

Those pesky regulations

The report authors say that EU websites practicing email exfiltration may be in violation of at least three GDPR requirements: transparency, purpose limitation, and prior consent. Firms found to be violating these rules can be fined up to $20m euros or 4 per cent of annual revenue, per Article 83(5).

The US doesn't have a federal data privacy law, though it's conceivable one of the handful of US states with applicable privacy rules could take action against pre-submission form harvesting. But given the toothlessness of US privacy regulation over the past decade, don't expect much.

The authors say they attempted to contact 58 first-parties and 28 third-parties with GDPR requests. They report receiving 30 responses from the first-parties, which varied from surprise and remediation to justifications of one sort or another.

"fivethirtyeight.com (via Walt Disney’s DPO), trello.com (Atlassian), lever.co, branch.io and cision.com were among the websites that said they had not been aware of the email collection prior to form submission on their websites and removed the behavior," the report says.

Marriott, meanwhile, said the information collected by digital analytics firm Glassbox helps with customer care, technical support, and fraud prevention.

Third-parties Taboola, Zoominfo, and ActiveProspect defended their data collection practices.

Facebook, aka Meta, is among the third-parties involved in this. The researchers say that email addresses or their hashes were spotted being sent to facebook.com from 21 different websites in the EU.

"On 17 of these, Facebook Pixel’s Automatic Advanced Matching feature was responsible for sending the SHA-256 of the email address in a SubscribedButtonClick event, despite not clicking any submit button," the report says.

Advanced Matching – called out recently for harvesting student loan data – is designed to collect hashed customer data, such as email addresses, phone numbers, and names from checkout, sign-in, and registration forms. The researchers speculate that on these sites, Facebook's script treats clicks on non-submit buttons as a click event for the submit button.

Facebook did not respond to a request for comment.

The report concludes that browser vendors, regulators, and privacy tool makers need to deal with this issue because it isn't going away. "Based on our findings, users should assume that the personal information they enter into web forms may be collected by trackers – even if the form is never submitted," the report concludes. ®

RECENT NEWS

Navigating The Shifting Sands: The Neutral Rate Of Interest In A Rapidly Evolving Economy

In the labyrinth of monetary policy tools, the neutral rate of interest stands out for its pivotal role in stabilizing e... Read more

Indias Stock Market Surge: A Sectoral Deep Dive And The Modi Effect

In the landscape of global finance, few markets have captivated investor interest quite like India's, particularly again... Read more

Navigating New Horizons: The Entry Of Crypto-ETNs In The UK Market And Its Ripple Effects

In an unprecedented move that marks a significant pivot in the United Kingdom's regulatory approach to digital assets, t... Read more

Navigating The New Frontier: Investing In The Age Of Artificial Intelligence

In recent months, the financial world has witnessed a phenomenon that has reshaped the landscape of investment: the boom... Read more

The Future Of Finance: How Cryptocurrency ETFs Are Changing The Investment Landscape

In an unprecedented move that marked a milestone for the digital currency world, the U.S. Securities and Exchange Commis... Read more

Financial Markets Embark On A Resilient Path Amidst Macro-Economic Optimism

Author: Brett Hurll                            &nb... Read more