1.3M comments on net neutrality were likely faked, data expert says

Click to play video: 'Why you should care about Net Neutrality'
Why you should care about Net Neutrality
WATCH: Tech expert Marc Saltzman talks about the impacts of getting rid of net neutrality in the U.S – Jul 12, 2017

New York Attorney General Eric Schneiderman is criticizing the  U.S. Federal Communications Commission (FCC) after he said it was flooded with fake public comments on net neutrality and did nothing about it.

Schneiderman said an investigation shows hundreds of thousands of fake comments that were against net neutrality were sent to the FCC. Another data scientist said the number could actually be more than a million.

READ MORE: What is net neutrality and why should Canadians care?

This comes in the wake of the FCC’s decision on Tuesday to dismantle net neutrality rules in the U.S., which equalized access to the internet and prevented broadband providers from favouring their own apps and services.

In an open letter to the FCC, Schneiderman said the agency has not provided him with “critical” information for the investigation his office is conducting.

Story continues below advertisement

In May 2017, his office analyzed fake comments sent to the FCC’s that were in favour of getting rid of net neutrality. He said during the investigation he also found hundreds of thousands of Americans may have had their identities stolen for this spam campaign.

In June 2017, he contacted the FCC to request certain records related to the comment system. He said his office made the request for logs and other records at least nine times over a span of five months.

WATCH: Prime Minister Justin Trudeau says he will defend net neutrality

Click to play video: 'Justin Trudeau says he will ‘defend’ net neutrality'
Justin Trudeau says he will ‘defend’ net neutrality

“Yet we have received no substantive response to our investigative requests. None,” he wrote.

“Such conduct likely violates state law — yet the FCC has refused multiple requests for crucial evidence in its sole possession that is vital to permit that law enforcement investigation to proceed.”

Story continues below advertisement

‘Something fishy going on’

While Schneiderman said there could have been hundreds of thousands of fake comments sent to the FCC about repealing net neutrality, one data scientist said it could actually be more than a million.

READ MORE: U.S. plans to scrap Obama-era net neutrality rules

Jeff Kao, a software engineer from San Francisco, built a system that analyzed millions of comments submitted to the FCC on the subject. Kao also worked as a summer intern for the FCC in 2010.

He posted the findings Thursday on Medium and said around 1.3 million of the comments sent to the FCC could have been fake — based on the text.

To find this, Kao broke down more than 22-million comments to see which ones were duplicates or unique.

WATCH: Editor-in-chief of Wired explains the importance of net neutrality

Click to play video: 'Editor-in-chief of Wired explains the importance of net neutrality'
Editor-in-chief of Wired explains the importance of net neutrality

“I found that less than 800,000 of the 22-million comments submitted to the FCC could be considered truly unique,” he said.

Story continues below advertisement

He said 1.3-million comments had similar words, such as “Washington bureaucrats”, “unprecedented regulatory power” and “Obama Administration imposed.”

This is because each sentence in the faked comments looks like it was generated by a computer program, he said.

And in order to change the vocabulary, he said a mail merge swapped in a synonym for each term to generate unique-sounding comments.

“When laying just five of these side-by-side with highlighting, as above, it’s clear that there’s something fishy going on,” he said.

READ MORE: In an era of fake news, Snopes says it could shut down, but there’s more to the story

“But when the comments are scattered among 22 million, often with vastly different wordings between comment pairs, I can see how it’s hard to catch. Semantic clustering techniques, and not typical string-matching techniques, did a great job at nabbing these.”

Out of the 800,000 comments that did not seem to be duplicated, he looked at a random sample of 1,000. He was only able to find three comments that “were clearly pro-repeal of net neutrality.”

Kao said the concept of using fake comments to distort statistics is nothing new. But there are precautions a company can take to avoid them, and the FCC hasn’t done that, he said.


Sponsored content