1.3M comments on net neutrality were likely faked, data expert says
New York Attorney General Eric Schneiderman is criticizing the U.S. Federal Communications Commission (FCC) after he said it was flooded with fake public comments on net neutrality and did nothing about it.
Schneiderman said an investigation shows hundreds of thousands of fake comments that were against net neutrality were sent to the FCC. Another data scientist said the number could actually be more than a million.
This comes in the wake of the FCC’s decision on Tuesday to dismantle net neutrality rules in the U.S., which equalized access to the internet and prevented broadband providers from favouring their own apps and services.
In an open letter to the FCC, Schneiderman said the agency has not provided him with “critical” information for the investigation his office is conducting.
In May 2017, his office analyzed fake comments sent to the FCC’s that were in favour of getting rid of net neutrality. He said during the investigation he also found hundreds of thousands of Americans may have had their identities stolen for this spam campaign.
In June 2017, he contacted the FCC to request certain records related to the comment system. He said his office made the request for logs and other records at least nine times over a span of five months.
WATCH: Prime Minister Justin Trudeau says he will defend net neutrality
“Yet we have received no substantive response to our investigative requests. None,” he wrote.
“Such conduct likely violates state law — yet the FCC has refused multiple requests for crucial evidence in its sole possession that is vital to permit that law enforcement investigation to proceed.”
‘Something fishy going on’
While Schneiderman said there could have been hundreds of thousands of fake comments sent to the FCC about repealing net neutrality, one data scientist said it could actually be more than a million.
Jeff Kao, a software engineer from San Francisco, built a system that analyzed millions of comments submitted to the FCC on the subject. Kao also worked as a summer intern for the FCC in 2010.
He posted the findings Thursday on Medium and said around 1.3 million of the comments sent to the FCC could have been fake — based on the text.
To find this, Kao broke down more than 22-million comments to see which ones were duplicates or unique.
WATCH: Editor-in-chief of Wired explains the importance of net neutrality
“I found that less than 800,000 of the 22-million comments submitted to the FCC could be considered truly unique,” he said.
He said 1.3-million comments had similar words, such as “Washington bureaucrats”, “unprecedented regulatory power” and “Obama Administration imposed.”
This is because each sentence in the faked comments looks like it was generated by a computer program, he said.
“They were like Mad Lib comments,” Kao said.
And in order to change the vocabulary, he said a mail merge swapped in a synonym for each term to generate unique-sounding comments.
“When laying just five of these side-by-side with highlighting, as above, it’s clear that there’s something fishy going on,” he said.
“But when the comments are scattered among 22 million, often with vastly different wordings between comment pairs, I can see how it’s hard to catch. Semantic clustering techniques, and not typical string-matching techniques, did a great job at nabbing these.”
Out of the 800,000 comments that did not seem to be duplicated, he looked at a random sample of 1,000. He was only able to find three comments that “were clearly pro-repeal of net neutrality.”
Kao said the concept of using fake comments to distort statistics is nothing new. But there are precautions a company can take to avoid them, and the FCC hasn’t done that, he said.
© 2017 Global News, a division of Corus Entertainment Inc.