Majority of COVID Surveys About Infection, Testing, Vaccination, Treatment 'Classified as Fraud': Peer-Reviewed Journal 'JMIR Formative Research'
A "multilayer fraud detection strategy classified a total of 4,722 (59.40%) entries as fraud," the study confirms.
A new study published Friday in the peer-reviewed journal JMIR Formative Research found that 59.40% of web-based surveys about individuals’ COVID-19 attitudes, beliefs, and behaviors, are fraudulent.
Such surveys are widely used in COVID-related studies that are used by mainstream health officials and media outlets.
The study authors looked at a total of 7,950 completed COVID-related survey responses.
Participants “were asked to complete a 20-minute questionnaire about their experiences, behaviors, and beliefs about COVID-19, risk of infection, testing, vaccination, treatment, and knowledge and beliefs about COVID-19 clinical trials,” the study explains.
‘Multilayer’ Fraud Detection Method
Using a first “multilayer” fraud detection method, the researchers initially confirmed that 4,207 (52.92%) of these entries “were classified as fraud.”
Of those classified as fraudulent:
1,242 (29.52%) reported a neighborhood name that did not match their residential address
648 (15.4%) provided an invalid residential address
1,397 (33.21%) displayed rapid survey submission
42 (1%) used a repeated email address
77 (1.83%) reported a nonstandard zip code
398 (9.46%) reported a residential address that was used more than twice
403 (9.58%) did not have a valid recruitment URL
Only 3,743 (47.08%) cases remained classified as valid.
The fraud might include individuals misrepresenting themselves in order to appear eligible for a study.
Individuals might also submit duplicate surveys in order to receive multiple incentive payments.
Fraudulent data may also come from “automated operations enacting fraud at a large scale, often referred to as ‘bots.’”
The authors explained that such methods of fraud are “often used to target surveys offering participation compensation payments and can be lucrative when aimed at large web-based surveys, even those offering small payments.”
They emphasized the risk of tainting COVID survey data quality that fraud poses:
“Such fraud poses risks not only to research resources but also, importantly, to the integrity of research findings, as fraudulent data can distort results and undermine data quality,” they write. “Specifically, fraudulent responses can introduce additional random noise or potentially add systematic bias to the data.”
However, even among the remaining 3,743 initially valid cases:
1,561 (41.70%) cases had a duplicate response in the free text entry item
394 (10.53%) cases had an IP address from outside the United States or from a virtual private network
and 619 (16.54%) had inconsistencies between the screener and main survey on at least 1 key item
Using their “2-strike” rule, the authors classified an additional 515 (13.76%) responses as fraud.
“Thus, our multilayer fraud detection strategy classified a total of 4722 (59.40%) entries as fraud” and only “3228 (40.60%) entries as valid,” the authors confirm.
‘Qualtrics’ Fraud Detection Method
Then, using a second “Qualtrics” fraud detection method, the study authors identified:
498 (6.26%) cases that failed bot detection (reCAPTCHA)
2,776 (34.92%) cases as fraud by the “RelevantID FraudScore”
and 938 (11.80%) cases as duplicates by the RelevantID DuplicateScore
Ultimately, the Qualtrics fraud detection strategy classified a total of 3,561 (44.79%) entries as fraud and only 4,389 (55.21%) entries as valid.
The study authors’ affiliations include:
School of Nursing, University of Pennsylvania, Philadelphia, PA, United States
Department of Psychology, Ashoka University, Sonepat, India
Read the full study here.