How to Counter Fraud in Market Research?

June 10, 2021

Fraud and misrepresentation of reports is the common norm, which erodes elements of transparency in research activity. If proper checks are not set in place, such loopholes can be catastrophic in decision making phase because it relies upon data gathered during the research process.

In Pakistan, research ethics and professionalism is the salt sold very cheap. No standards are in place to guide and curb such sheer violation of research basics. Data is infringed with biased, wrong and irrelevant responses. Least efforts are made to ensure data verifiability and authenticity. According to an estimate, 25-30% fraud rate is obvious in all the research activities.

By reviewing data conformity to objectives on each level can mitigate unwanted practices but this idea is not practical where scope is too high.

Some of the common survey frauds

  • Same surveys are performed multiple times by one respondent – Duplicate respondent error

Unique IP address and link to email accounts can minimize this error also, using one single platform to submit responses will help us in getting unique submissions. Also, at times it can be accidental, Accidental “fraud” occurs when survey participants are not aware they are taking the same survey multiple times. This happens when online surveys do not block duplicate entries. Another source of accidental errors can occur when a survey respondent is a member of multiple online panels that are all sending respondents the same survey. But in all cases, these respondents are not actually committing fraud. They are simply good people accidentally participating more than one time in the same survey.

On other hand, there is this group committing Intentional fraud, it occurs when a survey participant deliberately tries to complete a survey multiple times, or provides inaccurate answers to survey questions. The motives are either to alter the results of the survey or to reap extra financial incentives. These individuals routinely join and participate in as many online panels as they can. They have their browsers reset browser cookies and history after each survey attempt, make common use of “incognito modes,” and utilize VPN services (commercial and free) to circumvent any geographic restrictions. Online surveys that offer bigger incentives tend to attract more intentional fraud. These are individuals you definitely do not want in your online surveys.

Minimizing Fraud – Protection mechanism

  • Research targets such as, households or retail branches when not visited, or doubted as not visit for this purpose – Geo Tagging is used. The first component of a fraud defense is geo-location. Geo-location can be provided by the device itself (mobile devices, etc.), by a previous data point created by survey participants when signing up, or via their computer’s IP address. Geo-location or tagging results are generally trusted and acceptable at the country level.
  • Another widely used method that is used to track survey participants is a device/browser fingerprint. This “digital” fingerprint is built up from components and properties of the browser, such as fonts installed, plugins registered, screen size, color depth, and many other variables that uniquely identify a computer, tablet, or smartphone.
  • Another strategy is to put questions (i.e., cheater traps) in an online survey with nonsensical answers that might trap a participant who is rushing through a survey.
  • Cookies are also used to track respondents across and within online surveys. Cookies are widely used to prevent duplicate survey entries and are easy to implement.
  • To reach right audience or diffuse chances of fraud, random calls are made to assess the authenticity of online and offline surveys.
  • At the same time, Ace research has devised one of the most comprehensive Quality check process. This ensures optimally the integrity and accuracy of data at all levels. It minimizes human errors by putting in place review mechanism.
  • All the reports and submission if cross checked with underlined instruction shared with shoppers and surveyors – with zero tolerance policy for any deviation also helps in mitigating fraud.

Other ways to minimize fraudulent responses

Two less common and more difficult to interpret components of a fraud defense are survey metadata and open-end verbatim analysis. Survey metadata includes details like how long survey participants took to complete a particular survey or question, how long they spent writing an answer to an open-end question, or how many times they changed their answers. Consequently, determining what constitutes fraud from metadata requires some human legwork and tends to vary survey by survey.

The same is true of the analysis of open-end questions. While sentiment analysis is commonplace now, it is fraught with misclassifications for anything but the simplest of cases. Going a step further and performing more sophisticated analysis of open-ends to identify fraudulent responses is even more difficult. However, there are some simple things that can be done to help check for fraudulent open-ends. Checking for gibberish, repetitive answers, and off-topic responses can help identify fraud.

How to Ensure Proper Data Cleaning in Excel?

May 11, 2020

In Research, the emphasis is on the report writing because a good report comprehensively explains all the stages with the relevant outcome and valuable way forwards.

Reports are the product of data gathered either from secondary or primary sources and it is, therefore, very important for the data to be authentic, reliable, and up to date. In order to ensure the reliability of data, it has to be processed for omitting any error or mistake. And before we can work with our data, we need to make sure it’s valid, accurate, and reliable.

In the age of Big Data, companies may spend just as much or more on maintaining the health and cleaning their data as they spend on collecting it in the first place. Consider the issues that can stem from missing or wrong values, duplicates, and typos. The validity, accuracy, and reliability of your calculations depend on your ability to keep your data up-to-date, this is also evident from Ace Research’s projects.

To prepare data for later analysis, it is important to have a clean data table.  Depending on the origin of the data, you may need to do some of the following steps to ensure that the data are as complete and consistent as possible.

  1. Assign unique code to your fields

Unique codes are very useful while sorting and cleaning data because at any stage if trouble arises you can sort out the data from your database with the help of unique codes already assigned to the data set.

  1. Maintain separate sheets if you are working on a huge data set

Often the data is very large and you cannot work on the whole data set at the same time, so it is preferred to maintain separate files for each change you make. This helps when you refer back in case you missed anything at any step.

  1. Get rid of extra spaces

Extra spaces are painfully difficult to spot. While you may somehow spot the extra spaces between words or numbers, trailing spaces are not even visible. Here is a neat way to get rid of these extra spaces

Excel TRIM function takes the cell reference (or text) as the input. It removes leading and trailing spaces as well as the additional spaces between words (except single spaces).

  1. Select and treat all blank cells

Blank cells can create havoc if not treated beforehand. We often face issues with blank cells in a data set that is used to create reports.

You may want to fill all blank cells with ‘0’ or ‘Not Available’, or may simply want to highlight it. If there is a huge data set, doing this manually could take hours. Thankfully, there is a way you can select all the blank cells at once.

  1. Select the entire data set
  2. Press F5 (this opens the Go to dialogue box)
  3. Click on the Special button (at the bottom left).
  4. This opens the Go To Special dialogue box
  5. Select Blank and Click OK

This selects all the blank cells in your data set. If you want to enter 0 or Not Available in all these cells, just type it and press Control + Enter (remember if you press only enter, the value is inserted only in the active cell).

  1. Remove duplicates

There can be 2 things you can do with duplicate data – Highlight It or Delete It.

Highlight Duplicate Data:

Select the data and Go to Home – Conditional Formatting – Highlight Cells Rules – Duplicate Values.

Specify the formatting and all the duplicate values get highlighted.

Delete Duplicates in Data: 

  • Select the data and Go to Data – Remove Duplicates.
  • If your data has headers, ensure that the checkbox at the top right is checked.
  • Select the Column(s) from which you want to remove duplicates and click OK.

This removes duplicate values from the list.

If you want the original list intact, copy-paste the data at some other location and then do this.

  1. Highlight errors

There are 2 ways you can highlight Errors in Data in Excel:

Using Conditional Formatting

  • Select the entire data set
  • Go to Home –Conditional Formatting – New Rule
  • In New Formatting Rule Dialogue Box select ‘Format Only Cells that Contain’
  • In the Rule Description, select Errors from the drop-down
  • Set the format and click OK. This highlights any error value in the selected dataset

Using Go To Special

  • Select the entire data set
  • Press F5 (this opens the Go To Dialogue box)
  • Click on Special Button at the bottom left
  • Select Formulas and uncheck all options except Errors

This selects all the cells that have an error in it. Now you can manually highlight these, delete it, or type anything into it.

  1. Change text to lower/upper/proper case

When you import data from text files, often the names or titles are not consistent. Sometimes all the text could be in lower/upper case or it could be a mix of both. You can easily make it all consistent by using these three functions:

  • LOWER () – Converts all text into Lower Case.
  • UPPER () – Converts all text into Upper Case.
  • PROPER () – Converts all Text into Proper Case.
  1. Parse data using text to column

When you get data from a database or import it from a text file, it may happen that all the text is cramped in one cell. You can parse this text into multiple cells by using Text to Column functionality in Excel.

  • Select the data/text you want to parse
  • Go To Data –Text to Column (This opens the Text to Columns Wizard)

Step 1: Select the data type (select Delimited if your data is not equally spaced, and is separated by characters such as comma, a hyphen, dot.). Click Next

Step 2: Select Delimiter (the character that separates your data). You can select pre-defined delimiter or anything else using the other option

Step 3: Select the data format. Also, select the destination cell. If the destination cell is not selected, the current cell is overwritten.

  1. Spell check

Nothing lowers the credibility of your work than a spelling mistake.

Use the keyboard shortcut F7 to run a spell check for your data set in Excel.

  1. Delete all formatting

In my job, I used multiple databases to get the data in excel. Every database had its own data formatting. When you have all the data in place, here is how you can delete all the formatting at one go:

  • Select the data set
  • Go to Home – Clear –Clear Formats

Similarly, you can also clear only the comments, hyperlinks, or content.

  1. Use find and replace to clean data in excel

Find and replace is indispensable when it comes to data cleansing. For example, you can select and remove all zeros, change references in formulas, find and change formatting, and so on.