Tuesday, 1 July 2014

Online Survey? You are doing it WRONG!

I am sure many of you have conducted surveys at some point in your life. Many students love to use survey results in their projects to substantiate their stand or to identify trends pertaining to certain issue.  In a properly designed and sampled survey, this is certainly viable. The problem is, conducting a proper survey requires so much resources, it is almost impossible and impractical for any students to do so. Most students choose the easy way out by conducting online survey due to its convenience. The end result is a brunch of numerical values fitted forcefully and incorrectly into statistics. I have even seen an intern from a certain national body recruiting online survey responses through Facebook. Since then, I have serious doubt on the validity of the reports and findings produced by that particular organization.  

The ability of a small sample in a survey to reflect the opinions of the larger target population does not happen haphazardly. It requires proper probabilistic sampling method. Otherwise, even simple statistics such as taking the average of the response is wrong as it implicitly assumed that everyone in the target population have equal probability of responding to the survey, which is certainly untrue given the way how most students conduct their surveys. 

Self-Selected Sample 

With the advent of forums, blogs and social medias, one of the most commonly used methods for students to conduct such surveys is through these online platforms. I mean, why wouldn't they? With the help of the countless free survey tools available online, it is extremely easy for anyone to set up a questionnaire and distribute it online to gather responses. Most of these tools are easy to use, often producing beautiful graphs and charts with the click of a button and most importantly, they are convenient. 

The main problem lies in the fact that the respondents are self-selected. These volunteer samples are likely to have stronger opinion on the survey topic, which compelled them to do the survey in the first place. In the case of conducting survey through social media such as Facebook by asking your friends to fill up the survey, it may also lead to over-representation of people who belongs to a very similar demographic group. 

As an illustrative example, in 1993, shortly after Bill Clinton was elected as the president of the United States, a TV station in Sacramento, California conducted a TV poll seeking viewers’ opinion to the question: “Do you support the president’s economic plan?” Coincidently, the result of a properly conducted survey asking the same question was published around that time, with the following results: 

Self-Selected Sample (TV Poll)
Random Sample
Not Sure

As shown in the table, the results are contradicting. In the TV poll, majority of the viewers are not supportive of the president’s economic plan while in the proper survey, 3 in 4 respondents are supportive of it. In addition, none of the respondents in the TV poll was ‘Not Sure’, highlighting the tendency of people who responded have a stronger opinion.

In summary, survey which relies on self-selected sample is a complete waste of time. The result is meaningless, or worse, misleading. The only accurate information you can derive from such survey is probably the count of number of people who bothered to respond. 


Even for some reasons, you managed to get a proper sample, e.g. in a very simple survey where your target population consists of only your class. There are many other factors which need serious consideration, but are often ignored. One such pitfall is the wording of questions used in the survey.
Building questionnaire may seem like a simple task at first, but this couldn't be further from the truth. The way you phrase a question and the order in which respondents respond to them can have unintentional consequences on the resulting responds.
In an experiment conducted by Loftus and Palmer, a video footage of an automobile accident was shown to 2 groups of college students. The first group was then asked the question: “About how fast were the cars going when they contacted each other?” While the second group was asked “About how fast were the cars going when they collided with each other?” The average responses of the 2 groups were 51.2km/h and 65.7km/h respectively. Simply changing one word in the question led to a very different result.
The ordering of questions can also skew the responses in the survey. For example, if a survey asked, “How often do you dine at KFC?” and then asked “Name the top 5 fast-food restaurant you think teenager visit most frequently.” It is quite likely that the respondents will include KFC in the latter question.
These two examples are definitely not an exhaustive list of the potential pitfall one can encounter when designing questionnaires. There are many other considerations researchers should be aware of in order to create a good survey. However, I have no desire to turn this blog post into an entire chapter, so I will skip them. For those who are interested, you may read more about them here

Characteristic of a proper survey 

So how do we conduct a proper survey? While each survey should be customized according to the objective of the study, a good survey necessarily share the following characteristics:

1.       Well defined target population
Who are the people or what are the objects you are trying to make inference on? It also allows us to identify potential sampling frame. 

2.       A sampling frame and sampling method
Sampling frames are lists intended to identify all elements from the target population. However, a perfect sampling frame which contains all elements in the target population can be prohibitively expensive or even non-existent. Therefore, we often have to make do with frames that are imperfect which may not contain all elements in the target population or include ineligible elements.
After obtaining the sampling frame, we have to have a probabilistic method of selecting the sample from the frame in order for the survey result to be statistically valid. The simplest approach is to use Simple Random Sampling, where each element in the frame is assigned a probability of selection and you select a sample of predetermined size based on this probability profile.

3.       Questions should be carefully phrased to avoid bias. Pilot study to ensure question yield the desired information
As mentioned previously, wordings can cause bias to the response collected. Therefore, care must be taken to ensure the questions in the survey have minimal bias. If need be, pilot study should be conducted to evaluate the questionnaire’s effectiveness in eliciting the desired information. 

4.       Properly trained surveyor
If the survey requires surveyor to collect responds from respondents through face-to-face interview or telephone interview, adequate training should be provided in order to minimize influence of the surveyor on the respondents. 

5.       Ensuring a good response rate
In a sample survey of humans, it is almost impossible to obtain 100% response rate. Some respondents could not be reached while others simply refused to be interviewed. People who tend to respond to a survey could be very different from people who don’t. This difference can bias the survey result. Therefore, measures should be taken to minimize non-response. 

After reading this post, I hope you will realize that designing a proper survey requires a lot of efforts and resources. The way you have been conducting surveys are most probably wrong and the results are nothing but a pile of meaningless numbers. As a student, you may be bounded by the syllabus to churn out ‘surveys’, but as a responsible user and consumer of data, this stupid practice of conducting survey with no statistical basis should be dumped into the garbage bin where it belongs.   

No comments :

Post a Comment