Sampling Methods and Zombies


If the entire population you’re studying is small enough you might be able to include every member (or close enough) of the population in your study. This is known as a census study. Usually it isn’t feasible to get everyone so you have to try and take a sample that reflects the characteristics of target population from which the sample is drawn.

All sampling methods you use should be Volunteer Sampling i.e. participants willingly take part in the study. Participants can take part in a study out of the goodness in their heart or they can be paid or incentivised in some way. There are exceptions where a researcher uses participants who don’t take part in a study willingly but such studies make it much harder for a researcher to obtain ethical approval.

General sampling methods can be divided into two categories: probability sampling and non-probability sampling. In probability sampling, each member of the target population has a known (and higher than zero) probability of being recruited. Many statistical tests presume that the research sample was chosen on a random basis. In non-probability sampling, participants are selected in some non-random fashion.

Examples of Probability Sampling

Random Sampling, every member of the target population has an equal chance of being selected. Individual members of the target population should be similar to each other in terms of relevant variables (as identified by your given study) which could influence the study e.g. all women/men, all students, all dogs, all middle income earners, etc. When you’re selecting who to include in your study you can assign every possible participant ID numbers and then use a ‘random number generator’ to select participants e.g. http://www.mathgoodies.com/calculators/random_no_custom.html

Systematic sampling (also known as Nth name selection technique), after the required sample size is established, members of the target population are selected from a list at fixed intervals e.g. every 3rd person on the list, every 100th person on the list, etc. As with random sampling, individual members of the target population should be similar to each other in terms of relevant variables which could influence the study.

Random and systematic sampling both have the risk of not being reflective of the target population e.g. 30% of the target sample might be children and 70% adults, random or systematic sampling could result in a participant sample comprise of something like 90% children and 10% adults. In this example, the proportional size differences between target population subgroups do not match proportional size difference between sample subgroups.

Stratified sampling, target population is divided into subgroups. Each sample of participants corresponds to a subgroup. Only people (potential participants) who match the given criteria of their subgroup can be recruited into that subgroup e.g. males go into the male group, females go into the female group. Random or systematic sampling is usually used to select potential participants for each subgroup. The proportional size differences between subgroups should match that of the participant samples. Stratified sampling is usually used when you have a mixed target population sample (e.g. different age groups, ethnicities or gender within a sample).

Stratification is sometimes introduced after the initial sampling phase. This is known as post-stratification and is usually used midway through a study when the researcher lacks prior knowledge of target population characteristics.

Cluster Sampling (also known as Multi-Stage Sampling). The target population is divided into subgroups (usually on a geographical or time basis). These subgroups may be further divided into subgroups. Instead of studying the entire population, the researcher selects examples from each subgroup and studies them. This is usually used when researchers aren’t sure how common or rare a variable is within a given population. There are multiple stages in this form of research. The first phase involves a researcher identifying subgroups (or clusters) of the target population. In the next phase the researcher takes a sample of each cluster and studies that sample. The researcher may then decide to divide each cluster into sub-clusters and samples from these are studied. There is a possibility for these sub-clusters to be divided again the process repeats as necessary. Cluster sampling does need a sampling frame whereby every aspect of the phenomenon being studied is listed. Instead, cluster sampling helps to generate a sampling frame. Before using cluster sampling, a researcher should be able to make a reasonable estimate of the elements which influence each cluster.

Students sometimes get confused by cluster/multi-stage sampling so I’ll use an example which I have found to work, zombies.

Example

Imagine a wizard is a researcher and the variable he wants to study is ‘people’s ability to defend themselves from zombie invasion’ (zombie defence). The wizard doesn’t know which countries have the best zombie defences (i.e. the rarity/nature of the variable is unknown) or what other variables influence zombie defence. In order to satisfy his curiosity, the wizard conducts a study by cursing the entire world so that the dead start to rise.

Phase 1: Within a month, some countries (clusters) defeat the zombies, some countries are overrun and some countries are still fighting. Now the wizard has three population groups to compare with each other. However, there are way too many groups within these populations for the wizard to study. It isn’t feasible to study them all (or the wizard is lazy). He therefore selects examples of each population and studies them.

Phase 2: The wizard examines these three populations and notices certain differences (variables which may have an influence). Zombie free countries tend to be areas:

  • Which cremate the dead so that they can’t turn into zombies later (Japan, Sweden and Denmark are safe)
  • where there was already a zombie culture motivated by movies and books. People therefore knew how to counteract zombies (most of Europe is safe)
  • Where there were lots of weapons available (America, China, Russia and Korea are safe)

The wizard has identified more population subgroups and notices that culture, weapon resources and the media are important variables.

The wizard examines countries which are still fighting. He notices that the rate at which zombies are being killed is nearly the same as the rate at which the zombie infection/curse spreads. The kill/infection ratio is therefore a very important reflection of zombie defences. Areas near graveyards, funeral homes, hospitals, museums and morgues tend to be completely overrun by zombies. Location is another important variable and allows the wizard to identify sub-clusters within countries. It looks like Australia and Canada will survive but Egypt may not be so lucky. The wizard examines countries which are completely overrun by zombies. It is difficult to find anyone alive to interview and it’s therefore more difficult to gather information. Alas, Iceland is no more.

At the end of the study the wizard now has a better understanding of the variable ‘zombie defences’. He now has a sampling frame of population subgroups and knows that culture, resources, location, infection rates and the media are important variables.

Check out this video for what the Canada congress has to say about zombies:


Examples of Non-Probability Sampling

Opportunity Sampling (also known as convenience sampling), involves recruiting whoever is available or easy to contact. This is usually used by college students practising their research skills, when the research results are unlikely to matter and when the research is unlikely to be published. Chances are it will produce a biased sample so avoid this method when possible.

Judgement sampling (also known as Judgemental Sampling or Purposive Sampling), a researcher selects participants based on some knowledge they have or selected participants fit some form of ideal description. I’ve noticed that this sampling method is most commonly used for qualitative studies which involve interviews e.g. a researcher is studying diabetes so they may recruit someone who is a medical expert and/or someone with diabetes.

Quota sampling, this is combination of stratified sampling and either opportunity sampling or judgement sampling. The target population is divided into subgroups and an ideal target sample size for each subgroup is established. Opportunity sampling or judgement sampling is used to select participants (must match subgroup criteria) for each subgroup.

Snowball Sampling. Researcher locates a few members of target population and then asks for their help to find other members of target population. This form of sampling is generally used when the variable a researcher is trying to study is extremely rare or it’s difficult to get in contact with participants with this variable/characteristic e.g. participants might have a rare disease, belong to a rare ethnic group or have a status (e.g. criminal) that they may wish to conceal from the general population (often promised confidentiality by researcher).

How big should my sample be OR how reflective is the sample I’ve used?

There are free sample size calculators available online. Here’s an example: http://www.surveysystem.com/sscalc.htm This link also provides a detailed explanation of the factors that influence the appropriateness of a sample size. In general, the more the merrier. There's a technique called power analysis to calculate how many participants you need based on the statistical strength of findings in previous research (it's a form of academic guess work). If research in similar or identical areas found weak correlations you may need more participants to find an effect.