A webR tutorial

Estimating Mindfulness Practice Among College Students

Introduction

You and your colleague aim to estimate the prevalence of mindfulness practice among college students. Mindfulness has been linked to numerous benefits, including reduced stress and improved mental well-being. Knowing how many students engage in mindfulness practices can help the university develop programs to support student health. The population of interest is all college students at your university.

Simulate the population

For this example, we’ll simulate a data frame that represents the population of all college students at your university. Press Run Code on the code chunk below to create the data frame.

This script simulates a population of 5,000 students. First, a random seed is set to ensure the results are reproducible (i.e., so that we all get the same results). A population of 5,000 students is created, each assigned a unique ID. The key variable of interest, “Mindfulness,” is a binary variable where students are either assigned a 1 (indicating they practice mindfulness) or a 0 (indicating they do not). The probability of practicing mindfulness is set at 0.20, meaning roughly 20% of the students are expected to practice mindfulness. The rbinom() function is used to generate random numbers from a binomial distribution. The arguments include n (the number of students), size (this indicates that there is one trial for each observation, meaning that each student either “succeeds” or “fails” in a single trial — i.e., practices mindfulness or does not), and prob (this represents the probability of success in each trial — here, each student has a 20% chance of practicing mindfulness).

From this population, we can compute the proportion of college students who practice mindfulness. Since we’re imagining that we just simulated data for the whole population, this is the population parameter.

This script calculates the true prevalence of mindfulness by computing the mean of the “Mindfulness” variable, which provides the proportion of students who practice mindfulness in the simulated population. This calculated prevalence reflects the overall rate of mindfulness practices among the 5,000 students. The cat() function in R is used to concatenate and print objects. It converts its arguments to character strings and outputs them, it’s just a convenient way here to label the output for this activity.

Two approaches to sampling

Now, let’s imagine that you and your colleague differ in your approach to producing a sample of students to study. One of you chooses to draw a random sample of students from the population (Scenario 1), and the other chooses to collate a sample of students by placing advertisements to participate in a study of college student well-being around campus and invite students to participate in the study.

Within your pair, one of you should work through Scenario 1 and the other through Scenario 2.

Scenario 1

You choose to draw a simple random sample of students from a roster of all students at your university.

Purpose of Random Sampling

The goal of simple random sampling is to create a subset of the population that is representative of the entire population. Since each student has an equal chance of being selected, the sample should (on average) reflect the true distribution of characteristics (like mindfulness practice) in the population. This method ensures that the sample is unbiased and doesn’t over-represent or under-represent any group.

Drawing a True Random Sample

The sample_n() function is used to draw a random sample of 500 students from the population. In a true random sample, every student in the population has an equal chance of being selected, regardless of whether or not they practice mindfulness or have any other characteristics.

Press Run Code on the code chunk below to simulate your sample.

After drawing the random sample, you estimate the prevalence of mindfulness in the population. The code below computes that proportion.

Scenario 2

You choose to recruit students to participate in the study through advertisement, allowing students to select themselves into the study.

The code chunk below simulates this process by doing the following:

  1. Assigning Propensity to Participate:

    • Each student in the population is given a “propensity” score based on whether or not they practice mindfulness. Those who practice mindfulness are assigned a higher propensity to participate (0.8), while those who do not practice mindfulness are assigned a lower propensity (0.2).

    • This step reflects the idea that students who practice mindfulness are more likely to join a study about well-being than those who do not.

  2. Calculating Sampling Probability:

    • After assigning propensities, the script calculates the probability that each student will be selected into the sample. This is done by dividing each student’s propensity by the total sum of all propensities.

    • This ensures that the students with higher propensity (those who practice mindfulness) have a greater chance of being included in the sample, but everyone still has some chance.

  3. Generating the Non-Random Sample:

    • Finally, a non-random sample is generated. The script selects students in a non-random (i.e., biased) way, using the calculated sampling probabilities based on the propensity to participate. This means that students who practice mindfulness (with higher propensity) are more likely to be selected than those who don’t practice.

Intuitive Explanation: Students who already practice mindfulness may be much more eager to participate in the study than those who don’t. This code models that situation by assigning higher participation chances to mindfulness practitioners. When selecting participants for the sample, the chances of picking a mindfulness practitioner are higher, reflecting the reality of many surveys where certain groups are more likely to participate than others.

Press Run Code on the code chunk below to simulate your sample.

After collating the sample, you estimate the prevalence of mindfulness. The code below computes that proportion.

Discuss your results

Discussion Guide: Comparing Random and Non-Random Sampling

Once both students have completed their scenario, share the results with one another. Teach the other about how the sampling process was carried out and compare the estimated proportion of students who practice mindfulness. Please use the following as a guide for additional discussion:

Potential Bias in the Selected Sample:

  • Discuss how the sampling process affected the composition of each sample:
    • How does this affect the conclusions you can draw from the study?
  • Explore the consequences of using the non-random sampling process:
    • Would the findings from the study apply to the general population?

Real-World Applications:

  • Reflect on when it might be acceptable to use non-random sampling (e.g., when convenience or targeting specific groups is necessary).
  • Highlight the importance of acknowledging and adjusting for bias when using non-random samples in research.
  • How might this issue affect your own research?