# R script -301114 -Nature of Data

Question 1

Our goal is to examine if the sample is representative of the population and if the allocation of medication

and placebo is random.

1. Compute two frequency tables:

i. 1 × 3 table of frequencies containing the number of participants in each suburb

ii. 2 × 3 table of frequencies containing the number of participants from each suburb and the type of

medication given

2. Test if the distribution of suburbs follows the proportions: 50% Parramatta, 30% Campbelltown and

20% Penrith. Make sure to write out the two hypotheses, compute the p value, and write a conclusion

for the test.

3. Test if the medication type is independent of suburb. Make sure to write out the two hypotheses,

compute the p value, and write a conclusion for the test.

4. Answer only one of the following two questions!

i. Describe in words how the independence of two variables is defined in terms of their probability.

Write out equations if needed.

ii. Why did you use the chosen hypothesis test? What requirements does it have in order to be

successfully used and were those requirements satisfied? In general, what does the p-value

represent? Be brief; consider bullet points.

Question 2

Before we examine the effect of the medication, we first examine if there is any dependence of heart rate on

each participant’s suburb.

1. Compute the sample mean “before medication heart rate” for each suburb.

2. Test if the “before heart rate” mean is equal for all suburbs. Make sure to write out the two hypotheses,

compute the p value, and write a conclusion for the test.

3. Provide the set of 95% confidence intervals for the difference in means between “before heart rate”

measurements for each pair of suburbs. State which pairs of suburbs show a difference in means and

example your choice.

4. A colleague suggests you could perform a hypothesis test for each pair of suburbs using a t.test on each

data pair. Why is this not a good idea?

3

Question 3

To determine the effect of the medication on each participant’s heart rate, we will compare the mean after

medication heart rate for those who have been issued the medication, to the mean after medication heart

rate of those who were issued a placebo.

1. Compute the mean of the “after medication heart rate” for those that received the medication, and the

mean “after medication heart rate” for those that receive the placebo.

2. Test if there is a difference in mean “after medication heart rate” for those that received the medication

compared to those that received the placebo. Make sure to write out the two hypotheses, compute the

p value, and write a conclusion for the test.

3. Compute the 90% bootstrap confidence interval of the mean difference in “after medication heart rate”

between those that received the medication and those that received the placebo.

4. Describe why it is important to use a control treatment (such as the above placebo) when examining

the effects of medication.

Question 4

Finally, we want to examine the change in heart rate for each participant, regardless of the medication

received.

1. Report the slope and intercept for the linear model, modelling the heart rate after medication with

respect to the heart rate before medication.

2. Use a permutation approach to test if the population slope is 1. Make sure to write out the two

hypotheses, compute the p-value and write a conclusion for the test.

3. Compute the 95% confidence interval for the slope.

4. Describe in words how we can determine if the linear model is a good fit of the data