Homework 01: Simulation and Estimation

Examining the accuracy (bias) and precision (spread) of a few measures of center.

Homework
Published

February 6, 2023

Overview

Suppose we want to examine the accuracy (bias) and precision (spread) of a few estimates for the center or middle of a distribution. The four statistics we are going to compare are: the mean, the midhinge, the trimmed mean and the median.

  • The mean. You can use the mean() function here.
  • The midhinge, the average of \(Q_{1}\) and \(Q_{3}\). Use the function you created in worksheet 1.
  • The trimmed mean is the sample mean with a certain percent of the extreme values removed. We will look at a 3% trimmed mean which you can calculate in R with the function mean(x,0.3).
  • The median is the number that divides the sorted data into two equal halves. Use the median() function.

We will compare the behavior of these statistics by creating sampling distributions of these statistics calculated on samples drawn from a Poisson distribution with known parameter \(\lambda\), using 2 different sample sizes (15, 50) and two different values of \(\lambda\) (0.5, 2). Use 10,000 replications to create your sampling distribution.

You should observe a histogram of the generated data to get a feel for the data, but do not need to include copious amounts of plots in your submission.

For each simulated sampling distribution, calculate the mean, standard deviation and bias for each of the three statistics under consideration. The first one is done for you as an example.

set.seed(8675309) # you must choose a different seed. 
samp.dist <- replicate(10000, {
  x <- rpois(15, .5)
  mean(x)
})

n <- length(samp.dist)
(E.x <- sum(samp.dist)/n)
[1] 0.49894
(sd.x <- sqrt(sum((samp.dist-E.x)^2)/(n-1)))
[1] 0.1822391
(bias <- E.x - .5)
[1] -0.00106

Copy the markdown table code below into your assignment file to help you summarize your findings. Then write a paragraph or two that summarizes your results. Which statistic would you recommend if estimating \(\lambda\)? Does your answer differ depending on sample size \(n\) or the value of \(\lambda\)? Compare the different measures of center with respect to the accuracy and bias. Use complete sentences.

## Mean

| Parameters            | Exp Value | Std Dev |  Bias  |
|-----------------------|-----------|---------|--------|
| $n=15, \lambda = 0.5$ |  0.4989   | 0.1822  | -0.001 |
| $n=15, \lambda = 2$   |           |         |        |
| $n=50, \lambda = 0.5$ |           |         |        |
| $n=50, \lambda = 2$   |           |         |        |

## Trimmed Mean

| Parameters            | Exp Value | Std Dev |  Bias  |
|-----------------------|-----------|---------|--------|
| $n=15, \lambda = 0.5$ |           |         |        |
| $n=15, \lambda = 2$   |           |         |        |
| $n=50, \lambda = 0.5$ |           |         |        |
| $n=50, \lambda = 2$   |           |         |        |

## Midhinge

| Parameters            | Exp Value | Std Dev |  Bias  |
|-----------------------|-----------|---------|--------|
| $n=15, \lambda = 0.5$ |           |         |        |
| $n=15, \lambda = 2$   |           |         |        |
| $n=50, \lambda = 0.5$ |           |         |        |
| $n=50, \lambda = 2$   |           |         |        |

## Median

| Parameters            | Exp Value | Std Dev |  Bias  |
|-----------------------|-----------|---------|--------|
| $n=15, \lambda = 0.5$ |           |         |        |
| $n=15, \lambda = 2$   |           |         |        |
| $n=50, \lambda = 0.5$ |           |         |        |
| $n=50, \lambda = 2$   |           |         |        |

Bonus credit

Create a plot of the bias as a function of n, lambda, and the type of statistic.