Fancy stat terms I

Random stuff: Rademacher distribution

A month or two ago, I saw this comic on xkcd:

xkcd comic This just shows how folks might use different ways to say the same thing (something simple can appear complex and vice versa). And then, in that same week, I was reading an article that mentioned drawing a random variable from a Rademacher distribution– which was a distribution I had not heard of (w/c has a M = 0, SD = 1) and somehow the name alone made it sound complicated. This just means that the variable takes on a value of -1 or +1 with equal probability.

\[ x = \begin{cases} +1 & \text{with probability p = 1/2} \\ -1 & \text{with probability p = 1/2} \end{cases} \]

Sometimes, these are referred to simply as Rademacher weights (and is named after German-American mathematician Hans Rademacher [1892 - 1969]). Although this sounds fancy, you can just flip a coin:

  1. if it’s heads, x = +1
  2. if it’s tails, x = -1

That’s it. Repeat \(m\) times. That’s the distribution.

In r, we can draw 100 numbers from a Rademacher distribution using:

set.seed(244)
x <- sample(c(-1, 1), size = 100, replace = TRUE)

Can inspect the variable:

table(x)
## x
## -1  1 
## 50 50
mean(x)
## [1] 0
sd(x)
## [1] 1.005038

I’ve seen more complicated ways using to generate this using 2x - 1. It is a recoding of a Bernoulli distribution:

x1 <- rbinom(100, 1, .5) * 2 - 1

or just using:

x2 <- c(-1,1)[rbinom(100, 1, .5) + 1] 

But, I think just using sample is simple enough. We can always benchmark the time (if interested). Often, these weights can be used for bootstrapping so speed is of interest.

set.seed(123)
library(rbenchmark)
benchmark(
  'a_sample' = { #using a coin flip
    x <- sample(c(-1, 1), 10000, replace = TRUE)
  },
  
  'b_rbinom' = { #2x - 1
    x1 <- rbinom(10000, 1, .5) * 2 - 1
  },
  
  'c_rbinom2' = { #similar to sample
    x2 <- c(-1,1)[rbinom(10000, 1, .5) + 1] 
  },
  
  replications = 5000,
  columns = c("test", "replications", "elapsed",
              "relative", "user.self", "sys.self")
)
##        test replications elapsed relative user.self sys.self
## 1  a_sample         5000    4.50    1.000      4.41     0.07
## 2  b_rbinom         5000    7.22    1.604      6.89     0.03
## 3 c_rbinom2         5000    7.83    1.740      7.55     0.03

– END

Related

Next
Previous
comments powered by Disqus