## 14. Point biserial correlation

It is a special case of correlation in which one of the variables has only two possible values, and these values represent different groups.

For instance, it is possible to find the correlation between height and gender. At first, this may seem impossible, because gender is not quantifiable, and you need numbers for both variables to calculate r.

However, you can arbitrarily assign two different numbers to the two different groups and then calculate the correlation. It doesn’t matter what two numbers you assign: You will get the same r if you use 1 and 2, or 3 and 17. The r you get, which is called the point-biserial r (symbolized rpb ), is meaningful.

Suppose you assign 1 to females and 2 to males and correlate these gender numbers with their heights. In this case, r will measure the tendency for the heights to get larger as the gender number gets larger (i.e., goes from 1 to 2).

If we assign the larger gender number to females, the sign of r will reverse, which is why the sign of rpb is usually ignored.

Sources:
Essentials of Statistics for the Social and Behavioral Sciences, Barry H. Cohen and R. Brooke Lea (Click for eBook)

