Gender bias in blinded review

A study finds that men use more broad language, and that women are unfairly discriminated against for this reason. I look into this study.

Jonatan Pallesen
04-25-2019

Introduction

In this study they use grant proposals to the Gates foundation to investigate discrimination against women. They find that men use more broad language, and that women are unfairly discriminated against for this reason.

Here is a piece about it in Nature and Science. It is getting a lot of play on Twitter:



In this post I do the unthinkable, and skim the paper.


Analysis

First, according to this figure from the paper, men do not tend to use more broad words. They are essentially randomly distributed around the line. (Narrow words are more below the line though.)


Notice that broad and narrow has a specific definition in the paper, that is not what you would intuitively think. Eg health and community are narrow, and low-cost is broad. Immunity is narrow, but immune is broad.


Second, they note in the sample that women in the sample are on average significantly less experienced / qualified. This makes it very complicated statistically to disentangle discriminatory effects from uncontrolled group difference effects.

For example, they say in the abstract that “Despite blinded review, female applicants receive significantly lower scores, which cannot be explained by reviewer characteristics, proposal topics, or ex-ante measures of applicant quality.” But since there is large difference in group average, this doesn’t mean that you can conclude that the lower scores are then due to use of broad words or gender. An applicant’s quality is not fully determined by those things they mention, so the difference in scores could be caused by a large number of unmeasured characteristics.

It is very hard to separate signal from noise in such a data set, where in addition to the above issues, there are also potential issues with measurement error, tail effects, randomness, multiple testing and more.


Conclusion

I think this is a typical case of a study that is not bad as such, but it is a weak design to begin with, and that is very unlikely to enable us to draw conclusions with high confidence. However because it is a sexy topic, and draws conclusions about gender discrimination, it receives a lot of attention. I think it is important to show epistemilogical humility and not conclude from this one study that science has proven that there is gender discrimination based on word usage, or similar.

An aspect that makes the whole thing harder for people who would like to be more careful and look into research themselves, is that the article is behind a paywall. I have a hard time understanding why the Gates foundation would fund research with such potentially broad implications, only to hide it behind a paywall. Not a good system.