The last chapter of Malcolm Gladwell’s book Blink introduced me to the concept of screens used for orchestra auditions. For those that are unfamiliar: a screen is set up to hide the identity of the performer from those evaluating the candidate.
Using data from actual auditions in an individual fixed-effects framework, we find that the screen increases by 51% the probability a woman will be advanced out of certain preliminary rounds. The screen also enhances, by severalfold, the likelihood a female contestant will be the winner in the final round.
Claudia Goldin, Cecilia Rouse - Orchestrating Impartiality: The Impact of “Blind” Auditions on Female Musicians
These findings got me thinking about possible ways to apply this to the interview process for technology. For many tech companies, the decision to hire candidates is based on their ability to write code to solve a series of problems, often evaluated via a combination of homework, a technical phone-screen, and in-person interviews. During this process, the candidate’s resume and code is reviewed by multiple people, and bias can arise from surprising sources.
The results show significant discrimination against African-American names: White names receive 50 percent more callbacks for interviews.
Marianne Bertrand, Sendhil Mullainathan - Are Emily and Greg More Employable than Lakisha and Jamal? A Field Experiment on Labor Market Discrimination
A Simple Tool to Remove Bias
As a first step, I decided to build a simple tool for blinding public LinkedIn profiles by hiding data that could be used to infer the race, gender, or age of the candidate. I included the option to hide company and educational institution names to avoid advancing candidates based on pedigree alone.
LinkedIn has an API, but they don’t offer a user’s public profile without that specific user logging in via oAuth. As a result, I had to go the screen-scraping route and process the contents using Nokogiri. This leaves the project in an especially brittle state, where a small change on LinkedIn’s side could mean that the content is rendered incorrectly or that it throws errors.
Public-profiles on LinkedIn use two totally different templates. Here’s an example of my profile and a former colleague’s profile. I’m unable to explain the reasoning for it, but parsing and replacing text within both profiles without completely breaking either required a lot of effort.
Certain portions of a resume — summaries, volunteer work, and awards — can provide identifying information about the candidate. Some of it made sense to hide, i.e. the user’s co-workers and people in their network, but other information, like awards, seemed too important to omit.
The link you view when browsing LinkedIn isn’t the public profile of the user, and there isn’t an API to convert it to a public profile either. I’m guessing this will lead to a lot of head-scratching and a poor user-experience.
Valid Public Profile Links
Non-Public Profile Links
LinkedIn has a number of anti-screen-scraping protections in place. As an example, this is what happens when you make a request with certain user-agents or from certain IPs.
$ curl -I 'https://www.linkedin.com/in/philipcorliss' HTTP/1.1 999 Request denied Date: Mon, 20 Oct 2014 22:00:20 GMT Server: ATS X-Li-Pop: PROD-ELA4 X-LI-UUID: 24URwzD6nhPwF54ilysAAA== Content-Length: 511 Content-Type: text/html
Next, I intend to integrate with GitHub to create a completely blind code review tool that companies can use to assign coding challenges to candidates.
Code & Link
There’s no code provided for this project as it’s a mess of untested spaghetti mostly built as a proof of concept. But you can visit and use the project freely at BadBias.herokuapp.com.