Every day we use algorithms to make decisions. We use algorithms to decide if it’s worth getting up and having breakfast, if taking the bus would be quicker than cycling to the office. We even use algorithms to decide when to have lunch. Most of us don’t explicitly come up with a mathematical formulation for any of these algorithms but they are still algorithms.
For each of our day-to-day algorithms we make choices about the input data to take into account. Perhaps this data is what happened last time you got the bus or whether you actually have any milk for your cereal to make it worth getting up and having breakfast. But in each case your ‘training data’ is biased towards your own personal experience and what you might, or might not, count towards good data.
If we take the breakfast example; You might be biased towards eating coco pops for breakfast. You take into account milk and cereal in your data and exclude the dozen eggs and bread you might have that would still make a decent breakfast. Thus, when you ‘run your algorithm’ you determine that you shouldn’t bother getting up as you don’t have breakfast available. This is a bias that you’ve put into your algorithm based on what you decided to select as your input data.
What happens when we look at algorithms that are created for companies? These algorithms will take historical data in order to create a model based on what the coder determines to be a success. If the coder is biased towards a certain outcome, data may be collected that skews the information. It may even be the case that the measure of success itself chosen for the model is unfairly biased to one outcome.
Sometimes these unfairly biased measures of success create a cycle that reinforces that initial measure of success. This type of algorithm is dangerous, as Cathy O’Neill in the linked TED talk explains. A self-perpetuating biased model won’t help in creating a more even world but may in fact just make matters worse.
If you are someone that is being measured based on some formula; try to find out what that formula is, see if you can get it explained to you if you don’t understand it, and make sure that you aren’t being unfairly measured. Remember: “Algorithms are opinions embedded in code” – Cathy O’Neil (See her TED talk below)