There’s a joke that I love: There are two types of people in the world Those that can extrapolate from incomplete data. It’s funny! But it also hits a little close to home. I am of course referring to the data being used in the training of Machine Learning (ML) models and how it’s not always what it should be. According to Statista , the global volume of data created, copied, captured and consumed is more than 150 ZETTABYTES, with 90% of this data from the last few years! To