Introducing FairNLP

Published at Jan 5, 2024

Introducing FairNLP

FairNLP is a new organisation that is developing fair, open-source algorithms for natural language processing. Our goal is to make algorithmic fairness more simple and accessible, such that everyone can create more ethical applications with less effort. FairNLP is self-funded.

Background

Algorithmic fairness for machine learning is a complicated organisational topic. On the one hand, there is a desire to create ethical algorithms and minimise human harm, but on the contrary there is a driving force to be the first and beat the competition. Often these two goals do not go hand in hand, since algorithmic fairness is complex and time-consuming to implement and understand. This often leads to the complete ignorance of ethical considerations (i.e. moving to production and hoping for the best), which can have dangerous results. Alternatively, the technology is also often abandoned, being sketched off as a black box (we don’t understand it so we don’t use it). Given the impressive capabilities of modern NLP systems can’t be ignored, this is naturally also an undesired option.

But if algorithmic fairness is so important, why is it so difficult? To some degree, we ourselves are to blame. In just over a decade, we have managed to push the capabilities of both software and hardware to a point where we see widespread adoption in society. A big contributing factor to this trend is the fast-moving pace of open-source ecosystems like PyTorch, TensorFlow and Hugging Face. Barriers are found, innovative solutions are created in response, and the frontiers are pushed, one experiment at a time. But in the meanwhile, ethical considerations are not as trendy since they don’t bring in any money. The race for AI is at its peak, yet fairness as a topic is severely underaddressed.

Legal framework and the AI act

With great power comes great responsibility, and there have been numerous cases of misuse when it comes to AI technologies. Apart from criminal intentions however, stand out reports from organisations who create harmful algorithms without intending to do so (e.g. Microsoft’s Tay chatbot, Google’s image recognition, legal COMPAS software). In such cases, there is no intention of harm and the results are only discovered once it´s too late.

But the legal landscape is changing. In a matter of months, likely even before 2025, the AI act will become enforceable in the EU. Considering the effect GDPR has had (and still has) on data-driven applications, we can expect the AI act to also bring a lot of work for practitioners.

Where current toolkits lack

Firstly, it should be noted that there are open-source communities behind algorithmic fairness (e.g. FairLearn, AIF360). Although I really like the academic basis and thorough documentation, we find them to have a steep learning curve and be unsuitable for the common use-case: I want to use this well-performing Hugging Face model from which I know it is biased.

xkcd: Standards

Where FairNLP aims to help

At FairNLP, we acknowledge that algorithmic fairness is complex and requires thorough research, but we also aim to be pragmatic by acknowledging that a simple fairness solution is better than no fairness solution. In a nutshell, this means that debiasing your transformer model shouldn’t take more than 2 additional lines of code: a library import and a wrapper class. What exactly is done behind this wrapper class to ensure fairness can be very complex, but by keeping the interface minimal, we are keeping the software accessible.