Free and open source software and AI: A pairing for open innovation
In the ever-evolving digital age, the intersection between free and open source software and AI is emerging as a crucial focus of technological innovation. Our analysis aims to explore the dynamic and fruitful link between these two driving forces of the digital process, by examining the philosophical underpinnings of free software and how it intertwines with advances in AI in the context of an increasingly open and collaborative technological landscape.
Free and open source software
Free and open source software (FOSS) is an alternative to the closed source software model associated with most commercial software licences. In its literal sense, the FOSS model promotes innovation and the free flow of source code, as opposed to the proprietary software model, which is based on the owner's prerogative to prohibit third parties from exploiting their rights.
Free software is essentially a philosophical movement based on the idea of freedom of use of software, which sees non-free software as a social problem and an obstacle to innovation. According to the Free Software Foundation, software can be said to be "free" if it respects four fundamental freedoms of users and the community:
- the freedom to run the program at will and for any purpose
- the freedom to study how the program works and modify it to suit specific needs
- the freedom to redistribute copies
- the freedom to make improvements to the program and to distribute them publicly so the whole community can benefit
The open source movement has a more pragmatic connotation, focusing on the accessibility of the source and object code of the software. The distribution of open source software must be free, the licence must not restrict the right to sell or donate the software, and no royalties can be charged for its sale. In addition, the licence must allow users to modify the original programs and create derivative works and distribute them under the terms of the original software licence. In addition, the licence must not impose restrictions on the distribution of modified source code unless patch files are distributed with the original code, and must not discriminate against any person or class of persons, or any area of use. The rights in the program must be available to all those to whom it is distributed, without the need for additional licences. If the licensed program is part of a software distribution, the licence must not be specific to that distribution, but must guarantee each recipient gets the same rights as those guaranteed with the original distribution; the licence must also not contaminate other software distributed with the main one, and must respect the principle of technological neutrality.
Despite the differences between the two schools of thought, in practice it's common to use the terms "free software" and "open source software" interchangeably. And it's not uncommon to refer to this phenomenon as free and open source software.
FOSS licences
More than 40 FOSS licences coexist in the current landscape, differing in both the rights granted and the conditions imposed. FOSS licences can be categorised according to the level of protection afforded to the software.
Restrictive licences (also known as "copyleft" licences) restrict the redistribution of derivative works to ensure the code remains open. They grant software users the right to use, copy, modify, improve and redistribute the software, provided that reciprocal conditions are met. In other words, they prevent the application of pejorative conditions that could return the software to the proprietary model. An example is the GPL licence.
Permissive type licences (also called "non-copyleft" licences) allow wide use of the source code even in non-open source programs (including possible modification) and don't restrict the distribution of derivative works. The BSD and Apache are two examples of non-copyleft licences.
In between these two extremes are certain grey area licences with weaker copyleft clauses, such as the LGPL or the MPL.
By far the most common open source licences for AI-based software releases are permissive licences.
The benefits of the open source-AI intersection
The intersection of the open source and AI worlds represents a turning point in the ongoing technological revolution. The open source community fosters the development of increasingly advanced AI systems, giving public or private actors in the sector rapid access to resources, stimulating the innovation process, without the need for the huge initial investments associated with proprietary software licences.
Examining the source code makes it possible to understand the operating mechanisms of the AI system, increasing transparency and user confidence: if the algorithms are understandable and verified, a high degree of fairness and constant prevention of bias can be guaranteed.
Models and code can be adapted to specific needs, offering a flexibility that simple open source solutions cannot provide in other contexts.
The intersection of AI and open source also fosters the creation of a collaborative ecosystem where developers can share ideas, contribute to improvements and accelerate technological progress.
The availability of the source code then helps to increase the level of security, allowing constant monitoring and subsequent revision of the functioning of the systems to respond quickly to threats in cyberspace.
Implementing open source AI
An AI-based system is made up of more components than traditional software, so the definition of open source needs to be adapted and expanded: in the context of AI, there is no source code per se, and the key element is the data used to train the system.
In the case of proprietary AI systems, there is usually a "per token" charge, with the level of charge varying according to the models and their responsiveness. Fees often include hosting and support costs, and the per-token fee is highly volatile. Open source AI systems are often free to download, and although they can be expensive, at least in the installation phase, they prove to be cheaper for customers who use them extensively.
There are many ways to use an open source AI system, and the most common scenarios include API or function calls, embedding, linking, modification and translation. It's very common for software to be both proprietary and open source, and in most cases API calls – ie processes by which two pieces of software exchange data – or function calls are used. In these cases, there's no need to contaminate or mix the source code. A developer could also copy part of a FOSS into a proprietary product: this is called embedding. The result of this operation is contamination, and it's important to note that copyleft licences require all source code to be open if the resulting product is to be distributed. It's also common for a developer to link or merge a FOSS component with a proprietary product, in which case there is no source code contamination. Similar to API calls, linking to AI models from other source code is a very common form of implementation. Open code allows developers to make changes to a FOSS component, including adding new FOSS components, correcting and optimising the original components, and deleting certain parts of the code. Finally, source code can be translated, eg from one programming language to another.
AI Law and Product Liability Directive: Future prospects for the European approach to FOSS
The European Regulation on Artificial Intelligence (AI Act), in its latest adopted version, has led to confusion about the regulation applicable to open source software. Initially, Art. 2(5)(g) of the version of 26 January 2024 excluded the obligations of the Regulation for AI systems released under open source licences, unless they were high risk or fell under Titles II and IV.
But the new version addressed the same issue in Art. 2 para. 12, making the Regulation applicable to open source systems, which should be exempted from the Regulation because they promote sharing and innovation. An interpretation error in the last revision led to the belief that FOSS might not be exempted, but then this mistake was corrected in the final version of the text published in the Official Journal of the European Union.
Free software that will benefit from the exemption will be software whose parameters (including information about the model architecture) are made public and are not made available against payment or monetised (eg by offering paid technical support after release).
To get a clear idea of the European legislator's approach, it's worth looking at recitals 102 and 103 of the latest version of the Regulation, which provide that an AI system released under a free and open source licence that allows it to be openly shared and that allows users to freely access, use, modify and redistribute the data or modified versions, contributing to market research and innovation and offering significant growth opportunities for the EU economy, should not fall under the discipline of the Regulation.
The new Product Liability Directive, which is part of the package of European measures to promote AI, also excludes FOSS from its scope. Once again, this choice seems justified by the promotion of research and innovation in the European market and by the fact that Free Software, which is not developed or supplied in the course of a commercial activity, is by definition not "put on the market." On the other hand, where software is provided for remuneration or personal data is used in the context of a commercial activity, the Directive applies.