Choice of LLM and hosting mode, governance framework, business involvement, security and compliance... There are a large number of factors to consider when conducting a generative AI project. Here are some explanations.
Unless you've spent the last two years on a desert island, you can't have escaped the generative AI phenomenon. ChatGPT, Gemini, Claude and Copilot are all getting a huge boost thanks to their performance and intuitive interfaces.
Everyone has had the opportunity to question them, whether privately or professionally, and to be amazed at their ability to respond correctly to a wide variety of subjects.
However, according to Redha Moulla, an independent trainer and consultant in artificial intelligence, there is a huge gap between the 'wow' effect experienced by the average user and the difficulty of successfully completing a generative AI project within a company.
Car "Unlike other technologies, members of senior management have the opportunity to try out ChatGPT. Because it responds quickly to their questions, they don't realise the complexity that lies behind the impression of simplicity.
Their initial interactions lead them to believe that generative AI is easy to deploy.
In fact, once you're past the POC (Proof of Concept) stage, scaling up is far more complicated. According to a study by Deloitte, 68 % of organisations say they have put 30 % or less of their projects into productiona particularly low ratio.
A generative AI project is not a machine learning project
One of the main reasons for failure is the nature of the data used. While 'traditional' machine learning AI essentially uses structured tabular data in CSV or Excel format, theGenerative AI uses unstructured data (text, images, video), which is particularly rich and complex.
[Read our article on the difference between machine learning and deep learning]
"A company's document base is based on Word or PDF documents in very different formats, some of which were created ten years ago, explains Redha Moulla. They have to be sorted, redundant files removed and relevant information extracted. This preparatory work, which is time-consuming, costly and requires manual handling, can prove prohibitive and seal the end of the project.. »
Another difference is that a machine learning algorithm has to be trained from scratch, a generative AI model arrives pre-trained. This difference changes the composition of the teams involved, as Hervé Mignot, Chief AI Officer of digital transformation consultancy Equancy, points out.
"A generative AI project calls for prompt engineering and traditional IT integration skills. It's only at the end, during the fine-tuning phase, that the contribution of data scientists is felt.
In his opinion, many technical profiles can develop their skills in generative AI without having any previous experience in data science. Conversely, " Data scientists may shy away from the idea of participating in generative AI projects. They chose this profession to design models with a strong statistical dimension and prefer to stick with the predictive approach of machine learning".
The trade expert, the justice of the peace
Whereas The involvement of business managers is a key factor in all IT projects, and is even more critical in the field of generative AI. The aim of a large language model (LLM) is to optimise or even reshape workflow processes. " Defining the stages in the work of people you want to 'augment' [using AI] means working closely with operational staff »says Hervé Mignot.
Our expert takes the example of a sales department that wants to automate the response to calls for tender. " What processes will AI be involved in? It can analyse the consultation file, generate a summary note, but also provide answers based on existing content.
For Hervé Mignot, the trade expert is the only judge of the peace. "It is he who decides that an AI model is of sufficient quality to be put into production. The use of agile methods means that we can continuously show operational staff the different iterations of the project.
To facilitate exchanges between the developers of the model and the "Ops", the teams responsible for putting it into production, it recommends adopting an LLMOps (Large Language Model Operations) approach. This set of practices, methods and tools makes it possible to effectively manage the deployment, monitoring and maintenance of LLMs.
Implementing LLMOps best practice also helps to ensure the stability of an AI system over time. " In the event of a version change of the foundation model, the functioning and performance of the application based on it must not be altered.notes Hervé Mignot. Similarly, if the quality of incoming data declines, we need to ensure that it does not degrade the model. This may mean going back to the drawing board to qualify the data".
LLM or SLM? Proprietary or open source?
But first there's the choice of model. Since the launch of ChatGPT, the LLM family has continued to grow. There are proprietary models (Google's Gemini, Anthropic's Claude, etc.) and their open source equivalents (Meta's Llama, Mistral AI, etc.). There is also a distinction between large language models (LLMs), such as GPT-4 and its 175 billion parameters, and 'small' models (SLMs), designed to perform specific tasks.
Hervé Mignot advises starting your generative AI project with a generalist LLM. This enables the relevance of the model to be validated without being hampered by limited performance.
"An SLM can be relevant when the domain covered by the use case is sufficiently small. "Less demanding in terms of computing power, a small model has the advantage of reducing the economic and environmental cost of a generative AI project. It also offers greater security. An SLM coding assistant can be installed locally on the developer's workstation, rather than being hosted in the cloud.
In all cases, it recommends the use of Recovery Augmented Generation (RAG) technology. It consists of improving the responses of a model by relying on an internal knowledge base that is deemed reliable and independent of the LLM training data. For example, the company's document database.
As a service" approach or on-premises hosting
Then there's the question of hosting. There are two possibilities for a company. The most common solution is to use a model hosted in the cloud, via an API. The American hyperscalers, Google Cloud, AWS and Microsoft Azure... but also French players, such as Scaleway and OVHcloud, offer this "as a service" approach.
" Some companies are reluctant to use this method for reasons of data security and confidentiality", moderates Redha Moulla.
Another solution is for the company to host an open source foundation model on its own servers (on-premises). This involves investing in computing power. ad hocThis is a particularly expensive resource. What's more, the company will have to maintain this dedicated infrastructure without being able to pool it as cloud players do. Once again, this can be a particularly costly and complex choice.
Safety, compliance and post-production monitoring
Finally, the life of a generative AI project does not end when it goes into production. The AI model must then be monitored like milk on the fire to ensure that it does not drift over time and become the subject of hallucinations. Studies show that a generative AI model can produce up to 21 % of erroneous content or "hallucinations", which can have harmful consequences.
At the same time, cybercriminals have developed a range of so-called "prompt injection" techniques to force a template to generate unwanted, misleading or toxic content.
If the model is for internal use only, "Safeguards must be put in place to ensure that employees do not have access to unauthorised documents. "adds Redha Moulla. According to a survey by Cybersecurity Ventures, 60 % of data leaks come from internal sources.
Finally, organisations need to anticipate the requirements of the AI Act, the European regulation on AI that will come fully into force in 2026. To comply, they need to map their models in production now and classify them according to their level of risk.
To sum up, a successful generative AI project requires a multidisciplinary approach, combining technical skills, strong business involvement, particular attention to security and compliance, and rigorous management of resources and expectations.