Home > Digital technologies > AI and big data > AI: code generators, a revolution for developers

AI: code generators, a revolution for developers

Published on February 7, 2024

Share this page :

Development is one of the areas most affected by advances in AI. The phenomenon did not wait for ChatGPT. How can AI facilitate software and web design, programming and optimisation? What are the best AI code assistants? What are their limitations?

"AI will be writing 80 % of code within 5 years".
Thomas Dohmke, CEO of GitHub

"AI will be writing 80 % of the code within 5 years" was the prophetic and slightly provocative statement made in 2022 by Thomas Dohmke, the boss of GitHub, the source code collaboration and management platform used by more than 100 million developers.

The Generative AI such as ChatGPT, Copilot and Gemini (formerly Google Bard) have been instrumental in democratising artificial intelligence. Based on major language models (LLM)These AIs are not just used to create text or images. They can also generate code, comment on it, suggest optimisations or hunt down bugs, saving developers precious time.

Launched in 2021, GitHub Copilotthe most popular code-generating AI, is already counting in October 2023 more than a million paying users in over 37,000 organisations !

That's all it takes to make 2024 the year of AIs specialising in code. All the more reason to rekindle the debate sparked by the trend towards AI. low-code/no-code on the disappearance of the developer profession, or at least its inevitable evolution.

It only took 16 minutes to create this game!

Thanks to FRVRIt took just 8 minutes to create the code for Space Aliens and a further 8 minutes for the design.

Code, the ideal data to feed AI

Code is one of AI's favourite fields. Long before the arrival of ChatGPT in November 2022. These include the IBM Watson Code assistant in 2015, DeepCode in 2016 to find security flaws in a program or Microsoft IntelliCode in 2017 to improve the semi-automatic code entry of IntelliSense that appeared ten years earlier in Visual Studio.

This is hardly surprising, since these AIs are based on great language models (LLM) trained on large text-based datasets. And to feed themachine learningThese AIs need structured data. Code is the ideal candidate: in text format, structured, available in large quantities, open source, in huge repositories such as GitHub or SourceForge.

And so, GitHub Copilot has been trained on GitHub's 54 million repositories, including a dataset of 159GB of Python code.

2. The Limits of Code AIs

3. List of the best AI for code generation

What is the purpose of code AIs?

Using artificial intelligence has therefore become common practice for many developers. Especially as these tools are readily available. They can be found in the form ofextensions in the most common development environments (IDEs)such as Visual Studio Code, Eclipse or JetBrains' specialist software (IntelliJ, PhpStorm, PyCharm, WebStorm...).

To date, there are several hundred development aid tools, covering a wide variety of applications. And the number is growing all the time. New AI-based tools are appearing every week, meeting the demand from developers for more efficient, high-performance solutions.

The AI can be used for a wide range of development purposes :

***Code generation and bug fixing account for half of Tabnine users' uses***

Automatic code generation

This is by far the function most used by developers. The AI creates code from a request (a prompt), a description in natural language of what you want, or from code already entered by autocompleting it. This last autocompletion function is an improvement on the semi-automatic input systems that have long existed in IDEs. But instead of just guessing the next word, the AI will generate one or more lines of code based on probability statistics.

The AI only makes suggestions. The developer can refine his request iteratively to obtain the desired result.

The time savings are enormous. Developers can code faster (55 % faster according to GitHub statistics), more accurately and with a better understanding of the code.

**GitHub Copilot lets you create a button component in a second. All you have to do is ask!**

Bug detection and correction

Code AI is also used to debug code, to validate it against the syntax of a given language, and to detect errors that are difficult to find in complex code. The AI analyses the code submitted to it and detects any errors, such as :

syntax errors A semi-colon is missing at the end of an instruction in Java or JavaScript, a closing tag is missing in PHP, or there are too many braces in a CSS...

execution errors : use of an uninitialised or deleted variable (NullPointer), attempt to access an object that is no longer accessible in memory (PointerException), call stack too large (StackOverflow)...

logic errors division by zero, arithmetic overflow, non-existent function calls, etc.

input-output errors errors: errors reading from or writing to a storage medium, attempts to open a file that does not exist, etc.

formatting errors unexpected format by a character string, incorrect data type, etc.

Code optimisation and rewriting

AIs are capable of improving code quality, execution speed and readability. They suggest changes to be made: refactoring code (simplifying structure, making code more modular and reusable, applying SOLID principles, etc.), reducing superfluous loops, parallelizing tasks, optimizing data structures and input-output, etc.

AIs sniff out code smellsThese are bad software design practices that can make code more difficult to debug and increase the risk of bugs. Code duplication, non-explicit variable names or excessive coupling (classes and modules that are too interdependent) are tracked down in a matter of seconds.

In addition, theAI formats code according to the conventions and best practices of a given language while adding comments and documentation. This makes the code easier to read, understand and maintain, enabling developers to get back to the project quickly, even those who have not been involved.

Code analysis and explanation

To better understand a code, a developer can ask an AI questions in natural language via a chat interface. Some AIs are also capable of generating code visualisations to help developers better understand the structure of the code, its logic and the relationships between the different parts that make it up.

The massive use of AI to explain code has led to a fall in the audience of Stack Overflow, the famous question-and-answer forum consulted by millions of developers, causing 28 %s to be removed from its staff in October 2023.

Code documentation

Completing the code means writing its documentation, a tedious and often neglected stage. All too often, developers are pushed to deliver working code faster and faster, leaving documentation in the background.

Artificial intelligence can change all that. By analysing the code, it can immediately produce complete, formatted documentation, guaranteeing clarity and readability. This revolutionary process considerably reduces technical debt and allows developers to concentrate on their core business: creating code.

Detecting security vulnerabilities

You might think that AI would be just as useful for correcting code as it is for detecting security flaws. They are. AIs such as Checkmarx CheckAI, BurpGPT, DeepCode, Codecy, GitHub Copilot detect vulnerabilities and make recommendations to close them, sometimes with just one click. They detect vulnerabilities known, anti-pattern security devices such as SQL injections or XSS. Some AIs can analyse the flow of data in the source code to identify potential entry and exit points for attacks.

However, we need to remain vigilant and not rely solely on AI. Manual static and dynamic tests remain imperative.

**Codacy analyses the code and displays the necessary corrections in your Git repository.**

Automated test generation

Thanks to its capabilities, AI is able to conduct automatic tests. This brings a number of benefits, including time savings, improved code coverage (identifying parts of code that have not been tested manually), better bug detection and a reduction in the costs associated with manual testing.

Test generation tools include DeepCode which usesmachine learning to generate unit and integration tests.

However, this technology is not yet perfect. It saves time, but still needs to be supplemented by manual testing, particularly to test the non-functional aspects of an application or site, such as usability or performance. So AI cannot yet be seen as a substitute for manual testing.

Converting from one language to another

Having mastered several programming languages, most code AIs are able to translate code from one language to another. This is not an easy operation, as moving from one language to another involves converting syntax (changing from block structure to indentation, for example), logic (changing paradigms and control structures), finding equivalent libraries in the target language, converting comments and documentation, and optimising the code for the target language.

So you can go from C++ to Python and vice versa, or from Java to Python in a matter of seconds. Automatic code conversion by AI speeds up the software development process on different platforms and avoids manual translation errors. In addition to the code AIs listed below, there are specialist AIs such as DeepCode, Polyglot, Code Trans and AI Code Translator.

Learn a new language

Code AIs support a large number of programming languages, breaking down the barriers that exist between different developers. So you can easily learn a new language by asking the AI to convert it from a language you already know. As we saw above, the AI can also explain existing code or test itself by asking it to correct your mistakes. The icing on the cake is that AIs such as ChatGPT or Gemini (formerly Bard) are capable of creating code exercises and MCQs to test your new knowledge.

For example, GitHub Copilot is able to understand and generate code in the main languages and frameworks, including :

Compiled languages C, C++, C#, Go, Java...
Scripting languages Bash, JavaScript, Powershell, Python, Ruby, TypeScript...
Markup languages: CSS, HTML, XML, JSON, YAML...
Database languages SQL, MySQL, PostgreSQL...
Specific programming languages SQL, R...
Frameworks Django (Python), Angular and React (JavaScript), ASP.NET (C#), Spring Boot (Java), Laravel (PHP), Qt (C++)...

Of course, artificial intelligence can't match the highly specialised and interactive training we offer at ORSYS. Our 246 training courses in software development and web are run by experienced professionals in the field who tailor their courses to your needs. AI does not replace human contact.

The limits of code generation AI

AI is revolutionising the development process at every stage. However, they still face numerous limitations and constraints that need to be borne in mind before using them.

A standardised code

Originating from open source code repositories, the code generated by AIs leads to standardised, repeated code that sometimes degrades quality and impoverishes the overall code. GitClear, a consultancy firm, has analysed 153 million lines of modified code written between January 2020 and December 2023. It concluded that

“ code rotation - the percentage of lines that are revised or updated less than two weeks after being created - the percentage of lines that are revised or updated less than two weeks after being created - the percentage of lines that are revised or updated less than two weeks after being created is expected to double in 2024 compared to its baseline value in 2021, before the advent of AI. "

According to GitClear, the increase in copy-and-paste code caused by AI is likely to worsen companies' technical debt over the next few years.

Uneven quality across languages

Code AIs are like all AIs: the quality of their response depends closely on the quantity and quality of the data used to train them. As a result, the more popular a programming language is in public code repositories such as GitHub, GitLab or Bitbucket, the more relevant the AI suggestions will be.

For example, JavaScript is very present in repositories. AI suggestions for JavaScript will therefore be highly relevant. Conversely, languages that are well-known but comparatively less present in code repositories, such as Julia, SAP's ABAP or MATLAB, will be poorly supported.

Programming biases

Public repositories that serve as training data for AIs may contain code that is poorly written or oriented towards a certain result. This can affect AI code suggestions such as generating discriminatory code.

AI does not understand complex code

At the moment, code generation AIs are unable to handle a project as a whole.

AIs are ideal for generating short code, code fragments such as snippets or functions.. Conversely, their suggestions are less relevant for long code. For the moment, working on a complex project means breaking it down so that the AI can analyse it part by part. However, the AI will not be able to take into account the whole picture or integrate the entire context of the project (budget, time, target platform, developer skills, available tools, stakeholder needs, etc.).

AIs suffering from hallucinations

Well-known to ChatGPT users, the hallucinations are answers invented by LLMs when they can't find the right answer to a question and presented as a certain fact. Applied to development, this can lead to false, insecure code that wastes the developer's time.

Copyright and data confidentiality issues

If AIs make extensive use of public data, there is undoubtedly some code subject to copyright, with risks of plagiarism. What's more, it's important to remember that when you use an AI, you are training it. By writing proprietary code or code subject to confidentiality agreements, you could be breaching your company's policy. Some publishers, such as Tabnine and GitHub for Copilot Business, do not store your code and only train their AI on open source repositories whose licence allows it.

Potential security flaws

Be careful not to rely too heavily on AI for the security of your data. The code generated by AIs is neither tested nor validated by humans. What's more, the quality of the code depends, as we have seen, on the quality of the data on which it has been trained. If the data is compromised or contains errors, the code generated could also be compromised.

What's more, the code generated by AIs can be difficult to audit and understand, making it hard to detect security flaws. Otherwise, only AIs specialised in cyber security will be able to analyse it!

Ethical issues

Programmes that involve humans (health, autonomous driving, selection of candidates, etc.) may raise ethical issues that are not dealt with by AIs or that may lead to biased results. Would you entrust the lives of patients or users of autonomous vehicles to an AI?

The future of development will be written by AI

While there is still room for improvement in the tools, AI is making steady progress and is already providing many services to developers, who can concentrate on other, higher added-value, collaborative tasks. For example, they will have more time to attend the many project meetings (coordination, demos, monitoring, validation, retrospective, etc.), where, alas, AI cannot yet replace them!

The main code generation AIs

GitHub Copilot

The most famous code assistant. Not to be confused with Copilot, the conversational AI in Windows 11, Bing and Microsoft 365. GitHub Copilot was developed in 2021 by GitHub, a Microsoft subsidiary, in partnership with OpenAI. Its main advantage is that it can be integrated as an extension into major IDEs such as Visual Studio Code, Visual Studio, Vim, Neovim, JetBrains IDEs and Azure Data Studio.

Price: From 10 $/month or 100 $ per year

Tabnine

Used by 3 million developers, this open source tool works with GPT models to predict and suggest code as the developer writes. The tool also provides snippets for common tasks and a chat function. Tabnine integrates with the major IDEs: Visual Studio, Eclipse, Android Studio and JetBrains IDEs.

Price: from 12 $/month (free limited version)

Amazon CodeWhisperer

Developed by Amazon, CodeWhisperer writes both snippets and complete functions from your request or existing code. It can be used on the command line or via an IDE (VS Code, Visual Studio, JetBrains IDE, AWS Cloud9, etc.). Its Amazon Q conversational assistant provides personalised advice.

Price : from 20 $/month (free for personal use)

Google Gemini (formerly Bard)

The new version of Google's AI is called Gemini and improves its efficiency, particularly in development. Bard generates code in around twenty languages, can explain, debug and comment on code and write functions for Google Sheets.

Price: free (For now)

ChatGPT

The famous generative AI was not designed specifically for programming, but it can understand and generate code in the main languages. ChatGPT has a number of limitations: it does not create complete programs, does not integrate with IDEs and is not very effective at detecting security flaws.

Price : from 20 $/month (free limited version GPT-3.5 )

Our expert

ORSYS Editorial Board

Made up of journalists specialising in IT, management and personal development, the ORSYS Le mag editorial team [...]