Home > Digital technologies > AI and big data > AI: code generators, a revolution for developers

AI: code generators, a revolution for developers

Published on February 7, 2024
Share this page :

Development is one of the areas most affected by advances in AI. The phenomenon did not wait for ChatGPT. How can AI facilitate software and web design, programming and optimization? What are the best AI code assistants? What are their limits ?

Illustration article IA de code

“AI will write 80 % of code within 5 years” affirmed in 2022 in a prophetic and slightly provocative manner Thomas Dohmke, the boss of GitHub, the platform used by more than 100 million developers.

Generative AIs like ChatGPT, Copilot or Gemini (formerly Google Bard) have marked the democratization of artificial intelligence. Based on large language models (LLM), these AIs are not just used to create text or images. They can also generate code, comment on it, suggest optimizations or hunt for bugs, saving developers valuable time.

Launched in 2021, GitHub Copilot, the most popular code generation AI, already counts in October 2023 over 1 million paying users across 37,000+ organizations !

Nothing more is needed to dedicate the year 2024 as that of AI specialized in coding. Enough to revive the debate generated by the trend low-code/no-code on the disappearance of the developer profession, or at least its inevitable evolution.

It only took 16 minutes to create this game!

Thanks to the AI of FRVR, it only took 8 minutes to create the code for the Space Aliens game and another 8 minutes for the design.

Code, ideal data to power AI

Code is one of AI's favorite areas. And this, well before the arrival of ChatGPT in November 2022. Let us cite the IBM Watson Code assistant in 2015, DeepCode in 2016 to find security vulnerabilities in a program or Microsoft IntelliCode in 2017 to improve the automatic code completion of IntelliSense which appeared ten years earlier in Visual Studio.

Not surprisingly, these AIs are based on large language models (LLMs) trained on large text-based datasets. And to power machine learning algorithms, these AIs need structured data. The code then becomes an ideal candidate: in text format, structured, available in large quantities, in open source, in huge repositories like GitHub or SourceForge.  

And so, GitHub Copilot was trained on GitHub's 54 million repositories, including a 159 GB dataset of Python code.

What are code AIs used for?

Using artificial intelligence has therefore become common practice for many developers. Especially since these tools are within their reach. They are found in the form ofextensions in the most common development environments (IDEs), such as Visual Studio Code, Eclipse or specialized software from JetBrains (IntelliJ, PhpStorm, PyCharm, WebStorm…).

To date, there are several hundred development support tools, covering a wide variety of applications. Their number is constantly increasing. New AI-based tools appear every week, meeting developers' demand for more efficient and effective solutions. 

The There are many uses of AI for development :

Code generation and bug fixing represent half of Tabnine users' usage

Automatic code generation

This is by far the most used function by developers. The AI creates code from a request (a prompt), a description in natural language of what we want, or from the code already entered by automatically completing it. This latter autocomplete feature is an improvement on the autocomplete systems that have long existed in IDEs. But instead of just guessing the next word, the AI will generate one or more lines of code based on probability statistics.

The AI only makes suggestions. The developer can iteratively refine their request to obtain the desired result.   

The time saving is enormous. Developers can code faster (55 % faster according to GitHub statistics), with more precision and with better understanding of the code.

GitHub Copilot allows you to create a button component in one second. You just have to ask !

Bug detection and correction

A code AI is also used to debug code, to validate it against the syntax of a given language, but also to detect errors that are difficult to find in complex code. Thus, the AI analyzes the code submitted to it and detects possible errors, such as:

  • syntax errors : a semicolon missing at the end of an instruction in Java or JavaScript, a missing closing tag in PHP, an extra brace in CSS, etc.
  • runtime errors: use of an uninitialized or deleted variable (NullPointer), attempt to access an object that is no longer accessible in memory (PointerException), call stack too large (StackOverflow)…
  • logic errors : division by zero, arithmetic overflow, non-existent function call…
  • input-output errors : reading or writing errors on a storage medium, attempt to open a file that does not exist, etc.
  • formatting errors : unexpected format by a character string, incorrect data type…

Optimization and code rewriting

AIs are capable of improving code quality, execution speed and readability. They suggest changes to be made: refactoring code (simplification of the structure, making the code more modular and reusable, application of SOLID principles, etc.), reduction of unnecessary loops, parallelization of tasks, optimization of data structures and input-outputs, etc.   

AIs sniff out code smells, poor software design practices that can make code more difficult to debug and increase the risk of bugs. Thus, code duplications, non-explicit variable names or excessive couplings (classes and modules that are too interdependent) are tracked down in a few seconds.

Furthermore, theIA formats the code according to the conventions and best practices of a given language while adding comments and documentation. The readability of the code, its understanding and its maintainability are facilitated, allowing developers to quickly resume the project, even those who did not participate in it.

Code analysis and explanation

To better understand code, a developer can question an AI in natural language through a chat interface. Some AI is also capable of generating code visualizations to help developers better understand the structure of the code, its logic and the relationships between the different parts making it up.

The massive use of AI to explain code has caused the audience of Stack Overflow, the popular Q&A forum consulted by millions of developers, to plummet, prompting the removal of 28 %s from its staff in October 2023 .

Code documentation

Completing the code requires writing its documentation, a tedious and often overlooked step. Too often, developers are pushed to deliver working code faster and faster, leaving documentation on the back burner.

Artificial intelligence can be a game changer. By analyzing the code, it can immediately produce complete and formatted documentation, ensuring its clarity and readability. This revolutionary process significantly reduces technical debt and allows developers to focus on their core business: creating code.

Security breach detection

You might think that AIs are as useful for fixing code as they are for detecting security vulnerabilities. It's the case. AIs like Checkmarx CheckAI, BurpGPT, DeepCode, Codecy, GitHub Copilot detect flaws and make recommendations to fill them, sometimes in one click. They detect known vulnerabilities, security anti-patterns such as SQL injections or XSS. Some AI can analyze the flow of data in source code to identify potential entry and exit points for attacks.


However, we must remain vigilant and not rely solely on AI. Conducting manual static and dynamic tests remains imperative.

Codacy analyzes the code and displays the necessary fixes in your Git repository.

Automated test generation

Thanks to its capabilities, AI is capable of conduct automatic tests. This brings many benefits such as saving time, improving code coverage (identifying parts of code that have not been manually tested), better bug detection and reducing costs associated with manual testing.

Test generation tools include DeepCode who uses themachine learning to generate unit and integration tests.

However, this technology is not yet perfect. It saves time, but must still be supplemented by manual testing, in particular to test non-functional aspects of an app or site such as usability or performance. We cannot yet consider AI as a substitute for manual testing.

Converting from one language to another

Proficient in several programming languages, most code AIs are able to translate code from one language to another. The operation is not easy, because moving from one language to another requires converting syntax (changing block structure to indentation, for example), logic (changing paradigms and control structures), finding equivalent libraries in the target language, convert comments and documentation, and optimize the code for the target language.

We can thus switch from C++ to Python and vice versa or from Java to Python in a few seconds. AI-powered automatic code conversion accelerates the software development process across different platforms and avoids manual translation errors. In addition to the code AIs below, there are specialized AIs like DeepCode, Polyglot, Code Trans and AI Code Translator.

Learn a new language

Code AIs support a large number of programming languages, breaking down the barrier that exists between different developers. We can thus easily learn a new language by asking the AI to convert it from a language we master. As we saw above, the AI will also be able to explain existing code or allow you to test yourself by asking it to correct your errors. The icing on the cake is that AIs like ChatGPT or Gemini (formerly Bard) are able to create exercises and coding multiple choice questions to test your new knowledge.

For example, GitHub Copilot is able to understand and generate code in major languages and frameworks, including:

  • Compiled languages : C, C++, C#, Go, Java…
  • Scripting languages : Bash, JavaScript, Powershell, Python, Ruby, TypeScript…
  • Markup languages: CSS, HTML, XML, JSON, YAML…
  • Database languages : SQL, MySQL, PostgreSQL…
  • Specific programming languages : SQL, R…
  • Frameworks : Django (Python), Angular and React (JavaScript), ASP.NET (C#), Spring Boot (Java), Laravel (PHP), Qt (C++)…

Of course, artificial intelligence cannot match such specialized and interactive training that we offer at ORSYS. Our 246 training courses in software development and web are led by experienced field professionals who adapt their courses to your needs. AI does not replace human contact.

The limits of code generation AI

AI is revolutionizing the development process at all stages. However, they still face many limitations and constraints that must be kept in mind before using them.

A standardized code

Coming from open source code repositories, the code generated by AI leads to standardized code, repeated code which sometimes degrades quality and impoverishes the overall code. Thus, the consulting company GitClear analyzed 153 million lines of modified code, written between January 2020 and December 2023. It concluded that

According to GitClear, the increase in copying and pasting of code caused by AI is expected to worsen companies' technical debt in the coming years.

Unequal quality depending on the language

Code AIs are like all AIs: the quality of their response depends closely on the quantity and quality of the data used for their training. So, the more popular a programming language is in public code repositories like GitHub, GitLab or Bitbucket, the more relevant the AI suggestions will be.

For example, JavaScript is very present in repositories. AI suggestions on JavaScript will therefore be very relevant. Conversely, well-known languages, but comparatively less present in code repositories such as Julia, ABAP from SAP, or MATLAB will be poorly supported.

Programming biases

Public repositories that serve as AI training data may contain poorly written or result-oriented code. This may affect AI code suggestions such as generating discriminatory code.

AI does not understand complex code

At the moment, code generation AIs cannot handle a project as a whole.

AIs are perfect for generating short code, code fragments like snippets or functions. Conversely, their suggestions are less relevant on long code. At the moment, working on a complex project requires cutting it up so that the AI analyzes it part by part. However, AI will not be able to take into account the big picture or integrate the entire context of the project (budget, time, target platform, developer skills, available tools, stakeholder needs, etc.).

AI victims of hallucinations

Well known to ChatGPT users, hallucinations are answers invented by LLMs when they cannot find the right answer to a question and are presented as a certain fact. Applied to development, this can lead to incorrect, insecure code which will therefore waste the developer's time.

Copyright and data privacy issues

If AIs largely use public data, there is undoubtedly code subject to copyright with risks of plagiarism. Additionally, it is important to remember that when you use an AI, you are training it. By writing code that is proprietary or subject to confidentiality agreements, you could be violating your company policy. Some publishers, like Tabnine and GitHub for Copilot Business, do not store your code and only train their AIs on open source repositories whose license allows it.

Potential security breaches

Be careful not to rely too much on AI for the security of your data. Code generated by AIs is neither tested nor validated by humans. Furthermore, the quality of the code depends, as we have seen, on the quality of the data on which it is trained. If the data is compromised or includes errors, the generated code could also be compromised.

Additionally, AI-generated code can be difficult to audit and understand, making it difficult to detect security vulnerabilities. Otherwise, only AI specialized in cybersecurity will be able to analyze it!

Ethical problems

Programs that touch on humans (health, autonomous driving, selection of candidates, etc.) may pose ethical questions that are not managed by AI or that may lead to biased results. Would you entrust the lives of patients or users of autonomous vehicles to an AI?

The future of development will be written by AI

Although the tools can still be improved, AI is making constant progress and is already providing many services to developers who can concentrate on other higher value-added and collaborative tasks. Thus, they will have more time to attend the numerous project meetings (coordination, demos, monitoring, validation, retrospective, etc.), where, unfortunately, AI cannot yet replace them!


The main code generation AIs

GitHub Copilot

GitHub Copilot

The most famous code wizard. Not to be confused with Copilot, the conversational AI of Windows 11, Bing and Microsoft 365. GitHub Copilot was developed in 2021 by GitHub, a subsidiary of Microsoft, in partnership with OpenAI. Its main advantage: integrating as an extension into the main IDEs such as Visual Studio Code, Visual Studio, Vim, Neovim, JetBrains IDEs and Azure Data Studio.

Price: From 10 $/month or 100 $ per year

Tabnine logo

Tabnine

Used by 3 million developers, this open source tool works with GPT models to predict and suggest code as the developer writes. The tool also provides snippets for common tasks and a chat function. Tabnine integrates with major IDEs: Visual Studio, Eclipse, Android Studio and JetBrains IDEs.

Price: from 12 $/month (free in limited version)

Amazon CodeWhisperer Logo

Amazon CodeWhisperer

Developed by Amazon, CodeWhisperer writes both snippets and complete functions from your request or existing code. It can be used from the command line or via an IDE (VS Code, Visual Studio, JetBrains IDE, AWS Cloud9, etc.). Its Amazon Q conversational assistant provides personalized advice.    

Price : from 20 $/month (free for personal use)

Google Gemini (formerly Bard)

The new version of Google's AI is called Gemini and improves its efficiency, particularly in development. Bard generates code in around twenty languages, can explain, debug and comment on code and write functions for Google Sheets.   

Price: free (For now)

ChatGPT

The famous generative AI was not designed specifically for programming, but it can understand and generate code in major languages. ChatGPT has many limitations: it does not create complete programs, does not integrate with IDEs, and is not very good at detecting security vulnerabilities.    

Price : from 20 $/month (free in limited version GPT-3.5)

Our expert

ORSYS Editorial Board

Made up of journalists specialising in IT, management and personal development, the ORSYS Le mag editorial team [...]

associated domain

AI, machine learning, data analysis

associated training

Artificial intelligence: issues and tools (AIO)

The 20 best Generative Artificial Intelligence (GAI) solutions for your business

Machine learning, the state of the art