Parameters

In AI, a parameter is a property learned from the data used to train the model.. This is an adjustable element that determines the behaviour and functionality of the AI model..

Parameters play a role crucial to the functioning of AI models:

They influence the way the model interprets data and generates responses..
They allow the model to learn from the training data and generalise this knowledge to process new inputs..
They are adjusted during training to optimise the model's performance on specific tasks..

In the case of language modelsThe parameters are often associated with the weights of the connections between the neurons in the neural network of the model.. The more parameters the model has, the more details and nuances it can learn from the data, enabling it to produce more complex and natural responses. Parameters are essential because they form the basis of the model's ability to 'understand' and generate language that sounds natural to human users.

These are the numbers (numerical values) that define the way in which the model transforms the entries (data) in exits (predictions).

Example: In a neural network, each connection between neurons has a weight (weight), and each neuron has a bias (bias). These weights and biases are the parameters.

Role

The parameters store knowledge of the model, learned from the training data.
They are adjusted via optimisation algorithms (e.g: gradient descent) to minimise the error between predictions and actual results.

Types of parameters

Driveable parameters : those that the model adjusts during training (e.g. the weights of a neural network).

Hyperparameters external parameters defined before training (e.g. learning rate, number of layers, etc.). They are not learned by the model.

Why are there so many parameters?

Model capacity :
- The more parameters a model has, the more it is theoretically able to capture complex patterns in the data (e.g. GPT-3 with 175B, i.e. 175 billion parameters vs. BERT with 340M, 340 million parameters).
- However, too many parameters can lead to overlearning (overfitting) or high calculation costs

Cost and resources :
- Models with billions/billions of parameters (e.g. GPT-4) require supercomputers and massive amounts of data.
- Example: GPT-3 training would have cost several million dollars to calculate.

Model	Designer	Number of parameters (in billions)
o3	OpenAI	5 000
DeepSeek R1	DeepSeek-AI	685
4o	OpenAI	200
o1	OpenAI	200
Gemini 2.0 Pro	Google	200
Claudius 3.5 Sonnet	Anthropic	175
Pistral Large	Mistral	124
LLama 3.3	Meta	70
o3-mini	OpenAI	20
Gemini 2.0 Flash	Google	30

Key points

Parameters ≠ Performance A model with fewer parameters but better training (eg: Chinchilla) can outperform a larger model.
Balance Finding a compromise between model size, available data and resources is crucial in AI.

Back to Glossary