Algorithmic complexity
Reading time5 minThe goal of any algorithm is to propose a solution to a given problem. Minimising the processing time is directly related to the number of calculations involved. Algorithm complexity is a measure of the performance of a program in terms of the number of calculations it performs. The lower the algorithmic complexity, the fewer calculations the algorithm performs and the higher its performance.
Loops are the main cause of the large number of calculations. Their uncontrolled use leads to an increasing number of operations. Beyond a certain number, the computation time becomes an obstacle to the use of the programme. It is therefore important to be able to evaluate the performance of an algorithm before programming it.
Analysis of complexity
There are several methods of complexity analysis, including average analysis and pessimistic analysis. Average analysis provides information about the average computation time. Pessimistic analysis is the most commonly used. It provides information about the number of computations to be considered in a worst-case scenario. Its main purpose is to evaluate algorithms at their most extreme.
For example, if an algorithm searches for a data item among N initial data items, in the worst case all data items, the pessimistic analysis will indicate that the search makes N comparisons. The algorithmic complexity is then denoted as O(N). The notation O indicates that the performance analysis focuses on the upper bound. The value N indicates that a maximum of N processing operations must be performed when there is N initial data.
Main categories of complexity
Complexity varies from one algorithm to another, but can be grouped into the following categories:
1. Logarithmic algorithms. Their notation is $O(log N)$. These algorithms are very efficient in terms of processing time. The number of calculations depends on the logarithm of the initial number of data to be processed. The log(n) function has a curve that flattens out horizontally as the number of data increases. The number of calculations increases slightly as the amount of data to be processed increases (see Figure below).
The logarithmic function gives a different result depending on the base on which it is calculated.
e.g. $log_{10}(1000)=3; log_2(1000)= 9.97; ln(1000)=6.9$
Logarithmic algorithms process the data set by dividing it into two equal parts, then repeating the process on each half. This is the logarithm in base 2, $log_2(N)$. To simplify writing the complexity of these algorithms, the notation $Log(N)$ is used instead of $log_2(N)$.
Example: The dichotomy search is one of these algorithms. You search in the half where the data is located, then repeat the search by dividing the selected half by two. However, the data must be sorted first, making $log_2(N)$ comparisons.
2. Linear algorithms. Their notation is of the form $O(N)$. They are fast algorithms. The number of calculations depends linearly on the number of initial data to be processed.
Example: Case of an algorithm that searches for a name in a list of 1000 names and compares this name with each of the names in the list: If the processing of 1000 initial data (N data) requires 1000 basic calculations to find the result, the complexity of the algorithm is O(N) . If another algorithm, with the same amount of initial data, performs 2000 operations to obtain the result, its complexity is O(2N).
3. Linear and logarithmic algorithms. Their notation is of the form $O(N log N)$. They are algorithms that repeatedly divide a set into two parts and iterate through each part completely.
Example: Case of an algorithm that searches for the longest increasing subsequence. The aim is to find a subsequence of a given sequence in which the elements of the subsequence are sorted in an ascending order and in which the subsequence is as long as possible. This subsequence is not necessarily contiguous or unique.
4. Power type algorithms. Their notation is of the form $O(N^p)$, where p is the power.
Example: An algorithm that compares a list of 1000 names with another list of 1000 names, and iterates through the second list for every name in the first, performs 1000 X 1000 computations. It is of type $O(N^2)$. Algorithms whose complexity is to the power of 2 are also called quadratic.
5. Exponential or factorial algorithms. Their notation is of the form $O(e^N)$ or $O(N !)$. These are the most complex algorithms. The number of calculations increases exponentially or factorially depending on the data being processed.
An exponential algorithm processing 10 initial data points will perform 22 026 calculations, while a factorial algorithm will perform 3 628 800 calculations. This type of algorithm is commonly found in artificial intelligence (AI) problems.
The curves clearly show how each complexity function grows with increasing N.
$O(log N)$: very slow curve growth (low complexity).
$O(N)$: linear curve growth.
$O(N log N)$: slightly faster curve growth.
$O(N^2)$: quadratic growth.
$O(e^N)$ : exponential growth (very high complexity).
$O(N !)$: factorial growth (extremely high complexity).
Although memory management is not included in the notion of complexity, it often depends on it. It is common to use more memory to minimise the number of operations. We therefore need to check that the memory management of high performance algorithms does not become a new constraint on their use.