Halstead's Software Metrics: Measuring Code Complexity and Size
Understand Halstead's software metrics for quantifying code complexity. This guide explains the fundamental measures (operators, operands, length, vocabulary), derived metrics (volume, difficulty, effort, bugs), and their application in assessing software size and potential errors.
Halstead's Software Metrics: Measuring Software Complexity
Introduction to Halstead's Metrics
Halstead's software metrics provide a quantitative measure of software complexity. They're based on the idea that a program can be viewed as a collection of tokens (operators and operands). These metrics offer insights into a program's size, complexity, and potential for errors. These metrics were developed by Maurice Howard Halstead.
Basic Halstead Metrics
Halstead's metrics are defined using four basic measures:
- n1: Number of unique operators.
- n2: Number of unique operands.
- N1: Total occurrences of operators.
- N2: Total occurrences of operands.
The total number of tokens is N = N1 + N2.
Derived Halstead Metrics
Several other metrics are derived from these basic measures:
- Program Volume (V): A measure of the program's size in bits. V = N * log2(n), where n = n1 + n2 is the vocabulary size.
- Program Level (L): A measure of the program's implementation level (0 ≤ L ≤ 1, with 1 representing the highest level). L = V* / V.
- Program Difficulty (D): A measure of the program's complexity and potential for errors. D = (n1 / 2) * (N2 / n2).
- Programming Effort (E): Estimated effort in terms of mental discriminations. E = V / L = D * V.
- Estimated Program Length (Nˆ): An estimate of the program's length. Nˆ = n1log2(n1) + n2log2(n2). (Alternative formulas for estimated program length are given in the original text but are omitted here for brevity. These would be added to the HTML as equations.)
- Potential Minimum Volume (V*): The volume of a theoretically minimal program solving the same problem. V* = (2 + n2*) * log2(2 + n2*), where n2* is the number of unique input/output parameters.
- Language Level (L'): A measure of how efficiently the algorithm is implemented in the chosen language. L' = V*/V.
Counting Rules for Operators and Operands in C Code
When performing software metrics analysis or calculating software size in C code, operators and operands are counted according to specific rules. These rules help in determining the complexity of the code and are crucial for estimating the effort, time, and resources required for software development and maintenance. Here are the counting rules for operators and operands in C code:
1. Operators
Operators are symbols that perform operations on operands in C code. The following are the rules for counting operators:
- Arithmetic Operators: +, -, *, /, %, etc.
- Relational Operators: <, >, <=, >=, ==, !=
- Logical Operators: &&, ||, !
- Assignment Operators: =, +=, -=, *=, /=, %=
- Bitwise Operators: &, |, ^, ~, <<, >>
- Unary Operators: ++, --, & (address-of), * (dereference), sizeof
- Conditional Operator: ? :
Each occurrence of these operators counts as one operator.
2. Operands
Operands are the variables, constants, or expressions that operators act upon. The following are the rules for counting operands:
- Variables: Every variable used in an expression is counted as an operand.
- Constants: Every literal constant (e.g., 5, 3.14, 'A') is counted as an operand.
- Sub-expressions: Each sub-expression within parentheses is also counted as an operand.
- Function Calls: The function name and each of the arguments are counted as operands.
Each variable, constant, and sub-expression used in an operator’s expression is counted as one operand.
3. Special Rules
- Increment/Decrement Operators: The increment (++) and decrement (--) operators count as a single operator, but they affect a single operand.
- Operator Overloading (in C++): In C++, operator overloading can make counting more complex. Here, you need to account for the overloaded operators based on their custom definition.
By following these counting rules, you can accurately determine the number of operators and operands in a C program, which helps in calculating software metrics such as Lines of Code (LOC) and Cyclomatic Complexity.
Applying Halstead's Metrics to a Sorting Program
Halstead's software metrics are used to evaluate the complexity of a program based on the number of operators and operands. In this example, we will apply Halstead’s metrics to a simple sorting program written in C. We will first list the operators and operands in the program, and then calculate the various Halstead metrics.
Sorting Program Example
#includevoid bubbleSort(int arr[], int n) { int i, j, temp; for (i = 0; i < n - 1; i++) { for (j = 0; j < n - i - 1; j++) { if (arr[j] > arr[j + 1]) { temp = arr[j]; arr[j] = arr[j + 1]; arr[j + 1] = temp; } } } }
Step 1: Identify Operators and Operands
Operators:
Operator | Frequency |
---|---|
= | 6 |
+ | 4 |
- | 3 |
> | 2 |
if | 1 |
for | 2 |
Operands:
Operand | Frequency |
---|---|
arr | 6 |
n | 2 |
i | 4 |
j | 3 |
temp | 2 |
Step 2: Halstead Metrics Calculations
Based on the program's operators and operands, we will now calculate the Halstead metrics:
1. n (Number of Distinct Operators)
Distinct Operators: =, +, -, >, if, for
n1 (Distinct Operators) = 6
2. N (Total Number of Operators)
Total Operators: 6 (for =) + 4 (for +) + 3 (for -) + 2 (for >) + 1 (for if) + 2 (for for)
N1 (Total Operators) = 18
3. n2 (Number of Distinct Operands)
Distinct Operands: arr, n, i, j, temp
n2 (Distinct Operands) = 5
4. N2 (Total Number of Operands)
Total Operands: 6 (arr) + 2 (n) + 4 (i) + 3 (j) + 2 (temp)
N2 (Total Operands) = 17
5. V (Program Volume)
V = (n1 + n2) × log2(n1 + n2)
V = (6 + 5) × log2(6 + 5) = 11 × log2(11) ≈ 11 × 3.459 ≈ 38.049
V = 38.049
6. E (Program Effort)
E = n1 × N2
E = 6 × 17 = 102
E = 102
7. λ (Program Level of Difficulty)
λ = (n1 / 2) × (N2 / n2)
λ = (6 / 2) × (17 / 5) = 3 × 3.4 = 10.2
λ = 10.2
Summary of Halstead Metrics
Metric | Value |
---|---|
n1 (Distinct Operators) | 6 |
N1 (Total Operators) | 18 |
n2 (Distinct Operands) | 5 |
N2 (Total Operands) | 17 |
V (Program Volume) | 38.049 |
E (Program Effort) | 102 |
λ (Program Level of Difficulty) | 10.2 |