Cyclomatic Complexity
Created: 3/4/2022Updated: 3/4/2022
Cyclomatic complexity is a simple and objective way of measuring a module's testability and maintainability. Unfortunately, most programmers I've met either aren't very familiar with it or have never even heard of it. Let's change that.
Intro
There are countless subjective ways to judge a block of code. We can nitpick over stylistic differences, naming conventions, and a preference for the latest features of a language until we're blue in the face without ever achieving anything resembling an objective quality improvement. Quite often, even "best practices" are rooted in subjective reasoning or anecdotal experiences of certain individuals.
What if I told you there was an objective, mathematical way to assess a block of code? An objective, sans-opinion way to quantify a block of code's maintainability and testability? If this interests you, keep reading and we'll discuss how to take advantage of Cyclomatic Complexity to objectively improve your code quality.
Definition
Cyclomatic Complexity is defined as, "a measure of the complexity of a module's decision structure. It is the number of linearly independent paths and therefore, the minimum number of paths that should be tested. [1]"
How is it calculated?
Here is the formula for calculating Cyclomatic Complexity:
c = e - n + 2
c = complexity
n = nodes (a group of statements without programatic selection, such as, but not limited to, if/else blocks)
e = edges (the flow between nodes)
Examples:
Example 1 GetDayOfWeekIndex:
/** * Complexity: c = e - n + 2 * c = 4 - 4 + 2 * c = 2 */functionGetDayOfWeekIndex() { // node 1 startconst dayOfWeekIndex = newDate().getDay(); // node 1 endif ([0, 6].includes(dayOfWeekIndex)) { // edge of node 1 -> node 2// node 2 startconsole.log('It\'s the weekend!'); // node 2 end } // edge of node 2 -> node 4else { // edge of node 1 -> node 3// node 3 startconsole.log('It\'s a weekday.') // node 3 end } // edge of node 3 -> node 4// node 4 startreturn dayOfWeekIndex // node 4 end}
The above function gets the numeric value of the current day of the week, which is a zero-indexed number between zero and six. Ultimately, we return that number, but not before we log a message depending on whether or not it's a weekend or weekday.
The comments highlight the node boundaries and the edges between them. Based on the formula for Cyclomatic Complexity we reviewed earlier, the complexity of this function is two. This means that there are two linearly independent paths, and that we'll need to call this function at least twice in our tests to get full coverage.
If we removed the if/else block in the middle, then the formula would switch to c = 0 - 1 + 2, which leaves us with a complexity of 1, and therefore only needing (as a minimum) to write a single test.
Example 2 GetDayOfWeekInfo:
/** * Complexity: c = e - n + 2 * c = 14 - 9 + 2 * c = 7 */functionGetDayOfWeekInfo(): [string, 'weekend' | 'weekday'] { // node 1 start const dayOfWeekIndex = newDate().getDay() as0 | 1 | 2 | 3 | 4 | 5 | 6; // node 1 endswitch (dayOfWeekIndex) { case0: // edge of node1x -> node 2// node 2 start return ['Sunday', 'weekend'] // edge of node 2 to end node// node 2 end case1: // edge of node 2 -> node 3// node 3 startreturn ['Monday', 'weekday'] // edge of node 3 to end node// node 3 endcase2: // edge of node 3 -> node 4// node 4 startreturn ['Tuesday', 'weekday'] // edge of node 4 to end node// node 4 endcase3: // edge of node 4 -> node 5// node 5 startreturn ['Wednesday', 'weekday'] // edge of node 5 to end node// node 5 endcase4: // edge of node 5 -> node 6// node 6 startreturn ['Thursday', 'weekday'] // edge of node 6 to end node// node 6 endcase5: // edge of node 6 -> node 7// node 7 startreturn ['Friday', 'weekday'] // edge of node 7 to end node// node 7 endcase6: // edge of node 7 -> node 8// node 8 startreturn ['Saturday', 'weekend'] // edge of node 8 to end node// node 8 end }}
The above method is arguably quite a bit more useful than the first, as it not only returns a human readable string representing the current day of week, but also a string representing whether the current day is a "weekend" or "weekday."
This method is also quite a bit more complex as well. At a complexity of seven, this function would require seven tests to cover all the branches. Several if/else if/else selections, or the switch selection construct tend to be very "complex" from a programatic standpoint, requiring a test per selection. In the next example, we'll go over an alternative, less complex function with the same behavior.
Example 3 GetDayOfWeekInfoRefactored:
/** * Complexity: c = e - n + 2 * c = 0 - 1 + 2 * c = 1 */functionGetDayOfWeekInfoRefactored(): [string, 'weekend' | 'weekday'] { // node 1 startconstdayIndexStringRecord: Record<0 | 1 | 2 | 3 | 4 | 5 | 6, string> = { 0: 'Sunday', 1: 'Monday', 2: 'Tuesday', 3: 'Wednesday', 4: 'Thursday', 5: 'Friday', 6: 'Saturday' } constdayIndexDayTypeRecord: Record<0 | 1 | 2 | 3 | 4 | 5 | 6, 'weekend' | 'weekday'> = { 0: 'weekend', 1: 'weekday', 2: 'weekday', 3: 'weekday', 4: 'weekday', 5: 'weekday', 6: 'weekend' } const dayOfWeekIndex = newDate().getDay() as0 | 1 | 2 | 3 | 4 | 5 | 6; const dayOfWeekString = dayIndexStringRecord[dayOfWeekIndex]; const dayTypeString = dayIndexDayTypeRecord[dayOfWeekIndex]; return [dayOfWeekString, dayTypeString]; // node 1 end}
This function, while achieving the same requirements as the previous function, only has a complexity of one, and therefore can be tested (minimally) with only a single test.
Obviously it's debatable which one of these two methods is more readable. Many would find the switch more readable, and that's just fine. Keep in mind though, that familiarity with a programming tactic can cause something to seem more readable to an individual. Ternaries are a good example of this.
Some find an if/else:
if (predicate) { /* do this */ } else { /* do that */ }
More readable than a ternary:
predicate ? /* do this */ : /* do that */
However, programmers familiar with ternaries usually find them to be plenty readable. Programmers familiar with map data structures will also tend to find the refactored function obvious in its intent.
What's not opinionated is that the refactored function is less complex, with only a single linearly independent path (branch).
To summarize my opinion on this dynamic, I'll say that, writing more complex code for readability purposes is fine, so long as you're willing to cover all those branches...
How does this improve my software quality?
Reducing complexity is synonymous with reducing opportunities for failure. Simpler code is easier to maintain and test because it's making fewer decisions. As a developer, the fewer decision points you're considering in a code module, the less likely you are to mess one of them up, forget about them, etc.
Consider thoroughly explaining what your code is doing to a laymen or diagraming its control flow. It'd be quite a bit less painful to do either of those with code that has low complexity wouldn't it?
Even my eyes glaze over when someone tries to explain business logic consisting of several if/else selections. The effect compounds itself when you consider how the selections you're currently considering may be nested within others. The difficulty to reason about such logic in spoken language is very much related to difficulty in diagnosing a defect or extending existing functionality.
We should actively pursue keeping code as simple as possible for all these reasons and more.
Helpful tools
Now you know what Cyclomatic Complexity is and how to calculate it. You may be wondering if there's an easier way to take advantage of these concepts. Especially if this is a new concept for you, it may be best to look for some tools for whatever language you're currently using.
For example, I use ESLint (a JavaScript / TypeScript linter) to enforce a maximum Cyclomatic Complexity for functions, methods, etc. [2].
In addition to ESLint, I use a VSCode extension called, CodeMetrics to call attention to high complexity modules. This extension doesn't technically calculate true Cyclomatic Complexity, but it's a close enough "approximation" to be very handy [3].
Assuming you're writing tests, and calculating code coverage, you will also notice a very close relationship between complexity and branch coverage when assessing your code coverage reports. Often times you'll notice that you can increase branch coverage more significantly by eliminating complexity than by writing a few tests.
Summary
Cyclomatic Complexity gives us an objective metric to lean on. One that doesn't care about who the author of the code was. Using it as a tool to target areas of high complexity is therefore consistent and reliable. Reducing complexity reduces the opportunities to failure within an application, statistically reducing the likelihood of defects, and significantly easing our testing burden.
It's worth mentioning that there are other methods of calculating complexity, and that McCabe's Cyclomatic Complexity is not entirely without controversy.
Nevertheless, the objectiveness of the metric cannot be disputed. Any engineer valuing high test coverage would also be on a self-defeating mission dismissing Cyclomatic Complexity, given its causal relationship to code branches (and therefore branch coverage).