The Secret Life of Go: Benchmarking

Proving performance, testing.B, and outsmarting the compiler

#Golang #Benchmarking #SoftwareEngineering #BackendDev

Eleanor is a senior software engineer. Ethan is her junior colleague. They work in a beautiful beaux arts library in Lower Manhattan — the kind of place where coding languages are discussed like poetry.

Episode 34

Ethan was aggressively typing a response into a pull request review.

"My coworker is trying to tell me that standard string concatenation is faster than strings.Builder," Ethan said, shaking his head as Eleanor walked by. "We are just joining a prefix to an ID, like "user_" + id. I used a strings.Builder because it avoids allocating new memory every time you append. It's objectively the superior pattern."

Eleanor pulled up a chair. "It is the superior pattern for building large strings in a loop. But for joining exactly two small strings? Your coworker might be right."

"How can they be right? A builder avoids allocations!"

"Performance in Go is a science, Ethan. Not a philosophy," Eleanor said. "We don't argue about performance based on intuition. We measure it. Let's write a benchmark."

The Benchmark Setup

Eleanor instructed Ethan to create a new file named string_test.go. She explained that just like unit tests use testing.T, benchmarks use testing.B.

Ethan quickly sketched out the two approaches:

package main

import (
    "strings"
    "testing"
)

func BenchmarkConcat(b *testing.B) {
    // b.N is automatically adjusted by the testing framework
    // to run the loop enough times to get a statistically significant result.
    for i := 0; i < b.N; i++ {
        _ = "user_" + "12345"
    }
}

func BenchmarkBuilder(b *testing.B) {
    for i := 0; i < b.N; i++ {
        var builder strings.Builder
        builder.WriteString("user_")
        builder.WriteString("12345")
        _ = builder.String()
    }
}

Ethan ran the benchmark command in his terminal: go test -bench=.

The output appeared instantly:

BenchmarkConcat-10    1000000000         0.2100 ns/op
BenchmarkBuilder-10   1000000000         0.2100 ns/op

Ethan squinted at the screen. "Zero point two nanoseconds? They both take less than a billionth of a second? And they are exactly the same speed?"

The Dead Code Elimination Trap

Eleanor laughed. "You just fell into the Dead Code Trap. The Go compiler is ruthlessly efficient. It looked at your loops, saw that you assigned the results to the blank identifier _, and realized that the outcome of this work is never actually used anywhere."

"So what did it do?"

"It deleted your code," Eleanor smiled. "During compilation, it simply removed the string operations entirely. You aren't benchmarking string concatenation. You are benchmarking an empty for loop. That's why it's so fast."

The Architect's Fix: The Global Variable Trick

"To benchmark properly in Go, you must force the compiler to keep your code," Eleanor explained. "You do this by saving the result to a package-level variable. Since the compiler can't prove that no other package will ever read that global variable, it is forced to actually execute the operation."

She modified Ethan's code:

// 1. Declare a package-level variable
var GlobalResult string

func BenchmarkConcat(b *testing.B) {
    var r string
    for i := 0; i < b.N; i++ {
        r = "user_" + "12345"
    }
    // 2. Assign the final result to the global variable
    // OUTSIDE the loop, so we don't benchmark the assignment itself.
    GlobalResult = r
}

func BenchmarkBuilder(b *testing.B) {
    var r string
    for i := 0; i < b.N; i++ {
        var builder strings.Builder
        builder.WriteString("user_")
        builder.WriteString("12345")
        r = builder.String()
    }
    GlobalResult = r
}

Ethan ran the benchmark command again. This time, the computer hummed for a few seconds before printing the real truth:

BenchmarkConcat-10    85432100          13.4 ns/op
BenchmarkBuilder-10   23105000          48.2 ns/op

Ethan stared at the numbers. "The standard concatenation + is almost four times faster."

"It is," Eleanor confirmed. "For exactly two strings, the Go compiler optimizes + perfectly into a single memory allocation. strings.Builder requires instantiating a struct and making multiple method calls. The overhead ruins the benefit for tiny operations."

Ethan deleted his pull request comment and changed his code back to a simple +.

"I assumed I was outsmarting the compiler," Ethan admitted.

"We all do, until we run the benchmarks," Eleanor said. "Never guess. Always measure. And always make sure the compiler isn't deleting your test."

Key Concepts from Episode 34

The testing.B Framework

Go has a built-in benchmarking suite. Functions prefixed with Benchmark that take *testing.B can be executed using go test -bench=..
The b.N variable is dynamically increased by the test runner until the loop runs long enough to provide a reliable, average ns/op (nanoseconds per operation).

Dead Code Elimination

The Go compiler aggressively optimizes code. If the result of a calculation is never used (or assigned to _), the compiler will simply remove the calculation from the final binary.
This results in "infinitely fast" benchmarks that are actually just measuring an empty loop.

The Global Variable Trick

To prevent dead code elimination in benchmarks, store the result of your operation in a local variable, and then assign that local variable to a package-level global variable after the b.N loop completes.
This forces the compiler to execute the code because it cannot guarantee the global variable won't be accessed elsewhere.

The Architectural Lesson

strings.Builder is highly optimized for building strings in a loop or joining many parts. However, for simply joining two strings together, the standard + operator is heavily optimized by the compiler and has less overhead.
Performance is empirical. Never rely on intuition for micro-optimizations.

Aaron Rose is a software engineer and technology writer at tech-reader.blog.

Catch up on the latest explainer videos, podcasts, and industry discussions below.

Search This Blog

Tech-Reader.blog