pickuma.
Dev Knowledge

Why Arrays Start at Zero

Zero-based indexing isn't arbitrary — it falls out of how arrays are stored in memory, where an index is really an offset from the array's starting address.

5 min read

If you’ve ever wondered why arr[0] is the first element and not arr[1], the answer isn’t pedantry or tradition for its own sake. It comes straight from how a computer finds an element in memory.

An Index Is an Offset, Not a Count

An array is a contiguous block of memory. When you create one, the program knows a single thing about where it lives: the address of its first byte, called the base address. Every element after that is found by doing arithmetic from that base.

To locate any element, the machine computes:

address = base + index * elementSize

If you have an array of 4-byte integers starting at address 1000, then element 0 lives at 1000 + 0 * 4 = 1000, element 1 lives at 1000 + 1 * 4 = 1004, and so on. The index isn’t answering “which element in counting order” — it’s answering “how many elements past the start.” The first element is zero elements past the start, so its index is 0.

Once you see it this way, zero-based indexing stops looking like a quirk. A count-from-one scheme would force the machine to subtract one on every single access (base + (index - 1) * elementSize), doing extra work to undo a human convention. Zero indexing makes the index be the offset, with nothing to adjust. The convention is really just the address arithmetic showing through.

Dijkstra and the Case for Half-Open Ranges

There’s a second, more subtle reason zero-based indexing tends to win, and it’s about counting ranges rather than addresses. In a well-known 1982 note titled “Why numbering should start at zero,” Edsger Dijkstra argued for writing ranges as half-open intervals: include the lower bound, exclude the upper one. Written mathematically, that’s [start, end).

The payoff is that the length of such a range is simply end - start, with no +1 or -1 correction anywhere. The elements 0, 1, 2, 3, 4 are exactly the range [0, 5), which has length 5 - 0 = 5. Two adjacent ranges like [0, 3) and [3, 6) join cleanly with no gap and no overlap, because the end of one is the start of the next. This is exactly the shape of nearly every loop you write:

for (int i = 0; i < n; i++) { ... } // runs n times, indices 0..n-1

The loop touches n elements, the last index is n - 1, and the bound i < n reads as “while still inside the range.” If arrays started at one and ranges were closed on both ends, you’d be sprinkling +1 and -1 adjustments through your code — and off-by-one errors thrive in exactly those adjustments.

A Convention, Not a Law of Nature

It’s worth being precise: zero-based indexing is a convention, cemented largely by C and the languages that inherited its memory model, not a rule the universe enforces. Plenty of languages chose differently. Fortran arrays default to starting at one, MATLAB indexes from one, and Lua’s idiomatic arrays (tables) conventionally start at one. R and several mathematics-oriented languages do the same, because they prioritize matching mathematical notation, where a vector’s first component is usually written with subscript one.

None of these are wrong. They reflect a different priority: closeness to human and mathematical convention over closeness to the machine. The reason zero-based indexing feels “default” to most working programmers today is simply that the dominant systems languages — C, C++, Java, JavaScript, Python, Go, Rust — all adopted it, and that lineage traces back to the offset-from-base model that hardware uses anyway.

FAQ

So the next time someone calls zero-based indexing confusing, you can hand them the one-line answer: the index is the distance from the start, and the start is zero away from itself.

FAQ

Does zero-based indexing make programs faster?+
On modern hardware the difference is usually negligible, because compilers fold any constant offset into the address calculation at compile time. The original motivation was conceptual cleanliness and avoiding a runtime subtraction, not a measurable speed win in today's code.
Why do some languages like Fortran and MATLAB start at one?+
They prioritize matching mathematical notation, where vectors and matrices are traditionally indexed from one. Those languages grew up in scientific and numerical computing, where staying close to the math on paper mattered more than mirroring the machine's memory model.
Is the last valid index always the array's length minus one?+
For a standard zero-based array of n elements, yes — valid indices run from 0 to n - 1. Accessing index n is off the end, which is a classic off-by-one bug and, in unmanaged languages like C, a source of buffer overruns.

Related reading

See all Dev Knowledge articles →

Get the best tools, weekly

One email every Friday. No spam, unsubscribe anytime.