We describe what fencepost errors are, how to avoid them, and provide a precise derivation for why they occur.
Suppose that you’re working on a large codebase in your favorite editor, and suppose that you want to find out how many lines of code a particular class in this codebase takes up.
To do this, you find the first line of code that the class starts on, say line $7$, and then you find the last line of code that the class ends on, say line $373$. You then compute $373 - 7 = 366$ and conclude that this class takes up $366$ lines of code.
However, the correct answer is actually $367$ lines of code, since the last line, line $373$, was not counted in the original answer.
This is an example of a fencepost error. The word “fencepost” comes from one of the simplest examples where fencepost errors occur:
If you build a fence that is $10$ feet long with posts spaced $1$ foot apart, and assuming that each fence section consists of $2$ posts, how many posts do you need to build the fence?
If we divide $10$ feet of fence by $1$ foot per fence section, then we obtain $10$ fence sections. However, each fence section consists of $2$ posts, so while there will be $10$ fence sections, there will be $11$ posts.
We can resolve the fencepost error in the example above by writing down the sequence of lines of code as
Then, because two sequences have the same length if there exists a bijection between them, we can determine the length $N$ of $\eqref{loc_seq}$ by finding a bijection between it and the sequence $1,2,\dots,N$ as follows,
Hence, the length $N$ of $\eqref{loc_seq}$ is $367$.
More formally, consider the integers $r$ and $a$ such that $a > r$. What is the length of the sequence $r, r+1, r+2, \dots, a$? That is, how many integers are there in this sequence?
As done above, the length of this sequence can be determined by finding a bijection between this sequence and the sequence $1,2,\dots,N$, where $N \in \mathbb N$ is the length of the sequence and $N > 1$, as follows,
where $\eqref{gen_seq2}$ is obtained from $\eqref{gen_seq1}$ by subtracting $r$ from each element in $\eqref{gen_seq2}$ and $\eqref{gen_seq3}$ is obtained from $\eqref{gen_seq2}$ by adding $1$ to each element in $\eqref{gen_seq2}$. Hence, the length of the sequence $r,r+1,r+2,\dots,a$ is $(a-r) + 1$.
We can generalize this reasoning to the more general sequence $r,r+b,r+2b,\dots,a$, such that the difference between consecutive elements in the sequence is $b \in \mathbb N$ instead of $1$ as assumed above and $a = r + q \cdot b$ for some $q \in \mathbb N$.
The length of this more general sequence can be obtained as follows,
Hence, the length of the sequence $r,r+b,r+2b,\dots,a$ is $\frac{a-r}{b} + 1$. It is interesting to note that because $a = r + q \cdot b$, then $a, r, q,$ and $b$ can be interpreted as the dividend, remainder, quotient, and divisor, respectively, when $a$ is divided by $b$ via Euclidean division.
This general case is useful when determining how many “points” there are in a given interval. For example, suppose that you are trying to run a numerical simulation over time. You have the desired total simulation time $T > 0$ and a fixed time step size $\Delta t > 0$. Given this information, you would like to compute how many points in time that the simulation will step through, denoted $N$.
That is, you would like to determine the length of the sequence $0, \Delta t, 2\Delta t, \dots, T$. As done above, we can determine the length of this sequence by finding a bijection between this sequence and the sequence $1,\dots,N$, where $N \in \mathbb N$, as follows:
where $\eqref{sim_seq2}$ is obtained from $\eqref{sim_seq1}$ by dividing each element in the sequence by $\Delta t$ and $\eqref{sim_seq3}$ is obtained from $\eqref{sim_seq2}$ by adding $1$ to each element in the sequence. Hence, the number of time points that this simulation will step through is $\frac{T}{\Delta t} + 1$.
To summarize, to compute the correct number of numbers in a sequence and avoid fencepost errors, do the following:
- Given the end-points of the sequence, write down the full sequence that you would like to count. Be sure to avoid including any numbers that you do not want to count.
- Transform the sequence from step 1 into the canonical sequence $1,\dots,N$, where $N \in \mathbb N$.
- The amount of numbers in the sequence is $N$.
Fencepost errors occur when the sequence that is written down in step 1 includes numbers that should not be counted or does not include numbers that should be counted.
For further reading on fencepost errors, see this article.