Hunks (Comparing and Merging Files) (original) (raw)
1.1 Hunks ¶
When comparing two files, diff
finds sequences of lines common to both files, interspersed with groups of differing lines called_hunks_. Comparing two identical files yields one sequence of common lines and no hunks, because no lines differ. Comparing two entirely different files yields no common lines and one large hunk that contains all lines of both files. In general, there are many ways to match up lines between two given files. diff
tries to minimize the total hunk size by finding large sequences of common lines interspersed with small hunks of differing lines.
For example, suppose the file F contains the three lines ‘a’, ‘b’, ‘c’, and the file G contains the same three lines in reverse order ‘c’, ‘b’, ‘a’. Ifdiff
finds the line ‘c’ as common, then the command ‘diff F G’ produces this output:
1,2d0 < a < b 3a2,3
b a
But if diff
notices the common line ‘b’ instead, it produces this output:
1c1 < a
c 3c3 < c
a
It is also possible to find ‘a’ as the common line. diff
does not always find an optimal matching between the files; it takes shortcuts to run faster. But its output is usually close to the shortest possible. You can adjust this tradeoff with the--minimal (-d) option (see diff Performance Tradeoffs).