Git’s Database Internals III: File History Queries (GitHub blog)

Post Syndicated from original https://lwn.net/Articles/906604/

The GitHub blog series on how the Git database works continues with this
look at file-history queries
.

If these history modes usually have the same output, then why
wouldn’t we always use --full-history
--simplify-merges? The reason is performance. Not only
does simplified history speed up the query by skipping a large
portion of commits, it also allows iterative output. The simplified
history can output portions of the history without walking the
entire history. By contrast, the --simplify-merges algorithm is
defined recursively starting at commits with no parents. Git cannot
output a single result until walking all reachable commits and
computing their diffs on the input path. This can be extremely slow
for large repositories.