Scaling Merge-Ort Across GitHub: A New Era in Merging Strategies

At GitHub, the processes of merging and rebasing occur frequently in the background. When users are ready to merge their pull requests, the resultant merge is already prepared. Enhancing the performance of merges and rebases not only saves time for users but also optimizes backend resources. Recently, Git has introduced new features that GitHub is now utilizing on a large scale. This article explores the changes made and the improvements experienced.

Requirements for a Merge Strategy

Any merge strategy implemented at GitHub must adhere to several essential criteria:

Speed: Given GitHub’s extensive scale, even minor slowdowns can have significant repercussions across millions of activities occurring in hosted repositories daily.
Correctness: The definition of “correct” in merge strategies can be subjective. In ambiguous situations, GitHub strives to align with user expectations, which often mirror the behavior of the Git command line.
No Repository Checkout: To avoid scalability and security issues associated with having a working directory, GitHub’s strategy does not involve checking out the repository.

Historically, GitHub relied on libgit2 to fulfill these requirements. It provided faster performance than Git’s default merge strategy without necessitating a working directory. However, there were instances where libgit2 could not perform merges that users’ local Git installations could handle easily. This discrepancy resulted in numerous support tickets from users confused about why the GitHub web interface failed to merge files that their local command line could manage.

A New Strategy Emerges

Two years ago, Git introduced a new merge strategy called merge-ort. As detailed by its author on the mailing list, merge-ort is designed to be fast and correct while addressing many limitations of the previous default strategy. Unlike its predecessor, merge-recursive, it does not require a working directory. Merge-ort outperforms even GitHub’s optimized libgit2 approach in speed. With merge-ort becoming the default strategy in Git, it was imperative for GitHub to upgrade to this new method to enhance correctness.

The transition to merge-ort was divided into two phases: first for merges and then for rebases.

Merge-Ort for Merges

In September, GitHub announced the integration of merge-ort for handling merge commits. The implementation involved using Scientist to run both code paths in production, allowing for comparison of timing and correctness without significant risk. The process included:

Creating and enabling an experiment with the new code path.
Rolling it out initially to a small fraction of traffic, starting with internal repositories before expanding to a wider audience.
Iteratively measuring gains, verifying correctness, and fixing any bugs encountered.

The results showcased remarkable speed improvements across various scenarios, particularly in large and heavily trafficked repositories. For instance, within the github/github monolith, average speeds improved by tenfold in both average cases and P99 metrics. Throughout the experiment, P50 metrics also achieved a tenfold increase while P99 metrics experienced nearly a fivefold enhancement.

Merge-Ort for Rebases

In addition to merges, GitHub also performs numerous rebases. Customers may opt for rebase workflows in their pull requests, and test rebases are conducted behind the scenes. Consequently, merge-ort was also applied to rebases.

This implementation utilized a new Git subcommand called git-replay, developed by Elijah Newren—the original author of merge-ort and a notable contributor to Git. The approach involved:

Merging git-replay into GitHub’s fork of Git (initially using version 2.39 which lacked git-replay).
Utilizing the test suite prior to deployment to identify discrepancies between old and new implementations.
Automating tests by performing test rebases on all open pull requests in github/github and comparing outcomes.
Setting up a Scientist experiment to assess performance differences between libgit2-powered rebases while monitoring for unexpected behavioral mismatches.

The results were impressive; during testing over 730,000 instances, computers spent 2.56 hours executing rebases with libgit2 compared to under 10 minutes using merge-ort. Extrapolating this data indicated that if all rebases during that period had been executed with merge-ort instead of libgit2, it would have taken approximately 33 hours versus an astonishing 512 hours with libgit2.

What’s Next

While significant progress has been made regarding common use cases for merge-ort at GitHub, this is merely the beginning. There are additional opportunities to harness its capabilities for enhanced performance, accuracy, and availability. Future plans include exploring squashing and reverting functionalities as well as investigating potential new product features that could arise from this powerful tool.