Inter Repository Operations

[This is an experiment in publishing a doc piece by piece as blog entries. Please refer to the main page for additional context.]

Mozilla, like most operations, has the Repositories of Record (RoR) set to only allow “fast forward” updates when new code is landed. In order to fast forward merge, the tip of the destination repository (RoR) must be an ancestor of the commit being pushed from the source repository. In the discussion below, it will be useful to say if a repository is “ahead”, “behind”, or “equal” to another. These states are defined as:

  • If the tip of the two repositories are the same reference, then the two repositories are said to be equal (’e’ in table below)
  • Else if the tip of the upstream repository is a ancestor of the tip of the destination repository, the upstream is defined to be behind (’B’ in table below) the source repository
  • Otherwise, the upstream repository is ahead (’A’ in table below) of the source repository.

Landing a change in the normal (2 repository case: RoR and lander’s repository), the process is logically (assuming no network issues):

  1. Make sure lander’s repository is equivalent to RoR (start with equality)

  2. Apply the changes (RoR is now “Behind” the local repository)

  3. Push the changes to the RoR
    • if the push succeeds, then stop. (equality restored)

    • if the push fails, simultaneous landings were being attempted, and you lost the race.

      When simultaneous landings are attempted, only one will succeed, and the others will need to repeat the landing attempt. The RoR is now “Ahead” of the local repository, and the new upstream changes will need to be incorporated, logically as:

      1. Remove the local changes (“patch -R”, “git stash”, “hg import”, etc.).
      2. Pull the changes from RoR (will apply cleanly, equality restored)
      3. Continue from step 2 above

When an authorized committer wants to land a change set on an hg RoR from git, there are three repositories involved. These are the RoR, the git repository the lander is working in, and internal hggit used for translation. The sections below describe how this affects the normal case above.

Land from git – Happy Path

On the happy path (no commit collisions, no network issues), the steps are identical to the normal path above. The git commands executed by the lander are set by the tool chain to perform any additional operations needed.

Land from git – Commit Collision

Occasionally, multiple people will try to land commits simultaneously, and a commit collision will occur (steps 3a, 3b, & 3c above). As long as the collision is noticed and dealt with before addition changes are committed to the git repository, the tooling will unapply the change to the internal hggit repository.

Land from git – Sad Path

In real life, network connections fail, power outages occur, and other gremlins create the need to deal with “sad paths”. The following sections are only needed when we’re neither on the happy path nor experiencing a normal commit collision.

Because these cases cover every possible case of disaster recovery, it can appear more complex than it is. While there are multiple (6) different sad paths, only one will be in play for a given repository. And the maximum number of operations to recover is only three (3). The relationship between each pair of repositories determines the correct actions to take to restore the repositories to a known, consistent state. The static case is simply:

Simplistic Recovery State Diagram

Simplistic Recovery State Diagram

Note

  1. The simplistic diagram assumes no changes to RoR during the duration of the recovery (not a valid assumption for real life). See the text for information on dealing with the changes.
  2. States “BB” & “BA” are not shown, as they represent invalid states that may require restoring portions of the system from backup before proceeding.

In reality, it is impractical to guarantee the RoR is static during recovery steps. That can be dealt with by applying the process described in the flowchart to restore equality and using the tables below to locate the actions.

The primary goal is to ensure correctness based on the RoR. The secondary goal is to make the interim repository as invisible as possible.

Key RoR <-> hggit hggit <-> git Interpretation Next Step to Equality
Ae Ahead equal someone else landed pull from RoR
AA Ahead Ahead someone else landed [1] pull from RoR
AB Ahead Behind someone else landed [1] back out local changes (3a above)
ee equal equal equal nothing to do
eA equal Ahead someone else landed [2] pull to git
eB equal Behind ready to land push from git
Be Behind equal ready to land [2] push to RoR
BA Behind Ahead prior landing not finished, lost from git [3] corrupted setup, see note
BB Behind Behind prior landing not finished, next started [4] back out local changes (3a above) from 2nd landing

Table Notes

[1](1, 2) This is the common situation of needing to update (and possibly re-merge local changes) prior to landing the change
[2](1, 2) If the automation is working correctly, this is only a transitory stage, and no manual action is needed. IRL, stuff happens, so an explicit recovery path is needed.
[3]This “shouldn’t happen”, as it implies the git repository has been restored from a backup and the “pending landing” in the hggit repository is no longer a part of the git history. If there isn’t a clear understanding of why this occurred, client side repository setup should be considered suspect, and replaced.
[4]

Lander shot themselves in the foot - they have 2 incomplete landings in progress. If they are extremely lucky, they can recover by completing the first landing (“hg push RoR” -> “eB”), and proceed from there.

The deterministic approach, which also must be used if landing of first change set fails, is to back out second landing from hggit and git, then back out first landing from hggit and git.) Then equality can be restored, and each landing redone separately.

DVCS Commands
Next Step Active Repository Command
pull from RoR hggit hg pull
pull to git git git pull RoR
push from git git git push RoR
push to RoR hggit hg push

Note

that if any of the above actions fail, it simply means that we’ve lost another race condition with someone else’s commit. The recovery path is simply to re-evaluate the current state and proceed as indicated (as shown in diagram 1).

Flowchart to Restore Equality

Flowchart to Restore Equality

Flowchart to Restore Equality

Using hg & git for the same codebase

Speaker Notes

Following are the slides I presented at the RELENG 2013 workshop on May 20th, 2013. Paragraphs formatted like this were not part of the presented slides - they are very rough speaker notes.

If you prefer, you may view a PDF version.

Hal Wine hwine@mozilla.com
Release Engineering
Mozilla Corporation

Issues and solutions encountered in maintaining a single code base under active development in both hg & git formats.

Background

  • Mozilla Corporation operates an extensive build farm that is mostly used to build binary products installed by the end user. Mozilla has been using Mercurial repositories for this since converting from CVS in 2007. We currently use a 6 week “Rapid Release” cycle for most products.

    Speaker Notes

    We currently have upwards of 4,000 hosts involved in the continuous integration and testing of Mozilla products. These hosts do approximately 140 hours of work on each commit.

  • Firefox Operating System is a new product that ships source to be incoporated by various partners in the mobile phone industry. These partners, experienced with the Android build process, require source be delivered via git repositories. This is close to a “Continuous Release” process.

    Speaker Notes

    A large part of the FxOS product is code used in the browser products. That is in Mercurial and needs to be converted to git. Most new code modules for FxOS are developed on github, and need to be converted to Mercurial for use in our CI & build systems.

Summary

  • What we initially set out to do:
    • Make it purely a developer choice which dvcs to use.

      Speaker Notes

      Ideal was to allow developers to make dvcs as personal a choice as editor.

    • Support multiple social coding sites.

      Speaker Notes

      These social coding sites, such as github and bitbucket, make it much easier for new community members to contribute.

  • That was much tougher than anticipated.
    In theory, git & hg are very close…
    … In practice, “the devil is in the details”.
  • Where we are:
    • Changed direction to support FFOS release to partners.
    • Quickly mirror Repository of Record (RoR) between git & hg.
    • CI/build system remains Mercurial centric.

Challenge Areas

  • Changesets have different hashes in Mercurial and git.
    • We added tooling to support both in static documents such as manifest files.
    • All tools continue to use hg hash as primary value for indexing and linking.
  • Propagation delays of changesets to the “other” system.

    Speaker Notes

    For most use cases, the approximately 20 minute average we’re achieving is acceptable.

    • Compounded by hash differences between two systems.

      Speaker Notes

      A common use case here is a developer wanting to start a self serve build. If the commit was to git, the self serve build won’t be successful until that commit is converted to hg.

      We are continuing work on this. It is closely tied to determining which commit broke the build, when multiple repositories are involved.

  • Build details
    • Movable tags are not popular in git based workflows, but have been a common technique at Mozilla to mark “latest”.

Challenge Areas (Con’t)

  • Mixed philosophies are often linked with mixed repositories.
    • Android never wants history to appear to change. Downstream servers allow only fast forward changesets and deny deletions.

    • Mozilla uses “RoR is authoritative”.

      Speaker Notes

      Either approach is self consistent. It is when the two need to interact that challenges arrise.

  • Conversion failures
    • Occasional hg-git conversion failures, due to implementation details of hg & git.

      Speaker Notes

      • Dates in export patches (e.g. hg uses seconds, git uses minutes, in time)
      • Email validation (git stricter than hg)
    • Since commit already accepted by hg, hg-git must be modified

      Speaker Notes

      This requires inhouse resources to respond urgently to patch the conversion machinery. Without conversion, there are no builds.

Alternate Approaches

  • To support your own “use the DVCS you want” infrastructure requires:
    • production quality hg server
    • production quality git server
    • in house ability to address conversion issues (as already mentioned)
  • I’m aware of two commercial alternatives. Both of these use a centralized RoR which supports git and/or hg interfaces for developer interaction.

    Speaker Notes

    And at least one explicitly does not have a git back end.

  • You can leave it to developers to scratch their own itch independently. Given diversity of workflows, this may be more cost effective than obtaining consensus.

Future Research

Areas of particular interest for further study include:

  • What is the set of enforceable assertions which would ensure the tooling can maintain lossless conversion between DVCS?

  • What minimum conditions must be maintained in conversions to preclude downstream conflicts?

  • What workflows can be supported to minimize issues?

  • Are there best practice incident management protocols for addressing problem commits.

    Speaker Notes

    The common example is a commit contains sensitive material it should not. There are cases were limiting the scope of distribution can have significant business value.