Using Mozilla trees more smartly
A month ago I got a new laptop, requiring me to migrate my Mozilla trees, patches, and related work from old laptop to new. My previous setup was the simplest, stupidest thing that could work: individual clones of different trees, no sharing among those trees, sometimes multiple clones of the same tree for substantial, independent patchwork I didn’t want to explicitly order. Others have tried smarter tricks in the past, and I decided to upgrade my setup.
A new setup
The new setup is essentially this:
- I have one local clone of mozilla-inbound in
~/moz/.clean-basewhich I never develop against or build against, and never modify except by updating it. - Whenever I want a mozilla-inbound tree, I clone
~/moz/.clean-base. I change thedefault-pushentry in the new clone to point to the original mozilla-inbound. (I don’t change thedefaultentry; pulling is entirely local.) - If I want to push a patch, I pull and update
~/moz/.clean-base. Then I pull and update the local clone that has the patch I want to push. Then I finish my patch and push it. Becausedefault-pushpoints to the remote mozilla-inbound,hg pushas usual does exactly what I want.
Advantages
This setup has many advantages:
- Getting a new mozilla-inbound tree is fast. I never clone the remote mozilla-inbound tree, because I have it locally. It’s not modified by a patch queue where I’d have to temporarily checkpoint work, pop to clone, then reapply after.
- Updating a working mozilla-inbound tree is fast. Pulling and updating are completely local with no network delay.
- I only need to update from the remote mozilla-inbound once for new changes to be available for all local trees. Instead of separately updating my SpiderMonkey shell tree, updating my browser tree, and updating any other trees I’m using, at substantial cost in time, one pull in
~/moz/.clean-basebenefits all trees. - My working trees substantially share storage with
~/moz/.clean-base.
Pitfalls, and workarounds
Of course any setup has down sides. I’ve noticed these so far:
- Updating a working trees is a two-step process: first updating
~/moz/.clean-base, then updating the actual tree. - I’ll almost always lose a push race to mozilla-inbound. If my local working tree is perfectly up-to-date with my
~/moz/.clean-base, that’s generally not up-to-date with the remote tree, particularly as rebasing my patches is now a two-step process. That produces a larger window of time for others to push things after I’ve updated my clean tree but before I’ve rebased my working tree. - I have to remember to edit the
default-pushin new trees, lest I accidentally mutate~/moz/.clean-base.
Some of these problems are indeed annoying, but I’ve found substantial workarounds for them such that I no longer consider them limitations.
Automate updating ~/moz/.clean-base
Updating is only a two-step process if I update ~/moz/.clean-base manually, but it’s easy to automate this with a cronjob. With frequent updates ~/moz/.clean-base is all but identical to the canonical mozilla-inbound. And by making updates automatic, I also lose push races much less frequently (particularly if I rebase and push right after a regular update).
I’ve added this line to my crontab using crontab -e to update ~/moz/.clean-base every twenty minutes from 07:00-01:00 every day but Sunday (this being when I might want an up-to-date tree):
*/20 00-01,07-23 * * 1-6 /home/jwalden/moz/inflight/pull-updated-inbound >/dev/null 2>&1
I perform the update in a script, piping all output to /dev/null so that cron won’t mail me the output after every update. It seems better to have a simpler crontab entry, so I put the actual commands in /home/jwalden/moz/inflight/pull-updated-inbound:
#!/bin/bash cd ~/moz/.clean-base/ hg pull -u
With these changes in place, updating a working tree costs only the time required to rebase it: network delay doesn’t exist. And the intermediate tree doesn’t intrude on my normal workflow.
Add a hook to ~/moz/.clean-base to prevent inadvertent pushes
My setup depends on ~/moz/.clean-base being clean. Local changes or commits will break automatic updates and might corrupt my working trees. I want ~/moz/.clean-base to only change through pulls.
I can enforce this using a Mercurial prechangegroup hook. This hook, run when a repository is about to accept a group of changes, can gate changes before they’re added to a tree. I use such a hook to prevent any changes except by a push by adding these lines to ~/moz/.clean-base/.hg/hgrc:
# Prevent pushing into local mozilla-inbound clone: only push after changing a clone's default-push. [hooks] prechangegroup.prevent_pushes = python:prevent_pushes.prevent_pushes.hook
This invokes the hook function in prevent_pushes.py:
#!/usr/bin/python
def hook(ui, repo, **kwargs):
source = kwargs['source']
if source != 'pull':
print "Changes pushed into non-writable repository! Only pulls permitted."
return 1
print "Updating pristine mozilla-inbound copy..."
return 0
On my Fedora-based system, I place this file in /usr/lib/python2.7/site-packages/prevent_pushes/ beside an empty __init__.py. Mercurial will find it and invoke the hook whenever ~/moz/.clean-base receives changesets.
Only pushing from a new clone without a default-push would attempt to modify ~/moz/.clean-base, so the need to prevent changes to ~/moz/.clean-base might seem small. Yet so far this hook has prevented such changes more than once when I’ve forgotten to set a default-push, and I expect it will again.
Conclusion
There are doubtless many good ways to organize Mozilla work. I find this system works well for me, and I hope this description of it provides ideas for others to incorporate into their own setups.
I just use a shell script and rsync.
I already edited the .hgrc in each of the clean/* repos to include the default-push.
#!/bin/sh -x if [ $# -ne 1 ] ; then echo "Need a single arg: directory name!" exit 1 fi DIRNAME=$1 if [ -e $DIRNAME ] ; then echo "$DIRNAME exists!" exit 1 fi mkdir $DIRNAME for dir in buildbot-configs buildbotcustom build-tools mozharness; do pushd clean/$dir hg pull hg up -C -r default popd rsync -azv clean/$dir $DIRNAME/ doneComment by aki — 03.11.11 @ 10:27
Very similar to mine. Major differences:
* I have an hourly cron job to pull mozilla-inbound. When I want to be as up-to-date as possible, I have an alias named ‘
pullup‘ that does( cd $(hg path default) && hg pull ). Come to think of it, that’s probably better spelledhg -R $(hg path default) pull* I don’t care much about the source in my upstream repo, so I figure
pull -uis a waste of time. Both mypullupalias and my cronjob just do anhg pull. If I want the current source, I’ll manually do anhg update.* I don’t bother preventing pushes to my upstream repo because I never do a push with no arguments. I just put various aliases into the [paths] section of my
~/.hgrc, and push to those. mi=mozilla-inbound, try=Try. I don’t trust myself to have configured myhgrc‘s properly.* To avoid the cron mail problem, I just have a broken mail setup on my computer so it only delivers locally. 🙂
* I have a cron job to relink all of my repos, since with the hard links get broken easily and quickly. I use
cd src; ls | while read f; do [ -d $f/.hg ] && ( cd $f && hg relink ); done* Except that I have a couple of different upstreams, so for those, I add in a default-relink that points to my mozilla-inbound upstream repo. Given how rarely I update those, that’s probably a waste.
* I also have a cron job to attempt a nightly build in all of my objdirs. I put my objdirs underneath my src dirs (so I can do hg commands from the objdir) and name them obj-something. That’s probably unhealthy if you use the inotify extension, but inotify is busted in other ways anyway. The nightly builds are really only useful for when I switch back to a repo I haven’t used for a little while (since I don’t auto-update any of my repos), but sometimes I’ll have
opt/debug/really-debugobjdirs and they’ll be handy for my currently active repo or repos.If ccache could handle builds with varying paths, I’d do a nightly
pull -uand build with every configuration I cared about, just to warm up the ccache.Comment by Steve Fink — 03.11.11 @ 17:03
My mail’s only delivered locally too, but I think (based on what seems to happen if I invoke
atmanually) I’d get random terminal spam about new (local) mail if I didn’t silence the job.The cronjob to relink repositories is a good idea worth using. Although to tell the truth, I find the space-saving aspect of the setup about the least of its advantages, given how cheap disk space is, and how much the new laptop has (well more than I’ve ever used).
Comment by Jeff — 03.11.11 @ 18:02
I keep my dirty repos pointing to my clean, master repo, but when I’m ready to push a patch from a dirty repo I push directly to mozilla-inbound to reduce the chance of losing a race.
Comment by njn — 03.11.11 @ 18:29
I was doing this back when mozilla-central was created; I have one master, one build and one push tree (which default-pushes to ssh) for each repository that I use, which is comm-central and all its dependencies (which makes it harder to use mozilla-inbound).
I don’t update the master clone (I used
hg clone -Uthe first time).I configure the trees to automatically relink after a local pull.
When I want to check in I use
hg pull -R `hg paths default` && hg pull && hg update(I can’t usehg pull -Uin case the cron job meant that I have nothing to pull).I don’t have a specific cron job, unless you count the shell script I saved in
/etc/cron.hourlyas a cron job.Comment by Neil Rashbrook — 04.11.11 @ 06:58
always for looking for ways to improve my workflow, and this post and jlebar’s were very helpful (as were the comments and different approaches) – thanks !
Comment by Ian Melven — 04.11.11 @ 15:48
Yeah, super similar to what I am myself doing with BlueGriffon and my local clones of mozilla-central. Since nobody wrote it down before, nice post. Should probably be cloned somewhere on MDN.
Comment by Daniel Glazman — 09.11.11 @ 00:00