There is also shortcuts in case if the "current branch" is
a persistent revision such as tag or sha1. We check if the
persistent revision is present locally and if it does - do
no fetch anything from the server.
This greately reduces sync time and size of the on-disk repo
Change-Id: I23c6d95185474ed6e1a03c836a47f489953b99be
If the remote is using authenticated HTTP, but does not have
$GIT_URL/clone.bundle files in each repository, an initial sync
would fail around 8 projects in due to the library not resetting
the number of failures after getting a 404.
Work around this by updating the retry counter ourselves.
The urllib2 library is also not thread-safe. Make it somewhat
safer by wrapping the critical section with a lock.
Change-Id: I886e2750ef4793cbe2150c3b5396eb9f10974f7f
Signed-off-by: Shawn O. Pearce <sop@google.com>
Not every version of urllib2 supplies a reason object on the
HTTPError exception that it throws from urlopen(). Work around
this by using str(e) instead and hope the string formatting includes
sufficient information.
Change-Id: I0f4586dba0aa7152691b2371627c951f91fdfc8d
Signed-off-by: Shawn O. Pearce <sop@google.com>
urllib2 returns a malformed HTTPError object in certain situations.
For example, urllib2 has a couple of places where it creates an
HTTPError object with no fp:
if self.retried > 5:
# retry sending the username:password 5 times before failing.
raise HTTPError(req.get_full_url(), 401, "basic auth failed",
headers, None)
When it does that, HTTPError's ctor doesn't call through to
addinfourl's ctor:
# The addinfourl classes depend on fp being a valid file
# object. In some cases, the HTTPError may not have a valid
# file object. If this happens, the simplest workaround is to
# not initialize the base classes.
if fp is not None:
self.__super_init(fp, hdrs, url, code)
Which means the 'headers' slot in addinfourl is not initialized and
info() fails. It is completely insane that urllib2 decides not to
initialize its own base class sometimes.
Change-Id: I32a0d738f71bdd7d38d86078b71d9001e26f1ec3
Signed-off-by: Shawn O. Pearce <sop@google.com>
After a $GIT_URL/clone.bundle has been applied to the new local
repository, perform an incremental fetch using `git fetch` to ensure
the local repository is up-to-date. This allows the hosting server
to offer stale /clone.bundle files to bootstrap a new client.
If a single git fetch fails, it may succeed again after a short
delay. Transient failures are typical in environments where the
remote Git server happens to have limits on how many requests it
can serve at once (the anonymous git daemon, or an HTTP server).
Wait a randomized delay between 30 and 45 seconds and retry the
failed project once. This delay gives the site time to recover
from a transient traffic spike, and the randomization makes it less
likely that a spike occurs again from all of the same clients.
Change-Id: I97fb0fcb33630fb78ac1a21d1a4a3e2268ab60c0
Signed-off-by: Shawn O. Pearce <sop@google.com>
An HTTP (or HTTPS) based remote server may now offer a 'clone.bundle'
file in each repository's Git directory. Over an http:// or https://
remote repo will first ask for '$URL/clone.bundle', and if present
download this to bootstrap the local client, rather than relying
on the native Git transport to initialize the new repository.
Bundles may be hosted elsewhere. The client automatically follows a
HTTP 302 redirect to acquire the bundle file. This allows servers
to direct clients to cached copies residing on content delivery
networks, where the bundle may be closer to the end-user.
Bundle downloads are resumeable from where they last left off,
allowing clients to initialize large repositories even when the
connection gets interrupted.
If a bundle does not exist for a repository (a HTTP 404 response
code is returned for '$URL/clone.bundle'), the native Git transport
is used instead. If the client is performing a shallow sync, the
bundle transport is not used, as there is no way to embed shallow
data into the bundle.
Change-Id: I05dad17792fd6fd20635a0f71589566e557cc743
Signed-off-by: Shawn O. Pearce <sop@google.com>
The manifest project has - by design - not a review URL associated
with it. It is actually not even a 'project' in repo's sense.
This will prevent the commit-msg hook from being added, which is
not necessarily wanted as the project is managed in gerrit.
This commit will enable the commit-msg hook, which in turn will
add the Change-Id-line to every new commit in it. This simplifies
replacing patch sets (by git push ... refs/for/...).
Change-Id: I42d0f6fd79e6282d9d47074a3819e68d968999a7
Signed-off-by: Victor Boivie <victor.boivie@sonyericsson.com>
This commit adds a --br=<branch> option to repo upload.
repo currently examines every non-published branch. This is problematic
for my workflow. I have many branches in my kernel tree. Many of these
branches are based off of upstream remotes (I have many remotes) and
will never be uploaded (they'll get sent upstream as a patch).
Having repo scan these branches adds to my upload processing time
and clutters the branch selection buffer. I've also seen repo get
confused when one of my branches is 1000s of commits different from
m/master.
Change-Id: I68fa18951ea59ba373277b57ffcaf8cddd7e7a40
In the current version of repo checkout, we often get the error:
error: no project has branch xyzzy
...even when the actual error was something else. This fixes it
to only report the 'no project has branch' when that is actually true.
This fix is very similar to one made for 'repo abandon':
https://review.source.android.com/#change,22207
The repo checkout error is filed as: <http://crosbug.com/6514>
TEST=manual
A sample creating a case where 'git checkout' will fail:
$ repo start branch1 .
$ repo start branch2 .
$ touch bogusfile
$ git add bogusfile
$ git commit -m "create bogus file"
[branch2 f8b6b08] create bogus file
0 files changed, 0 insertions(+), 0 deletions(-)
create mode 100644 bogusfile
$ echo "More" >> bogusfile
$ repo checkout branch1 .
error: chromite/: cannot checkout branch1
A sample case showing that we still fail if no project has a branch:
$ repo checkout xyzzy .
error: no project has branch xyzzy
Change-Id: I48a8e258fa7a9c1f2800dafc683787204bbfcc63
The main fix is to give an error message if nothing was actually
abandoned. See <http://crosbug.com/6041>.
The secondary fix is to list projects where the abandon happened.
This could be done in a separate CL or dropped altogether if requested.
TEST=manual
$ repo abandon dougabc; echo $?
Abandon dougabc: 100% (127/127), done.
Abandoned in 2 project(s):
chromite
src/platform/init
0
$ repo abandon dougabc; echo $?
Abandon dougabc: 100% (127/127), done.
error: no project has branch dougabc
1
$ repo abandon dougabc; echo $?
Abandon dougabc: 100% (127/127), done.
error: chromite/: cannot abandon dougabc
1
Change-Id: I79520cc3279291acadc1a24ca34a761e9de04ed4
If git-rerere is enabled, it uses the rr-cache directory that
repo currently creates a symlink from, but doesn't create the
destination directory (inside the project's directory). Git
will then complain during merges and rebases.
This commit creates the rr-cache directory inside the project.
Change-Id: If8b57a04f022fc6ed6a7007d05aa2e876e6611ee
All repo-level hooks are expected to live in a single project at the
top level of that project. The name of the hooks project is provided
in the manifest.xml. The manifest also lists which hooks are enabled
to make it obvious if a file somehow failed to sync down (or got
deleted).
Before running any hook, we will prompt the user to make sure that it
is OK. A user can deny running the hook, allow once, or allow
"forever" (until hooks change). This tries to keep with the git
spirit of not automatically running anything on the user's computer
that got synced down. Note that individual repo commands can add
always options to avoid these prompts as they see fit (see below for
the 'upload' options).
When hooks are run, they are loaded into the current interpreter (the
one running repo) and their main() function is run. This mechanism is
used (instead of using subprocess) to make it easier to expand to a
richer hook interface in the future. During loading, the
interpreter's sys.path is updated to contain the directory containing
the hooks so that hooks can be split into multiple files.
The upload command has two options that control hook behavior:
- no-verify=False, verify=False (DEFAULT):
If stdout is a tty, can prompt about running upload hooks if needed.
If user denies running hooks, the upload is cancelled. If stdout is
not a tty and we would need to prompt about upload hooks, upload is
cancelled.
- no-verify=False, verify=True:
Always run upload hooks with no prompt.
- no-verify=True, verify=False:
Never run upload hooks, but upload anyway (AKA bypass hooks).
- no-verify=True, verify=True:
Invalid
Sample bit of manifest.xml code for enabling hooks (assumes you have a
project named 'hooks' where hooks are stored):
<repo-hooks in-project="hooks" enabled-list="pre-upload" />
Sample main() function in pre-upload.py in hooks directory:
def main(project_list, **kwargs):
print ('These projects will be uploaded: %s' %
', '.join(project_list))
print ('I am being a good boy and ignoring anything in kwargs\n'
'that I don\'t understand.')
print 'I fail 50% of the time. How flaky.'
if random.random() <= .5:
raise Exception('Pre-upload hook failed. Have a nice day.')
Change-Id: I5cefa2cd5865c72589263cf8e2f152a43c122f70
Fix for the bug that leaves a fractional .git directory after attempting to
perform an initial sync to a nonexistent revision. Moved the initialization of
the working directory to after the revision ID has already been checked. Now,
no project/.git directory gets created at all if the revision ID is bad.
Change-Id: I0c9b2a59573410f1d11de7661591bf02e4ce326b
This renaming was done for two reasons:
1. The hooks are actually project-level hooks, not repo-level
hooks. Since we are talking about adding repo-level hooks,
It keeps things less confusing if we name the existing hooks
to be "ProjectHooks"
2. The function is a private function in project.py and so
should have capitalization to match.
I also added a docstring describing this function.
Change-Id: I1d30f5de08e8f9f99f78146e68c76f906782d97e
There was a minor typo that would cause repo to (I believe)
mistakenly identify any file that contained a substring of the
word 'commit-msg' as a commit message hook. For example, the file
'mit' or the file 'msg' would be treated as a commit message hook.
I believe that it was intended that repo only recognize files
named exactly 'commit-msg'.
Change-Id: I93edbddf3da3cf0935641e6efb19b0a8ee6e2308
Commit "Make path references OS independent" (df14a70c45)
broke mirror clients by trying to invoke replace() on None
when there is no worktree.
Change-Id: Ie0a187058358f7dcdf83119e45cc65409c980f11
It hasn't been necessary for a long time, and its
functionality can be accomplished with 'git push'.
Change-Id: Ic00d3adbe4cee7be3955117489c69d6e90106559
Use git clone to initialize a new repository, and when possible
allow callers to use --reference to reuse an existing checkout as
the initial object storage area for the new checkout.
Change-Id: Ie27f760247f311ce484c6d3e85a90d94da2febfc
Signed-off-by: Shawn O. Pearce <sop@google.com>
If the -t flag is given to upload, the local branch name is
automatically sent to Gerrit Code Review as the topic branch name
for the change(s). This requires the server to be Gerrit Code
Review v2.1.3-53-gd50c94e or later, which isn't widely deployed
right now, so the default is opt-out.
Change-Id: I034fcacb405b7cb909147152db427fe69dd7bcbf
Signed-off-by: Shawn O. Pearce <sop@google.com>
If a tagged commit is not reachable by the fetch refspec configured
for the git (usually refs/heads/*) it will not be downloaded by
'git fetch'. The tag can however be downloaded with 'git fetch
--tags' or 'git fetch tag <tag>'.
This patch fixes the situation when a tag is not found after a
'git fetch'. Repo will issue 'git fetch tag <tag>' before giving
up completely.
Change-Id: I87796a5e1d51fcf398f346a274b7a069df37599a
Signed-off-by: Shawn O. Pearce <sop@google.com>
Most users of repo are also using Gerrit Code Review, and will want
the commit-msg hook to be automatically installed into their local
projects so that Change-Ids are assigned when commits are created,
not when they are first uploaded.
(cherry picked from commit a949fa5d20
but squashed with latest hook script from version 2.1.2)
Change-Id: Ie68b2d60ac85d8c2285d2e1e6a4536eb76695547
Signed-off-by: Shawn O. Pearce <sop@google.com>
This is almost always something the user needs to address
before continuing work, so promoting it to a failure (rather
than simply an informational message) seems the right way to
go. As a side-effect, repo will now exit with a non-zero
status code in this situation, so pipelines of the form
`repo sync && make` will fail if there are branches that
are stalled due to uploaded but unmerged patches.
If a local git repository exists within the same folder as a new project that
is added, when the user syncs the repo, the sync will overwrite the local
files under the project's .git repository with its own symlinks. Make sure
that we do not overwrite 'normal' files in repo and throw an error when
that happens.
If an email address in a commit object contains a space, like a few
malformed ones on the Linux kernel, we still want to split only on
the first space.
Unfortunately my brain was too damaged by Perl and originally wrote
the split asking for 2 results; in Python split's argument is how
many splits to perform. Here we want only 1 split, to break apart
the commit identity from the email address on the same line.
Signed-off-by: Shawn O. Pearce <sop@google.com>
We accidentally introduced this message during 1.6.8 by always
invoking `git rebase` when there were no new commits from the
upstream, but the user had local commits.
Signed-off-by: Shawn O. Pearce <sop@google.com>
The revisionExpr field now holds an expression from the manifest,
such as "refs/heads/master", while revisionId holds the current
commit-ish SHA-1 of the revisionExpr. Currently that is only
filled in if the manifest points directly to a SHA-1.
Signed-off-by: Shawn O. Pearce <sop@google.com>
The trick of looking at the reflog for the remote tracking branch
and only going back one commit works some of the time, but not all of
the time. Its sort of relying on the fact that the user didn't use
`repo sync -n` or `git fetch` to only update the tracking branches
and skip the working directory update.
Doing this right requires looking through the history of the SHA-1
source (what the upstream used to be) and finding a spot where the
DAG diveraged away suddenly, and consider that to be the rewind
point. That's really difficult to do, as we don't have a clear
picture of what that old point was.
A close approximation is to list all of the commits that are in
HEAD, but not the new upstream, and rebase all of those where the
committer email address is this user's email address. In most cases,
this will effectively rebase only the user's new original work.
If the user is the project maintainer and rewound the branch
themselves, and they don't want all of the commits they have created
to be rebased onto the new upstream, they should handle the rebase
on their own, after the sync is complete.
Signed-off-by: Shawn O. Pearce <sop@google.com>
We now feed Project a RemoteSpec, instead of the Remote directly
from the XmlManifest. This way the RemoteSpec already has the
full project URL, rather than just the base, permitting other
types of manifests to produce the URL in their own style.
Signed-off-by: Shawn O. Pearce <sop@google.com>
These aren't that widely used, and actually make it difficult for
users to fully mirror a forest of repositories, and then permit
someone else to clone off that forest, rather then the original
upstream servers.
Signed-off-by: Shawn O. Pearce <sop@google.com>
Performance improvements in repo sync caused us to skip out of the
initial Sync_LocalHalf without ever running CopyFiles, so we didn't
create the top level Makefile in new clients whose manifest request
one with a <copyfile> element.
Now we run CopyFiles after the initial read-tree that populates
the project working directory.
Signed-off-by: Shawn O. Pearce <sop@google.com>
If the current branch is published, but all published commits are
merged into the manifest revision, but there is also at least one
unpublished commit on the current branch, we should rebase the
unpublished commit, rather than creating a merge commit.
Signed-off-by: Shawn O. Pearce <sop@google.com>
By creating a background ssh "control master" process which lives
for the duration of our sync cycle we can easily cut the time for
a no-op sync of 132 projects from 60s to 18s.
Bug: REPO-11
Signed-off-by: Shawn O. Pearce <sop@google.com>
Most projects will have their branch heads matching in all branches,
so switching between them should be just a matter of updating the
work tree's HEAD symref. This can be done in pure Python, saving
quite a bit of time over forking 'git checkout'.
Signed-off-by: Shawn O. Pearce <sop@google.com>
This is mostly useful if the number of projects to switch is many
(e.g. all of Android) and a large number of them are behind the
current manifest revision. We wind up needing to run git just to
make the working tree match, and that often makes the command take
a couple of seconds longer than we'd like.
Signed-off-by: Shawn O. Pearce <sop@google.com>
Its quite common for most projects to be matching the current
manifest revision, as most developers only modify one or two projects
at any one time. We can speed up `repo start foo` (that impacts
the entire client) by performing most of the branch creation and
switch operations in pure Python, and thus avoid 4 forks per project.
Signed-off-by: Shawn O. Pearce <sop@google.com>
These used to be used back when we had Gerrit 1.x support and used
HTTP based uploads to transmit changes for review. Since we moved
entirely to Gerrit 2.x, these are no longer called.
Signed-off-by: Shawn O. Pearce <sop@google.com>
Its unlikely that a new version of repo will be delivered in any
given day, so we now check only once every 24 hours to see if repo
has been updated. This reduces the sync cost, as we no longer need
to contact the repo distribution servers every time we do a sync.
repo selfupdate can still be used to force a check.
Signed-off-by: Shawn O. Pearce <sop@google.com>