The smart sync logic takes up about 45% of the overall Execute func
and is about 100 lines of code. The only effect it has on the rest
of the code is to set the manifest_name variable. Since this func
is already quite huge, split the smart sync logic out.
Change-Id: Id861849b0011ab47387d74e92c2ac15afcc938ba
Reviewed-on: https://gerrit-review.googlesource.com/c/git-repo/+/234835
Tested-by: Mike Frysinger <vapier@google.com>
Reviewed-by: David Pursehouse <dpursehouse@collab.net>
In commit d9e5cf0e ("sync: invert --force-broken with --fail-fast") the
force-broken option has been deprecated. Accidentally the option has
been changed from Boolean to Value. This breaks all users of repo with:
main.py: error: -f option requires an argument
This is easy to avoid by keeping the type.
Signed-off-by: Stefan Müller-Klieser <s.mueller-klieser@phytec.de>
Change-Id: Ia8b589cf41ac756d10c61e17ec8d76ba8f7031f9
Reviewed-on: https://gerrit-review.googlesource.com/c/git-repo/+/235043
Reviewed-by: David Pursehouse <dpursehouse@collab.net>
Reviewed-by: Mike Frysinger <vapier@google.com>
Tested-by: Mike Frysinger <vapier@google.com>
A common pattern in our subcommands is to verify the arguments &
options before executing things. For some subcommands, that check
stage is quite long which makes the execution function even bigger.
Lets split that logic out of the execute phase so it's easier to
manage these.
This is most noticeable in the sync subcommand whose Execute func
is quite large, and the option checking makes up ~15% of it.
The manifest command's Execute can be simplified significantly as
the optparse configuration always sets output_file to a string.
Change-Id: I7097847ff040e831345e63de6b467ee17609990e
Reviewed-on: https://gerrit-review.googlesource.com/c/git-repo/+/234834
Reviewed-by: David Pursehouse <dpursehouse@collab.net>
Tested-by: Mike Frysinger <vapier@google.com>
People seem to not expect the sync process to halt immediately if an
error is encountered. It's also basically guaranteed to leave their
tree in an incomplete state. Lets invert the default behavior so we
attempt to sync (both fetch & checkout) all projects. If an error is
hit, we still exit(1) and show it at the end.
If people want the sync to abort quickly, they can use the new option
--fail-fast.
Bug: https://crbug.com/gerrit/11293
Change-Id: I49dd6c4dc8fd5cce8aa905ee169ff3cbe230eb3d
Reviewed-on: https://gerrit-review.googlesource.com/c/git-repo/+/234812
Reviewed-by: David Pursehouse <dpursehouse@collab.net>
Tested-by: Mike Frysinger <vapier@google.com>
Callers don't actually see -1 (they'll usually see 255, but the exact
answer here is complicated). Just switch to 1 as that's the standard
value tools use to indicate an error.
Change-Id: Ib712db1924bc3e5f7920bafd7bb5fb61f3bda44f
Reviewed-on: https://gerrit-review.googlesource.com/c/git-repo/+/233553
Reviewed-by: David Pursehouse <dpursehouse@collab.net>
Tested-by: Mike Frysinger <vapier@google.com>
The partial clone rework (commit 745be2ede1
"Add support for partial clone") changed the behavior when a single repo
hit a failure: it would always call sys.exit() immediately. This isn't
even necessary as we already pass down an error event object which the
workers set and the parent checks. Just delete the exit entirely.
Change-Id: Id72d8642aefa2bde24e1a438dbe102c3e3cabf48
Reviewed-on: https://gerrit-review.googlesource.com/c/git-repo/+/233552
Reviewed-by: David Pursehouse <dpursehouse@collab.net>
Tested-by: Mike Frysinger <vapier@google.com>
A new option, --partial-clone is added to 'repo init' which tells repo
to utilize git's partial clone functionality, which reduces disk and
bandwidth usage when downloading by omitting blob downloads initially.
Different from restricting clone-depth, the user will have full access
to change history, etc., as the objects are downloaded on demand.
Change-Id: I60326744875eac16521a007bd7d5481112a98749
Reviewed-on: https://gerrit-review.googlesource.com/c/git-repo/+/229532
Reviewed-by: Mike Frysinger <vapier@google.com>
Tested-by: Xin Li <delphij@google.com>
Neither of the fields here expect floats so make sure we use integer
division when calculating things.
Bug: https://crbug.com/gerrit/10418
Change-Id: Ibda068b16a7bba7ff3efba442c4bbff4415caa6e
There's no reason to support any other encoding in these files.
This only affects the files themselves and not streams they open.
Bug: https://crbug.com/gerrit/10418
Change-Id: I053cb40cd3666ce5c8a0689b9dd938f24ca765bf
Forcefully remove dirty projects if option '--force-remove-dirty' is given.
The '--force-remove-dirty' option can be used to remove previously used
projects with uncommitted changes. WARNING: This may cause data to be lost
since uncommitted changes may be removed with projects that no longer exist
in the manifest.
Change-Id: I844a6e943ded522fdc7b1b942c0a1269768054bc
* Add more file i/o wrappers in platform_utils to allow using
long paths (length > MAX_PATH) on Windows.
* Paths using the long path syntax ("\\?\" prefix) should never
escape the platform_utils API surface area, so that this
specific syntax is not visible to the rest of the repo code base.
* Forward many calls from os.xxx to platform_utils.xxx in various place
to ensure long paths support, specifically when repo decides to delete
obsolete directories.
* There are more places that need to be converted to support long paths,
this commit is an initial effort to unblock a few common use cases.
* Also, fix remove function to handle directory symlinks
Change-Id: If82ccc408e516e96ff7260be25f8fd2fe3f9571a
Since gitiles recommends using # headers over ---/=== underlines,
change the manifest-format.md over and all our help texts.
Change-Id: I96391d41fba769e9f26870d497cf7cf01c8d8ab3
os.remove raises an exception when deleting read-only files on
Windows. Replace all calls with calls to platform_utils.remove,
which deals with deals with that issue.
Change-Id: I4dc9e0c9a36b4238880520c69f5075eca40f3e66
* changes:
Port os.rename calls to work on Windows
Workaround shutil.rmtree limitation on Windows
Add support for creating symbolic links on Windows
Make "git command" and "forall" work on Windows
$ git ls-files | grep py$ | xargs pyflakes
subcmds/stage.py:101: list comprehension redefines 'p' from line 63
subcmds/sync.py:784: list comprehension redefines 'p' from line 664
subcmds/upload.py:467: list comprehension redefines 'avail' from line 454
Change-Id: Ia65d1a72ed185ab3357e1a91ed4450c719e75a7c
With --force-broken it continue to fetch other projects but nothing
is added in directory because it abort some lines later.
Change-Id: I32c4a4619b3028893dc4f98e8d4e5bc5c09adb27
By default, shutil.rmtree raises an exception when deleting readonly
files on Windows.
Replace all shutil.rmtree with platform_utils.rmtree, which adds an
error handler to make files read-write when they can't be deleted.
Change-Id: I9cfea9a7b3703fb16a82cf69331540c2c179ed53
repo sync can sync submodules via the --fetch-submodules option.
However, if the manifest repo has submodules, those will not be synced.
Having submodules in the manifest repo -- while not commonly done -- can
be useful for inheriting a manifest from another project using <include>
and layering changes on top of it. In this way, you can avoid having to
deal with merge conflicts between your own manifests and the other
project's manifests (for example, if you're managing an Android fork).
Add a --submodule option to init that automatically syncs the submodules
in the manifest repo whenever the manifest repo changes.
Change-Id: I45d34f04517774c1462d7f233f482d1d81a332a8
Signed-off-by: Martin Kelly <mkelly@xevo.com>
repo can be configured to download from any number of remote git repos.
However when one fails repo doesn't report which one. Example:
Fatal: remote error: Daily ls-remote rate limit exceeded for IP xx.xx.xx.xx
TEST=repo init -q -u https://chromium.googlesource.com/chromiumos/manifest.git
# Apply patch in ./.repo/repo/
# Simulate a git remote error:
sed -i -e 's#chromiumos/docs#chromiumos/XXdocs#' .repo/manifests/full.xml
repo sync --quiet --force-sync docs
# error message now shows the remote URL
Optional test tip: reduce the time.sleep(random(...)) in ./.repo/repo/project.py
Change-Id: I4509383b6a43a8e66064778e8ed612d8a735c8b6
When repo syncs a manifest that utilizes multiple branches
in the same project, then the sync will use an extra
thread for each "duplicate". For example, if
the manifest includes the project "foo" and "bar"
twice, then "repo sync -jN" will fetch with N+2 threads.
This is caused by _FetchHelper() releasing the thread semaphore
object each time it's called, even though _FetchProjectList()
may call this function multiple times within the scope of a
single thread.
Fix by moving the thread semaphore release to
_FetchProjectList(), which is only called once per thread
instance.
Change-Id: I1da78b145e09524d40457db5ca5c37d315432bd8
When there's a symlink to a directory, os.walk still lists the symlink
in dirs, even if it isn't configured to follow symlinks. This will fail
the listdirs check if the symlink is broken (either before or during the
cleanup). So instead, check for directory symlinks and remove them using
os.remove.
Bug: Issue 231
Change-Id: I0ec45a26be566613a4a39bf694a3d9c6328481c2
When there are nested projects in a manifest, like on AOSP right now:
<project path="build" name="platform/build" />
<project path="build/blueprint" name="platform/build/blueprint" />
<project path="build/kati" name="platform/build/kati" />
<project path="build/soong" name="platform/build/soong" />
And the top "build" project is removed (or renamed to remove the
nesting), repo just wipes away everything under build/ and re-creates
the projects that are still there. But it only checks to see if the
build/ project is dirty, so if there are dirty files in a nested
project, they'll just be blown away, and a fresh worktree checked out.
Instead, behave similarly to how `git clean -dxf` behaves and preserve
any subdirectories that have git repositories in them. This isn't as
strict as git -- it does not check to see if the '.git' entry is a
readable gitdir, just whether an entry named '.git' exists.
If it encounters any errors removing files, we'll print them all out to
stderr and tell the user that we were unable to clean up the obsolete
project, that they should clean it up manually, then sync again.
Change-Id: I2f6a7dd205a8e0b7590ca5369e9b0ba21d5a6f77
The shared object stores confuse git and make it throw away objects which are
still in use. We'll avoid that problem by disabling automatic pruning on those
projects, but there's nothing preventing a user from changing the config back
or pruning a repository manually.
BUG=chromium:375945
TEST=Ran repo sync on fresh ChromeOS checkout, starting with a branch of repo
with this change. Verified that the kernel projects and no others were
identified as having shared object stores, and that repo successfully disabled
automatic pruning in their configs. Re-enabled pruning and ran repo sync just
on one of the kernel directories. Verified that pruning was re-disabled as a
result.
Change-Id: I728ed5b06f0087aeb5a23ba8f5410a7cd10af5b0
When repo sync is used with -f (--force-error) and a project fails to
sync, the sync will continue but then exit with an error status.
However if -n (--network-only) is also used, the exit code is 0, even
when a project failed.
Modify the logic to make sure the sync exits with the correct status.
Bug: Issue 214
Change-Id: I0b5d97a34642c5aa3743750ef14a42c9d5743c1d
By passing --prune to the sync command, the --prune option is
given to the `git fetch`, causing refs that no longer exist on
the remote to be removed.
Change-Id: I3cedacce14276d96ac2d5aabf2d07fd05e92bc02
The .gitcookies file generated by googlesource.com does not have
the header:
# (Netscape) HTTP Cookie File
which causes python's MozillaCookieJar.load to fail with the
error:
"does not look like a Netscape format cookies file"
Prepend the expected header onto the generated cookie file.
We don't bother to check if the header already exists on the
file; repeating it does not cause any problem.
Bug: Issue 207
Change-Id: I7d39720a1d36a6aae00f70691156514ebc04e579
This way any changes made to the main manifest are reflected in the gitc
manifest. It's also necessary to use both manifests to sync since the
information required to update the gitc manifest is actually in the repo
manifest.
This also fixes a few issues that came up when testing. notdefault
groups weren't being saved to the gitc manifest in a method that matched
'sync'. The merge branch wasn't always being set to the correct value
either.
Change-Id: I435235cb5622a048ffad0059affd32ecf71f1f5b
This way any changes made to the main manifest are reflected in the gitc
manifest. It's also necessary to use both manifests to sync since the
information required to update the gitc manifest is actually in the repo
manifest.
This also fixes a few issues that came up when testing. notdefault
groups weren't being saved to the gitc manifest in a method that matched
'sync'. The merge branch wasn't always being set to the correct value
either.
Change-Id: I5dbc850dd73a9fbd10ab2470ae4c40e46ff894de
Updates the repo launcher and gitc_utils to pull the manifest
directory location out of the gitc config file.
Change-Id: Id08381b8a7d61962093d5cddcb3ff6afbb13004b
Add repo start support for GITC checkouts. If the user is in
the GITC FS view, they can now run repo start to check out
the sources and create a new working branch.
When "repo start" is called on a GITC project, the revision
tag is set to an empty string and saved in a new tag:
old-revision. This tells the GITC filesystem to display the
local copy of the sources when being viewed. The local copy
is created by pulling the project sources and the new branch
is created based off the original project revision.
Updated main.py to setup each command's gitc_manifest when
appropriate.
Updated repo sync's logic to sync opened projects and
updating the GITC manifest file for the rest.
Change-Id: I7e4809d1c4fc43c69b26f2f1bebe45aab0cae628
Don't emit a message when the netrc file doesn't exist or couldn't
be opened.
Instead of trying to unpack the result of info.authenticators() and
catching the resulting TypeError when it's None, first store it to
a local and only unpack it if it has a value.
Also remove an unused import.
Change-Id: I5c404d91e48c261c1ab850c3e5f040c4f4c235cb
Use the same cookies and proxy that git traffic goes through for
persistent-http[s] to support authentication for smart-sync.
Change-Id: I20f4a281c259053a5a4fdbc48b1bca48e781c692
Add repo sync support for GITC checkouts. If the user is in the
GITC client directory they can still pull the sources as normal
if they pass in the --force-gitc argument. Otherwise the user
should call repo sync in the GITC view to update the user's
remote view. (This works because .repo in the GITC view will
link to .repo in the client config directory.)
Part of the support for this change is the refactoring of GITC
related code into gitc_utils.py.
Change-Id: I2636aaa50b450b6f091309db8dd0e8f4dbdad579
Previously repo would only print the failing project path if
Sync_NetworkHalf returned false/empty, but if it threw an
exception the print() was never called.
Change-Id: I58c41de43930df5e34b21561c205e062a72e290f
In some cases, a user may wish to continue with a sync even though
it would require overwriting an existing git directory. This behavior
is not safe as a default because it could result in the loss of some
user data, but as an optional flag it allows the user more flexibility.
To support this, add a --force-sync flag to the sync command that will
attempt to overwrite the existing git dir if it is specified and the
existing git dir points to the wrong obj dir.
Change-Id: Ieddda8ad54e264a1eb4a9d54881dd6ebc8a03833
When syncing with the -s or -t option, a smart_sync_override.xml file
is created. This file is left in the file system when syncing again
without the -s or -t option.
Remove the smart sync override manifest, if it exists, when not using
the -s or -t option.
Change-Id: I697a0f6405205ba5f84a4d470becf7cd23c07b4b
The error message only states that writing the manifest failed.
Include the exception message, so it's easier to track down the reason
that the write failed.
Change-Id: I06e942c48a19521ba45292199519dd0a8bdb1de7
In 2fb6466f79 an optimisation was
added to avoid fetching from remotes if the project is fixed to
a revision and the revision is already available locally.
This causes problems for users who expect all objects to be
fetched by default.
Change the logic so that the optimized behaviour is only enabled if
an option is explicitly given to repo sync.
Change-Id: I3b2794ddd8e0071b1787e166463cd8347ca9e24f
Use JSON as it is shown to be much faster than pickle.
Also clean up the loading and saving functions.
Change-Id: I45b3dee7b4d59a1c0e0d38d4a83b543ac5839390
The fetch logic is now shared between the jobs == 1 and
jobs > 1 cases. This refactoring also fixes a bug where
opts.force_broken was not honored when jobs > 1.
Change-Id: Ic886f3c3c00f3d8fc73a65366328fed3c44dc3be
This takes the wrapper importing code from main.py and moves it into
its own module so that other modules may import it without causing
circular imports with main.py.
Change-Id: I9402950573933ed6f14ce0bfb600f74f32727705
the value of Manifest.projects has changed from being the dictionary
to the values of the dictionary. Here we handle this change
correctly on a PostRepoUpgrade.
From a `git diff v1.12.7 -- manifest_xml.py`:
+ @property
def projects(self):
self._Load()
- return self._projects
+ return self._paths.values()
self._paths does contain the projects according to this line of
manifest_xml.py:
484 self._paths[project.relpath] = project
Change-Id: I141f8d5468ee10dfb08f99ba434004a307fed810
This significantly reduces sync time and used brandwidth as only
a tar of each project's revision is checked out, but git is not
accessible from projects anymore.
This is relevant when git is not needed in projects but sync
speed/brandwidth may be important like on CI servers when building
several versions from scratch regularly for example.
Archive is not supported over http/https.
Change-Id: I48c3c7de2cd5a1faec33e295fcdafbc7807d0e4d
Signed-off-by: Julien Campergue <julien.campergue@parrot.com>
* Add .decode('utf-8') where needed
* Add 'b' to `open` where needed, and remove where unnecessary
Change-Id: I0f03ecf9ed1a78e3b2f15f9469deb9aaab698657
It is often useful to be able to include the same project more than
once, but with different branches and placed in different paths in the
workspace. Add this feature.
This CL adds the concept of an object directory. The object directory
stores objects that can be shared amongst several working trees. For
newly synced repositories, we set up the git repo now to share its
objects with an object repo.
Each worktree for a given repo shares objects, but has an independent
set of references and branches. This ensures that repo only has to
update the objects once; however the references for each worktree are
updated separately. Storing the references separately is needed to
ensure that commits to a branch on one worktree will not change the
HEAD commits of the others.
One nice side effect of sharing objects between different worktrees is
that you can easily cherry-pick changes between the two worktrees
without needing to fetch them.
Bug: Issue 141
Change-Id: I5e2f4e1a7abb56f9d3f310fa6fd0c17019330ecd
When the RPC call fails, the error message returned by the server
is printed, but it is not obvious that this is caused by RPC call
failure.
Prefix the error message with a descriptive message that explains
what went wrong.
Change-Id: I4b77af22aacc2e9843c4df9d06bf54e41d9692ff
When syncing using smart sync or smart tag mode, print the url of
the manifest server that is being used.
This is useful in organisations that have multiple manifest servers
used in different manifest branches.
Change-Id: Ib5bc2de5af6f4a942d0ef735c65cbc0721059a61
* manifest_name was never set if opt.smart_sync or opt.smart_tag is used.
* Set it earlier, so that the code handles it correctly when it is None.
* An UnboundLocalError is raised if running `repo sync` without any options:
local variable 'manifest_name' referenced before assignment
* This fixes the above regression caused by commit
53a6c5d93a
Change-Id: I57086670f3589beea8461ce0344f6ec47ab85b7b
Revert "Fix "'module' object is not callable" error", and fix it properly.
* The urlparse module is renamed to urllib.parse in Python 3.
* This commit fixes the code to use "urllib.parse.urlparse"
instead of creating a new module urlib and setting
urlib.parse to urlparse.urlparse.
* Fixes an AttributeError:
'function' object has no attribute 'uses_relative'
This reverts commit cd51f17c64.
Change-Id: I48490b20ecd19cf5a6edd835506ea5a467d556ac
In a couple of files the urlparse method was not being set up
correctly for python < 3 and this resulted in an error being
thrown when trying to call it.
Change-Id: I4d2040ac77101e4e228ee225862f365ae3d96cec
This was broken in b2bd91c, which updated the manifest after it had
been overridden, which made it fall back to the original file (and
not the one from the manifest server).
This builds on 0766900 and overrides the manifest by the one
downloaded from the manifest server completely.
Change-Id: Ic3972390a68919b614616631d99c9e7a63c0e0db
Add a new module with methods for checking the Python version.
Instead of handling Python3 imports with try...except blocks, first
check the python version and then import the relevant modules. This
makes the code a bit cleaner and will result in less diff when/if we
remove support for Python < 3 later.
Use the same mechanism to handle `input` vs. `raw_input` and add
suppression of pylint warnings caused by redefinition of the built-in
method `input`.
Change-Id: Ia403e525b88d77640a741ac50382146e7d635924
Also-by: Chirayu Desai <cdesai@cyanogenmod.org>
Signed-off-by: Chirayu Desai <cdesai@cyanogenmod.org>
* Fix imports.
* Use python3 syntax.
* Wrap map() calls with list().
* Use list() only wherever needed.
(Thanks Conley!)
* Fix dictionary iteration methods
(s/iteritems/items/).
* Make use of sorted() in appropriate places
* Use iterators directly in the loop.
* Don't use .keys() wherever it isn't needed.
* Use sys.maxsize instead of sys.maxint
TODO:
* Make repo work fully with python3. :)
Some of this was done by the '2to3' tool [1], by
applying the needed fixes in a way that doesn't
break compatibility with python2.
Links:
[1]: http://docs.python.org/2/library/2to3.html
Change-Id: Ibdf3bf9a530d716db905733cb9bfef83a48820f7
Signed-off-by: Chirayu Desai <cdesai@cyanogenmod.org>
* Print project name if the "quiet" option is not used.
Change-Id: I99863bb50f66e4dcbaf2d170bdd05971f2a4e19a
Signed-off-by: Chirayu Desai <cdesai@cyanogenmod.org>
Several messages are printed with the `print` method and the message
is split across two lines, i.e.:
print('This is a message split'
'across two source code lines')
Which causes the message to be printed as:
This is a message splitacross two source code lines
Add a space at the end of the first line before the line break:
print('This is a message split '
'across two source code lines'
Also correct a minor spelling mistake.
Change-Id: Ib98d93fcfb98d78f48025fcc428b6661380cff79
Add an option to pass `--no-tags' to `git fetch'.
Change-Id: I4158cc369773e08e55a167091c38ca304a197587
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
If the current manifest is broken then "repo sync" fails because it
can't retrieve the default value for --jobs. Use 1 in this case, in
order that you can "repo sync" to get a fixed manifest (assuming someone
fixed it upstream).
Change-Id: I4262abb59311f1e851ca2a663438a7e9f796b9f6
(Previous submission of this change broke Android buildbot due to
incorrect regular expression for parsing git-config output. During
investigation, we also found that Android, which pulls Chromium, has a
workaround for Chromium's submodules; its manifest includes Chromium's
submodules. This new change, in addition to fixing the regex, also
take this type of workarounds into consideration; it adds a new
attribute that makes repo not fetch submodules unless submodules have a
project element defined in the manifest, or this attribute is
overridden by a parent project element or by the default element.)
We need a representation of git-submodule in repo; otherwise repo will
not sync submodules, and leave workspace in a broken state. Of course
this will not be a problem if all projects are owned by the owner of the
manifest file, who may simply choose not to use git-submodule in all
projects. However, this is not possible in practice because manifest
file owner is unlikely to own all upstream projects.
As git submodules are simply git repositories, it is natural to treat
them as plain repo projects that live inside a repo project. That is,
we could use recursively declared projects to denote the is-submodule
relation of git repositories.
The behavior of repo remains the same to projects that do not have a
sub-project within. As for parent projects, repo fetches them and their
sub-projects as normal projects, and then checks out subprojects at the
commit specified in parent's commit object. The sub-project is fetched
at a path relative to parent project's working directory; so the path
specified in manifest file should match that of .gitmodules file.
If a submodule is not registered in repo manifest, repo will derive its
properties from itself and its parent project, which might not always be
correct. In such cases, the subproject is called a derived subproject.
To a user, a sub-project is merely a git-submodule; so all tips of
working with a git-submodule apply here, too. For example, you should
not run `repo sync` in a parent repository if its submodule is dirty.
Change-Id: I4b8344c1b9ccad2f58ad304573133e5d52e1faef
Enable the following Pylint warnings:
C0322: Operator not preceded by a space
C0323: Operator not followed by a space
C0324: Comma not followed by a space
And make the necessary fixes.
Change-Id: I74d74283ad5138cbaf28d492b18614eb355ff9fe
The repo coding style is to indent at 2 characters, but there are
many places where this is not followed.
Enable pylint warning "W0311: Bad indentation" and make sure all
indentation is at multiples of 2 characters.
Change-Id: I68f0f64470789ce2429ab11104d15d380a63e6a8
The --manifest-server-* flags broke the smartsync subcmd since
the corresponding variables weren't getting set. This change
ensures that they will always be set, regardless of whether we are
using sync -s or smartsync.
Change-Id: I1b642038787f2114fa812ecbc15c64e431bbb829
This minimum version is required for the -c argument to set config on
the command line. Without this option, git by default uses as many
threads per invocation as there are CPUs, so we cannot safely
parallelize without hosing a system.
Change-Id: I8fd313dd84917658162b5134b2d9aa34a96f2772
Fixing some more pylint warnings:
W1401: Anomalous backslash in string
W0623: Redefining name 'name' from outer scope
W0702: No exception type(s) specified
E0102: name: function already defined line n
Change-Id: I5afcdb4771ce210390a79981937806e30900a93c
Previously, if a key was added, a client wouldn't add the key during
the sync step. This would cause issues if a new key were added and a
subsequent release were signed by that key.
Change-Id: I4fac317573cd9d0e8da62aa42e00faf08bfeb26c
We can't just let this run wild with a high (or even low) -j, since
that would hose a system. Instead, limit the total number of threads
across all git gc subprocesses to the number of CPUs reported by the
multiprocessing module (available in Python 2.6 and above).
Change-Id: Icca0161a1e6116ffa5f7cfc6f5faecda510a7fb9
Try to more accurately estimate which projects take the longest to
sync by keeping an exponentially weighted moving average (a=0.5) of
fetch times, rather than just recording the last observation. This
should discount individual outliers (e.g. an unusually large project
update) and hopefully allow truly slow repos to bubble to the top.
Change-Id: I72b2508cb1266e8a19cf15b616d8a7fc08098cb3
Some projects may consistently take longer to fetch than others, for
example a more active project may have many more Gerrit changes than a
less active project, which take longer to transfer. Use a simple
heuristic based on the last fetch time to fetch slower projects first,
so we do not tend to spend the end of the sync fetching a small number
of outliers.
This algorithm is probably not optimal, and due to inter-run latency
variance and Python thread scheduling, we may not even have good
estimates of a project sync time.
Change-Id: I9a463f214b3ed742e4d807c42925b62cb8b1745b
"except Exception as e" instead of "except Exception, e"
This is part of a transition to supporting Python 3. Python >= 2.6
support "as" syntax.
Note: this removes Python 2.5 support.
Change-Id: I309599f3981bba2b46111c43102bee38ff132803
We need a representation of git-submodule in repo; otherwise repo will
not sync submodules, and leave workspace in a broken state. Of course
this will not be a problem if all projects are owned by the owner of the
manifest file, who may simply choose not to use git-submodule in all
projects. However, this is not possible in practice because manifest
file owner is unlikely to own all upstream projects.
As git submodules are simply git repositories, it is natural to treat
them as plain repo projects that live inside a repo project. That is,
we could use recursively declared projects to denote the is-submodule
relation of git repositories.
The behavior of repo remains the same to projects that do not have a
sub-project within. As for parent projects, repo fetches them and their
sub-projects as normal projects, and then checks out subprojects at the
commit specified in parent's commit object. The sub-project is fetched
at a path relative to parent project's working directory; so the path
specified in manifest file should match that of .gitmodules file.
If a submodule is not registered in repo manifest, repo will derive its
properties from itself and its parent project, which might not always be
correct. In such cases, the subproject is called a derived subproject.
To a user, a sub-project is merely a git-submodule; so all tips of
working with a git-submodule apply here, too. For example, you should
not run `repo sync` in a parent repository if its submodule is dirty.
Change-Id: I541e9e2ac1a70304272dbe09724572aa1004eb5c
Fix the following issues reported by pylint:
C0321: More than one statement on a single line
W0622: Redefining built-in 'name'
W0612: Unused variable 'name'
W0613: Unused argument 'name'
W0102: Dangerous default value 'value' as argument
W0105: String statement has no effect
Also fixed a few cases of inconsistent indentation.
Change-Id: Ie0db839e7c57d576cff12d8c055fe87030d00744
Change 9bb1816b removed part of a block of code, but left the
remaining part unreachable. Remove it.
Change-Id: Icdc6061d00e6027df32dee9a3bad3999fe7cdcbc
Add two new command line options, -u/--manifest-server-username and
-p/--manifest-server-password, which can be used to specify a username
and password to authenticate to the manifest server when using the
-s/--smart-sync or -t/--smart-tag option.
If -u and -p are not specified when using the -s or -t option, use
authentication credentials from the .netrc file (if there are any).
Authentication credentials from -u/-p or .netrc are not used if the
manifest server specified in the manifest file already includes
credentials.
Change-Id: I6cf9540d28f6cef64c5694e8928cfe367a71d28d
When using the --smart-sync or --smart-tag option, and the specified
manifest server is hosted on a server that requires authentication,
repo sync fails with the error: HTTP 401 Unauthorized.
Add support for getting the credentials from the .netrc file.
If a .netrc file exists in the user's home directory, and it contains
credentials for the hostname of the manifest server specified in the
manifest, use the credentials to authenticate with the manifest server
using the URL syntax extension for Basic Authentication:
http://user:password@host:port/path
Credentials from the .netrc file are only used if the manifest server
URL specified in the manifest does not already include credentials.
Change-Id: I06e6586e8849d0cd12fa9746789e8d45d5b1f848
`R_HEADS` is imported twice, from both the git_refs and project
modules.
It is actually defined in git_refs, and in project it is imported
from there, so the import of `R_HEADS` from project in the sync
module is redundant. Remove it.
`HEAD` is imported from project, but like `R_HEADS` it is actually
defined in git_refs. Import it from git_refs instead.
Change-Id: I8e2b0217d0d9f9f4ee5ef5b8cd0b026174ac52f4
When connecting to the manifest server, exceptions can occur but
are not caught, resulting in the repo sync exiting with a python
traceback.
Add handling of the following exceptions:
- IOError, which can be raised for example if the manifest server
URL is malformed.
- xmlrpclib.ProtocolError, which can be raised if the connection
to the manifest server fails with HTTP error.
- xmlrpclib.Fault, which can be raised if the RPC call fails for
some other reason.
Change-Id: I3a4830aef0941debadd515aac776a3932e28a943
The threaded 'repo sync' implementation would very often freeze the
process when interrupted by the user with Ctrl-C. The only solution
being to kill -9 the process explicitly from another terminal.
The reason for this is best explained here:
http://snakesthatbite.blogspot.fr/2010/09/cpython-threading-interrupting.html
This patch makes all helper sync threads 'daemon', which allows the
process to terminate immediately on Ctrl-C.
Note that this will forcefully kill all threads in case of interruption; this
is generally a bad thing, but:
1/ This is equivalent to calling kill -9 in another terminal, which
is the _only_ thing that can currently stop the process.
2/ There doesn't seem to be a way to tell the worker threads to
gently stop when they are in a blocking operation anyway (even
in the non-threaded case).
+ Do the same for "repo status -j<count>".
Change-Id: Ieaf45b0eacee36f35427f8edafd87415c2aa7be4
Allows specifying a list of groups with a -g argument to repo init.
The groups act on a group= attribute specified on projects in the
manifest.
All projects are implicitly labelled with "default" unless they are
explicitly labelled "-default".
Prefixing a group with "-" removes matching projects from the list
of projects to sync.
If any non-inverted manifest groups are specified, the default label
is ignored.
Change-Id: I3a0dd7a93a8a1756205de1d03eee8c00906af0e5
Reviewed-on: https://gerrit-review.googlesource.com/34570
Reviewed-by: Shawn Pearce <sop@google.com>
Tested-by: Shawn Pearce <sop@google.com>
This parameter changes the manifest used by 'repo sync' for only
this execution. It should be useful for developers wishing to get
the repo temporarily into a known state, without clobbering their
existing manifest.
Tested by shifting Chrome OS between minilayout and full, and
between several release-builder-generated manifests.
Change-Id: I14194b665195b0e78f368d9ec8b8a83227af2627
There is also shortcuts in case if the "current branch" is
a persistent revision such as tag or sha1. We check if the
persistent revision is present locally and if it does - do
no fetch anything from the server.
This greately reduces sync time and size of the on-disk repo
Change-Id: I23c6d95185474ed6e1a03c836a47f489953b99be
If the manifest is updated and the default sync-j attribute
was modified, honor it during this sync session if the user
has not supplied a -j flag on the command line.
Change-Id: I127ee5c779e2bbbb40b30bddc10ec1fa704b3bf3
Signed-off-by: Shawn O. Pearce <sop@google.com>
This permits manifest authors to suggest a number of parallel
fetch operations against a remote server. For example, Gerrit
Code Review servers support queuing of requests and processes
them in first-in, first-out order. Running concurrent fetches
can utilize multiple CPUs on the Gerrit server, but will also
decrease overall operation latency by having the request put
into the queue ready to execute as soon as a CPU is free.
Change-Id: I3d3904acb6f63516bae4b071c510ad57a2afab18
Signed-off-by: Shawn O. Pearce <sop@google.com>
Each worker thread requires at least 3 file descriptors to run the
forked 'git fetch' child to operate against the local repository.
Mac OS X has the RLIMIT_NOFILE set to 256 by default, which means
a sync -j128 often fails when the workers run out of pipes within
the Python parent process.
Change-Id: I2cdb14621b899424b079daf7969bc8c16b85b903
Signed-off-by: Shawn O. Pearce <sop@google.com>
This is an evolution of 'smart-sync' that adds a new option, -t,
that allows you to specify a tag/label to use instead of the
"latest good build" on the current manifest branch which -s does.
Signed-off-by: Victor Boivie <victor.boivie@sonyericsson.com>
Change-Id: I8c20fd91104a6aafa0271d4d33f6c4850aade17e
Event.isSet was renamed to is_set in 2.6, but we should
use the earlier syntax to avoid breaking compatibility
with older Python installations.
Change-Id: I41888ed38df278191d7496c1a6eed15e881733f4
The bug that this is fixing is described here:
http://code.google.com/p/chromium-os/issues/detail?id=6813
This fix allows the helper threads to signal the main thread that they
saw an error. When the main thread sees the error, it will let all
existing threads finish, then exit with an error.
Change-Id: If3019bc6b0b3ab9304d49ed2eea53e9d57f3095a
Users may wind up with a lot of loose object content in projects they
don't frequently make changes in, but that are modified by others.
Since we bypass many git code paths that would have otherwise called
out to `git gc --auto`, its possible for these projects to have
their loose object database grow out of control. To help prevent
that, we now invoke it ourselves during the network half of sync.
Signed-off-by: Shawn O. Pearce <sop@google.com>
(cherry picked from commit 1875ddd47c)
Windows allows the environment to have unicode values.
This will cause Python to fail to execute the command.
Change-Id: I37d922c3d7ced0d5b4883f0220346ac42defc5e9
Signed-off-by: Shawn O. Pearce <sop@google.com>
This feature is used to convey information on a when a branch has
ceased development or if it is an experimental branch with a few
gotchas, etc.
You add it to your manifest XML by doing something like this:
<manifest>
<notice>
NOTE TO DEVELOPERS:
If you checkin code, you have to pinky-swear that it contains no bugs.
Anyone who breaks their promise will have tomatoes thrown at them in the
team meeting. Be sure to bring an extra set of clothes.
</notice>
<remote ... />
...
</manifest>
Carriage returns and indentation are relevant for the text in this tag.
This feature was requested by Anush Elangovan on the ChromiumOS team.
This adds a new flag -f/--force-broken that will allow the rest of
the sync process to continue instead of bailing when a particular
project fails to sync.
Change-Id: I23680f2ee7927410f7ed930b1d469424c9aa246e
Signed-off-by: Andrei Warkentin <andreiw@motorola.com>
Signed-off-by: Shawn O. Pearce <sop@google.com>
This patch does two things for being compatibile with
those Python which are built without threading support:
1. As the Python document and Shawn suggested, import dummy_threading
when the threading is not available.
2. Reserve the single threaded code and make it default.
In cases the --jobs does not work properly with dummy_threading,
we still have a safe fallback.
Change-Id: I40909ef8e9b5c22f315c0a1da9be38eed8b0a2dc
The manifest server doesn't want to have refs/heads passed to it, so
we need to strip that when the branch contains it.
Change-Id: I044f8a9629220e886fd5e02e3c1ac4b4bb6020ba
Do not error if a project is missing on the filesystem, is deleted
from manifest.xml, but still exists in project.list.
Change-Id: I1d13e435473c83091e27e4df571504ef493282dd
This option allows the user to specify a manifest server to use when
syncing. This manifest server will provide a manifest pegging each
project to a known green build. This allows developers to work on a
known good tree that is known to build and pass tests, preventing
failed builds to hamper productivity.
The manifest used is not "sticky" so as to allow subsequent
'repo sync' calls to sync to the tip of the tree.
Change-Id: Id0a24ece20f5a88034ad364b416a1dd2e394226d
This bug happens when a project gets added to the manifest, and
then is renamed. Users who happened to have run "repo sync" after
the project was added but before the rename happened will try to
read the data from the old project, as the manifest was only updated
after all projects were updated successfully.
If a line is blank in project.list, its not a relevant project path,
so skip over it. Existing project.list files may have blank lines if
sync was run with no projects at all, and the file was created empty.
Signed-off-by: Shawn O. Pearce <sop@google.com>
We have no working tree, so we cannot update the project.list
state file, nor should we try to delete a directory if a project is
removed from the manifest. Clients would still need the repository
for historical records.
Signed-off-by: Shawn O. Pearce <sop@google.com>
After a repo sync, some of the project paths might need
to be removed. This changes maintains a list of project
paths from the previous sync operation and compares.
The revisionExpr field now holds an expression from the manifest,
such as "refs/heads/master", while revisionId holds the current
commit-ish SHA-1 of the revisionExpr. Currently that is only
filled in if the manifest points directly to a SHA-1.
Signed-off-by: Shawn O. Pearce <sop@google.com>
Its unlikely that a new version of repo will be delivered in any
given day, so we now check only once every 24 hours to see if repo
has been updated. This reduces the sync cost, as we no longer need
to contact the repo distribution servers every time we do a sync.
repo selfupdate can still be used to force a check.
Signed-off-by: Shawn O. Pearce <sop@google.com>
We now try to sync all projects that can be done safely first, before
we start rebasing user commits over the upstream. This has the nice
effect of making the local tree as close to the upstream as possible
before the user has to start resolving merge conflicts, as that extra
information in other projects may aid in the conflict resolution.
Informational output is buffered and delayed until calculation for
all projects has been done, so that the user gets one concise list
of notice messages, rather than it interrupting the progress meter.
Fast-forward output is now prefixed with the project header, so the
user can see which project that update is taking place in, and make
some relation of the diffstat back to the project name.
Rebase output is now prefixed with the project header, so that if
the rebase fails, the user can see which project we were operating
on and can try to address the failure themselves.
Since rebase sits on a detached HEAD, we now look for an in-progress
rebase during sync, so we can alert the user that the given project
is in a state we cannot handle.
Signed-off-by: Shawn O. Pearce <sop@google.com>
Users may want to upgrade only repo to the latest release, but
leave their working tree state alone and avoid 'repo sync'.
Signed-off-by: Shawn O. Pearce <sop@google.com>
This way users can see how much is left during fetch. Its
especially useful when most syncs are no-ops but there are
hundreds of repositories to poll.
Signed-off-by: Shawn O. Pearce <sop@google.com>
This permits usage of 'repo sync' while offline, as we bypass the
network based portions of the code and do only the local sync.
An example use case might be:
repo sync -n ; # while we have network
... some time later ...
repo sync -l ; # while without network, come up to date
Signed-off-by: Shawn O. Pearce <sop@google.com>
The -d flag moves the project back to a detached HEAD state,
matching what is listed in the manifest. This can be useful to
set a client to something stable (or at least well-known), such as
before a sequence of 'repo download' commands are used to get some
changes for testing.
Signed-off-by: Shawn O. Pearce <sop@google.com>
This makes it easier to update all repositories, without actually
impacting the working directory, or learning about how to use
`repo forall -c 'git fetch $REPO_REMOTE' `.
Signed-off-by: Shawn O. Pearce <sop@google.com>
This is only meant to be passed through while repo upgrades itself
during a sync. It should never be something a user invokes on
their own.
Signed-off-by: Shawn O. Pearce <sop@google.com>
If a client was created with "repo init --mirror" then there are
no working directories present, and no files checked out. Using
a command like "repo status" in this context makes no sense, and
actually throws back a Pytyon traceback at the console when the
underlying commands fail out.
We now tag commands with the MirrorSafeCommand type if they are
able to be executed within a mirror directory safely. Using a
command in a mirror which lacks this base class results in a
useful error letting you know the command isn't supported.
Bug: REPO-14
Signed-off-by: Shawn O. Pearce <sop@google.com>
The mirror option downloads a complete forrest (as described by the
manifest) and creates a replica of the remote repositories rather
than a client working directory. This permits other clients to
sync off the mirror site.
A mirror can be positioned in a "DMZ", where the mirror executes
"repo sync" to obtain changes from the external upstream and
clients inside the protected zone operate off the mirror only,
and therefore do not require direct git:// access to the external
upstream repositories.
Signed-off-by: Shawn O. Pearce <sop@google.com>
This hook is evaluated by `git gc --auto` to determine if it is a
good idea to execute a GC at this time, or defer it to some later
date. When working on a laptop its a good idea to avoid GC if you
are on battery power as the extra CPU and disk IO would consume a
decent amount of the charge.
The hook is the standard sample hook from git.git contrib/hooks,
last modified in git.git by 84ed4c5d117d72f02cc918e413b9861a9d2846d7.
I added the GPLv2 header to the script to ensure the license notice
is clear, as it does not match repo's own APLv2 license.
We only update hooks during initial repository creation or on
a repo sync. This way we don't incur huge overheads from the
hook stat operations during "repo status" or even the normal
"repo sync" cases.
Signed-off-by: Shawn O. Pearce <sop@google.com>