Since commit 454fdaf119 (v2.48), syncing a
manifest without any projects would result in:
Repo command failed: RepoUnhandledExceptionError
Number of processes must be at least 1
Bug: 377546300
Change-Id: Iaa2f6a3ac64542ad65a19c0eef449f53c09cae67
Reviewed-on: https://gerrit-review.googlesource.com/c/git-repo/+/443442
Reviewed-by: Erik Elmeke <erik@haleytek.corp-partner.google.com>
Reviewed-by: Josip Sokcevic <sokcevic@google.com>
Commit-Queue: Peter Kjellerstedt <peter.kjellerstedt@axis.com>
Tested-by: Peter Kjellerstedt <peter.kjellerstedt@axis.com>
With a large number of sync workers, the sync process may fail on
macOS due to connection errors. The root cause is that multiple
workers may attempt to connect to the multiprocessing manager server
at the same time when handling the first job. This can lead to
connection failures if there are too many pending connections, exceeding
the socket listening backlog.
Bug: 377538810
Change-Id: I1924d318d076ca3be61d75daa37bfa8d7dc23ed7
Reviewed-on: https://gerrit-review.googlesource.com/c/git-repo/+/441541
Tested-by: Josip Sokcevic <sokcevic@google.com>
Commit-Queue: Josip Sokcevic <sokcevic@google.com>
Reviewed-by: Josip Sokcevic <sokcevic@google.com>
The command _PostRepoFetch will try to self update
during repo sync. That is beneficial but adds
version uncertainty, fail potential and slow downs
in non-interactive scenarios.
Conditionally skip the update if env variable
REPO_SKIP_SELF_UPDATE is defined.
A call to selfupdate works as before, meaning even
with the variable set, it will run the update.
Change-Id: Iab0ef55dc3d3db3cbf1ba1f506c57fbb58a504c3
Reviewed-on: https://gerrit-review.googlesource.com/c/git-repo/+/439967
Tested-by: Fredrik de Groot <fredrik.de.groot@haleytek.com>
Commit-Queue: Josip Sokcevic <sokcevic@google.com>
Reviewed-by: Josip Sokcevic <sokcevic@google.com>
Background:
- Manifest object is large (for projects like Android) in terms of
serialization cost and size (more than 1mb).
- Lots of Project objects usually share only a few manifest objects.
Before this CL, Project objects were passed to workers via function
parameters. Function parameters are pickled separately (in chunk). In
other words, manifests are serialized again and again. The major
serialization overhead of repo sync was
O(manifest_size * projects / chunksize)
This CL uses following tricks to reduce serialization overhead.
- All projects are pickled in one invocation. Because Project objects
share manifests, pickle library remembers which objects are already
seen and avoid the serialization cost.
- Pass the Project objects to workers at worker intialization time.
And pass project index as function parameters instead. The number of
workers is much smaller than the number of projects.
- Worker init state are shared on Linux (fork based). So it requires
zero serialization for Project objects.
On Linux (fork based), the serialization overhead is
O(projects) --- one int per project
On Windows (spawn based), the serialization overhead is
O(manifest_size * min(workers, projects))
Moreover, use chunksize=1 to avoid the chance that some workers are idle
while other workers still have more than one job in their chunk queue.
Using 2.7k projects as the baseline, originally "repo sync" no-op
sync takes 31s for fetch and 25s for checkout on my Linux workstation.
With this CL, it takes 12s for fetch and 1s for checkout.
Bug: b/371638995
Change-Id: Ifa22072ea54eacb4a5c525c050d84de371e87caa
Reviewed-on: https://gerrit-review.googlesource.com/c/git-repo/+/439921
Tested-by: Kuang-che Wu <kcwu@google.com>
Reviewed-by: Josip Sokcevic <sokcevic@google.com>
Commit-Queue: Kuang-che Wu <kcwu@google.com>
With 551285fa35, the comment about number
of workers no longer stands - dict is shared among multiprocesses and
real time information is available.
Using 2.7k projects as the baseline, using chunk size of 4 takes close
to 5 minutes. A chunk size of 32 takes this down to 40s - a reduction of
rougly 8 times which matches the increase.
R=gavinmak@google.com
Bug: b/371638995
Change-Id: Ida5fd8f7abc44b3b82c02aa0f7f7ae01dff5eb07
Reviewed-on: https://gerrit-review.googlesource.com/c/git-repo/+/438523
Commit-Queue: Josip Sokcevic <sokcevic@google.com>
Tested-by: Josip Sokcevic <sokcevic@google.com>
Reviewed-by: Gavin Mak <gavinmak@google.com>
When using the smart sync option, we try to construct the target that
was "lunched" from the TARGET_PRODUCT and TARGET_BUILD_VARIANT envvars.
However, an android target is now made of three parts,
{TARGET_PRODUCT}-{TARGET_RELEASE}-{TARGET_BUILD_VARIANT}.
I am leaving the option of creating a target if a TARGET_RELEASE is not
specified in case there are other consumers who depend on that option.
BUG=b:358101714
TEST=./run_tests
TEST=smart sync on android repo and manually inspecting
smart_sync_override.xml
Change-Id: I556137e33558783a86a0631f29756910b4a93d92
Reviewed-on: https://gerrit-review.googlesource.com/c/git-repo/+/436977
Tested-by: Yiwei Zhang <yiwzhang@google.com>
Reviewed-by: Yiwei Zhang <yiwzhang@google.com>
Commit-Queue: Yiwei Zhang <yiwzhang@google.com>
Previously repo would abort a sync if there were published changes not
merged upstream. The --rebase option allows the modification of
published commits.
This is a copy of http://go/grev/369694 with the merge conflicts
resolved.
Bug: 40014610
Change-Id: Idac8199400346327b530abea33f1ed794e5bb4c2
Reviewed-on: https://gerrit-review.googlesource.com/c/git-repo/+/435838
Reviewed-by: Josip Sokcevic <sokcevic@google.com>
Tested-by: Jeroen Dhollander <jeroendh@google.com>
Commit-Queue: Jeroen Dhollander <jeroendh@google.com>
The current logic to create checkout layers doesn't work in all cases.
For example, let's assume there are three projects: "foo", "foo/bar" and
"foo-bar". Sorting lexicographical order is incorrect as foo-bar would
be placed between foo and foo/bar, breaking layering logic.
Instead, we split filepaths based using path delimiter (always /) and
then use lexicographical sort.
BUG=b:325119758
TEST=./run_tests, manual sync on chromiumos repository
Change-Id: I76924c3cc6ba2bb860d7a3e48406a6bba8f58c10
Reviewed-on: https://gerrit-review.googlesource.com/c/git-repo/+/412338
Tested-by: Josip Sokcevic <sokcevic@google.com>
Commit-Queue: Josip Sokcevic <sokcevic@google.com>
Reviewed-by: George Engelbrecht <engeg@google.com>
In some cases (e.g. in a CI system), it's desirable to be able to
instruct repo to force checkout. This flag passes --force flag to `git
checkout` operations.
Bug: b/327624021
Change-Id: I579edda546fb8147c4e1a267e2605fcf6e597421
Reviewed-on: https://gerrit-review.googlesource.com/c/git-repo/+/411518
Commit-Queue: Josip Sokcevic <sokcevic@google.com>
Reviewed-by: Gavin Mak <gavinmak@google.com>
Reviewed-by: George Engelbrecht <engeg@google.com>
Tested-by: Josip Sokcevic <sokcevic@google.com>
If a repo manifest is updated so that project B is placed within a
project A, and if project A had content in new B's location in the old
checkout, then repo sync could break depending on checkout order, since
B can't be checked out before A.
This change introduces checkout levels which enforces right sequence of
checkouts while still allowing for parallel checkout. In an example
above, A will always be checked out first before B.
BUG=b:325119758
TEST=./run_tests, manual sync on ChromeOS repository
Change-Id: Ib3b5e4d2639ca56620a1e4c6bf76d7b1ab805250
Reviewed-on: https://gerrit-review.googlesource.com/c/git-repo/+/410421
Tested-by: Josip Sokcevic <sokcevic@google.com>
Reviewed-by: Greg Edelston <gredelston@google.com>
Commit-Queue: Josip Sokcevic <sokcevic@google.com>
Reviewed-by: Gavin Mak <gavinmak@google.com>
Prior to this change RepoChangedException would be caught and re-rasied
as a different exception. This would prevent RepoChangedException
handler from running in main.py
Bug: b/323232806
Change-Id: I9055ff95d439d6ff225206c5bf1755cc718bcfcc
Reviewed-on: https://gerrit-review.googlesource.com/c/git-repo/+/407144
Tested-by: Josip Sokcevic <sokcevic@google.com>
Reviewed-by: Josip Sokcevic <sokcevic@google.com>
Commit-Queue: Josip Sokcevic <sokcevic@google.com>
UpdateManifestError inherits from RepoExitError which inherits
from BaseRepoError. None of them takes project as kwargs
causing the error like "UpdateManifestError() takes no keyword
arguments" in b/317183321
[1]: https://gerrit.googlesource.com/git-repo/+/449b23b698d7d4b13909667a49a0698eb495eeaa/error.py#144
Bug: b/317183321
Change-Id: I64c3dc502027f9dda56a0824f2712364b4502934
Reviewed-on: https://gerrit-review.googlesource.com/c/git-repo/+/398997
Commit-Queue: Yiwei Zhang <yiwzhang@google.com>
Tested-by: Yiwei Zhang <yiwzhang@google.com>
Reviewed-by: Josip Sokcevic <sokcevic@google.com>
Reviewed-by: Jason Chang <jasonnc@google.com>
There are still some verbose messages (e.g. "remote: ...") when doing
repo sync after a couple days. Let's hide them behind verbose flag.
Bug: N/A
Test: repo sync
Change-Id: I1408472c95ed80d9555adfe8f92211245c03cf41
Reviewed-on: https://gerrit-review.googlesource.com/c/git-repo/+/400855
Reviewed-by: Josip Sokcevic <sokcevic@google.com>
Tested-by: Tomasz Wasilczyk <twasilczyk@google.com>
Commit-Queue: Tomasz Wasilczyk <twasilczyk@google.com>
Most times a repo sync after some time (week+) results in a bunch of
messages, which are not very useful for average user:
- discarding 1 commits
- Deleting obsolete checkout.
Bug: N/A
Test: repo sync
Change-Id: I881eab61f9f261e98f3656c09e73ddd159ce288c
Reviewed-on: https://gerrit-review.googlesource.com/c/git-repo/+/397038
Reviewed-by: Josip Sokcevic <sokcevic@google.com>
Commit-Queue: Josip Sokcevic <sokcevic@google.com>
Tested-by: Tomasz Wasilczyk <twasilczyk@google.com>
In the case of a project being removed from the manifest, and in the
path in which the project used to exist, and symlink is place to another
project repo will start to warn about partial syncs when a partial sync
did not occur.
Repro steps:
1) Create a manifest with two projects. Project a -> a/ and project b -> b/
2) Run `repo sync`
3) Remove project b from the manifest.
4) Use `link` in the manifest to link all of Project a to b/
Bug: 314161804
Change-Id: I4a4ac4f70a7038bc7e0c4e0e51ae9fc942411a34
Reviewed-on: https://gerrit-review.googlesource.com/c/git-repo/+/395640
Reviewed-by: Gavin Mak <gavinmak@google.com>
Tested-by: Matt Schulte <matsch@google.com>
Commit-Queue: Gavin Mak <gavinmak@google.com>
- Bump minimum version to Python 3.6.
- Use f-strings in a lot of places.
Change-Id: I2aa70197230fcec2eff8e7c8eb754f20c08075bb
Reviewed-on: https://gerrit-review.googlesource.com/c/git-repo/+/389034
Tested-by: Jason R. Coombs <jaraco@google.com>
Reviewed-by: Mike Frysinger <vapier@google.com>
Commit-Queue: Jason R. Coombs <jaraco@google.com>
Found via pylint:
W0231: __init__ method from base class 'Transport'
is not called (super-init-not-called)
Just fixed for code correctness and to avoid potential future bugs.
Change-Id: Ie1e723c2afe65d026d70ac01a16ee7a40c149834
Reviewed-on: https://gerrit-review.googlesource.com/c/git-repo/+/390676
Reviewed-by: Mike Frysinger <vapier@google.com>
Tested-by: Daniel Kutik <daniel.kutik@lavawerk.com>
Commit-Queue: Daniel Kutik <daniel.kutik@lavawerk.com>
When a new shared project is added to manifest, there's a short window
where objects can be deleted that are used by other projects.
To close that window, set preciousObjects during git init. For
non-shared projects, repo should correct the state in the same execution
instance.
Bug: 288102993
Change-Id: I366f524535ac58c820d51a88599ae2108df9ab48
Reviewed-on: https://gerrit-review.googlesource.com/c/git-repo/+/390234
Commit-Queue: Josip Sokcevic <sokcevic@google.com>
Tested-by: Josip Sokcevic <sokcevic@google.com>
Reviewed-by: Mike Frysinger <vapier@google.com>
In Python 3, these exceptions were merged into OSError, so switch
everything over to that.
Change-Id: If876a28b692de5aa5c62a3bdc8c000793ce52c63
Reviewed-on: https://gerrit-review.googlesource.com/c/git-repo/+/390376
Reviewed-by: Aravind Vasudevan <aravindvasudev@google.com>
Commit-Queue: Mike Frysinger <vapier@google.com>
Tested-by: Mike Frysinger <vapier@google.com>
If a KeyboardInterrupt is encountered before an error is aggregated then
the context surrounding the interrupt is lost. This change aggregates
errors as soon as possible for the sync command
Bug: b/293344017
Change-Id: Iac14f9d59723cc9dedbb960f14fdc1fa5b348ea3
Reviewed-on: https://gerrit-review.googlesource.com/c/git-repo/+/384974
Tested-by: Jason Chang <jasonnc@google.com>
Commit-Queue: Jason Chang <jasonnc@google.com>
Reviewed-by: Gavin Mak <gavinmak@google.com>
Google Python style guide says to import modules.
Clean up all our stdlib imports. Leave the repo ones alone
for now as that's a much bigger shave.
Change-Id: Ida42fc2ae78b86e6b7a6cbc98f94ca04b295f8cc
Reviewed-on: https://gerrit-review.googlesource.com/c/git-repo/+/383714
Reviewed-by: Gavin Mak <gavinmak@google.com>
Commit-Queue: Mike Frysinger <vapier@google.com>
Tested-by: Mike Frysinger <vapier@google.com>
The repo project is fetched at most once a day and should be ignored
when checking if the tree is partially synced.
Bug: b/286126621, b/271507654
Change-Id: I684ed1669c3b3b9605162f8cc9d57185bb3dfe8e
Reviewed-on: https://gerrit-review.googlesource.com/c/git-repo/+/383494
Commit-Queue: Gavin Mak <gavinmak@google.com>
Tested-by: Gavin Mak <gavinmak@google.com>
Reviewed-by: Josip Sokcevic <sokcevic@google.com>
Partial syncs are not supported and can lead to strange behavior like
deleting files. Explicitly warn users on partial sync.
Bug: b/286126621, b/271507654
Change-Id: I471f78ac5942eb855bc34c80af47aa561dfa61e8
Reviewed-on: https://gerrit-review.googlesource.com/c/git-repo/+/382154
Reviewed-by: Jason Chang <jasonnc@google.com>
Reviewed-by: Aravind Vasudevan <aravindvasudev@google.com>
Tested-by: Gavin Mak <gavinmak@google.com>
Commit-Queue: Gavin Mak <gavinmak@google.com>
Reviewed-by: Mike Frysinger <vapier@google.com>
Reviewed-by: Josip Sokcevic <sokcevic@google.com>
Per discussion in go/repo-error-update updated aggregated and exit
errors for sync command.
Aggregated errors are errors that result in eventual command failure.
Exit errors are errors that result in immediate command failure.
Also updated main.py to log aggregated and exit errors to git sessions
log
Bug: b/293344017
Change-Id: I77a21f14da32fe2e68c16841feb22de72e86a251
Reviewed-on: https://gerrit-review.googlesource.com/c/git-repo/+/379614
Reviewed-by: Aravind Vasudevan <aravindvasudev@google.com>
Tested-by: Jason Chang <jasonnc@google.com>
Commit-Queue: Jason Chang <jasonnc@google.com>
Save the latest time any project is fetched and checked out. This will
be used to detect partial checkouts.
Bug: b/286126621
Change-Id: I53b264dc70ba168d506076dbd693ef79a696b61d
Reviewed-on: https://gerrit-review.googlesource.com/c/git-repo/+/380514
Commit-Queue: Gavin Mak <gavinmak@google.com>
Reviewed-by: Joanna Wang <jojwang@google.com>
Tested-by: Gavin Mak <gavinmak@google.com>
New vs existing project may be a useful measure for analyzing
sync performance.
Bug: b/287105597
Change-Id: Ibea3e90c9fe3d16fd8b863bcae22b21963a6771a
Reviewed-on: https://gerrit-review.googlesource.com/c/git-repo/+/377574
Tested-by: Jason Chang <jasonnc@google.com>
Reviewed-by: Joanna Wang <jojwang@google.com>
By chance, _sync_dict can be empty even though repo sync is still
working. In that case, the progress message shows incorrect info. Handle this case and fix a bug where "0 jobs" can show.
Bug: http://b/284465096
Change-Id: If915d953ba60e7cf84a6fb2d137fd6ed82abd3cc
Reviewed-on: https://gerrit-review.googlesource.com/c/git-repo/+/375494
Tested-by: Gavin Mak <gavinmak@google.com>
Reviewed-by: Josip Sokcevic <sokcevic@google.com>
Commit-Queue: Gavin Mak <gavinmak@google.com>
It's possible that number of jobs is more than 0 when we
check length, but in the meantime number of jobs drops to
0. In that case, we are working with float(inf) which
causes other problems
Bug: 284383869
Change-Id: I5d070d1be428f8395df7fde8ca84866db46f2100
Reviewed-on: https://gerrit-review.googlesource.com/c/git-repo/+/375134
Reviewed-by: Mike Frysinger <vapier@google.com>
Commit-Queue: Josip Sokcevic <sokcevic@google.com>
Tested-by: Josip Sokcevic <sokcevic@google.com>
Reviewed-by: Gavin Mak <gavinmak@google.com>
An investigation go/git-repo-shallow shows a number of problems
when doing a shallow git fetch/clone. This change introduces an
environment variable REPO_ALLOW_SHALLOW. When this environment variable
is set to 1 during a repo init or repo sync all shallow git fetch
commands are replaced with partial fetch commands. Any shallow
repository needing update is unshallowed. This behavior continues until
a subsequent repo sync command is run with REPO_ALLOW_SHALLOW set to 1.
Bug: b/274340522
Change-Id: I1c3188270629359e52449788897d9d4988ebf280
Reviewed-on: https://gerrit-review.googlesource.com/c/git-repo/+/374754
Reviewed-by: Josip Sokcevic <sokcevic@google.com>
Tested-by: Jason Chang <jasonnc@google.com>
Last of the recent `repo sync` UX changes. Show number of fetch jobs eg:
"Fetching: 3% (8/251) 0:03 | 8 jobs | 0:01 chromiumos/overlays/chrom.."
Bug: https://crbug.com/gerrit/11293
Change-Id: I1b3dcf3e56ae6731c6c6cb73cfce069b2f374b69
Reviewed-on: https://gerrit-review.googlesource.com/c/git-repo/+/374920
Reviewed-by: Josip Sokcevic <sokcevic@google.com>
Commit-Queue: Gavin Mak <gavinmak@google.com>
Tested-by: Gavin Mak <gavinmak@google.com>
Reviewed-by: Joanna Wang <jojwang@google.com>
"Last synced: X" is printed only after a project finishes syncing.
Replace that with a message that shows the longest actively syncing
project.
Bug: https://crbug.com/gerrit/11293
Change-Id: I84c7873539d84999772cd554f426b44921521e85
Reviewed-on: https://gerrit-review.googlesource.com/c/git-repo/+/372674
Reviewed-by: Josip Sokcevic <sokcevic@google.com>
Commit-Queue: Gavin Mak <gavinmak@google.com>
Reviewed-by: Joanna Wang <jojwang@google.com>
Tested-by: Gavin Mak <gavinmak@google.com>
Trace logs emitted from repo are not useful on error for many critical
commands. This change adds errors for critical commands to trace logs.
Change-Id: Ideb9358bee31e540bd84a94327a09ff9b0246a77
Reviewed-on: https://gerrit-review.googlesource.com/c/git-repo/+/373814
Reviewed-by: Joanna Wang <jojwang@google.com>
Tested-by: Josip Sokcevic <sokcevic@google.com>
Commit-Queue: Josip Sokcevic <sokcevic@google.com>
Reviewed-by: Gavin Mak <gavinmak@google.com>
Give users an indication that `repo sync` isn't stuck if taking a long
time to fetch.
Bug: https://crbug.com/gerrit/11293
Change-Id: Iccdaec918f86c9cc2db5dc12f9e3eef7ad0bcbda
Reviewed-on: https://gerrit-review.googlesource.com/c/git-repo/+/371414
Tested-by: Gavin Mak <gavinmak@google.com>
Reviewed-by: Josip Sokcevic <sokcevic@google.com>
Reviewed-by: Joanna Wang <jojwang@google.com>
Commit-Queue: Gavin Mak <gavinmak@google.com>
Apply rules set by https://gerrit-review.googlesource.com/c/git-repo/+/362954/ across the codebase and fix any lingering errors caught
by flake8. Also check black formatting in run_tests (and CQ).
Bug: b/267675342
Change-Id: I972d77649dac351150dcfeb1cd1ad0ea2efc1956
Reviewed-on: https://gerrit-review.googlesource.com/c/git-repo/+/363474
Reviewed-by: Mike Frysinger <vapier@google.com>
Tested-by: Gavin Mak <gavinmak@google.com>
Commit-Queue: Gavin Mak <gavinmak@google.com>
If interrupt signal is sent to repo process while sync is running, repo
prints stack trace for each concurrent job that is currently running
with no useful information.
Instead, this change captures KeyboardInterrupt in each process and
prints one line about current project that is being processed.
Change-Id: Ieca760ed862341939396b8186ae04128d769cd56
Reviewed-on: https://gerrit-review.googlesource.com/c/git-repo/+/357135
Reviewed-by: Joanna Wang <jojwang@google.com>
Tested-by: Josip Sokcevic <sokcevic@google.com>