An HTTP (or HTTPS) based remote server may now offer a 'clone.bundle'
file in each repository's Git directory. Over an http:// or https://
remote repo will first ask for '$URL/clone.bundle', and if present
download this to bootstrap the local client, rather than relying
on the native Git transport to initialize the new repository.
Bundles may be hosted elsewhere. The client automatically follows a
HTTP 302 redirect to acquire the bundle file. This allows servers
to direct clients to cached copies residing on content delivery
networks, where the bundle may be closer to the end-user.
Bundle downloads are resumeable from where they last left off,
allowing clients to initialize large repositories even when the
connection gets interrupted.
If a bundle does not exist for a repository (a HTTP 404 response
code is returned for '$URL/clone.bundle'), the native Git transport
is used instead. If the client is performing a shallow sync, the
bundle transport is not used, as there is no way to embed shallow
data into the bundle.
Change-Id: I05dad17792fd6fd20635a0f71589566e557cc743
Signed-off-by: Shawn O. Pearce <sop@google.com>
Teach repo how to resolve URLs using the url.insteadof feature
that C Git natively uses during clone, fetch or push. This will
later allow repo to resolve a URL before accessing it directly.
We do not want to pre-resolve things and store the resolved URL
into individual projects, as this makes it impossible for the
user to undo the insteadof mapping at a later date.
Change-Id: I0f62e811197c53fbc8a8be424e3cabf4ed07b4cb
Signed-off-by: Shawn O. Pearce <sop@google.com>
If the http_proxy environment variable was set, honor it during
the entire repo session for any Python created HTTP connections.
Change-Id: Ib4ae833cb2cdd47ab0126949f6b399d2c142887d
Signed-off-by: Shawn O. Pearce <sop@google.com>
'repo upload' makes http request using urllib2 python library.
Unfortunately this library does not work (by default) in case
if the user behind a proxy.
This change adds proxy handler in case if 'http_proxy' environment
variable is set.
Change-Id: Ic4176ad733fc21bd5b59661b3eacc2f0a7c3c1ff
Instead of giving a Python backtrace when there is a connectivity
problem during repo upload, report that we cannot access the host,
and why, with a halfway decent error message.
Bug: REPO-45
Change-Id: I9a45b387e86e48073a2d99bd6d594c1a7d6d99d4
Signed-off-by: Shawn O. Pearce <sop@google.com>
(cherry picked from commit d2dfac81ad)
This fixes the SSH Control Masters to be managed in a thread-safe
fashion. This is important because "repo sync -jN" uses threads to
sync more than one repository at the same time. The problem didn't
show up earlier because it was masked if all of the threads tried to
connect to the same host that was used on the "repo init" line.
os.remove() raises OSError if the file being removed doesn't exist.
Check before calling to ensure we don't raise a useless exception
on an already deleted file.
Change-Id: I44c1c7dd97a47fcab8afb6c18fdf179158b6dab7
Signed-off-by: Shawn O. Pearce <sop@google.com>
Be more thorough about checking for an existing ssh master by
running a test command first, and only opening up a new master
if the test fails to connect.
Change-Id: I56fe8e7b4dbc123675b7f259e81d359ed0cd55cf
Signed-off-by: Shawn O. Pearce <sop@google.com>
Some users might need to use a different login name than the local
part of their email address for their Gerrit Code Review user
account. Allow it to be overridden with the review.HOST.username
configuration variable.
Change-Id: I714469142ac7feadf09fee9c26680c0e09076b75
Signed-off-by: Shawn O. Pearce <sop@google.com>
This change allows local SSH configuration to choose the port number
to use when not explicitly set in the manifest.
(cherry picked from commit 4c0f670465)
Change-Id: Ibea99cfe46b6a2cc27f754cc3944a2fe10f6fda4
If the SSH control master process is killed while an active git
fetch is using its network socket, the underlying SSH client may
not realize the connection was broken. This can lead to both the
client and the server waiting indefinitely for network messages
which will never be sent.
Work around the problem by keeping track of any processes that use
the tunnels we establish. If we are about to kill any of the SSH
control masters that we started, ensure the clients using them are
successfully killed first.
Change-Id: Ida6c124dcb0c6a26bf7dd69cba2fbdc2ecd5b2fc
Signed-off-by: Shawn O. Pearce <sop@google.com>
Repo can now properly handle url.insteadOf sections in the
user's ~/.gitconfig file. This means that a user can now enjoy
the master-ssh functionality even if he/she uses insteadOf's in
~/.gitconfig to rewrite git:// URLs to ssh:// style URLs.
Change-Id: Ic0f04a9c57206a7b89eb0f10bf188c4c483debe3
Signed-off-by: Shawn O. Pearce <sop@google.com>
If a file (e.g. ~/.gitconfig) does not exist, we get None
here rather than a string. NoneType lacks rstrip() so we
cannot strip it.
Signed-off-by: Shawn O. Pearce <sop@google.com>
A git-config entry with no value was preventing repo
from initializing. This modifies _ReadGit() to handle
config entries with empty values.
Signed-off-by: David Aguilar <davvid@gmail.com>
Reported-by: Josh Guilfoyle <jasta00@gmail.com>
If the SSH client terminated abnormally in the background (e.g. the
server shutdown while we were doing a sync) then the pid won't exist.
Instead of crashing, ignore it, the result we wanted (a non-orphaned
ssh process) is already acheived.
Signed-off-by: Shawn O. Pearce <sop@google.com>
If the pickle config file is 0 bytes in length, we may have
crashed (or been aborted) while writing the file out to disk.
Instead of crashing with a backtrace, just treat the file as
though it wasn't present and load off a `git config` fork.
Signed-off-by: Shawn O. Pearce <sop@google.com>
Noticed by users on repo-discuss, we were missing a return False
here to signal that SSH control master was not used to setup the
network connection.
Signed-off-by: Shawn O. Pearce <sop@google.com>
This way we can put it in another directory than the config file
itself, e.g. hide it inside ".git" when parsing a ".gitmodules"
file from the working tree.
Signed-off-by: Shawn O. Pearce <sop@google.com>
This can be useful when pulling apart a configuration file, like
finding all entries which match submodule.*.*.
Signed-off-by: Shawn O. Pearce <sop@google.com>
I only tested this with ssh://hostname/ style URLs, so I failed
to test ssh://user@hostname/ format, which failed if the hostname
portion was longer than 1 character.
Signed-off-by: Shawn O. Pearce <sop@google.com>
If the SSH URL doesn't contain a port number, but uses the ssh://
or git+ssh:// syntax we raised a Python runtime error due to the
'port' local variable not being assigned a value. Default it to
the IANA assigned port for SSH, 22.
Signed-off-by: Shawn O. Pearce <sop@google.com>
By creating a background ssh "control master" process which lives
for the duration of our sync cycle we can easily cut the time for
a no-op sync of 132 projects from 60s to 18s.
Bug: REPO-11
Signed-off-by: Shawn O. Pearce <sop@google.com>
Its quite common for most projects to be matching the current
manifest revision, as most developers only modify one or two projects
at any one time. We can speed up `repo start foo` (that impacts
the entire client) by performing most of the branch creation and
switch operations in pure Python, and thus avoid 4 forks per project.
Signed-off-by: Shawn O. Pearce <sop@google.com>
The value of the varible TRACE was copied during the import, which
happens before the --trace option can be processed. So instead we
now use a function to determine if the value is set, as the function
can be safely copied early during import.
Signed-off-by: Shawn O. Pearce <sop@google.com>
These are not as expensive as spawning a git command, but they are
not free either. We want to keep track of how many times we wind
up calling them on any particular operation.
Signed-off-by: Shawn O. Pearce <sop@google.com>
We now cache the output of `git config --list` for each of our
GitConfig instances in a Python pickle file. These can be read
back in using only the Python interpreter at a much faster rate
than we can fork+exec the git config process.
If the corresponding git config file has a newer modification
timestamp than the pickle file, we delete the pickle file and
regenerate it. This ensures that any edits made by the user
will be taken into account the next time we consult the file.
This reduces the time for a no-op repo sync from 0.847s to 0.269s.
Signed-off-by: Shawn O. Pearce <sop@google.com>
In the case of:
[url "Foo"]
insteadOf = Bar
We should return "Bar" for the key "url.Foo.insteadof", but not
for the key "url.foo.insteadof". This requires splitting the
key into its components and only lower casing the section and
value name, leaving the subsection portion alone.
Signed-off-by: Shawn O. Pearce <sop@google.com>
If the user has multiple projects to upload changes to, and they
are all going to the same review server, we only need to query the
'/ssh_info' data once.
Signed-off-by: Shawn O. Pearce <sop@google.com>
If /ssh_info is protected by an HTML based login page, we may get
back a "200 OK" response from the server with some HTML document
asking us to authenticate. This can't be parsed into a host name
and port number, so we shouldn't even try.
Valid host names and decimal port numbers cannot contain '<', but
an unexpected HTML login page would. So we test for '<' to give
us a fair indicator that the content isn't what we think it is,
and bail out.
Signed-off-by: Shawn O. Pearce <sop@google.com>
If a review URL is set to 'http://host/Gerrit' because the user
thinks that is the correct way to point repo at Gerrit, we should
be a bit more flexible and fix the URL by dropping the '/Gerrit'
suffix and replace it with '/ssh_info'.
Likewise, if a review URL points already at '/ssh_info' for a Gerrit
instance, we should leave it alone.
Signed-off-by: Shawn O. Pearce <sop@google.com>
In Gerrit2 uploads are sent over "git push ssh://...", as this
is a more efficient transport and is easier to code from external
scripts and/or direct command line usage by an end-user.
Gerrit1's HTTP POST based format is assumed if the review server
does not have the /ssh_info URL available on it.
Signed-off-by: Shawn O. Pearce <sop@google.com>
This way "forks" of a project, e.g. the linux kernel, can be setup to
use different destination projects in the review server by creating
different remotes in the client side Git repository.
Signed-off-by: Shawn O. Pearce <sop@google.com>
The mirror option downloads a complete forrest (as described by the
manifest) and creates a replica of the remote repositories rather
than a client working directory. This permits other clients to
sync off the mirror site.
A mirror can be positioned in a "DMZ", where the mirror executes
"repo sync" to obtain changes from the external upstream and
clients inside the protected zone operate off the mirror only,
and therefore do not require direct git:// access to the external
upstream repositories.
Signed-off-by: Shawn O. Pearce <sop@google.com>