manifest: refactor the filesystem checking logic for more reuse

This function is currently written with copyfile & linkfile in mind.
Generalize the logic & function arguments slightly so we can reuse
in more places that make sense.

This changes the validation logic slightly too in that we no longer
allow "." for the dest attribute with copyfile & linkfile, nor for
the src attribute with copyfile.  We already rejected those later on
when checking against the active filesystem, but now we reject them
a little sooner when parsing.

The empty path check isn't a new requirement exactly -- repo used to
crash on it, so it was effectively blocked, but now we diagnosis it.

Bug: https://crbug.com/gerrit/14156
Change-Id: I0fdb42a3da60ed149ff1997c5dd4b85da70eec3d
Reviewed-on: https://gerrit-review.googlesource.com/c/git-repo/+/298442
Reviewed-by: Michael Mortensen <mmortensen@google.com>
Tested-by: Mike Frysinger <vapier@google.com>
This commit is contained in:
Mike Frysinger 2021-02-25 18:26:31 -05:00
parent 6093d99d13
commit a00c5f40e7
2 changed files with 33 additions and 5 deletions

View File

@ -1121,8 +1121,33 @@ https://gerrit.googlesource.com/git-repo/+/HEAD/docs/manifest-format.md
return relpath, worktree, gitdir, objdir
@staticmethod
def _CheckLocalPath(path, symlink=False):
"""Verify |path| is reasonable for use in <copyfile> & <linkfile>."""
def _CheckLocalPath(path, dir_ok=False, cwd_dot_ok=False):
"""Verify |path| is reasonable for use in filesystem paths.
Used with <copyfile> & <linkfile> elements.
This only validates the |path| in isolation: it does not check against the
current filesystem state. Thus it is suitable as a first-past in a parser.
It enforces a number of constraints:
* No empty paths.
* No "~" in paths.
* No Unicode codepoints that filesystems might elide when normalizing.
* No relative path components like "." or "..".
* No absolute paths.
* No ".git" or ".repo*" path components.
Args:
path: The path name to validate.
dir_ok: Whether |path| may force a directory (e.g. end in a /).
cwd_dot_ok: Whether |path| may be just ".".
Returns:
None if |path| is OK, a failure message otherwise.
"""
if not path:
return 'empty paths not allowed'
if '~' in path:
return '~ not allowed (due to 8.3 filenames on Windows filesystems)'
@ -1165,12 +1190,12 @@ https://gerrit.googlesource.com/git-repo/+/HEAD/docs/manifest-format.md
# Some people use src="." to create stable links to projects. Lets allow
# that but reject all other uses of "." to keep things simple.
if parts != ['.']:
if not cwd_dot_ok or parts != ['.']:
for part in set(parts):
if part in {'.', '..', '.git'} or part.startswith('.repo'):
return 'bad component: %s' % (part,)
if not symlink and resep.match(path[-1]):
if not dir_ok and resep.match(path[-1]):
return 'dirs not allowed'
# NB: The two abspath checks here are to handle platforms with multiple
@ -1202,7 +1227,8 @@ https://gerrit.googlesource.com/git-repo/+/HEAD/docs/manifest-format.md
# |src| is the file we read from or path we point to for symlinks.
# It is relative to the top of the git project checkout.
msg = cls._CheckLocalPath(src, symlink=element == 'linkfile')
is_linkfile = element == 'linkfile'
msg = cls._CheckLocalPath(src, dir_ok=is_linkfile, cwd_dot_ok=is_linkfile)
if msg:
raise ManifestInvalidPathError(
'<%s> invalid "src": %s: %s' % (element, src, msg))

View File

@ -87,6 +87,8 @@ class ManifestValidateFilePaths(unittest.TestCase):
def test_bad_paths(self):
"""Make sure bad paths (src & dest) are rejected."""
PATHS = (
'',
'.',
'..',
'../',
'./',