Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

algorithm: ErrDigestInvalidFormat on Validate() even for unknown algorithms #36

Conversation

wking
Copy link
Contributor

@wking wking commented Jun 9, 2017

In some cases (e.g. Algorithm("foo").Validate("!")) we know the encoded portion is invalid even if we do not have an entry for foo in anchoredEncodedRegexps. Whatever algorithm-specific additional restrictions are applied via anchoredEncodedRegexps, the encoded portion must satisfy the generic encoded-portion restrictions from the DigestRegexp. This commit adjusts the Algorithm("foo").Validate("!") result to match the current Digest("foo:!").Validate() result, and factors the encoded portion of DigestRegexp out into encodedRegexp and encodedRegexpAnchored as I'd recommended here.

It also guards the Size()-based length check with Available(), because Size() returns zero for unavailable algorithms. Now that we have a check for algorithms that aren't in anchoredEncodedRegexps, we can't rely on lookup-misses there to protect us from running the Size() check on unavailable algorithms. And equating anchoredEncodedRegexps hits with Available was dicey anyway, since it depended on what packages get loaded, etc. In #35 I'm dropping this Size check entirely, but I'm happy to rebase whichever PR lands second.

In most cases, the results match the analogous test in digest_test.
However, Algorithm("foo").Validate("!") returns ErrDigestUnsupported
while Parse("foo:!") returns ErrDigestInvalidFormat.  Because "!" is
an invalid encoded part regardless of algorithm, I think both should
be returning ErrDigestInvalidFormat.  I'll fix that in the next
commit; this one just shows the current situation.

Signed-off-by: W. Trevor King <wking@tremily.us>
@wking wking force-pushed the invalid-format-even-for-some-unrecognized branch from 74ad6a6 to c69c5e5 Compare June 9, 2017 22:45
@wking
Copy link
Contributor Author

wking commented Jun 9, 2017

I've pushed 74ad6a6c69c5e5 fixing some Go formatting and adding Algorithm.ValidateIdentifier to make Digest.Validate more efficient in the valid-digest case. Without it, we'd have both DigestRegexpAnchored and Algorithm.Validate covering the encoded portion.

@wking
Copy link
Contributor Author

wking commented Jun 9, 2017

In case it's hard to pick out in the diff, the user-visible change I'm interested in is here (with some leading tabs removed for this comment):

 	// unsupported, but the encoded part cannot possibly be valid because it contains invalid characters
 	encoded:   "!",
 	algorithm: "foo",
-	err:       ErrDigestUnsupported,
+	err:       ErrDigestInvalidFormat,

And when @dmcgowan questioned the usefulness of that distinction, my motivating example was that it allows middleware (e.g. registries) to reject invalid encoded portions like ! regardless of who the end consumers will eventually be and what additional algorithm-specific encoded restrictions they associate with foo.

…rithms

In some cases (e.g. Algorithm("foo").Validate("!")) we know the
encoded portion is invalid even if we do not have an entry for foo in
anchoredEncodedRegexps.  Whatever algorithm-specific additional
restrictions are applied via anchoredEncodedRegexps, the encoded
portion must satisfy the generic encoded-portion restrictions from the
DigestRegexp.  This commit adjusts the Algorithm("foo").Validate("!")
result to match the current Digest("foo:!").Validate() result, and
factors the encoded portion of DigestRegexp out into encodedRegexp and
encodedRegexpAnchored.

It also guards the Size()-based length check with Available(), because
Size() returns zero for unavailable algorithms.  Now that we have a
check for algorithms that aren't in anchoredEncodedRegexps, we can't
rely on lookup-misses there to protect us from running the Size()
check on unavailable algorithms.  And equating anchoredEncodedRegexps
hits with Available was dicey anyway, since it depended on what
packages get loaded, etc.

Adding Algorithm.ValidateIdentifier() allows us to avoid doubling up
on the encoded check in Digest.Validate(), since both
DigestRegexpAnchored and algorithm.Validate() were covering the
encoded portion.  The new tests are based on digest_test.go's
TestParseDigest.

I've also dropped the Available check from Digest.Valid, because we
can tell:

  sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855

is a valid digest even if Algorithm("sha256") happens to be
unavailable (e.g. because the consumer didn't import crypto/sha256).
Available-ness just affects whether you're capable of verifying
content against the digest, not digest-validity.

Running Algorithm.Validate before Algorithm.ValidateIdentifier uses
the presence of an identifier in anchoredEncodedRegexps as a hint to
skip the identifier validation.  This saves some time (ad Derek's
suggestion [1]) and is safe as long as we don't add any invalid
identifiers to that map.

[1]: opencontainers#34 (comment)

Signed-off-by: W. Trevor King <wking@tremily.us>
@wking wking force-pushed the invalid-format-even-for-some-unrecognized branch from c69c5e5 to 9a46ed6 Compare June 9, 2017 23:20
@wking
Copy link
Contributor Author

wking commented Jun 9, 2017

Shuffled around a bit more with c69c5e59a46ed6 to use presence in anchoredEncodedRegexps as a hint to skip the Algorithm.ValidateIdentifier check in Digest.Validate. This avoids the unnecessary regexp check @dmcgowan was concerned about.

@vbatts
Copy link
Member

vbatts commented Dec 13, 2019

House keeping. With no comments and no current demand for this I'm closing this. If it's still needed we can reopen

@vbatts vbatts closed this Dec 13, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants