URL Safety #1961

tghosth · 2024-05-16T11:19:00Z

This is a spinoff from #1589 with a focus on URLs.

My first question is, are we aiming this requirement at the dynamic generation of URLs or at the processor of URLs?

elarlang · 2024-05-16T11:40:43Z

Also related:

V5.5.5 [ADDED, MERGED FROM 13.1.1] Verify that different parsers used in the application for the same data type (e.g. JSON parsers, XML parsers, URL parsers), perform parsing in a consistent way and use the same character encoding mechanism to avoid issues such as JSON Interoperability vulnerabilities or different URI or file parsing behavior being exploited in Remote File Inclusion (RFI) or Server-side Request Forgery (SSRF) attacks.

jmanico · 2024-05-16T12:08:04Z

I believe both. However, I believe it’s not our job to set requirements for web servers in my opinion. I state with respect. I do believe we need to be aware of the differences between web servers.

jmanico · 2024-05-16T12:52:31Z

One of my concerns around URL safety is that bad URL handling to lead to a variety of different major security issues. This includes XSS, SSRF and open redirects. JavaScript URL’s alone cause over a third of real world XSS attacks and is on the rise. If we wish to aim for positive requirements in this area, I believe we need requirements for: 1) safe complete URL output encoding (XSS) 2) safe inserting of untrusted data into query string parameters and paths (url encoding and possibly base64encoding) 3) url validation (SSRF and open redirects)

elarlang · 2024-05-17T14:42:47Z

I believe both. However, I believe it’s not our job to set requirements for web servers in my opinion. I state with respect. I do believe we need to be aware of the differences between web servers.

vs

One of my concerns around URL safety is that bad URL handling to lead to a variety of different major security issues. This includes XSS, SSRF and open redirects.

I think those two statements are in conflict - if a web server handles URL incorrectly, it is exactly our goal to provide requirements for that and point to the specs, of how it must be done. It is not expected, that program on one web server is safe and on the other one is not just because they handle URL differently.

The provided topics are quite vague to comment on. For example, how "URL output encoding", "complete URL output encoding" and "safe complete URL output encoding" are different from the requirement we already have? Please provide example attacks or vulnerabilities that should be addressed.

jmanico · 2024-05-18T11:07:52Z

Elar, Like I demonstrated on my own server, if you url encode untrusted data and place it on a path, path traversal is possible on some (popular) HTTP servers. So URL encoding untrusted data on a path is not universally safe. This is why the JWT standard opted for base64url encoding and not url encoding.

elarlang · 2024-05-18T11:19:29Z

If it behaves like you have it implemented, it's done incorrectly. Recommended reading:

https://datatracker.ietf.org/doc/html/rfc3986#section-2.4

Under normal circumstances, the only time when octets within a URI are percent-encoded is during the process of producing the URI from its component parts.
Once produced, a URI is always in its percent-encoded form.

When a URI is dereferenced, the components and subcomponents significant to the scheme-specific dereferencing process (if any) must be parsed and separated before the percent-encoded octets within those components can be safely decoded, as otherwise the data may be mistaken for component delimiters.

Implementations must not percent-encode or decode the same string more than once, as decoding an already decoded string might lead to misinterpreting a percent data octet as the beginning of a percent-encoding, or vice versa in the case of percent-encoding an already percent-encoded string.

In total, you listed 3 problems: please list examples for all of them, one by one, keeping the entire coverage of the risen problem.

elarlang · 2024-05-18T13:00:26Z

Like I demonstrated on my own server, if you url encode untrusted data and place it on a path, path traversal is possible on some (popular) HTTP servers. So URL encoding untrusted data on a path is not universally safe. This is why the JWT standard opted for base64url encoding and not url encoding.

Correct URL encoding in this situation is for functionality to work correctly.

If you have a path traversal vulnerability on the server side, you can not fix it with URL encoding for the client, it is not a trusted service layer in this context. You need to fix it on the server side - with validation and sanitization like every other user input.

So, your point of view (to say that path traversal on the server side is the result of encoding URL incorrectly for the client side) is also conceptually just wrong.

jmanico · 2024-05-18T15:28:21Z

The point I’m making is that it is poor url handling is reality (in both url parsing classes, servers and more) that we must and should recognize, regardless of standards. Many of our other requirements would not be needed if dev’s and components just followed standards, but they do not.

elarlang · 2024-05-18T16:00:22Z

The point I’m making is that it is poor url handling is reality (in both url parsing classes, servers and more) that we must and should recognize, regardless of standards. Many of our other requirements would not be needed if dev’s and components just followed standards, but they do not.
Many of our other requirements would not be needed if dev’s and components just followed standards, but they do not.

This is the first statement I agree with - I recommend you start from your own server :) We are not going to tolerate vulnerabilities just because they are widespread.

As with 3 attempts here and 10s of comments before, I was not able to get answers or understand, what is the problem with current requirements, or what the proposals are to improve them, then I'll give it over to @tghosth .

My summary:

Output encoding

I stand for the statement, that the output encoding is covered with 5.3.1

safe complete URL output encoding (XSS)

safe inserting of untrusted data into query string parameters and paths (url encoding and possibly base64encoding)

Safe inserting means that we need to transfer the data correctly to the next component. If the problem is in the next component, it can not be solved with output encoding.

More detailed comment for that is written here: #1589 (comment)

To use base64uri or whatever other "URL-safe" encoding, it is a question of application logic to transfer its data between its components. If URL encoding is done correctly, we can not require the use of base64uri instead.

Validation

JavaScript URL’s alone cause over a third of real world XSS attacks and is on the rise.

url validation (SSRF and open redirects)

We have requirement 5.1.4 for validating the data from the user, we can mention URL to give it more spotlight (issue #1552)

#	Description	L1	L2	L3	CWE
5.1.4	[GRAMMAR] Verify that structured data is strongly typed and validated against a defined schema including allowed characters, length and pattern (e.g. credit card numbers, e-mail addresses, telephone numbers, or validating that two related fields are reasonable, such as checking that suburb and zipcode match).	✓	✓	✓	20

If validation is done and was not enough, the value must be sanitized.

#	Description	L1	L2	L3	CWE
5.2.2	Verify that unstructured data is sanitized to enforce safety measures such as allowed characters and length.	✓	✓	✓	138

For open redirect and SSRF we have:

#	Description	L1	L2	L3	CWE
5.1.5	[MODIFIED, SPLIT TO 50.7.1] Verify that the application will only automatically redirect the user to a different URL directly from an application URL where the destination appears on an allow list.	✓	✓	✓	601
5.2.6	Verify that the application protects against SSRF attacks, by validating or sanitizing untrusted data or HTTP file metadata, such as filenames and URL input fields, and uses allow lists of protocols, domains, paths and ports.	✓	✓	✓	918

jmanico · 2024-05-18T16:31:21Z

It’s not “my server software” it’s a standard HTTP server used by millions of sites.

tghosth · 2024-05-19T05:31:20Z

Thanks both for the comments and direction.

However, I believe it’s not our job to set requirements for web servers in my opinion. I state with respect. I do believe we need to be aware of the differences between web servers.

I think this is a subtle but important point. We are not writing requirements for how to write a web server (e.g. nginx or Apache or IIS or whatever), we are providing requirements for applications which run on these web servers.

As such, we need to advise on generating standards compliant URLs and how to manually process URLs in a safe way but I don't think we can necessarily account for every edge case or every instance where the underlying web server does not work in a standards compliant way.

In terms of the areas to cover:

1) safe complete URL output encoding (XSS)

@jmanico I would appreciate a more specific example of how this specific point should be covered. The way I understand it is that if a URL needs to be encoded for a web page based on the context where it is being written (HTML element, HTML attribute, Javascript, etc) in the same way that any content would need to be. I am not sure what is needed here in addition to what is in 5.3.1.

2) safe inserting of untrusted data into query string parameters and paths (url encoding and possibly base64encoding)

I agree with @jmanico that this has some specific points which would benefit from their own requirement. Do you think you could provide a suggested draft text for a requirement here @jmanico?

3) url validation (SSRF and open redirects)

In this case I think I agree with @elarlang that we already have a number of requirements that handle these aspects, @jmanico I am not sure we need anything additional here unless you wanted to suggest specific changes to the requirements Elar noted above.

jmanico · 2024-05-19T09:39:13Z

1) safe complete URL output encoding (XSS)

Output encoding complete URL’s in a src or href context is not a complete defense since JavaScript: URL’s still execute XSS even when encoded. This is one of the only situations where encoding is not a complete XSS defense. Validating URL’s is critical for XSS, open redirects, SSRF and more. 2) new requirements: These clarifications are valid for redirects, SSRF and general safe url construction. There are generally two different needs to stop this family of server side url handling. 2a) First, build dynamic URL’s safely. URL encode query string parameters and base64url encode path segments. The following… Site.com?data=urlEncode(data) And Site.com/base64urlEncode(data)/ …are by far the safest way to roll in a universal fashion. Again, this is why the JWT standard chose the base64url option for token encoding. It’s the only universally safe way to encode data on a path to stop path traversal. 2b) Also ensure complete URL’s are valid. Also, validate host names and validate schemes/protocols using allow-list validation. This is necessary for SSRF, open redirects, and to stop XSS when it comes to JavaScript URL’s. This is super important. Over 1/3 of real world XSS is from JavaScript URL’s.

tghosth · 2024-05-19T15:56:34Z

Output encoding complete URL’s in a src or href context is not a complete defense since JavaScript: URL’s still execute XSS even when encoded.

Specifically, you mean that the JavaScript URL will execute when it is clicked on, i.e. it does not execute when the page is rendered but rather when the link is clicked (which is the theoretical purpose of the JavaScript URL anyway). Did I understand correctly?

First, build dynamic URL’s safely....

Is there any OWASP cheatsheet for this as it feels like the rules here are going to be a little complicated?

Also, validate host names and validate schemes/protocols using allow-list validation.

I am going to look to see how we currently talk about this related to SSRF and other stuff and whether we need to add more content to the existing requirements. (I will get back to you :) )

tghosth · 2024-05-23T11:05:59Z

Notes from call:

Validating a complete URL to safely place in a webpage

Handled in #1589

Output encoding a complete URL when placing into a HTML document

Handled in #1589

Validation a complete URL for server side use (redirect/SSRF)

@tghosth to check what requirements we already have for SSRF/redirects, i.e. where we have control over the interpreter.

Assembling a dynamic URL safely
4a) Placing untrusted data in a querystring

Need a new requirement(s) @jmanico

4b) Placing untrusted data in the path of a URL (REST Style)

Need a new requirement(s) @jmanico

4c) Any other URL contexts?

Need a new requirement(s) @jmanico

elarlang · 2024-05-23T13:32:55Z

The point I’m making is that it is poor url handling is reality (in both url parsing classes, servers and more) that we must and should recognize, regardless of standards.

caddyserver/caddy#1582 - worth reading in the context of what is expected and accepted behavior by a web server and what is not.

My point is - ASVS should not go and tolerate going against specifications.

jmanico · 2024-05-23T13:58:06Z

This is not just a caddy issue. Other servers have the same struggle. I understand your point, but I want to acknowledge the reality that URL encoding is not a universal defense. base64encoding is.

I plan to build a requirement that suggests URL encoding and also suggests base64url encoding when URL encoding is not sufficient to address both of our concerns. Again, this is why the JWT standard uses base64url encoding. I'm not the only one who acknowledges this reality.

elarlang · 2024-05-24T06:28:56Z

URL handling

So, if I point to RFC - you say (#1589 (comment)) your understanding of the URL handling is still right. When I point to the Caddy issue, which describes the same behavior as a bug/security issue, you keep saying, that your own server, which handles the URL the same-incorrect way, is still right?

Other servers have the same struggle.

Please provide names and facts. We should only talk about facts and examples, not "but The Others". Also the reason and explanations, why they do so. Or... is it someones configuration issue.

URL building

I plan to build a requirement that suggests URL encoding and also suggests base64url encoding

I'm waiting for your proposals, but note, that a suggestion or recommendation is not a security requirement.

This is what I have pointed out before - if you want to suggest, that it is easier for the interpreter to handle the URL, you can send the data in base64uri encoding. It is not a security hole, when someone does not use base64uri encoding instead of just using correctly URL encoding.

jmanico · 2024-05-24T10:59:52Z

We as developers do not always have the ability to fix the interpreter. We almost never do. And not all ASVS requirements can map neatly to a RFC. Most do not. I rest my case and will focus on the new requirements moving forward.

tghosth · 2024-06-02T08:00:21Z

Hi @jmanico , I believe you have suggested the following requirements:

If building dynamic urls, be sure to URL encode when placing
untrusted data in a querystring parameter and validate the complete URL

and

If building dynamic urls, be sure to URL encode or base64url encode
when placing untrusted data in a path parameter and validate the
complete URL

Do you think these could be merged?
Do these cover all cases related to URLs? Do we need to consider javascript: urls or something as well?

jmanico · 2024-06-06T13:25:45Z

I can live with these being merged. Perhaps:

Verify that when building dynamic URLs, untrusted data is URL encoded or base64url encoded, and ensure the complete URL is validated to confirm it is legal and the protocol is safe.

tghosth · 2024-06-13T09:26:00Z

Verify that when dynamically building URLs, untrusted data is safely encoded based on context (e.g. URL encoded for query parameters, URL or ideally base64url encoded for path parameters) and only safe URL protocols are allowed (e.g. not javascript).

@jmanico what do you think about this one?

jmanico · 2024-06-13T10:36:58Z

I like it am am happy with it. Can I suggest a little cleanup just to make it a little more general?

Verify that when dynamically building URLs, untrusted data is safely encoded according to its context (e.g., URL encoding or base64url encoding for query or path parameters). Ensure that only safe URL protocols are permitted (e.g., disallow javascript: or data:).

tghosth · 2024-06-13T10:44:14Z

Verify that when dynamically building URLs, untrusted data is encoded according to its context (e.g., URL encoding or base64url encoding for query or path parameters). Ensure that only safe URL protocols are permitted (e.g., disallow javascript: or data:).

I removed safely as it may be a little redundant but can you PR this for 5.3?

Should be tagged as [ADDED, SPLIT FROM 5.3.1]

Addressing comment #1961

tghosth added 1) Discussion ongoing Issue is opened and assigned but no clear proposal yet _5.0 - prep This needs to be addressed to prepare 5.0 V5 Temporary label for grouping input validation, sanitization, encoding, escaping related requirements labels May 16, 2024

tghosth mentioned this issue May 16, 2024

5.3.1 has too much going on and should be shortened #1589

Closed

elarlang assigned jmanico May 16, 2024

jmanico added a commit that referenced this issue Jun 14, 2024

Update 0x13-V5-Validation-Sanitization-Encoding.md

a4bb109

Addressing comment #1961

jmanico mentioned this issue Jun 14, 2024

Update 0x13-V5-Validation-Sanitization-Encoding.md #1978

Merged

tghosth closed this as completed in 7ca3414 Jul 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

URL Safety #1961

URL Safety #1961

tghosth commented May 16, 2024

elarlang commented May 16, 2024 •

edited

Loading

jmanico commented May 16, 2024 via email

jmanico commented May 16, 2024 via email

elarlang commented May 17, 2024

jmanico commented May 18, 2024 via email

elarlang commented May 18, 2024

elarlang commented May 18, 2024

jmanico commented May 18, 2024 via email

elarlang commented May 18, 2024 •

edited

Loading

jmanico commented May 18, 2024 via email

tghosth commented May 19, 2024

jmanico commented May 19, 2024 via email

tghosth commented May 19, 2024

tghosth commented May 23, 2024 •

edited

Loading

elarlang commented May 23, 2024

jmanico commented May 23, 2024 •

edited

Loading

elarlang commented May 24, 2024

jmanico commented May 24, 2024 via email

tghosth commented Jun 2, 2024

jmanico commented Jun 6, 2024

tghosth commented Jun 13, 2024 •

edited

Loading

jmanico commented Jun 13, 2024

tghosth commented Jun 13, 2024

URL Safety #1961

URL Safety #1961

Comments

tghosth commented May 16, 2024

elarlang commented May 16, 2024 • edited Loading

jmanico commented May 16, 2024 via email

jmanico commented May 16, 2024 via email

elarlang commented May 17, 2024

jmanico commented May 18, 2024 via email

elarlang commented May 18, 2024

elarlang commented May 18, 2024

jmanico commented May 18, 2024 via email

elarlang commented May 18, 2024 • edited Loading

Output encoding

Validation

jmanico commented May 18, 2024 via email

tghosth commented May 19, 2024

1) safe complete URL output encoding (XSS)

2) safe inserting of untrusted data into query string parameters and paths (url encoding and possibly base64encoding)

3) url validation (SSRF and open redirects)

jmanico commented May 19, 2024 via email

tghosth commented May 19, 2024

tghosth commented May 23, 2024 • edited Loading

elarlang commented May 23, 2024

jmanico commented May 23, 2024 • edited Loading

elarlang commented May 24, 2024

URL handling

URL building

jmanico commented May 24, 2024 via email

tghosth commented Jun 2, 2024

jmanico commented Jun 6, 2024

tghosth commented Jun 13, 2024 • edited Loading

jmanico commented Jun 13, 2024

tghosth commented Jun 13, 2024

elarlang commented May 16, 2024 •

edited

Loading

elarlang commented May 18, 2024 •

edited

Loading

tghosth commented May 23, 2024 •

edited

Loading

jmanico commented May 23, 2024 •

edited

Loading

tghosth commented Jun 13, 2024 •

edited

Loading