RFC 6454 in Plainish English
12th April 2019
Most authors on the Internet can be trusted, for a given value of most. But it only takes one bastard. "User Agent Implementators" should aim to restrict the content loaded in some way, to stop that one guy, because it'd be annoying if sites could just request arbitrary files from your system.
Well, since the browser fetches resources on your (the user's) behalf, the browser will follow page instructions and redirects, happily fetching content such as JavaScript. Without a security model this would be absolutely insane. Over time vendors have converged on a Common Model: the Same-Origin Policy.
The Same-Origin Policy specifies trust by URI. For example, a script can be specified with a URI as its source:
<script src="example.com">
By fetching this resource, the browser bequeaths it with all its privileges when executing it. Which it will, because it's JavaScript.
Browsers also send secure information to URIs. For example, a login form which sends secure data to a location must implicitly trust the destination - there is no other system other than trust when POSTing form data, beyond "The developer must have intentionally added this to the page".
It'd be a PITA if you trusted nothing, cause you need access to third-party resources when loading a site.
To make the lives of developers (who constantly compain that entering a username, password, yubikey token, and mobile authentication "upsets the user journey" and "shouldn't be done for every internal and external resource loaded by a page") easier, browsers group URIs into "origins" - a group of addresses which share schemes, hosts, and ports, and as such can safely be assumed to be co-ordinated. This means that resources hosted locally, such as /css and /js can safely be assumed to be intended, as attackers can't add content to the server. Can they. Well, if they could, it'd hardly be Firefoxes fault.
The following addresses are within the same origin of http, example.com, port 80:
- http://example.com
- http://example.com:80
- http://example.com/file/sample.txt
Each of these addresses is within a unique origin due to differing schemes, hosts, and ports:
- http://example.com
- http://example.com:8080
- https://www.example.com
- https://example.com:80
- https://example.com
- http://example.org
Although browsers group URIs into Origins, not every URI in an origin has the same "authority" (read: privileges.) Images, as passive content, (should) have no access to objects. The HTML page and all its scripts on the other hand, are active and the page and its scripts can access every URI within the origin. Browsers determine access requirements by examining their media type (see content sniffing) and treating their resource accordingly.
When hosting untrusted content (i.e. user-generated content, or perhaps external resources), web applications can limit its authority by restricting its content type.
If such untrusted content is insecurely loaded into the HTML document then the origin's authority will be shared with the untrusted content, a vulnerability called cross-site scripting#originstory.
Content sniffing may grant a low authority image the authority of a better item, such as a HTML document, if a tricksy attacker were to spoof it inteligently. This is why real men disable it #scotthelmeisbae. At the end of the day, sending data in an arbitrary format where it can be interpreted by the receiver will always be danger.