TL;DR-time: XSS & XSRF (web attack vectors)
I'm going to explain Cross-site scripting (XSS) & Cross-site request forgery (XSRF/CSRF) to myself.
Super long names like "Cross-site request forgery" and their scary acronyms don't exactly lend themselves to instant revelations about how they work. Not like S.C.U.B.A., anyway.
Therefore we must know some things™. Welcome to the TL;DR section.
Mallory has her work cut out for her!
Basically, if your web page displays input from users, and that input is not scrubbed by you, that input could be weaponized — e.g. if it loads (or is) a script (yes, that was an em dash).
XSS generally applies when the user is interacting with your page directly.
Any page that you visit can load content from your web page via GET (and possibly other means if CORS is disabled, or when using XSS + XHR). All of this is perfectly valid, because many pages want to provide resources like images.
XSRF generally applies when the user is not interacting with your page directly.
Combination of XSS & XSRF
Obviously, if the attackers can combine these methods, they might be more successful than when employing just one.
Some examples will be given, followed by a section on protection. No guarantees can be given as it pertains to their efficacy.
XSS as a script in query parameters
This vector is generally called "non-persistent".
If the script is allowed to be included in a user-facing HTML the attacker can
perform any action that the user is allowed to perform. An example of this is
a query parameter
name=your%20name that your page reads and displays as the
title of some page. Now the attacker crafts some url
http://your-domain.com/page?name=<script>alert('xss')</script>, and voila you
dad just got rekt. (The url can be further masked using character percent escaping).
XSS as a script in a forum post
This vector is generally called "persistent".
If you allow user input in a forum post, user support request ticket, etc. you may be subject to XSS attacks. The attack happens when Mallory is allowed to post user-facing HTML that is not sanitized using HTML escaping. Same scenario as in the non-persistent case — Your dad is going to get rekt; only this time he gets rekt every time he refreshes that forum post on gastrointestinal health.
Always sanitize external input. Always validate external input. Disable scripts in your browser (you're not going to do that though, are you, tough guy?). There's some stuff coming in future browsers too.
This guy made me read a lot of text, and I'm not going to put you through that.
We love to love cookies, and Mallory knows it. That <img> tag we talked about
earlier, it might look like this:
<img src="http://your-domain.com/me/settings?delete_account=true"/>. (You're
probably not going to have an API like that but
bear with me). That resource is going to
be loaded using a GET request, and the browser is going to send your cookies right
along with it. That's right, your dad just keeps getting rekt.
This guy has a novel approach to web browser security as it pertains to XSRF attacks: load every page that requires credentials separately and make sure no other web pages are loaded. Sign out a soon as you are finished. Oh, and restart your browser after signing out, who knows what might be cached? No one is going to do that though.
We're going to help your dad out, because he's a nice guy.
If you're not into TL;DR, there's this: OWASP Cross-Site Request Forgery (CSRF) Prevention Cheat Sheet
Don't expose resource changing actions as GETs
- Not going to help against XSS + XHR + XSRF.
Check the header for origin / referer
- Can be circumvented - historically by flash, more recently by browser plugins. Also, some web proxies will strip these headers for different reasons.
Or as this guy points out, anti-XSRF tokens, are a way to validate all requests by generating unique ids and appending them using e.g. hidden input fields in forms, or required request parameters. Optimally the id is generated per request, but for practical reasons (web apps) they can be per session, or alternatively cookies with a predefined, short, lifespan.
- The server generates a hard to guess id, and saves it as a cookie on the client.
- The client reads the cookie (the cookie needs to have httpOnly=false) and attaches it to the request (hidden input, header, queryparam).
- The server has access to both the cookie and the attached parameter and only approves requests that have both.
- A pure XSRF attack is thwarted because it can't read cookies that are from another domain, and you should have made sure guessing your generated id is hard.
This attack is still subject to XSS + XSRF because the XSS would be able to read your cookie. Dad's getting rekt again.
Cross-origin resource sharing (CORS)
Since I mentioned CORS a few times, here's the lowdown:
CORS is used when a site is including e.g. images from a domain outside the
domain from which the request originated. This is useful. Some servers might
Access-Control-Allow-Origin: *, e.g. the server serving google's fonts:
<link href='https://fonts.googleapis.com/css?family=Open+Sans' rel='stylesheet' type='text/css'>.
Somewhat related reading
Some stuff I stumbled across in this recent quest for knowledge:
- Auth0 - Cookies vs Tokens: The Definitive Guide
- Stormpath - Where to Store your JWTs – Cookies vs HTML5 Web Storage
Furthermore, you should see what some people that are probably smarter, more structured, and likely prettier than me had to say:
- Wikipedia: Cross-site scripting
- Wikipedia: Cross-site request forgery (According to this article CSRF is "sometimes pronounced sea-surf" - wat)