Should I escape shell arguments in Perl?

When using system() calls in Perl, do you have to escape the shell args, or is that done automatically?

The arguments will be user input, so I want to make sure this isn't exploitable.


If you use system $cmd, @args rather than system "$cmd @args" (an array rather than a string), then you do not have to escape the arguments because no shell is invoked (see system). system {$cmd} $cmd, @args will not invoke a shell either even if $cmd contains metacharacters and @args is empty (this is documented as part of exec). If the args are coming from user input (or other untrusted source), you will still want to untaint them. See -T in the perlrun docs, and the perlsec docs.

If you need to read the output or send input to the command, qx and readpipe have no equivalent. Instead, use open my $output, "-|", $cmd, @args or open my $input, "|-", $cmd, @args although this is not portable as it requires a real fork which means Unix only... I think. Maybe it'll work on Windows with its simulated fork. A better option is something like IPC::Run, which will also handle the case of piping commands to other commands, which neither the multi-arg form of system nor the 4 arg form of open will handle.

On Windows, the situation is a bit nastier. Basically, all Win32 programs receive one long command-line string -- the shell (usually cmd.exe) may do some interpretation first, removing < and > redirections for example, but it does not split it up at word boundaries for the program. Each program must do this parsing themselves (if they wish -- some programs don't bother). In C and C++ programs, routines provided by the runtime libraries supplied with the compiler toolchain will generally perform this parsing step before main() is called.

The problem is, in general, you don't know how a given program will parse its command line. Many programs are compiled with some version of MSVC++, whose quirky parsing rules are described here, but many others are compiled with different compilers that use different conventions.

This is compounded by the fact that cmd.exe has its own quirky parsing rules. The caret (^) is treated as an escape character that quotes the following character, and text inside double quotes is treated as quoted if a list of tricky criteria are met (see cmd /? for the full gory details). If your command contains any strange characters, it's very easy for cmd.exe's idea of which parts of text are "quoted" and which aren't to get out of sync with your target program's, and all hell breaks loose.

So, the safest approach for escaping arguments on Windows is:

  1. Escape arguments in the manner expected by the command-line parsing logic of the program you're calling. (Hopefully you know what that logic is; if not, try a few examples and guess.)
  2. Join the escaped arguments with spaces.
  3. Prefix every single non-alphanumeric character of the resulting string with ^.
  4. Append any redirections or other shell trickery (e.g. joining commands with &&).
  5. Run the command with system() or backticks.

 sub esc_chars {
  # will change, for example, a!!a to a\!\!a
     @_ =~ s/([;<>\*\|`&\$!#\(\)\[\]\{\}:'"])/\\$1/g;
     return @_;

If you use system "$cmd @args" (a string), then you have to escape the arguments because a shell is invoked.

Fortunately, for double quoted strings, only four characters need escaping:

"    - double quote
$    - dollar
@    - at symbol
\    - backslash

The answers on your question were very useful. In the end I followed @runrig's advice but then used the core module open3() command so I could capture the output from STDERR as well as STDOUT.

For sample code of open3() in use with @runrig's solution, see my related question and answer: Calling system commands from Perl

Need Your Help