regular expression for remove duplicate slashes
How are you? I have the next task. I have a lot of strings that can contain duplicate slashes. I need to replace duplicate slashes to one slash (any count of slashes), but when the next symbols found after slashes (quote, double quote, NUL (NULL byte)) - all slashes should be removed. Thanks. My language - PHP. Some tests:
$s1 = 'test\\\\string'; // test\string $s2 = 'test\\\\\"\\\\\'\\\\string'; // test"'\string $s3 = 'test\\string\\\\\"'; // test\string"
preg_replace("~\\\\+([\"\'\\x00\\\\])~", "$1", $string);
to replace arbitrary amounts of \ with just one \.
The pattern consist of arbitrary initial backslahes \\\\+ and a following symbol that is one of ", ', \x00, or \. The replacement will effectively remove any precending backslahes.
You need 4 backslashes in your regular expression. Two backslashes (\\) will lead to one backslash (\) inside the regular expression string because the PHP interpreter uses backslashes to escape special characters like " or \. For the same reason you will need two backslahes inside your regular expression.
Or explained the other way around: To gain \+ as regular expression, you have to add a backslash to tell PCRE that the one backslash is not for escaping the +. To get \\+ as a string you will also need to add one backslash before each backslash to tell the PHP interpreter that you don't want to escape the second backslash with the first.
source code: \\\\+
inside regular expression string: \\+
pattern matches: \+