regex - perl - spliting string with quoted characters


Keywords:regex 


Question: 

I want to split the following string at the pipe character without split at the escaped pipe:

"123|ABC|x\|yz|123"  should result in ["123","ABC","x|yz",123]

Does anyone had such a split regexp for perl?


1 Answer: 

You could use a negative lookbehind:

use warnings 'all';
use strict;

use Data::Dumper;

my $str = '123|ABC|x\|yz|123';
my @bits = split /(?<!\\)\|/, $str;

print Dumper(@bits);

Results in:

$VAR1 = '123';
$VAR2 = 'ABC';
$VAR3 = 'x\\|yz';
$VAR4 = '123';

As pointed out by Wiktor, if your string was of the form:

my $str = '123|ABC|x\|yz|123\\|456|123\\345';

The 123\\ would be grouped with 456 (athough the last string 123\\345 would be okay):

$VAR1 = '123';
$VAR2 = 'ABC';
$VAR3 = 'x\\|yz';
$VAR4 = '123\\|456';
$VAR5 = '123\\345';

This is because the negative lookbehind only asserts a single backslash.