The issue: I'm coding a library which receives user supplied regex which contains unknown number of capture groups to be run against other input, and I want to extract value of all capture groups concatenated in one string (for further processing elsewhere).
It is trivial if number of capture groups is known in advance, as I just specify them:
#!/usr/bin/perl -w
my $input = `seq -s" " 100 200`;
my $user_regex =
qr/100(.*)103(.*)107(.*)109(.*)111(.*)113(.*)116(.*)120(.*)133(.*)140(.*)145/;
if ($input =~ $user_regex) { print "$1 $2 $3 $4 $5 $6 $7 $8 $9 $10\n"; }
correctly produces (ignore the extra whitespace):
101 102 104 105 106 108 110 112 114 115 117 118 119
121 122 123 124 125 126 127 128 129 130 131 132
134 135 136 137 138 139 141 142 143 144
However, if there are more than 10 capture groups I lose data if I don't modify the code. As the number of capture groups is unknown, currently I go with hundreds of manually specified matches ("$1" to "$200") under no warnings
pragma and hope it is enough, but it does not seem particularity clean or robust.
Ideally, I'd like something which works like values %+
does for named capture groups, but for non-named capture groups. Is it possible in perl 5.24? Or what less kludgy approach would you recommend for retrieving content of all numbered capture groups?
Answer
Maybe you can just capture into an array?
my @captured = $input =~ $user_regexp;
if( @captured ) { print join " ", @captured; print "\n"; }
If you absolutely must use the numbered capture variables, use eval:
my $input = "abc";
my $re = qr/(.)(.)(.)/;
if( $input =~ $re){
my $num = 1;
print "captured \$$num = ". eval("\$$num") ."\n" and $num++
while eval "defined \$$num";
}
Or just:
my $input = "abc";
my $re = qr/(.)(.)(.)/;
if( $input =~ $re){
my $num = 1;
print "captured \$$num = $$num\n" and $num++ while defined $$num;
}
...but this last example with scalar references doesn't work under use strict
.
No comments:
Post a Comment