Capture words using Regular Expressions

Asked

Viewed 422 times

2

I have the following text example:

Meusreportspdf001-Capa-Disclaimer-22012017

I want to capture with php regular expressions the texts

"Cover", "Disclaimer" and "22012017".

I’m trying to use the function preg_match_all() as follows:

preg_match_all("MeusRelatoriosPDF001-(\w*)-(\w*)-(\w*)",$links,$array);

Where in the parameter $links comes the texts separated by the indicated strokes. It is also worth mentioning that the 3 parameters do not always come in the variable. Ex: The variable $link could only come

"Meusreportspdf001-Capa-Disclaimer" or "Meusreportspdf001-Capa"

The mistake that is emerging is

"Warning: preg_match_all(): Delimiter must not be alphanumeric or backslash".

Can someone help me how I could capture these texts and put each variable in a position of $array?

  • That mistake says you missed // between the regex, one at the beginning another at the end.

  • I tried it as follows: preg_match_all("/Meusreportspdf001-( w*)-( w*)-( w*)/",$links,$array); and in the $array variable everything came empty.

  • 1

    A explode() for - would not be simpler or does not solve the problem?

2 answers

4

All regular expression based on PCRE is necessary to put the most common delimiters are the bars \ but may be other non-alphanumeric characters.

You can simplify your regex and capture in a group only words preceded by a dash -. The capture you want is in the group or you must access through the Dice 1 ex: $m[1][0] or $m[1][1].

$link = array('MeusRelatoriosPDF001-Capa-Disclaimer', 'MeusRelatoriosPDF001-Capa', 'MeusRelatoriosPDF001-Capa-Disclaimer-22012017');

foreach ($link as $item){
    preg_match_all("#-(\w+)#", $item, $m);

    echo "<pre>";
    print_r($m);
}   
  • Just as a comment, the flag i has no effect here, since you are capturing \w.

  • 1

    @Dineirockenbach true, corrected. Thank you.

1

Assuming that their links are all in one string separated by \n.

You can use REGEX : ~MeusRelatoriosPDF\d+\-(\w+)(?:\-(\w+))?(?:\-(\w+))?~

$links = 
"MeusRelatoriosPDF001-Capa-Disclaimer-22012017
MeusRelatoriosPDF001-Capa-Disclaimer
MeusRelatoriosPDF001-Capa";

preg_match_all('~MeusRelatoriosPDF\d+\-(\w+)(?:\-(\w+))?(?:\-(\w+))?~', $links, $match);

print_r($match);

Explanation

  • MeusRelatoriosPDF - Literal capture of MeusRelatoriosPDF
  • \d+ - number sequence capture, I just left a little generic for other files.
  • \-(\w+) - literal capture of -, and generates the group with the word
  • (?:...)? - Group without optional count, catch if any

Browser other questions tagged

You are not signed in. Login or sign up in order to post.