Just complementing, an alternative is:
$string1 = 'string(10) "CURITIBA" string(11) "SP"';
preg_match_all('/"([^"]+)"/', $string1, $matches);
foreach($matches[1] as $m) {
echo $m.PHP_EOL;
}
The difference to the another answer is that the regex is "([^"]+)"
:
- at the beginning and end we have the quotation marks
- in the middle we have
[^"]
, which is a character class denied. Basically, it means any character that nay be the "
- the quantifier
+
means "one or more occurrences". It is different from *
, which means "zero or more occurrences". That is, if you use *
, regex also considers cases where there is nothing between the quotes. Using +
, I only take the cases where there is at least one character between them (see the difference here and here). Use whatever makes the most sense to you.
- the brackets serve to form a capture group, so the array of pouch have a position to store the sections that correspond to the parentheses (in this case, it is
$matches[1]
, because it’s the first pair of parentheses, so it’s the first capture group, which is at index 1)
The result is:
CURITIBA
SP
The other difference is that [^"]+
is a little more efficient than .*?
. This is because the dot corresponds to any character (anyone, including quotation marks, so it is necessary to ?
so that the quantifier *
don’t take more characters than you should - see the difference here and here). And as he can pick up any character, including the quotation marks if he finds it necessary, regex ends up testing too many possibilities, until it finds the pouch (the quantifier Lazy - as is called the *?
- is very useful, but charges its price).
Already using [^"]+
, the regex can proceed without fear, for it no longer corresponds to any character, but to any character other than the "
. I mean, that guarantees that the regex will stop when it finds a "
. This makes it more efficient, just compare the amount of steps here and here.
Obviously, for small strings and few runs, it doesn’t make that much difference (maybe the gain is milliseconds or even less). But for larger strings, or for processes where regex will run many times, it starts to make a difference (compare here and here - and note that the biggest difference is in the cases where the regex fails because the quotation marks do not close, as the point generates much more possibilities to be tested - and the regex tests all until I find a match, or until you realize that there is no).
Another difference is that by default the dot picks any character, except line breaks. Already [^"]
consider line breaks. So if we have a string containing a line break between the quotes, only the second finds a match - compare here and here. (but in this case, just use the flag s
in regex: '/"(.+?)"/s'
- for thus the point also considers line breaks).
If you want to be more specific, you can use something like:
preg_match_all('/"([A-Z]+)"/', $string1, $matches);
Now the regex will only take cases where there are uppercase letters between the quotes ([A-Z]+
is "one or more letters of A
to Z
"). It would make a difference if you had cases like "123"
and wanted to ignore them, for example.
Use .*
It seems to be easier, but you don’t always want "anything". Often you have a well-defined set of characters that you want to consider (or ignore), and generally it is better that the expression says exactly what you want and what you don’t want.
Note: its regex had two quotes at the beginning and two at the end, so I couldn’t find anything.
This is happening because your
$pattern
is incorrect. Let’s wait for the guys who know regular expressions help you with this issue.– Victor Carnaval
ok, n understand mt of regular expression, waiting :)
– buddy-stack
Ideally you would edit the question to something like "Get words in quotes through regular expression". Because the alert
Notice: Undefined offset: 1
is given to the fact that the array$matches
does not have the index1
.– Victor Carnaval
thanks for the tip, I’ll change!
– buddy-stack