If the text at all times has this format, so just take the snippets "R$ etc":
function formatar($preco) {
return number_format(str_replace('.', '', $preco), 2, '.', '');
}
$texto = 'Melhor preço sem escalas R$ 1.367\n-Melhor preço com escalas R$ 994';
if (preg_match('/sem escalas R\$ (\d+(?:\.\d{3})*).*com escalas R\$ (\d+(?:\.\d{3})*)/s', $texto, $matches)) {
echo "Sem escalas: ". formatar($matches[1]);
echo "\nCom escalas: ". formatar($matches[2]);
}
\d+
takes one or more digits, and then there’s a snippet \.\d{3}
(a dot followed by 3 digits), only that this whole section is grouped between parentheses and with the quantifier *
(zero or more occurrences). That is, I can have "dot followed by 3 digits" being repeated zero or more times (maybe it’s exaggeration because the price of a ticket will not be more than 1 million reais, so it could also be (\d+(?:\.\d{3})?)
- the ?
indicates that the passage is optional).
All the part that interests me (the numerical value) is in parentheses, because thus form a catch group that I can recover after. The first price (no scales) will be in the first group ($matches[1]
) and the second price will be on $matches[2]
. Already the section "point followed by 3 digits" is with (?:
- this forms a no-capture group, so I don’t create random groups in the array $matches
- I’m only interested in full prices.
I also use .*
(zero or more characters) and the flag s
makes the point also correspond to line breaks (since the texts seem to be in different lines).
Given the prices, I can format them any way I see fit. When formatting, I deleted the point because when converting the string to number the point is used as decimal separator (then 1.367
would be interpreted as 1,367 and not as "one thousand three hundred and sixty-seven"). Then I format this number to have only two decimal places, using the dot as decimal separator and no separator among the thousands (see documentation of number_format
for more details).
The output of the code is:
Sem escalas: 1367.00
Com escalas: 994.00
His regex ^[0-9]$
doesn’t work because she uses the markers ^
and $
(respectively the beginning and end of the string) and only see if it has a single digit (i.e., the string could only have a character, which is a digit from 0 to 9).
Tested and approved! https://ideone.com/iHvAKx
– user60252