Parsing HTML with regex is a bad option and can lead to madness. There are many ways regex fails to read HTML (e.g., upper or lower case TAGS, spaces between classes, extra lines between html elements, etc...)
Regex means "Regular Expression", regular expression, and HTML is not a regular language. It will invariably break somewhere...
That said...
The best way is to use a real "parser". Fortunately, there are several options in PHP.
I advise you to use the Domdocument and the Domxpath included in PHP by default. Here’s an example:
HTML
$html = '
<html>
<head></head>
<body>
<div id="isOffered">
<a class="price addBetButton footballBetButton" id="bk_82285689_mk_sel1" href="">
<span class="priceText wide UK">1.2</span>
<span class="priceText wide EU">1.50</span>
<span class="priceText wide US">200</span>
<span class="priceText wide CH">1.50</span>
<span class="priceChangeArrow"></span>
<input class="betCode" type="hidden" value="0]SK@82285689@314222649@NB*1~2*0*-1*0*0]CPN:0" />
<input class="originalBetCode" type="hidden" value="0]SK@82285689@314222649@NB*1~2*0*-1*0*0]CPN:0" />
</a>
</div>
</body>
</html>';
PHP code
$doc = new DOMDocument();
$doc->loadHTML($html);
$xpath = new DOMXPath($doc);
//Lista de spans filhos de div"isOffered"->a
//So lista as divs que contenham a class 'pricetext'
$nodeList = $xpath->query("*/div[@id='isOffered']/a/span[contains(concat(' ', @class, ' '), ' priceText ')]");
foreach ($nodeList as $node) {
if ($node instanceof \DOMElement) {
// Le o valor do span e transforma num inteiro
$value = (float) $node->nodeValue;
// Altera o valor do span
$node->nodeValue = $value * 0.8;
var_dump($node->nodeValue);
}
}
//salva as alterações feitas ao documenthtml
//e guarda na variavel newHtml
$newHtml = $doc->saveHtml();
To prevent Domdocument from choking on HTML documents with errors, you can add this line at the beginning of your code:
libxml_use_internal_errors(true) AND libxml_clear_errors();
Want to do this with PHP or JS? if you don’t know then pf explains better what you want to do... what functionality.
– Sergio
With PHP, using the preg_replace function.
– Cassiano José
1/2 enters? 200 enters as negative?
– Papa Charlie
Use regex to parse HTML is to give in to Chtulhu’s appeal
– Tivie