We can parse string character by character to delete classes that are inside comments /* .classe */
, is the best way to extract all CSS classes from a string.
$string = <<<EOF
div.classe1{/*comentario div.classe1b*/}
div.-classe2{/*div.-classe2b*/}
div._classe3{/*div._classe3b*/}
.classe4 div a{/*.classe4b div a*/}
.classe5.classe6{/*.classe5b.classe6b*/}
.classe7{/*......classe7b......*/}
.classe8{esse aqui nao tem comentarios mas tambem nao pega o .classe8b pois esta dentro das chaves}
.cl{/*.clb 2 caracteres*/}
.c{/*.cb 1 caractere*/}
.d{/*.db 1 caractere*/}
EOF;
$length = strlen( $string );
$brackets = false;
$comment = false;
$dot = false;
$class = '';
$classes = array();
for ( $i = 0, $j = 0; $i < $length; $i++ ) {
if ( $string[ $i ] === "\x2f" && $string[ $i + 1 ] === "\x2a" ) {
$comment = true;
continue;
} else if ( $string[ $i ] === "\x7b" ) {
$brackets = true;
continue;
} else if ( $brackets === false && $comment === false && $string[ $i ] === "\x2e" ) {
$dot = true;
continue;
} else if ( $string[ $i ] === "\x2a" && $string[ $i + 1 ] === "\x2f" ) {
$comment = false;
continue;
} else if ( $brackets === true && $string[ $i ] === "\x7d" ) {
$brackets = false;
continue;
}
if ( $dot ) {
$j = $i + 1;
$k = $j;
if ( ( ( $string[ $i ] >= "\x41" && $string[ $i ] <= "\x5a" ) || ( $string[ $i ] >= "\x61" && $string[ $i ] <= "\x7a" ) || ( $string[ $i ] === "\x2d" ) || ( $string[ $i ] === "\x5f" ) ) === false ) {
$class = '';
$dot = false;
continue;
}
$class = $string[ $i ];
while ( ( $string[ $j ] >= "\x30" && $string[ $j ] <= "\x39" ) || ( $string[ $j ] >= "\x41" && $string[ $j ] <= "\x5a" ) || ( $string[ $j ] >= "\x61" && $string[ $j ] <= "\x7a" ) || ( $string[ $j ] === "\x2d" ) || ( $string[ $j ] === "\x5f" ) ) {
$class .= $string[ $j ];
$j++;
}
array_push( $classes, $class );
$class = '';
$dot = false;
$i = $j - 1;
}
}
echo '<pre style="font-size: 14px; font-family: Consolas; line-height: 20px; tab-size: 4;">';
var_export( $classes );
echo '</pre>';
die();
The result obtained from the function var_export
is the array
with all classes, and as the goal is to remove classes that will be present within comments, then this is achieved successfully and in addition classes that exist for whatever reason are removed within keys (maybe this does not work well with @media css
), but I added this last because I’m assuming that your code is a "normal" code without @media css
, if there is a @media
simply remove the parts relating to brackets
, in the original code I drew up did not have the brackets
, I just added last.
I wrote the code now after reading the question, did quick tests so I’m not sure it’s working 100%, but it’s extracting classes that start with -
or _
or a-z
or A-Z
and whether or not -
or _
or a-z
or A-Z
or 0-9
.
array (
0 => 'classe1',
1 => '-classe2',
2 => '_classe3',
3 => 'classe4',
4 => 'classe5',
5 => 'classe6',
6 => 'classe7',
7 => 'classe8',
8 => 'cl',
9 => 'c',
10 => 'd',
)
P.S.: I know the category is Javascript and the published code is PHP, but I made a point of answering because it is the correct answer to your question and the one that gets the best results, and also because of the similarity in the syntax and functions PHP and JS, to "convert" for Javascript, only minimal adaptations will be required. I hope I’ve helped.
Does it have to be with regular expressions? Only one appears
/* a classe .azul deixa o texto #123 */
in the code and many of them fail.– Gustavo Rodrigues
How will this regular expression be used? With a single
match
as in the example fiddle, or can one change the code? Regex is not the ideal tool in this case, but depending on the limitations you can think of something...– mgibsonbr
@mgibsonbr Any javascript solution is valid, no matter how it is implemented.
– Kazzkiq