-1
I need to analyze a C code using regex and I’m having trouble knowing if a variable is receiving a float or integer value.
Ex:
valor_01 = ( 5 * 3 ) / 2.5 + ( 4 % 3 ) ^ 4 ;
valor_01
can be any variable name, something like a\w+
- I need to capture if after the
=
and before the;
has some decimal value (2.5
in this example) - until now I have reached the following expression:
(\w+\s?\=\s?).+
Problem: with this expression I get the variable name, the same and everything you have on the line and I can’t just know if you have a (\d+.\d+)
and a ;
at the end of the line.
It’s like I need to capture
valor_01 = 2.5 ;
Although it is possible, I do not know if regex is the best solution, because in addition to checking the number, she would have to check the context. For example, if the number is inside a string:
valor = "abc 2.5";
, should not accept, since the variable is receiving a string and not a float. And if you have any variablefloat
in the expression? ex:valor = 1 / x;
, whereasx
is a float variable that was created earlier. And you can also dovalor = .5;
orvalor = 1e-2;
(scientific notation), which are alsofloats
. Anyway, there are too many variations and maybe it’s not worth using regex– hkotsubo
Maybe a lexical analyzer is more suitable (I’ve never used this, but I’d start looking for something like this). Another tip: if you’re really using regex,
\w+
accepts things like123ab
, that is not a valid name for variables, then change to[a-zA-Z_]\w+
(or,[a-zA-Z_]\w{0,n}
to limit the amount of characters inn + 1
(although it does consider_
a valid name, and I think it is, although strange).– hkotsubo
Another complicated case:
valor = func(2.5)
- the functionfunc
may even receive afloat
, but what if she returns aint
(or anything else)? In addition to the regex being complicated by itself (to deal with the cases I’ve already commented above), it would still have to analyze this kind of thing, and in the background you would be writing a mini-compiler of C in Javascript (what it is is already complicated by itself, if it is based on regex for that then...)– hkotsubo
What if the line is inside a comment? In this case, it should ignore, because the variable is not receiving any value (after all, it is commented). Just remembering that it’s not enough to check if you have
//
on the line, as C also has multi-line comments. Anyway, they are too many cases to analyze and the regex would be absurdly complex...– hkotsubo