Another alternative is to use regular expressions (System.Text.RegularExpressions
) through a pattern in the Regex search language to make the "Matches" (value matches) of only numbers and then use Linq (System.Linq
) to get the result, in the case of a collection, cast a IEnumerable
of the kind Match
, that IEnumerable
we take only the values we want in the 'value' property and then convert to an array of strings, from that array we use the method Join
class String
and then we use the Split
. There’s actually no need for Join
nor the Split
and the code becomes even simpler. I leave the two options for comparison.
1) Sem Split
:
// varias linhas
string value = "97A96D112A109X1X15T114H122D118Y128";
var matches = Regex.Matches(value, "[0-9]+");
var arrayOfNumbers = matches.Cast<Match>().Select(m => m.Value).ToArray();
// uma linha apenas
string[] arrayOfNumbers = Regex.Matches("97A96D112A109X1X15T114H122D118Y128", "[0-9]+")
.Cast<Match>().Select(m => m.Value).ToArray();
2) With Split
:
string value = "97A96D112A109X1X15T114H122D118Y128";
var matches = Regex.Matches(value, "[0-9]+");
var arrayOfNumbers = String.Join("-", matches.Cast<Match>().Select(m => m.Value).ToArray()).Split('-');
Why not use regular expression?
– TiagoA
@Tiagoa https://xkcd.com/1171/ Regex is one of the ugliest and weirdest things you ever create by hiding performance issues. Why use something complex when the simple solves? I find it funny almost everyone talk about readability of code and then use Regex.
– Maniero
Performance depends on the language: https://github.com/mariomka/regex-benchmark. Simple for me is to change the whole alphabet by
[a-zA-Z]+
.– TiagoA
@Tiagoa Varies from language to language, but does not depend on language. It is physically impossible for an expression of A + B to be slower than a complex calculus than almost all possible mathematical operations. Regex is an extremely sophisticated mechanism, it’s a language within the language, it has to do things that most programmers don’t even dream of being possible, it will generate complex code to execute that. And if you’re talking about optimizations that some languages do, if you have to compile the expression before you use it, then tragedy takes on absurd proportions.
– Maniero
Even those who defend its use know that it is tragic and you should only use it if you don’t care about performance, which in some cases may not even matter, but in many cases is the difference between the application being acceptable or not. I have heard that in extremely complex cases doing it in the hand is so much work, there may be confusion and even be slower than Regex, but in practice no one has ever demonstrated this clearly and would still have to rely on a programmer error.
– Maniero
I’ll tell you what, this test is very boring, C# has a very good Regex, there is no way he is in the last positions, I no longer trust any result demonstrated there. The person who took the test does not know how to use it right. Too bad you find my answer bad, because do it the best way. Some people want to learn right others do not. Nor is the criterion used correct. You have shown a test that does not compare one
Split()
simple against a Regex, you do a test with both, if the Regex is faster I delete the answer. What you do, besides taking the negative if the Split()` is faster?– Maniero
Regex is an amazing tool that has already saved my life. For example, find/replace in the IDE. However it is something very punctual and like any tool, its use must be done with responsibility, both for performance, readability and predictability. You can write an expression that apparently works wonderfully, test it and think it’s all ok - until the day you realize it makes sense in another context and matches something you shouldn’t.
– nmindz
A series of vulnerabilities in PHP frameworks (ex recent: Thinkphp) happened because someone decided to use Regex to validate something, in this case Thinkphp an Ioc/Reflection engine (it was something like this) in the class name, it is logical that someone managed to break the expression and from there executed arbitrary code. : D
– nmindz
Regex is analogous to the red hammer with button of the fire-starter mechanism: It exists, there is a good reason for it to exist, and there are cases where it is necessary and prevents something worse than the discomfort that may be caused by its use (ex: everyone get drenched X everyone die charred), but it’s not the kind of tool you’ll use for everything, because in this case it entails more risks than it would be reasonable to have to accept.
– nmindz
https://benchmarksgame-team.pages.debian.net/benchmarksgame/fastest/csharp.html this is more reliable because it is done because who understands the language. Note that you are close to C++ https://benchmarksgame-teampages.debian.net/benchmarksgame/fastest/csharpcore-gpp.html. Of course, if you speak in other languages, it changes things. For example Java, the
Split()
uses a Regex internally, so always theSplit()
will be equal to or worse than the pure Regex (although you will have to do more operations in pure so it can still be a little slower. In Java it is better to do in hand so bad that it is.– Maniero