How to make a Split for when there is a letter in the string?

Asked

Viewed 1,056 times

11

I’d like to make one string be divided with the .Split() every time there was a letter. Example:

To string: 97A96D112A109X115T114H122D118Y128

Would become a array with 9 values: { 97 96 112 109 115 114 122 118 128 }

How can I do that?

Code I have (to use as a basis):

string[] az = txt.Split(/*Parametros aqui*/);

6 answers

14


Use the Split() to break using a array with all the characters that can be broken. To facilitate, although not the most performatic (but not exaggerated) can create a string and turn into a array of char with ToCharArray(). You can do it in one line. If you want maximum performance, create the array in hand.

If you need lower case also just include them. Check the documentation for options if you need specific break modes.

using static System.Console;
public class Program {
    public static void Main() {
        var partes = "97A96D112A109X115T114H122D118Y128".Split("ABCDEFGHIJKLMNOPQRSTUVXWYZ".ToCharArray());
        foreach (var item in partes) WriteLine(item); //só para confirmar que deu certo
    }
}

Behold working in the ideone. And in the .NET Fiddle. Also put on the Github for future reference.

  • Why not use regular expression?

  • 1

    @Tiagoa https://xkcd.com/1171/ Regex is one of the ugliest and weirdest things you ever create by hiding performance issues. Why use something complex when the simple solves? I find it funny almost everyone talk about readability of code and then use Regex.

  • Performance depends on the language: https://github.com/mariomka/regex-benchmark. Simple for me is to change the whole alphabet by [a-zA-Z]+.

  • 2

    @Tiagoa Varies from language to language, but does not depend on language. It is physically impossible for an expression of A + B to be slower than a complex calculus than almost all possible mathematical operations. Regex is an extremely sophisticated mechanism, it’s a language within the language, it has to do things that most programmers don’t even dream of being possible, it will generate complex code to execute that. And if you’re talking about optimizations that some languages do, if you have to compile the expression before you use it, then tragedy takes on absurd proportions.

  • 2

    Even those who defend its use know that it is tragic and you should only use it if you don’t care about performance, which in some cases may not even matter, but in many cases is the difference between the application being acceptable or not. I have heard that in extremely complex cases doing it in the hand is so much work, there may be confusion and even be slower than Regex, but in practice no one has ever demonstrated this clearly and would still have to rely on a programmer error.

  • 2

    I’ll tell you what, this test is very boring, C# has a very good Regex, there is no way he is in the last positions, I no longer trust any result demonstrated there. The person who took the test does not know how to use it right. Too bad you find my answer bad, because do it the best way. Some people want to learn right others do not. Nor is the criterion used correct. You have shown a test that does not compare one Split()simple against a Regex, you do a test with both, if the Regex is faster I delete the answer. What you do, besides taking the negative if the Split()` is faster?

  • 2

    Regex is an amazing tool that has already saved my life. For example, find/replace in the IDE. However it is something very punctual and like any tool, its use must be done with responsibility, both for performance, readability and predictability. You can write an expression that apparently works wonderfully, test it and think it’s all ok - until the day you realize it makes sense in another context and matches something you shouldn’t.

  • 2

    A series of vulnerabilities in PHP frameworks (ex recent: Thinkphp) happened because someone decided to use Regex to validate something, in this case Thinkphp an Ioc/Reflection engine (it was something like this) in the class name, it is logical that someone managed to break the expression and from there executed arbitrary code. : D

  • 2

    Regex is analogous to the red hammer with button of the fire-starter mechanism: It exists, there is a good reason for it to exist, and there are cases where it is necessary and prevents something worse than the discomfort that may be caused by its use (ex: everyone get drenched X everyone die charred), but it’s not the kind of tool you’ll use for everything, because in this case it entails more risks than it would be reasonable to have to accept.

  • 1

    https://benchmarksgame-team.pages.debian.net/benchmarksgame/fastest/csharp.html this is more reliable because it is done because who understands the language. Note that you are close to C++ https://benchmarksgame-teampages.debian.net/benchmarksgame/fastest/csharpcore-gpp.html. Of course, if you speak in other languages, it changes things. For example Java, the Split() uses a Regex internally, so always the Split() will be equal to or worse than the pure Regex (although you will have to do more operations in pure so it can still be a little slower. In Java it is better to do in hand so bad that it is.

Show 5 more comments

7

Try it like this:

string alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
string[] az = txt.Split(alphabet.ToCharArray());

6

You could do that using the char.Isletter. would look like this:

string az= "97A96D112A109X115T114H122D118Y128";

string[] novaString;

foreach (char c in az)
{
    if(char.IsLetter(c)
    {
        string[] novaString = az.Split(c);
    }
}

That’s how you access it: novaString[0] novaString[1]

6

Another alternative is to use regular expressions (System.Text.RegularExpressions) through a pattern in the Regex search language to make the "Matches" (value matches) of only numbers and then use Linq (System.Linq) to get the result, in the case of a collection, cast a IEnumerable of the kind Match, that IEnumerable we take only the values we want in the 'value' property and then convert to an array of strings, from that array we use the method Join class String and then we use the Split. There’s actually no need for Join nor the Split and the code becomes even simpler. I leave the two options for comparison.

1) Sem Split:

// varias linhas
string value = "97A96D112A109X1X15T114H122D118Y128";
var matches = Regex.Matches(value, "[0-9]+");
var arrayOfNumbers = matches.Cast<Match>().Select(m => m.Value).ToArray();

// uma linha apenas
string[] arrayOfNumbers = Regex.Matches("97A96D112A109X1X15T114H122D118Y128", "[0-9]+")
.Cast<Match>().Select(m => m.Value).ToArray();

2) With Split:

string value = "97A96D112A109X1X15T114H122D118Y128";
var matches = Regex.Matches(value, "[0-9]+");
var arrayOfNumbers = String.Join("-", matches.Cast<Match>().Select(m => m.Value).ToArray()).Split('-');

5

You can use a list with the characters needed to split your array:

using System.Collections.Generic;

namespace SplitTest
{
    class Program
    {
        static void Main(string[] args)
        {
            var text = "97A96D112A109X115T114H122D118Y128";
            var separatorList = new List<char>();

            // ASCII TABLE
            // 0X41 = A
            // 0X5A = Z

            for (int i = 0X41; i < 0X5B; i++)
            {
                separatorList.Add((char)i);
            }

            var result = text.Split(separatorList.ToArray());

        }
    }
}

3

In addition to what has been answered previously you can use regular expression 'Regex' with the replace function, replacing the unwanted characters, in this case replace with a '-' dash, then just use 'Split' with this character '-'.

 string txt = "97A96D112A109X115T114H122D118Y128";
 string[] az = Regex.Replace(txt, "[^0-9]", "-").Split('-');

Browser other questions tagged

You are not signed in. Login or sign up in order to post.