Java: separate row into columns by character position

Asked

Viewed 99 times

1

Hello, I turned the internet upside down trying to solve this problem.

I have a text that in this format:

012016010402AAPL34      010APPLE       DRN          R$  000000000415000000000042200000000004150000000000421300000000042080000000003950000000000435000005000000000000012500000000000052664400000000000000009999123100000010000000000000BRAAPLBDR004115

This is a single row where I need to separate the columns by the position of the character, for example:

  • column 1: up to the second character;
  • column 2: up to the third character;
  • column 3: up to the tenth character;
  • And so on and so forth.

That was the Regex expression I found to divide it into two parts, but I couldn’t find a smart way to do it sequentially.

String[] linhaArray = linha.split("(?<=\\G^.{2})");

I really want to separate these columns using a delimiter " ; ".

1 answer

1


Do not use regex. If you want to get specific string positions, use substring. Ex:

String linha = "012016010402AAPL34      010APPLE       DRN          R$  000000000415000000000042200000000004150000000000421300000000042080000000003950000000000435000005000000000000012500000000000052664400000000000000009999123100000010000000000000BRAAPLBDR004115";

// coloque aqui todas as posições das colunas 
int posicoes[] = { 2, 3, 10 };
int inicio = 0;
for (int pos: posicoes) {
    System.out.println(linha.substring(inicio, pos));
    inicio = pos;
}

In the example above I only took 3 columns: the first is from the beginning of the string to position 1 (because the final index is not included), that is, it takes the first 2 characters.

The second column takes the third character (because it goes from position 2 - since the first position is zero - up to 3, but since the final position is not included, it takes only the third character).

The third column takes up to the tenth character, and to pick up more columns, just add the positions in the array posicoes.


If the idea is to have an array of columns:

int posicoes[] = { 2, 3, 10 };
String linhaArray[] = new String[3];
int inicio = 0;
for (int i = 0; i < posicoes.length; i++) {
    int pos = posicoes[i];
    if (pos > linha.length) break;
    linhaArray[i] = linha.substring(inicio, pos);
    inicio = pos;
}

Included a check to stop the loop if the position is larger than the string size, because the documentation says that in this case an exception is made.


But if you really want to use regex:

String linha = "012016010402AAPL34      010APPLE       DRN          R$  000000000415000000000042200000000004150000000000421300000000042080000000003950000000000435000005000000000000012500000000000052664400000000000000009999123100000010000000000000BRAAPLBDR004115";
Matcher matcher = Pattern.compile("^(.{2})(.)(.{7})").matcher(linha);                                     
if (matcher.find()) {
    String linhaArray[] = new String [matcher.groupCount()];
    for (int i = 1; i <= matcher.groupCount(); i++) {
        linhaArray[i - 1] = matcher.group(i);
    }
}

The idea is to have several capture groups, using pairs of parentheses. Inside them I put the point (which corresponds to any character), along with the amount (for example, {2} to indicate that I want 2 characters - the exception is when the quantity is 1, so you do not need to quantify).

Then just see how many groups you have, create the array and add the value of each group to it.

But I still find the first solution simpler.

  • Thanks hkotsubo, it worked!

  • @Luizfelipeborges I added a solution with regex, but it was just as "curiosity", because I still think with substring is simpler (and not tested, but probably also more efficient than regex)

Browser other questions tagged

You are not signed in. Login or sign up in order to post.