Separate string by number of characters

Asked

Viewed 8,805 times

9

I’m developing a component to read the file sent by Dataprev with the list of monthly obituaries. This file is a TXT and every 210 characters is a different person.

The documentation can be seen in this link: SISOBI.

I’m used to separating data like this through a delimiter, using the Split(), but this one in particular has none, and is separated by number of characters.

I made the Action to send the file TXT for the application, and read the data contained in it.

Ex:

string exemplo = "13032015joao";

Of that string, I need to remove the data and put in variables such as:

int dia = 13;
int mes = 03;
int ano = 2015;
string nome = joao;

The number of character is fixed, example:

Day will always be 2 characters, and after it will always come the month with 2 characters, and after the year... And so until the end of 210 characters.

Using the Split() if it had a delimiter, it would look something like this:

var exemplo = "13|03|2015|joao";
 string[] stringSeparators = new string[] { "|" };
            var result = nomeDescrypt.Split(stringSeparators, StringSplitOptions.None);

var dia = Convert.ToInt32(result[0]);
var Mes= Convert.ToInt32(result[1]);
var Ano= Convert.ToInt32(result[2]);
var Nome= Convert.ToInt32(result[3]);

My question is: How to separate one string, delimiting by number of characters?

Man controller to read the file is like this:

[HttpPost]
        public ActionResult Index(HttpPostedFileBase file)
        {
            //verifica se o arquivo está nulo
            if (file == null)
            {
                TempData["MensagemError"] = "Erro ao realizar o upload do arquivo!";
                return View("Index");
            }

            //Salvar o arquivo txt
            string path = Path.Combine(Server.MapPath("~/App_Data/Uploads/" + file.FileName));
            file.SaveAs(path);

            //Realiza a leitura do arquivo txt
            var fileContents = System.IO.File.ReadAllText(path);

            //testar se está lendo o arquivo
        TempData["Mensagem"] = fileContents;

            return RedirectToAction("Index");
        }

Example of layot:

000028000280000016427201412310000000000MARCIO SEARA RIBEIRO                                                        MARIA PETANIA DE OLIVEIRA SEARA 19780306201412319442067052500000000000000000000007657          
000028000290000016428201412310000000000MAIRE VALENTIM DA SILVA                                                     MAIRE VALENTIM DA SILVA         19281105201412310387867350700000000000000000000007657  
  • What happens when the record is less than 210 characters long? The layout is filled with spaces?

  • @Ciganomorrisonmendez It will always be 210 characters, because the layout is filled with spaces or "0". I added an Example in the question.

1 answer

8


The method you are looking for is the Substring():

using static System.Convert;
using static System.Console;
                    
public class Program {
    public static void Main() {
        var exemplo = "13032015joao";
        var dia = ToInt32(exemplo.Substring(0, 2));
        var mes = ToInt32(exemplo.Substring(2, 2));
        var ano = ToInt32(exemplo.Substring(4, 4));
        var nome = exemplo.Substring(8);
        WriteLine(dia);
        WriteLine(mes);
        WriteLine(ano);
        WriteLine(nome);
    }
}

Behold working in the ideone. And in the .NET Fiddle. Also put on the Github for future reference.

You can do a few things to automate code execution. It may get shorter and generalized, but the logic is a little more complex. Just for reference to the more generic form:

using System;
using static System.Console;
using System.Collections.Generic;
                    
public class Program {
    public static void Main() {
        var exemplo = "13032015joao";
        //o último elemento poderia ser 200, por exemplo
        //o que se for garantido que ele tenha o tamanho, evitaria o if no método
        var partes = SplitFixed(exemplo, new List<int>() {2, 2, 4, 0});
        foreach(var parte in partes) {
            WriteLine(parte);
        }
        //poderia fazer as conversões aqui e jogar nas variáveis individuais
        
        //alternativa com tipos, não sei se compensa o esforço
        //para fazer certo daria o mesmo trabalho que fazer manualmente
        //poucos casos esta forma seria realmente mais vantajosa e o ideal é que a conversão
        //fosse feita através de lambdas contendo o código de conversão e não com tipos
        var partes2 = SplitFixedTyped(exemplo, new List<Tuple<int, Type>>() {
            new Tuple<int, Type>(2, typeof(int)), 
            new Tuple<int, Type>(2, typeof(int)),
            new Tuple<int, Type>(4, typeof(int)),
            new Tuple<int, Type>(0, typeof(string))});
        foreach(var parte in partes2) {
            WriteLine("Dado: {0} - Tipo {1}", parte, parte.GetType());
        }
        
    }
    public static List<String> SplitFixed(string texto, List<int> tamanhos) {
        var partes = new List<String>();
        var posicao = 0;
        foreach(var tamanho in tamanhos) {
            if (tamanho > 0) { //padronizei que 0 significa que deve ir até o fim
                partes.Add(texto.Substring(posicao, tamanho));
            } else {
                // o ideal é que não tenha essa parte e todos os tamanhos sejam definidos
                //o 0 só pode ser usado como último parâmetro.
                partes.Add(texto.Substring(posicao));
            }
            posicao += tamanho;
        }
        return partes;
    }
    //esta implementação é um pouco ingênua, não funciona em todas as situações mas funciona com o básico
    public static List<object> SplitFixedTyped(string texto, List<Tuple<int, Type>> tamanhos) {
        var partes = new List<object>();
        var posicao = 0;
        foreach(var tamanho in tamanhos) {
            if (tamanho.Item1 > 0) { //padronizei que 0 significa que deve ir até o fim
                partes.Add(Convert.ChangeType(texto.Substring(posicao, tamanho.Item1), tamanho.Item2));
            } else {
                // o ideal é que não tenha essa parte e todos os tamanhos sejam definidos
                //o 0 só pode ser usado como último parâmetro.
                partes.Add(texto.Substring(posicao));
            }
            posicao += tamanho.Item1;
        }
        return partes;
    }
}

Behold working in the ideone. And in the .NET Fiddle. Also put on the Github for future reference.

A generic solution can be useful if you have to deal with multiple fixed-size column files with layout different. But when going to do something generic you have to think about all the possibilities, it is good to ensure that the parameters are in order. I did it quickly without considering everything that can happen.

At the time I did not exist Span and tuples per value as they exist today, so this code can be optimized.

  • I found very interesting this method I will use it and check other possibilities as well. Just one more question. How do I check to the end of the string when it’s a data list? EX: the string would be: 13032015joao14032015Juca In this example, every 12 characters, it is a new item.

  • @Renilsonandrade If I understood that you are the same thing: "13032015joao14032015Juca".Substring(0, 12) and then "13032015joao14032015Juca".Substring(12, 12)

  • Yes, the problem is that I don’t know how many times it will repeat. I only know it will be 12, using this example as a basis.

  • Just make a loop and check if the position plus the size to be read is larger or equal to the size of the total text. If you cannot do it I think it gives a good new question since it is a different problem from the original of this question.

  • I’m trying to do it with two loops, one counting up to 12 (each person’s size) and the other using one Count() string and running to the end, but I’m struggling. Could I give an example? Or if you prefer I can open another question.

  • Open another because the problem is not the same, will turn mess. I’ll already start mounting it to post.

Show 1 more comment

Browser other questions tagged

You are not signed in. Login or sign up in order to post.