Search in text file and compare with others

Asked

Viewed 503 times

-1

I have two string lists. Received and Fixed. Well, I would like with just one lambda expression, scroll through the list Recebido on the list Fixo and if there is no file within Fixo, I save to those files that do not exist in Fixo and then delete them from within the Received list.

Exemplifying, I have the following files to be compared:

odo_prs003a.asp
odo_gen0067b.asp
gen0001.js
estilo.css

And I have this fixed list:

odo_prs003a.asp
cmc002a.asp
cmc0067d.asp
odo_gen0067b.asp
gen0001.js
ass0000.asp
calendar-br.js
css002c.css

Note that in the Fixed list I have no.css style, so this file is a potential candidate to be deleted.

How do I do it using lambda?

Comparing and not existing in the Fixed list, I store in an array or list and then delete.

I did that and now I need to apply the lambda:

private bool ComparaArquivo(string recebido, string fixo)
        {
            List<string> _recebido = new List<string>();
            List<string> _fixo = new List<string>();

            try
            {
                return true;
            }
            catch
            {
                return false;
            }            
        }
  • 2

    Try to improve and simplify the question, it is very confusing. Try to create a miniscule example

  • Comparing by name wouldn’t help you?

  • I think Lampda would be the best way for us at the moment. I did some research and everyone told me that what I want this way does not give, so I asked if for lambda I would have a better performance and everyone told me yes. I think this is the way, but I’ll keep the post open to see if any ninja shows up here.

  • I made another edition to improve the post. I hope you have improved.

2 answers

2

you can even do with LINQ, whether using Query Syntax or Method Syntax (which you called lambda), for the sake of readability, I prefer Query Syntax.

var lista = new List<string> {
    "odo_prs003a.asp",
    "odo_gen0067b.asp",
    "gen0001.js",
    "estilo.css"
};

var lfixa = new List<string> {
    "odo_prs003a.asp",
    "cmc002a.asp",
    "cmc0067d.asp",
    "odo_gen0067b.asp",
    "gen0001.js",
    "ass0000.asp",
    "calendar-br.js",
    "css002c.css",
};

var delQuery = 
    from item in lista
    join fixo in lfixa on item equals fixo into ljoin
    from test in ljoin.DefaultIfEmpty()
    where test == null
    select item;

var delMethod = lista
    .GroupJoin(lfixa, item => item, fixo => fixo, (item, fixo) => new { item, fixo })
    .SelectMany(list => list.fixo.DefaultIfEmpty(), (list, fixo) => new { list.item, fixo })
    .Where(list => list.fixo == null)
    .Select(list => list.item);

var delClasico = new List<string>();        
foreach (var item in lista)
{
    if (!lfixa.Contains(item))
    {
        delClasico.Add(item);
    }           
}

foreach (var item in delQuery)
{
    Console.WriteLine(item);
}

foreach (var item in delMethod)
{
    Console.WriteLine(item);
}

foreach (var item in delClasico)
{
    Console.WriteLine(item);
}

In the above example, I perform the search using the Query Syntax (warehouse the return in delQuery), Method Syntax (warehouse the return in delMethod) and using a conventional loop (storing the return in delClasico).

Note that Method Syntax adds an unnecessary complexity, Query Syntax has exactly the same cost, but makes it more elegant.

For I see no advantage in using Linq in place of a good old tie is, either by performance, readability, or whatever other hidden reason.

As I see that you are trying to compare two file structures, I will give you an implementation... It takes two paths, then compares all files with the same name... identifying all files that only use one file structure, and the files that belong to both structures, but are not identical:

public class Arquivo
{
    public string Path { get; set; }
    public string Nome { get; set; }
    public string NomeCompleto { get { return this.Path + this.Nome; } }
    public byte[] Hash { get; set; }
}

.

static void CompararPastas(string origemPath, string destinoPath)
{
    var arquivosOrigem = new List<Arquivo>();
    var arquivosDestino = new List<Arquivo>();

    var pastaOrigem = new DirectoryInfo(origemPath);
    var pastaDestino = new DirectoryInfo(destinoPath);

    LerPasta(pastaOrigem, origemPath, ref arquivosOrigem);
    LerPasta(pastaDestino, destinoPath, ref arquivosDestino);

    var somenteOrigem =
        from arquivoOrigem in arquivosOrigem
        join arquivoDestino in arquivosDestino on arquivoOrigem.NomeCompleto equals arquivoDestino.NomeCompleto into notInDestino
        from arquivoDestino in notInDestino.DefaultIfEmpty()
        where arquivoDestino == null
        select arquivoOrigem;

    foreach (var arquivo in somenteOrigem)
    {
        //arquivo não presente no destino, você pode copiar o mesmo para a origem.
    }

    var somenteDestino =
        from arquivoDestino in arquivosDestino
        join arquivoOrigem in arquivosOrigem on arquivoDestino.NomeCompleto equals arquivoOrigem.NomeCompleto into notInOrigem
        from arquivoOrigem in notInOrigem.DefaultIfEmpty()
        where arquivoOrigem == null
        select arquivoDestino;

    foreach (var arquivo in somenteDestino)
    {
        //arquivo não presente na origem, você pode apagar o mesmo para o destino.
    }

    var modificados =
        from arquivoOrigem in arquivosOrigem
        join arquivoDestino in arquivosDestino on arquivoOrigem.NomeCompleto equals arquivoDestino.NomeCompleto
        where arquivoOrigem.Hash != arquivoDestino.Hash
        select arquivoOrigem;

    foreach (var arquivo in modificados)
    {
        //arquivo na origem é diferente do arquivo no destino, você pode substituir o arquivo da origem pelo destino;
    }
}

static void LerPasta(DirectoryInfo pasta, string basePath, ref List<Arquivo> arquivos)
{
    foreach (var subPasta in pasta.GetDirectories())
    {
        LerPasta(subPasta, basePath, ref arquivos);
    }

    foreach (var arquivo in pasta.GetFiles())
    {
        LerArquivo(arquivo, basePath, ref arquivos);
    }
}

static void LerArquivo(FileInfo arquivo, string basePath, ref List<Arquivo> arquivos)
{
    var file = new Arquivo();
    file.Path = arquivo.DirectoryName.Replace(basePath, string.Empty);
    file.Nome = arquivo.Name;
    using (var lobjLeitura = arquivo.OpenRead())
    {
        using (var lobjSha512 = new SHA256Managed())
        {
            file.Hash = lobjSha512.ComputeHash(lobjLeitura);
        }
    }
    arquivos.Add(file);
}
  • Hey, Toby, I can’t do it that way. I just did an example, but there are over 3,500 files and every day and sometimes more than once a day, so I can’t manually load like this. Look at my edit. There’s no file to upload.

  • @pnet, the fixed list only serves to exemplify, you can make queries using much larger lists..

  • @pnet, I will added a script to compare files and the file structure, it may get you useful.

  • Toby, your example looks good, but I couldn’t get it to run, because? Well, on a list I have names of files that are in a folder. On the other to be compared, I have a txt with several filenames inside. Soon I have the first list with, 10, 12 or n file names, but in the other I will always have one (with several names), which represents the txt file that is one only. This is giving me some trouble.

2

You can use the method Except. In the code below, the variable resultado store records that are not present in the list _fixo. If you want the result to be the records that exist in common in the two lists, you can use the method Intersect.

var resultado = _recebido.Except(_fixo);

See working on Ideone

Browser other questions tagged

You are not signed in. Login or sign up in order to post.