Check separator type in a CSV file

Asked

Viewed 1,331 times

3

I rode a script for import of CSV with PHP but I’m having difficulty with checking the type of formatting CSV. I need to check if he’s separated by ,or ; or \t or : or |, but I’m not getting it.

Follows the Script:

<?php

    $arquivo = isset($_FILES["file_csv"]) ? $_FILES["file_csv"] : FALSE;

    $tipoValido = "text/comma-separated-values";

    if($tipoValido != $arquivo["type"])
    {
        echo "Arquivo em formato inválido! O arquivo deve ser extensão CSV. Envie outro arquivo";
        echo '<br/><a href="index.html">Fazer Upload de Outro Arquivos</a>';
    }
    else
    {
        preg_match("/\.(csv){1}$/i", $arquivo["name"], $ext);

        $arquivo_nome = date("d-m-Y_H-i-s") . "." . $ext[1];

        move_uploaded_file($arquivo["tmp_name"], $arquivo_nome);

        $row = 0;
        $handle = fopen ($arquivo_nome,"r");
        echo "<table border=1>";
        while ( $data = fgetcsv ($handle, 1000, ",") )
        {
            $num = count ($data);
            $row++;


            $coluna1 = $data[0];
            $coluna2 = $data[1];
            $coluna3 = $data[2];
            $coluna4 = $data[3];
            $coluna5 = $data[4];

            //$query = "INSERT INTO tabela ( coluna1 , coluna2 , coluna3 ) VALUES ( '$coluna1' , '$coluna2' , '$coluna3', '$coluna4', '$coluna5' )";
            //mysql_query($query) or die(mysql_error());

            if ($row != 1)
            {

                echo "<tr>";
                echo "<td>".$coluna1."</td>";
                echo "<td>".$coluna2."</td>";
                echo "<td>".$coluna3."</td>";
                echo "<td>".$coluna4."</td>";
                echo "<td>".$coluna5."</td>";
                echo "</tr>";
            }
        }

        echo "</table>";

        echo "<p>Dados da planilha inseridos com sucesso!<br/>Total de linhas inseridas: ".$row."</p>";

        fclose ($handle);

        unlink($arquivo_nome);
    }

?>
  • There are 5 fields this csv always?

  • @rray The idea is this, but if it is dynamic would be even better.

  • 1

    If you give a explode by any of these delimiters can know which is the right delimiter because it returns an array with some element

2 answers

3


I got a solution and I’ll leave it on record in case anyone needs it.

I created a function to check the file delimiter, follow the script complete with the reader of CSV:

<?php

function getArquivoDelimitador($arquivo_recebido, $verificar_linhas = 2)
{
    $arquivo_recebido = new SplFileObject($arquivo_recebido);

    $delimitadores = array(',', "\t", ';', '|', ':');

    $resultado = array();

    for($i = 0; $i < $verificar_linhas; $i++){
        $linha = $arquivo_recebido->fgets();
        foreach ( $delimitadores as $delimitador )
        {
            $regExp = '/['.$delimitador.']/';

            $fields = preg_split( $regExp, $linha );

            if( count( $fields ) > 1 )
            {
                if( !empty( $resultado[$delimitador] ) )
                {
                    $resultado[$delimitador]++;
                } else {
                    $resultado[$delimitador] = 1;
                }
            }
        }
    }

    $resultado = array_keys( $resultado, max( $resultado ) );

    return $resultado[0];
}

$arquivo = isset($_FILES["file_csv"]) ? $_FILES["file_csv"] : FALSE;

$delimitador = getArquivoDelimitador($arquivo['tmp_name']);

$tipoValido = "text/comma-separated-values";

if($tipoValido != $arquivo["type"])
{
    echo "Arquivo em formato inválido! O arquivo deve ser extensão CSV. Envie outro arquivo";
    echo '<br/><a href="index.html">Fazer Upload de Outro Arquivos</a>';
}
else
{
    preg_match("/\.(csv){1}$/i", $arquivo["name"], $ext);

    $arquivo_nome = date("d-m-Y_H-i-s") . "." . $ext[1];

    move_uploaded_file($arquivo["tmp_name"], $arquivo_nome);

    $row = 0;
    $handle = fopen ($arquivo_nome,"r");
    echo "<table border=1>";

    while ( $data = fgetcsv ($handle, 1000, $delimitador) )
    {
        $num = count ($data);
        $row++;


        $coluna1 = $data[0];
        $coluna2 = $data[1];
        $coluna3 = $data[2];
        $coluna4 = $data[3];
        $coluna5 = $data[4];

        //$query = "INSERT INTO tabela ( coluna1 , coluna2 , coluna3 ) VALUES ( '$coluna1' , '$coluna2' , '$coluna3', '$coluna4', '$coluna5' )";
        //mysql_query($query) or die(mysql_error());

        if ($row != 1)
        {

            echo "<tr>";
            echo "<td>".$coluna1."</td>";
            echo "<td>".$coluna2."</td>";
            echo "<td>".$coluna3."</td>";
            echo "<td>".$coluna4."</td>";
            echo "<td>".$coluna5."</td>";
            echo "</tr>";
        }
    }

    echo "</table>";

    echo "<p>Dados da planilha inseridos com sucesso!<br/>Total de linhas inseridas: ".$row."</p>";

    fclose ($handle);

        unlink($arquivo_nome);
    }

?>
  • @ctgPi It is, but it is not, by some that I do not know when it is passed to fgetcsv with the return of $resultado[0] it does not understand. I believe that in the dynamic assembly of the variable it is lost, even concatenating " doesn’t work.

  • @ctgPi did not understand when it refers to '\t' != "\t" in various places, can you indicate me more specifically? So I can try better, of course if you want to edit to make the code better all help is welcome

  • It worked, I removed the switch and is returning correctly, also made the change of < in place of <=, thank you. About reading a small CSV the for is limited to read only the first 2 lines by the variable $verificar_linhas, even so he believes that it would give problem?

  • 1

    Have you ever heard of "Murphy’s Law"? : ) It doesn’t have to be too complicated, just check if the fgets failed to break this case. I don’t know if this is code you just debugged, but echo "<td>".$coluna1."</td>"; needs htmlspecialchars() before concatenating.

  • I also suggest you read that answer.

  • It would be something like that if ($linha)&#xA; {&#xA; break;&#xA; echo "Erro ao processar o arquivo";&#xA; exit();&#xA; } ? I will read yes on the link, I had already searched and I will start using the mysqli thank you.

  • Not in that order because the break ensures that the rest of the code will not run. : ) I think the condition is $linha === false or $linha === FALSE, but I’m not sure. But you don’t need to call Exit(); I can’t want to import a one-line CSV? Or the first line is always the header?

  • I used the !$linha exactly why I didn’t know if it was false or FALSE, I thought this way would be any return false. On the header the correct is that always have, this to guide the columns, but I will remove here so that it is better its reuse.

  • !$linha doesn’t work; I just did a question about that.

  • I saw they answered your question but is there any difference between FALSE and false ?

  • Solve my problem, thank you...

Show 6 more comments

0

In my Java project, uniVocity-parsers, I created an algorithm capable of identifying:

  • value separator (,)

  • line separator

  • quotation marks (double or single)

  • quotes escapement (usually " or "")

See here the algorithm (warning: it is complex). The process is not simple because Voce has to ignore what is between quotes, values with line separator, etc.

It shouldn’t be too hard to recreate this code in PHP.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.