1
I am making a program that needs to get the words in a file that has a certain formatting. Everything is working correctly except that, in the output file, accented characters get wrong. I even inserted two debug lines to print the word found in the console and, in this case, the accented characters appear correctly. I believe there’s some coding problem but I don’t know what it is.
The output in the file looks like this: (the weird things are the accented letters)
Abdala
abdel¡
abduct
achªmenes
Adã³Nide
Code:
while (<$in>){
if ($_ =~ /& .*/){ # Testa se a linha tem o formato de interesse.
my @linha = split (//, $_);
my $count = 0;
# Obtém o início da palavra.
while ($linha[$count] !~ /[a-zA-Záéíóúãẽĩõũâêîôûàèìòùäëïöü]/) { $count++; }
my $inicio = $count;
# Obtém o fim da palavra e calcula o tamanho.
while ($linha[$count] =~ /[a-zA-ZáéíóúãẽĩõũâêîôûàèìòùäëïöüÁÉÍÓÚÃẼĨÕŨÂÊÎÔÛÀÈÌÒÙÄËÏÖÜ]/) { $count++; }
my $tamanho = $count - $inicio;
# Obtém a palavra em caixa baixa e grava no arquivo.
my $palavra = lc (substr ($_, $inicio, $tamanho));
print $out "$palavra\n";
print $palavra; #DEBUG
print "\n"; #DEBUG
}
}
tries to join
use utf8::all;
at first...– JJoao
Put an example of the input file, and find out the format of the input file with the command
file arquivo
– aod