Character 'º' is accepted by utf8 within the Insert, but not when I use COPY

Asked

Viewed 127 times

0

I’m trying to migrate data from a Firebird database to a Postgresql version 12.0.

To generate a. csv file containing the table records in Firebird, I use the software Fbexport,

./fbexport -Sc -D /opt/firebird/bin/measures.fdb -H localhost -U sysdba -P masterkey -F /home/dani/Documents/raw_cfgpar.out -Q "SELECT * FROM CFGPAR"

I created the database using only the name as parameter, so the others were defined with the pattern, including the parameter "encoding", which according to the documentation, must receive utf8 through the standard template template1.

Create database measures;

Table contains columns such as integer values and other varchar.

When I try to add the file records using the COPY function, I get an error in the terminal:

COPY CFGPAR (CODIGO, PARNAME, PARVALUE, VRSCODIGO) from '/home/dani/Documents/raw_cfgpar.out' DELIMITER ',' CSV HEADER;
ERROR:  invalid byte sequence for encoding "UTF8": 0xba

Showing that Character ° is not accepted by UTF8

But when I try to enter a record using Insert through the terminal, the record is saved even containing Character °.

INSERT INTO cfgpar (codigo, parname, parvalue, vrscodigo) VALUES (55555, 'ºªã', 'ºº', 666);
INSERT 0 1
measures=# select * from cfgpar;
 codigo | parname | parvalue | vrscodigo
--------+---------+----------+-----------
  66666 | teste㺠| teste    |       555
  55555 | ºªã     | ºº       |       666
(2 rows)

Do I need to convert the csv file generated by Fbexport to another format? how to convert?

  • Have you tried to make encoding explicit? I think postgre uses another standard encoding. So: COPY CFGPAR (CODIGO, PARNAME, PARVALUE, VRSCODIGO) from '/home/dani/Documents/raw_cfgpar.out' DELIMITER ',' CSV HEADER ENCODING 'UTF8'; https://www.postgresql.org/docs/9.1/sql-copy.html

  • I tried now, the error remains the same, invalid byte Sequence for encoding "UTF8": 0xba

  • According to the reference below, the character 0xba is in ISO-8869-1, not in UTF8 - he suggests converting before the migration! Reference: https://stackoverflow.com/questions/25599816/unicodedecodeerror-utf8-codec-cant-decode-byte-0xba-in-position-1266-invali

  • According to UNICODE standard: Unicode code point: U+00BA | 
Character: º | 
UTF-8: c2 ba (hex.) | 
Name: MASCULINE ORDINAL INDICATOR

No answers

Browser other questions tagged

You are not signed in. Login or sign up in order to post.