Characters Invisible

Asked

Viewed 1,203 times

1

Setting up a web example, when I received a string for the projection in html from the application, I noticed an inconsistency. I would like help with this.

In one of the moments of creating the page I have a treatment of the object to be persisted, and the treatment is this:

   //sobrenome - apenas caracteres
   String s = v.getSobrenome();
   if(s.isEmpty()||!Pattern.matches("[a-zA-Z]+", s))
      v.setSobrenome("!INVALIDO!");

The only problem is that it magically appeared an invalid character (it is between the O of INVALIDO and the exclamation), which does not take up space in the presentation, but is countable by the keyboard cursor ("little arrows").

When I tried to get into the class folder to get it from my work and take it home to analyze, I ended up getting in the . java by browser (Firefox) and came across it:

   //sobrenome - apenas caracteres
   String s = v.getSobrenome();
   if(s.isEmpty()||!Pattern.matches("[a-zA-Z]+", s))
      v.setSobrenome("!INVALIDO„ƒ„ƒ„ƒ„ƒ„ƒ„ƒ„ƒ„ƒ„ƒ„ƒ„ƒ„ƒ„!ƒ");

I have no place for that character to have "appeared" in class.

The good thing is that erasing would solve the problem, but I wonder if anyone has ever come across it, already treated and knows its origin.

EDITION

It’s starting to get complicated for me. I copied the character and put it in a clean class, and while trying to save Eclipse gave me an error, saying it was impossible to save in a character format called "Cp1252".

  • 2

    What is the charset to file? What is the charset detected by firefox? There are several invisible control characters like \u200b, \u200e, and \u200f, but I have no idea about how you managed to insert one of them manually by your editor.

  • The eclipse is like UTF-8, the browser could not identify, but I believe it is with the same encoding.

  • I believe the reason is some "copy-glue" from another part of the code, where I used a character map pattern from Windows 7 itself. But it was a pattern I had already used and there was no such conflict and misinterpretation.

  • I think it’s pretty clear what happened, so what’s missing is "how to fix it?"... I suggest doing the following: 1) Create a "clean" class, as you did; 2) Copy all the code to the beginning of the string "! INVALID!" (do not copy the string itself); 3) Copy all code afterward of the string (again, do not copy the string); 4) Type the string manually. This should produce code without ghost characters. When saving, check the encoding (you say the eclipse is UTF-8, but when saving it complains of Cp1252-related error... check the actual encoding being used in your project).

  • Doing as suggested, there is no Cp1252 format error, it saves and works normally.

  • It seems my eclipse now recognizes the standard Cp1252, and if I define it as the encoding, it presents the „ƒ„ƒ„ƒ„ƒ„ƒ„ƒ„ƒ„ƒ„ƒ„ƒ„ƒ„ƒ„ in the middle of the string

Show 1 more comment

2 answers

0


After some research, I found that the standard "CP1252" is the default Windows standard, also known as "Windows-1252" or ANSI.

When using the character map for another pattern of my code, the character must have broken or copied some control character, generating this strange pattern of ghost character.

By the way the "CP1252" is also the default of some Ides, among them the Eclipse.

0

Yes, this kind of mistake happens often and I named it Caracterer Fantasma.

After observations, I concluded that this problem happens when developers exchange code snippets via instant communicators;

  • neo, there is no better explanation for this?

  • 1

    I don’t have Gustavo. I just have what I observed and I didn’t go into the subject because I don’t think it’s necessary. If the code is written right but does not work, the rule is to remove the spaces and type them again.

  • 2

    Paste Chrome console code or jsfiddle code may also cause this: http://stackoverflow.com/questions/12719859/syntaxerror-unexpected-token-illegal

Browser other questions tagged

You are not signed in. Login or sign up in order to post.