Why does PHP allow you to create identifier names with special characters?

Asked

Viewed 767 times

13

Typically in programming languages and databases, identifier names (variables, functions, classes, methods, tables, fields, etc.) must start with a letter or underline, numbers can come in sequence and to avoid problems accented and special characters should be avoided.

In PHP variables should start with a dollar sign $ but it is possible to create identifiers with strange names like the ones below:

<?php
header('Content-Type: text/html; charset=utf-8');

function executarAção($ação){
    echo $ação .' <br>';
}

function 웃(){
    echo 'boneco de palito da uml é você mesmo? <br>';
}

function variavelEstranha(){
    ${0} = 'Olá mundo estranho :D';
    echo ${0};  
}

executarAção('kboom');
웃();
variavelEstranha();

Question:

Why PHP allows you to create variables and functions with special characters?

Example

Note: in my test, I saved the file as utf-8.

  • 1

    Why I wouldn’t allow?

  • Probably the reason is lack of reason (or planning) not to allow.

  • What a bizarre stop, I didn’t know you could do this, kkk +1 by the examples

  • @utluiz Yes, if you have no reason to ban, you do not need a reason not to ban.

  • I’ve seen several PHP files starting with <?php # -*- coding: utf-8 -*-... I knew why, but I forgot...

  • And also do not know why downvote without leaving a hint of the problem with this issue.

  • 2

    Shows every lack of PHP coherence... Recognizes function executarAção, but the native function itself to count the size of a string does not recognize accents.

  • @Maybe you’re wrong. Post a question about it so the guys can analyze. In some encodings the size of each character, in bytes, varies.

  • @Caffé, is not a question, is a demonstration of with PHP can cause these 'confusions'.

  • 3

    @Papacharlie At least in this respect, for me no cause. Just do not confuse strlen() with mb_strlen().

  • 2

    @Caffé, each one is free to vote as you give on the tile without obligation to explain why (it would be legal, but is not required)... I think the +6 vs -1 speaks for itself.

  • 1

    For function names I didn’t know it allowed.. Interesting.. I know you can also use in array ('漢 字'=>null) indexes... Outside the PHP world, if PHP is "confusing or bizarre" by allowing this, Oracle is also because database names, tables and columns accept special characters as well.

  • @Danielomine MS SQL Server also does not have this restriction. Actually I do not know of any language that does not allow to use "special characters".

Show 8 more comments

1 answer

9


Because programmers from other places in the world, like Saudi Arabia, for example, would probably like to be able to create a function like this:

نفعلشيئا()

There is no reason for the compiler to ban the use of "special characters" in the source code.

You can agree not to use it in your project for some specific reason, and you can also set up a source code parser to break the build if you find characters that are prohibited by convention. But the compiler does not know some aspects of his team’s culture or is not concerned.

Update: excuse the use of the word "compiler":-) Some languages are only interpreted. But the answer is the same.

By the way, his example, 웃, seems to be a valid carcter in Korea ("smiling", according to Google Translate).

Update 2 - a brief reflection: Why we do not use special characters in our systems?

What makes a special character?

In this documentation from a Windows resource, I found an interesting setting:

Special characters are characters not found in keyboard.

Now, on my keyboard I see everything I need to write "executing". So some special character definition is wrong (ours here in this question or that of the Microsoft Documentator).

I liked this one better another definition:

Are characters like dots, symbols (@ * ! % ; : . ) or spaces in white that are not accepted by the registration system for filling of the user name and password fields.

There is a good definition. It establishes the domain for its definition of special characters: certain fields of the registration system.

So I conclude that the definition of "special characters" varies according to the context. What is special character here may not be there.

Our special character definition:

We, Portuguese-speaking programmers, consider special even the characters that are part of our life: cedilla and accentuation. We don’t like them in our systems because... because... why even? Of course, we don’t even need formal definition, our experience reveals that this only gives problem:

  • Each programmer saves the file with a different encoding, so the cedilla you saved on your machine appears as a strange symbol on mine.

  • Each application that uses our database has been compiled or will be interpreted using a different encoding, so the SELECT I wrote to an application using cedilla will not find the table created with cedilla from another application.

  • Each programmer has a different "notion" of which words are accented, so the function I name with an accent will not be easily found by another programmer who believes that the word does not carry accent.

  • And so on and so on...

Completion: Now, a well-meaning PHP programmer must find it nice that Portuguese speakers can use their accents, so common, in the source code. If the interpreter has no problem dealing with these characters, why limit their use? And of course, as I mentioned, there are many languages out there using the most diverse "special characters".

The problems I mentioned can be ignored by this PHP programmer because it is "easy" to eliminate them: it is enough that everyone uses coding UTF-8 in their editors and other tools, and it is enough that everyone knows their language well. We don’t bet on it (I at least don’t bet on it) and follow our tradition of agreeing not to use special characters.

It is clear that before the advent and massification of more modern coding patterns (UTF-8), the problems regarding the use of "special characters" were serious because the coding patterns were limited. Even at that time compilers and interpreters could not do much to limit the use of characters because, as already said, the definition of which characters are special will vary.

  • And the ${0} should not generate an ex error: { unexpected or, instruction(if/while/class) expected before the { ?

  • @lost Not necessarily. PHP already knows that you are declaring a variable given the $, then he just needs to ban the characters he can’t handle in that context. There are other languages that do not prohibit the use of keywords for example in a context where the compiler knows that it does not fit the use of the word as a keyword.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.