3-letter abbreviation table for cities by state

Asked

Viewed 13,099 times

0

In the State of SP for example ADA would be Adamantine, ADO Adolfo, AGU Aguai, AGD High, ..., ADC, Alvaro de Carvalho, ABR Americo Brasiliense, ... SJC Sao Jose dos Campos, ... but "could be" is kick.

The three letters will be in physical medium (plates) and digital, can not be arbitrary. We need preference for a standard, as did IATA (airports) or Anatel, but not a federal standard (namespace of ~5500 items for the 3 letters), a state standard, where for example in SP we have ~650 items to associate the codes of the same 3 letters.

TECHNICAL NOTE

The 3-letter code plays the role of a hash, and in this sense it is worth to classic collision chance ratio... If we have 26 3 combinations with 3 letters, to keep the chances of collision at acceptable levels, we can’t map more than a fraction of that, say 1% to 5% of the 17576 combinations. The namespace of 5500 (31%!) precisely extrapolated and Anatel paid the price, was forced to use Y instead of I, or "nothing-there abbreviations" because of the excess of collisions in the most mnemonic choices - more like car plates than mnemonic acronyms.
From my evaluation the 3% of SP or max. 4% of MG are reasonable and would generate good results.


If there is an algorithm ready and standardized would also be a solution. For example Metaphone is a standard but does not preserve the same alphabet or generate mnemonic codes for initials in the case of compound names (e.g. São José dos Campos would result in SJC) of two or more words. I believe that we do not need to "reinvent the wheel" and the ideal algorithm for Portuguese already exists... Roughly the ideal algorithm is simple, it does this:

  1. If simple name, try the first 3 letters.
    Itu and Jaú are automatically resolved, Campinas and Marilia would be left CAM and MAR.

    1.1. If there is a collision (ex. Marinopolis can’t use MAR), adopt the first letter followed only by consonants (e.g.. MRN).

  2. Names composed of several words: first try the initials without preposition, then generate combinations with the previous item of the initial or final words of the name.
    Example: Santa Rita do Passa Quatro can be SRP, SPQ, ... or STQ, SQT, etc..

Ideal that the algorithm is well grounded and already adopted as standard elsewhere.

Another problem to be solved by the algorithm, so that it is fair to future users of the created acronyms. Who deserves the acronym "more beautiful"?
For example, BOR goes to Borborema (15790 inhabitants and founded in ) or to Borá (890 inhabitants and founded in 1965 )? The oldest or the most populous?


References

  • Brazilian cities with airports: apparently all of them should have IATA abbreviations, but I couldn’t find a systematic list of Brazil (not to be confused with airports)... A few examples but I don’t know if they’re official: ANS Anápolis-GO, PTM Patos de Minas-MG, PTS Patos-PB, URA Uberaba-MG, UDI Uberlândia-MG... and the most exotic, BSB Brasília-DF, GYN Goiânia-GO, ...

  • Anatel abbreviations: in PDF and seems to be (Date HTTP header) 2013.

  • CDHU/SP abbreviations: they do not have 3 letters, but they are all simple and short words, greatly simplifying the work of the algorithm. It would help as "official reference" if there is no other starting point... Only available at PDF and appears to be 2005.

  • Data of cities, by state, etc. Wikidata. How to do queries Sparql can be boring, some tables like the of São Paulo are already ready, and a homologated dataset can also be used.

  • How about taking Anatel’s and separating by state?

  • @Jeffersonquesado edited the question explaining better, check the "TECHNICAL NOTE", if need review.

  • @Bacco, thank you for explaining your vote-against, but note that I’m just contextualizing the problem, I don’t care if you answer with C# or Perl or so... And precisely a problem that I made a point of bringing here and not in the main Stackoverflow, because I want to first emphasize the context of the Portuguese language (!). These are numerical questions, I can remove everything I put on Brazil and SP, but I would lose the context... How about waiting for the community to manifest? There were already 2 votes I believe that in 1 week someone responds... Or you think more important to give more face of "programming problem"?

  • My vote is not binding, I would have to have 4 more closing. It may be that the staff does not close. I actually think it was pretty elaborate, but I can’t see a real programming problem that I didn’t leave wide at least. The need itself I find really out of scope (but that’s how I see it, not necessarily how others see it).

  • Anyway, I withdrew the comment and the vote not to influence, and then I delete these here.

  • @Peterkrauss if I understand correctly you want an algorithm that defines a 3 letter code for each municipality in Brazil? We have 5564 municipalities( second source: https://ww2.ibge.gov.br/home/statica/populacao/count2007/popmunic2007layoutTCU14112007.xls!

  • I initially thought @Bacco was being too orthodox about the scope. Because the question is interesting, and even though it’s strictly outside the scope, I thought that maybe you could squeeze the bar in a little bit. Seeing how the question evolved, I even thought of closing as unclear or even "based on opinions" as well. I decided to stick with the suggestion of Bacco because in the background is question for a geographer answer.

  • Hi @Diogolindoso, what I tried to express is that the abbreviations for the ~5600 municipalities already exist, were made by Anatel, and that we do not want this ("... but not a federal standard"), we are seeking abbreviations for smaller sets, by state: SP, MG, etc. which never exceed ~800. As for the algorithm I outlined is a kick, it’s nice to do but the hard thing is to find out if there is any "standard algorithm", which has already been used or has a good foundation.

  • If there is no pattern (and your previous search seems to indicate this), any path will be arbitrary. You’re looking for the least arbitrary possible, right? "Possible" within conditions that you’re determining for yourself. It would need to define conditions more clearly. For example, should the population be taken into account? The response that appeared disregarded this. Collision rules are not clear either. Can you repeat acronyms in different states? Defined the conditions, the algorithm would be practically given. In this case, what would be left to ask?

  • Well, already answered one of my question in the comment posted almost at the same time of mine. : ) I reiterate that I find the subject interesting, but I do not know how to make the question less problematic.

  • @bfavaretto, yes, we lack parameters and fundamentals, I do not know if I edit the question again or if I start a Github ;-) As I have already started to answer I will do some tests and also maintain a certain compatibility between question and answer.. The answer sounds good, maybe we’re done here.

  • made progress @Peterkrauss ?

  • 1

    Hi @Rovannlinhalis, I’m waiting for the green light of a project to resume the subject here, with subsidies on Github... Actually a tree-breaking algorithm, like the current answer (which was great!), I had already published in PHP before I came to ask the question... The project does not target the ZIP code but can be better understood/imagined as a replacement of the numerical ZIP code by a code with letters and numbers, initiated by the 2 letters of the state, then 3 letters of the municipality. Ver https://github.com/OSMBrasil/CRP e http://datasets.ok.org.br/city-codes

Show 8 more comments

1 answer

4

Officially, I also searched and found nothing about, I believe that if there was, would be in IBGE.

The solution could be an algorithm of its own, and of course, keep these records stored in case there is a new city in the state, the result is equal.


Cwith a little time and several fors, I made this algorithm to generate the acronyms, I did not take into account the UF, and the ordering of cities is alphabetical.

Bthe priority is that the first letter be the beginning of the first word. The second letter, the beginning of the second word. The third Letter, the beginning of the last word.

If this is not possible, try to take the following letters of the following or previous words (in the case of the last letter).

If not possible, try to take the next 2 letters of the first word, and if not possible, reverse-scroll the first word trying to set the acronym.

If after all attempts it is not possible, the success flag remains false and the system warns that it has not generated acronym for that word.

Follow the commented code:

static void Main(string[] args)
{
    List<string> cidades = new List<string>(645);
    #region addCidades
    cidades.Add("BIRITIBA-MIRIM");
    cidades.Add("BOA ESPERANCA DO SUL");
    cidades.Add("CONCHAL");
    cidades.Add("DOBRADA");
    cidades.Add("DOIS CORREGOS");
    cidades.Add("MERIDIANO");
    cidades.Add("MESOPOLIS");
    cidades.Add("MIGUELOPOLIS");
    cidades.Add("MINEIROS DO TIETE");
    cidades.Add("MIRA ESTRELA");
    cidades.Add("MIRACATU");
    cidades.Add("MIRANDOPOLIS");
    cidades.Add("NOVA CAMPINA");
    cidades.Add("PANORAMA");
    cidades.Add("VIRADOURO");
    cidades.Add("VISTA ALEGRE DO ALTO");
    cidades.Add("VITORIA BRASIL");
    cidades.Add("VOTORANTIM");
    cidades.Add("VOTUPORANGA");
    cidades.Add("ZACARIAS");
    #endregion

    //Primeiro a lista é ordenada por ordem alfabética, fazendo com que a sigla SAL seja gerada para a cidade SALES e não para a cidade SALTO.

    Dictionary<string, string> siglas = new Dictionary<string, string>();

    foreach (string c in cidades)
    {
        string[] cs = c.Replace("'", " ").Replace("-"," ").Split(' '); //Depois, vamos separar cada nome em palavras com o `Split(' ')`, observando que as cidades com hifén ou apóstrofo também são separadas em palavras.
        bool sucesso = false;
        string sigla = null;
        for (int i = 0; i < cs.Length && !sucesso; i++) //Percorrer cada palavra
        {
            for (int j = 0; j < cs[i].Length && !sucesso; j++) //Percorrer cada letra da palavra atual (i)
            {
                sigla = cs[i][j].ToString(); //Primeira letra da sigla

                for (int k = i+1; k < cs.Length && !sucesso; k++) //Se há mais palavras depois da palavra atual (i), entra no for
                {
                    for (int l = 0; l < cs[k].Length && !sucesso; l++) //Percorre cada letra da palavra (k)
                    {
                        sigla = cs[i][j].ToString() + cs[k][l].ToString(); //Segunda letra da sigla

                        for (int m = cs.Length-1; m > k  && !sucesso; m--) //Se há mais palavras além da palavra (k), entra no for
                        {
                            for (int n = 0; n < cs[m].Length && !sucesso; n++) //Percorre cada letra da palavra (m)
                            {
                                sigla = cs[i][j].ToString() + cs[k][l].ToString() + cs[m][n].ToString(); //Terceira letra da sigla

                                if (!siglas.ContainsKey(sigla)) //Se a sigla ainda não foi utilizada
                                {
                                    siglas.Add(sigla, c); //Adiciona no dicionário
                                    sucesso = true; //Marca a flag como sucesso pra sair de todos os outros for
                                }
                            }
                        }

                        for (int m = l+1; m < cs[k].Length && !sucesso; m++) //Se não gerou a terceira letra com a palavra m, percorre as letras seguintes à (l) da palavra (k)
                        {
                            sigla = cs[i][j].ToString() + cs[k][l].ToString() + cs[k][m].ToString(); //Terceira letra da sigla

                            if (!siglas.ContainsKey(sigla)) //Se a sigla ainda não foi utilizada
                            {
                                siglas.Add(sigla, c); //Adiciona no dicionário
                                sucesso = true; //Marca a flag como sucesso pra sair de todos os outros for
                            }
                        }

                    }
                }


                for (int m = j + 1; m < cs[i].Length && !sucesso; m++) //Se não gerou a segunda letra com a palavra (k) percorre a palavra [i]
                {
                    if (m + 1 < cs[i].Length) //Dá preferência a letra seguinte se existir
                    {
                        sigla = cs[i][j].ToString() + cs[i][m].ToString() + cs[i][m + 1].ToString(); //Compoe a sigla com a segunda (m) e terceira letra (m+1)

                        if (!siglas.ContainsKey(sigla)) //Se a sigla ainda não foi utilizada
                        {
                            siglas.Add(sigla, c); //Adiciona no dicionário
                            sucesso = true; //Marca a flag como sucesso pra sair de todos os outros for
                        }
                    }

                    for (int n = cs[i].Length - 1; n >= 0 && !sucesso; n--) //Percorre a palavra (i) no sentido inverso para gerar a terceira letra
                    {
                        sigla = cs[i][j].ToString() + cs[i][m].ToString() + cs[i][n].ToString(); //Compoe a sigla com a segunda (m) e terceira letra (n)

                        if (!siglas.ContainsKey(sigla)) //Se a sigla ainda não foi utilizada
                        {
                            siglas.Add(sigla, c); //Adiciona no dicionário
                            sucesso = true; //Marca a flag como sucesso pra sair de todos os outros for
                        }
                    }
                }

            }
        }


        if (sucesso) //Se foi possível gerar a sigla
        {
            Console.WriteLine("Sigla " + sigla + " gerada para a cidade " + c);
        }
        else //Se nenhuma combinação foi possível
        {
            Console.WriteLine("Não foi gerada sigla para a cidade " + c);
        }


    }

    Console.WriteLine("Siglas geradas: " + siglas.Count + " de um total de " + cidades.Count+" cidades");

    Console.ReadKey();

}

Upshot:

Sigla BMI gerada para a cidade BIRITIBA-MIRIM
Sigla BES gerada para a cidade BOA ESPERANCA DO SUL
Sigla CON gerada para a cidade CONCHAL
Sigla DOB gerada para a cidade DOBRADA
Sigla DCO gerada para a cidade DOIS CORREGOS
Sigla MER gerada para a cidade MERIDIANO
Sigla MES gerada para a cidade MESOPOLIS
Sigla MIG gerada para a cidade MIGUELOPOLIS
Sigla MDT gerada para a cidade MINEIROS DO TIETE
Sigla MET gerada para a cidade MIRA ESTRELA
Sigla MIR gerada para a cidade MIRACATU
Sigla MIS gerada para a cidade MIRANDOPOLIS
Sigla NCA gerada para a cidade NOVA CAMPINA
Sigla PAN gerada para a cidade PANORAMA
Sigla VIR gerada para a cidade VIRADOURO
Sigla VAA gerada para a cidade VISTA ALEGRE DO ALTO
Sigla VBR gerada para a cidade VITORIA BRASIL
Sigla VOT gerada para a cidade VOTORANTIM
Sigla VOA gerada para a cidade VOTUPORANGA
Sigla ZAC gerada para a cidade ZACARIAS
Siglas geradas: 20 de um total de 20 cidades

I put in the .Netfiddle

Obs. By the length of the code, I put only 20 cities. Fiddle has all 645 of SP that I had here at the base.


Extra:

I generated the acronyms for 2977 cities I have here at the base and there was no problem:

inserir a descrição da imagem aqui

  • Boy, awesome. Got to share the idea behind so many ties?

  • @Jeffersonquesado I tried to pass the idea in the description... what do you think ?

Browser other questions tagged

You are not signed in. Login or sign up in order to post.