I have two good ways for you to go:
First Option: Phoning the String
Here is an issue on which this is widely debated. I don’t like any of the solutions, so I wrote mine:
public static class FoneticaHelper
{
public static String Fonetizar(String termo)
{
var sb = new StringBuilder(termo.ToUpper());
sb.Replace("Á", "A");
sb.Replace("À", "A");
sb.Replace("Ã", "A");
sb.Replace("Ê", "E");
sb.Replace("É", "E");
sb.Replace("Í", "I");
sb.Replace("Ó", "O");
sb.Replace("Õ", "O");
sb.Replace("Ú", "U");
sb.Replace("Y", "I");
sb.Replace("BR", "B");
sb.Replace("BL", "B");
sb.Replace("PH", "F");
sb.Replace("MG", "G");
sb.Replace("NG", "G");
sb.Replace("RG", "G");
sb.Replace("GE", "J");
sb.Replace("GI", "J");
sb.Replace("RJ", "J");
sb.Replace("MJ", "J");
sb.Replace("NJ", "J");
sb.Replace("GR", "G");
sb.Replace("GL", "G");
sb.Replace("CE", "S");
sb.Replace("CI", "S");
sb.Replace("CH", "S");
sb.Replace("CT", "T");
sb.Replace("CS", "S");
sb.Replace("Q", "K");
sb.Replace("CA", "K");
sb.Replace("CO", "K");
sb.Replace("CU", "K");
sb.Replace("CK", "K");
sb.Replace("C", "K");
sb.Replace("LH", "L");
sb.Replace("RM", "SM");
sb.Replace("N", "M");
sb.Replace("GM", "M");
sb.Replace("MD", "M");
sb.Replace("NH", "N");
sb.Replace("PR", "P");
sb.Replace("X", "S");
sb.Replace("TS", "S");
sb.Replace("C", "S");
sb.Replace("Z", "S");
sb.Replace("RS", "S");
sb.Replace("TR", "T");
sb.Replace("TL", "T");
sb.Replace("LT", "T");
sb.Replace("RT", "T");
sb.Replace("ST", "T");
sb.Replace("W", "V");
int tam = sb.Length - 1;
if (tam > -1)
{
switch (sb[tam])
{
case 'S':
case 'Z':
case 'R':
case 'M':
case 'N':
sb.Remove(tam, 1);
break;
case 'L':
sb[tam] = 'U';
break;
}
}
tam = sb.Length - 2;
if (tam > -1)
{
if (sb[tam] == 'A' && sb[tam + 1] == 'O')
{
sb.Remove(tam, 2);
}
}
// ---------
sb.Replace("Ç", "S");
sb.Replace("L", "R");
/* if (!_usarVogais)
{
sb.Replace("A", "");
sb.Replace("E", "");
sb.Replace("I", "");
sb.Replace("O", "");
sb.Replace("U", "");
} */
sb.Replace("H", "");
StringBuilder frasesaida = new StringBuilder();
if (sb.Length <= 0) return "";
frasesaida.Append(sb[0]);
for (int i = 1; i <= sb.Length - 1; i += 1)
{
if (frasesaida[frasesaida.Length - 1] != sb[i] || char.IsDigit(sb[i]))
frasesaida.Append(sb[i]);
}
return frasesaida.ToString();
}
}
I also made the context automatically phoned when the field name is finished with "Phoned". For example, NomeFonetizado
receives the phoned value of the column Nome
automatically when entering or updating a record:
public override int SaveChanges()
{
var context = ((IObjectContextAdapter)this).ObjectContext;
IEnumerable<ObjectStateEntry> objectStateEntries =
from e in context.ObjectStateManager.GetObjectStateEntries(EntityState.Added | EntityState.Modified)
where
e.IsRelationship == false &&
e.Entity != null &&
typeof(IEntidade).IsAssignableFrom(e.Entity.GetType())
select e;
var currentTime = DateTime.Now;
foreach (var entry in objectStateEntries)
{
dynamic entityBase = entry.Entity;
if (entry.State == EntityState.Added || entityBase.DataCriacao == DateTime.MinValue)
{
entityBase.DataCriacao = currentTime;
entityBase.UsuarioCriacao = HttpContext.Current != null ? HttpContext.Current.User.Identity.Name : "MeuUsuario";
}
entityBase.UltimaModificacao = currentTime;
entityBase.UsuarioModificacao = HttpContext.Current != null ? HttpContext.Current.User.Identity.Name : "MeuUsuario";
foreach (var prop in entry.Entity.GetType().GetProperties())
{
if (prop.Name.EndsWith("Fonetizado"))
{
var colunaRelacionada = prop.Name.Replace("Fonetizado", "");
var valorOriginal = entry.Entity.GetType().GetProperty(colunaRelacionada).GetValue(entry.Entity, null);
prop.SetValue(entry.Entity, FoneticaHelper.Fonetizar(valorOriginal.ToString()));
}
}
}
try
{
return base.SaveChanges();
}
catch (DbEntityValidationException e)
{
foreach (var eve in e.EntityValidationErrors)
{
Console.WriteLine("Entity of type \"{0}\" in state \"{1}\" has the following validation errors:",
eve.Entry.Entity.GetType().Name, eve.Entry.State);
foreach (var ve in eve.ValidationErrors)
{
Console.WriteLine("- Property: \"{0}\", Error: \"{1}\"",
ve.PropertyName, ve.ErrorMessage);
}
}
throw;
}
}
Option two: Levenshtein distance
In this algorithm, you calculate a tolerance coefficient for two slightly different strings. Here explains in detail how to do. If the calculated coefficient is below the maximum tolerance margin, you return the record.
Gypsy, I axei very complex this approach, so I understand I will have to record in the bank the phonetic name and the original name right ? I believe it would be unviable in my case, the volume would be very large. I believe that when I pass "Brás" for research, the bank returns "Brás Cubas" and "Bras Santos", but Entity only shows what contains the accent, because the search parameter had the accent, I believe I have some way of saying for Entity to show with and without accent, I’ve been through a lot, but nothing so far.
– Alessandre Martins
Well, if you don’t want to record, the distance from Levenshtein is the best way. Unfortunately there is no easy way. I use both approaches, depending on what I need and what part of the system.
– Leonel Sanches da Silva