0
Dear people, I have an application using speech recognition that works very well in English (en-US), however when I tried to use for Portuguese (en-BR) the results are terrible, it seems like you are trying to recognize another language! I’m doing something wrong?
What’s right to use, Speech SDK 5.1 or Speech Platform 11.0?
I have installed:
Microsoft Speech Platform x64 v11.0
Microsoft Speech SDK 5.1
Microsoft Speech SDK 5.1 Language Pack
Microsoft Server Speech Recognition Language - TELE (en)
Microsoft Server Speech Recognition Language - TELE (en-US)
And some other languages too.
Here is my code:
public static string ProcessAudio(Stream input, string language)
{
try
{
_recon = "";
CultureInfo cInfo = new CultureInfo(language);
SpeechRecognitionEngine sre = new SpeechRecognitionEngine(cInfo);
sre.SetInputToWaveStream(input);
Choices options = new Choices();
options.Add(new string[] { "1", "2", "3", "4", "5", "6", "7", "8", "9", "0" });
options.Add(new string[] { "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q",
"r", "s", "t", "u", "v", "w", "x", "y", "z" });
GrammarBuilder gb = new GrammarBuilder();
gb.Append(new GrammarBuilder(options, 5, 50));
gb.Culture = cInfo;
Grammar g = new Grammar(gb);
sre.LoadGrammar(g);
sre.SpeechRecognized += new EventHandler<SpeechRecognizedEventArgs>(sre_SpeechRecognized);
sre.SpeechHypothesized += new EventHandler<SpeechHypothesizedEventArgs>(sre_SpeechHypothesized);
sre.Recognize();
sre.Dispose();
return _recon;
}
catch (Exception ex)
{
return "";
}
}
static void sre_SpeechRecognized(object sender, SpeechRecognizedEventArgs e)
{
_recon += e.Result.Text;
Console.WriteLine(e.Result.Text);
}
static void sre_SpeechHypothesized(object sender, SpeechHypothesizedEventArgs e)
{
Console.WriteLine($"{e.Result.Text} conf: {e.Result.Confidence}");
}
How does the user interact? What do you want to recognize? Simply spell letters and numbers?
– Luishg
Sorry for the delay, I get an audio file with the letters and numbers for interpretation. There is no direct interaction with the user.
– Matheus Lemos