Recognition of sound patterns

Question

Recognition of sound patterns

Asked 9 years, 7 months ago

Viewed 1,005 times

2

I would like to know about the existence of some Android/Java API for sound pattern recognition.

Example: Bird singing, sound is given as input, and the application would return to which bird this sound belongs.

The application will be used for something similar to the example.

interesting project, but hardly feasible. Like how to compare the sound even exists, because I have an app for tuning the guitar here on my Cell, This can be done, now to know to which pass the sound belongs you would need to have all possible chants of birds in a database and compare either by frequency or by some other means. But keep going, it’s new

– Armando Marques Sobrinho

2016/08/13 at 14:26
As I said Armando, the singing of birds is just one example, in fact the amount of sounds I will compare will be limited, at most ten different. Thanks for the encouragement

– CA_93

2016/08/13 at 23:55
1

@CA_93 This is machine learning. It’s not as difficult when the possibilities are limited as it seems to be your case. Take a look at this question and the answers to better understand: http://answall.com/questions/113343/qual%C3%A9-a-defini%C3%A7%C3%a3o-de-learning-m%C3%a1quina-machine-Learning

– Pablo Almeida

2016/08/14 at 23:10

3 answers

Browser other questions tagged java android áudio

You are not signed in. Login or sign up in order to post.

by ederwander • **6,431** points · Answer 1 · 2016-08-15T04:02:55+00:00

Come on, satisfactory results can be obtained without use ANN/Rnas, compare amplitude (mentioned in an answer here) will never work in the proposed way, the question is very broad, I can not simply write an article here with all the steps, although it seems complicated if you have a good mathematical/algebraic basis and in signal processing you will see that it is not so complicated, it can be laborious but not very complex, so being familiar with these fields is more than essential, solid foundation in deterministic and stochastic processes.

Maybe I’ll start telling you the steps here and you "do not understand bulbs", so it is up to you to deepen, the steps are:

Extract the characteristics of each bird (audio from each corner), this can be done by extracting the MFCC - Mel Frequency Cepstral Coefficients

The MFCC extracts the envelope/formants (contour) of the frequencies of a signal in the frequency domain, this tells us consistently the shape of the vocal tract in the spectrum envelope, we will have the frequency bands equally spaced in the honey scale, which approaches the response of the human auditory system more narrowly than the linearly spaced frequency bands used in normal Epstrum, generally 12 coefficients are sufficient, roughly it is a spectrum filter bench:

At this point you will have a vector of 12 positions representing the characteristics of the corner for each bird you want, I don’t want to go into too much, but from now on all you need to do is compare your pre-recorded vector with a new (currently unknown) and score which of them has the best similarity, you can start by simpler comparisons like Euclidean or try something but elaborate as for example Dynamic Time Warping

by Max Fratane • **1,535** points · Answer 2 · 2016-08-13T19:32:51+00:00

Dude, I would recommend you use the AudioInputStream. Using it, you can take the amplitude of the wave and thus find a pattern in the input sound and compare, this pattern, with the patterns you already have in the application, the singing of a bird as you spoke, for example.

I recommend some readings:

Example using the AudioInputStream

http://ganeshtiwaridotcomdotnp.blogspot.com.br/2011/12/java-extract-amplitude-array-from.html

Java Doc

https://docs.oracle.com/javase/7/docs/api/javax/sound/sampled/AudioInputStream.html

by Piovezan • **15,850** points · Answer 3 · 2016-08-14T22:27:56+00:00

A good way to do it but that requires a little prior familiarization with the concept of Artificial Neural Networks would be to extract from the sound a vector (or two separate vectors) containing the predominant frequencies and amplitudes of that stretch and using this vector as an input for an RNA that was previously fed recordings of known birds. More exactly with the frequency and amplitude data of these recordings.

I believe it is a simple and appropriate case for the application of Rnas because they are simple sounds, without mixing frequencies. Of course it will work better when the parts under test are individual sounds, that is, a single bird singing at a time.

I’m talking about Rnas but I don’t know much about them. I only know the Percéptron type, and I believe it’s a case for them.

I do not indicate the most appropriate libraries for this because I do not know them, but nothing that a googlada by ANN (Artificial Neural Networks) and Java does not solve. And these days the way things are maybe there’s even some free service that offers Rnas for you to train and use.

I also leave it to you to choose whether it is better to perform this processing on the side of the Android device or a remote server.