How to make Speechrecognition listen to system sounds (youtube, zoom, etc.) instead of the microphone?

Asked

Viewed 36 times

-2

import pyaudio
import wave
import numpy as np
import pyautogui
import speech_recognition as sr


CHUNK = 1024
FORMAT = pyaudio.paInt16
RATE = 44100



p = pyaudio.PyAudio()


#stream usando o as_loopback para pegar som do SO
   
stream = p.open(
    format = FORMAT,
    channels = 2,
    rate = RATE,
    input=True,
    frames_per_buffer=CHUNK,
    input_device_index=16,)

#Função para ouvir e reconhecer a fala
def ouvir_microfone():
    #Habilita o microfone do usuário
    microfone = sr.Recognizer()
    
    
    
    #usando o microfone
    with sr.Microphone() as source:
        
        #Chama um algoritmo de reducao de ruidos no som
        microfone.adjust_for_ambient_noise(source)
        
        
        #Frase para o usuario dizer algo
        print("Diga alguma coisa: ")
        
        #Armazena o que foi dito numa variavel
        audio = microfone.listen(source)
        
    try:
        
        #Passa a variável para o algoritmo reconhecedor de padroes
        frase = microfone.recognize_google(audio,language='pt-BR')

        stream.stop_stream()
        stream.close()
        p.terminate()

        if "aula" in frase:
            pyautogui.PAUSE = 1
            pyautogui.keyDown('win')  # hold down the shift key
            pyautogui.press('1')     # press the left arrow key
            pyautogui.press('1')     # press the left arrow key
            pyautogui.keyUp('win')
            pyautogui.click(x=1311, y=988)
            pyautogui.write("Presente professor")
            pyautogui.press('enter')
            pyautogui.hotkey('alt', 'tab') 
            

            

        
        #Retorna a frase pronunciada
        print("Você disse: " + frase)
        
    #Se nao reconheceu o padrao de fala, exibe a mensagem
    except sr.UnkownValueError:
        print("Não entendi")
        
    return frase
    

ouvir_microfone()

1 answer

1

Hello! Try changing the variable value input_device_index.

When you create the variable stream, you are passing several parameters to Pyaudio, this variable is responsible for selecting from which device the package will capture the audio:

stream = p.open(
    format = FORMAT,
    channels = 11,
    rate = RATE,
    input=True,
    frames_per_buffer=CHUNK,
    input_device_index=#ALTERAR AQUI,
    as_loopback=True)

PS: Maybe you have to search for these 'Devices' with:

for i in range(0, p.get_device_count()):
    print(i, p.get_device_info_by_index(i)['name'])

This way you can see which list of devices Pyaudio can capture, and choose which one you want. Let me know if you can.

If you still have questions, take a look in that reply, in this case it uses two variables stream, and simultaneously captures two devices (channels).

  • I couldn’t get it buddy, I don’t know how to make speech_recognition recognize, I’ll show you how it turned out

  • But then your problem has changed? Are you able to record the sound coming out of Windows now? Because your problem was in recording the PC Audio instead of the Mic audio. You managed to do this?

  • I don’t know if I can help you with the Speech Recognition package because I’ve never used that package. If it cannot read, but you are recording correctly, I imagine the problem is in the format of the file being created by Pyaudio

  • If you think the problem is still time to record audio from your speakers, ia recommends that you enable the option "Stereo Mixing", as it is taught on that website and that I took of that answer

Browser other questions tagged

You are not signed in. Login or sign up in order to post.