Azure Speech to Text Service

Picture from ThisIsEngineering (https://www.pexels.com/@thisisengineering/)
Picture from ThisIsEngineering
 

Yet another coding exercise on Azure Cognitive Services. This time, we have Azure Speech to Text Service in Python. I am referencing this code.

1. Initialize Setup

First and foremost, I need to do the following

  1. Create a resource group
  2. Create Azure Cognitive Translation Services in this resource group.
  3. Copy the key and location values as shown below (Note endpoint value is not needed).


For simplicity, I have two values as environment parameters, AZURE_SPEECH2TEXT_KEY, and AZURE_SPEECH2TEXT_LOC.

2. Code

FYI, I am using MacOSX

2.1 Install library

pip3 install azure-cognitiveservices-speech==1.23.0

2.2 Code

import azure.cognitiveservices.speech as speech_sdk
import os

if __name__ == "__main__":
    speech_key, service_region = (
        os.environ["AZURE_SPEECH2TEXT_KEY"],
        os.environ["AZURE_SPEECH2TEXT_LOC"],
    )
    speech_config = speech_sdk.SpeechConfig(
        subscription=speech_key, region=service_region)
    speech_recognizer = speech_sdk.SpeechRecognizer(speech_config=speech_config)

    print("Say something...")

    result = speech_recognizer.recognize_once()

    if result.reason == speech_sdk.ResultReason.RecognizedSpeech:
        print("Recognized: {}".format(result.text))
    elif result.reason == speech_sdk.ResultReason.NoMatch:
        print("No speech could be recognized: {}".format(result.no_match_details))
    elif result.reason == speech_sdk.ResultReason.Canceled:
        cancellation_details = result.cancellation_details
        print("Speech Recognition canceled: {}".format(cancellation_details.reason))
        if cancellation_details.reason == speech_sdk.CancellationReason.Error:
            print("Error details: {}".format(cancellation_details.error_details))

3. Test

I was testing it out (speaking into the mic), and one of the test outputs is


4. Conclusion

Being a native English speaker and having my voice recognized by this service is nice! Coding-wise, it cannot be simpler than what we have here. I will do trying out text-to-speech service next.









Comments

Popular posts from this blog

OpenAI: Functions Feature in 2023-07-01-preview API version

Storing embedding in Azure Database for PostgreSQL

Happy New Year, 2024 from DALL-E