Azure Speech to Text Service

Picture from ThisIsEngineering

Yet another coding exercise on Azure Cognitive Services. This time, we have Azure Speech to Text Service in Python. I am referencing this code.

1. Initialize Setup

First and foremost, I need to do the following

Create a resource group
Create Azure Cognitive Translation Services in this resource group.
Copy the key and location values as shown below (Note endpoint value is not needed).

For simplicity, I have two values as environment parameters, AZURE_SPEECH2TEXT_KEY, and AZURE_SPEECH2TEXT_LOC.

2. Code

FYI, I am using MacOSX

2.1 Install library

pip3 install azure-cognitiveservices-speech==1.23.0

2.2 Code

import azure.cognitiveservices.speech as speech_sdk
import os

if __name__ == "__main__":
    speech_key, service_region = (
        os.environ["AZURE_SPEECH2TEXT_KEY"],
        os.environ["AZURE_SPEECH2TEXT_LOC"],
    )
    speech_config = speech_sdk.SpeechConfig(
        subscription=speech_key, region=service_region)
    speech_recognizer = speech_sdk.SpeechRecognizer(speech_config=speech_config)

    print("Say something...")

    result = speech_recognizer.recognize_once()

    if result.reason == speech_sdk.ResultReason.RecognizedSpeech:
        print("Recognized: {}".format(result.text))
    elif result.reason == speech_sdk.ResultReason.NoMatch:
        print("No speech could be recognized: {}".format(result.no_match_details))
    elif result.reason == speech_sdk.ResultReason.Canceled:
        cancellation_details = result.cancellation_details
        print("Speech Recognition canceled: {}".format(cancellation_details.reason))
        if cancellation_details.reason == speech_sdk.CancellationReason.Error:
            print("Error details: {}".format(cancellation_details.error_details))

3. Test

I was testing it out (speaking into the mic), and one of the test outputs is

4. Conclusion

Being a native English speaker and having my voice recognized by this service is nice! Coding-wise, it cannot be simpler than what we have here. I will do trying out text-to-speech service next.

Dennis Seah

Search This Blog