GenerativeAI Python Libaries

https://www.pexels.com/photo/two-women-looking-at-the-code-at-laptop-1181263/
https://www.pexels.com/photo/two-women-looking-at-the-code-at-laptop-1181263/

After deploying GenerativeAI chat completion models from Azure GenerativeAI catalog, we need to know which Python SDK to used.

OpenAI Models

After deploying Azure OpenAI model, we use openai Python library. Either

pip install openai

or

poetry add openai

and we suggest that you use the asynchronous client.

from openai import AsyncAzureOpenAI

client = AsyncAzureOpenAI(
    api_key=<azure_openai_key>,
    api_version=<azure_openai_api_version>,
    azure_endpoint=<azure_openai_endpoint>,
)

or (even better, use the DefaultAzureCredential)

from azure.identity import DefaultAzureCredential, get_bearer_token_provider
from openai import AsyncAzureOpenAI

azure_credential = DefaultAzureCredential()
token_provider = get_bearer_token_provider(
	azure_credential, "https://cognitiveservices.azure.com/.default"
)
client = AsyncAzureOpenAI(
    api_version=<azure_openai_api_version>,
    azure_endpoint=<azure_openai_endpoint>,
    azure_ad_token_provider=token_provider,
)

then we can use the client's chat.completions.create accordingly (with await).


Other GenerativeAI Models deployed in Azure

Models such as Phi3 (our small and mighty model :-)), Llama (mighty from Meta), etc. We use azure-ai-inference

pip install azure-ai-inference

or

poetry add azure-ai-inference

Similarly, we recommend its asynchronous client

from azure.ai.inference.aio import ChatCompletionsClient
from azure.core.credentials import AzureKeyCredential

client = ChatCompletionsClient(
    endpoint=<azure_phi3_endpoint>,
    credential=AzureKeyCredential(<azure_phi3_key>),
)

or (even better, use the DefaultAzureCredential)

from azure.ai.inference.aio import ChatCompletionsClient
from azure.identity import DefaultAzureCredential

client = ChatCompletionsClient(
    endpoint=<azure_phi3_endpoint>,
    credential=DefaultAzureCredential(),  # type: ignore
)

Two things worth mentioning are

  1. We do not need to fetch bearer token like we did for OpenAI SDK.
  2. There is an issue with the typing declaration of credential parameter, it does take DefaultAzureCredential when it claims that it does not. Hence add a ignore typing checking comment is required.

All the above code can be found in https://github.com/dennisseah/llm-operational-metrics where we collect operational metrics for Phi3 and OpenAI models. This can be extended to collect more metrics and other models.





Comments