![]() |
| Image from https://www.pexels.com/@agk42/ |
I was reading about the Synonyms feature in Azure Cognitive Search and decided to test it with its Python API.
These are the dependencies.
python-dotenv==0.21.1 azure-identity==1.12.0 azure-search-documents==11.3.0 azure-search==1.0.0b2
We create the clients. An index client and a search client.
from dotenv import load_dotenv
import os
from azure.core.credentials import AzureKeyCredential
from azure.search.documents.indexes import SearchIndexClient
from azure.search.documents import SearchClient
load_dotenv()
SERVICE_ENDPOINT = os.getenv("AZURE_SEARCH_SERVICE_ENDPOINT")
SERVICE_KEY = os.getenv("AZURE_SEARCH_API_KEY")
INDEX_NAME = "test-synonyms"
index_client = SearchIndexClient(SERVICE_ENDPOINT, AzureKeyCredential(SERVICE_KEY))
search_client = SearchClient(
SERVICE_ENDPOINT, INDEX_NAME, AzureKeyCredential(SERVICE_KEY)
)
Next. we create the synonym mapping
SYNONYM_MAP_NAME = "test-syn-map"
from azure.search.documents.indexes.models import SynonymMap
synonyms = [
"Jenna, Ortega, Wednesday Addams\n",
]
synonym_map = SynonymMap(name=SYNONYM_MAP_NAME, synonyms=synonyms)
index_client.create_synonym_map(synonym_map)Here, we created 3 synonyms, "Jenna", "Ortega" and "Wednesday Addams". Note: there is a "\n" at the end.
Let's create some search documents to test these out.
from azure.search.documents.indexes.models import (
CorsOptions,
SearchFieldDataType,
SimpleField,
SearchableField,
SearchIndex,
)
INDEX_NAME = "test-synonyms"
fields = [
SimpleField(name="id", type=SearchFieldDataType.String, key=True),
SearchableField(
name="text",
type=SearchFieldDataType.String,
synonym_map_names=[SYNONYM_MAP_NAME],
),
]
index = SearchIndex(
name=INDEX_NAME,
fields=fields,
cors_options=CorsOptions(allowed_origins=["*"], max_age_in_seconds=60),
)
index_client.create_index(index)
search_client.upload_documents(
documents=[
{"id": "1", "text": "Wednesday Addams in Netflix"},
{"id": "2", "text": "Wednesday Addams' dance"},
{"id": "3", "text": "Jenna Ortega's dance went wild in Tik Tok"},
]
)
We have created the search index and uploaded 3 documents.
Now, we can test searching.
import json
search_docs = search_client.search('Jenna', search_fields=["text"])
print(json.dumps(list(search_docs), indent=4))
Searching for Jenna, Ortega, and "Wednesday Addams" (note the double quotes) gives us 3 matching results.
[
{
"id": "1",
"text": "Wednesday Addams in Netflix",
"@search.score": 0.5753642,
"@search.highlights": null
},
{
"id": "3",
"text": "Jenna Ortega's dance went wild in Tik Tok",
"@search.score": 0.51623213,
"@search.highlights": null
},
{
"id": "2",
"text": "Wednesday Addams' dance",
"@search.score": 0.5063205,
"@search.highlights": null
}
]Lastly, we try the query rewrite feature by changing the synonym mapping to
synonyms = [
"Jenna, Ortega => Wednesday Addams\n",
]Now, when I search for Jenna or Ortega, the search query will rewrite them to Wednesday Addams. That's
search_docs = search_client.search('Jenna', search_fields=["text"])
print(json.dumps(list(search_docs), indent=4))Searching for Jenna here is altered to searching for Wednesday Addams. So the search results are
[
{
"text": "Wednesday Addams in Netflix",
"id": "1",
"@search.score": 0.5753642,
"@search.highlights": null
},
{
"text": "Wednesday Addams' dance",
"id": "2",
"@search.score": 0.5063205,
"@search.highlights": null
}
]"Jenna Ortega's dance went wild in Tik Tok" is no longer in the search result.

Comments
Post a Comment