Posts

Showing posts from November, 2022

Simple Workflow with Azure Durable Function

Image
Image by https://www.pexels.com/@divinetechygirl/ In this blog, we look into Azure Durable Function . It is excellent for implementing a simple workflow when we need to track the state of activities. Setup Azure Durable Function is part of the Azure Function App . Hence we just need to create a Function App in our resource group. Simple Workflow For the blog, we coined a simple use case. An HTTP interface that receives a list of image URLs. This interface calls durable function which in turn makes multiple calls to Azure Face Recognition Service (one call per image). From the result from Azure Face Recognition Service, we get a list of happy emotion ratings. Then we find the maximum and minimum values. (see this blog on how Azure Face Recognition Service works) Implementation  From the diagram above, we need Durable Function HTTP Starter that takes an HTTP request. An Orchestrator to fan out the calls to Detect Face Durable Function Activity Detect Face Durable Function  There are a

Azure Face Recognition Service

Image
Image from https://www.pexels.com/@cottonbro/ In this blog, we looked at the  Azure Face Recognition Python library. The Azure Face Recognition service is part of Azure Cognitive Service . Please follow this blog to create the cognitive service. The Python code can be found in a gist . The main part of the code is def detect(image_url: str): detected_faces = face_client.face.detect_with_url( url=image_url, return_face_attributes=[ "age", "gender", "headPose", "smile", "facialHair", "glasses", "emotion", "hair", "makeup", "occlusion", "accessories", "blur", "exposure", "noise" ]) if detected_faces: print(json.dumps([info(face) for face in det

Upload files to Azure blob storage container

Image
Image by https://www.pexels.com/@apasaric/ We have a common use case when we need to upload a sizable file to Azure Blob Storage  (or any form of Cloud storage) and then the file content is then processed by Azure Function App . This blog is about an anti-pattern and an effective way to handle this use case. Setup Create a resource group and create these services Azure Function app Azure Blob Storage Enable the Managed Service Identity option for Azure Function App . Next grant this identity to blob storage's Storage Blob Data Contributor role. This is an important step otherwise the generated Shared Access Signature ( SAS ) Token does have the right permissions. Add Application Settings in Function App The BLOB_CONTAINER_NAME and BLOB_STORAGE_URL are in the Python code later. BLOB_CONTAINER_NAME is the container name . I have created one and I named it as " test" . BLOB_STORAGE_URL  is like  https://<you-blob-storage-account-name>.blob.core.windows.net/

Azure Form Recognizer Python SDK

Image
image from https://www.pexels.com/@yasinaydin/ In this blog, we look at  Azure Form Recognizer Python SDK by creating a simple medical prescription and examining the results from Azure Form Recognizer . First and foremost, we need to create the  Azure Cognitive  service. and get the access key and endpoint. Python Implementation We use its Python client library, azure-ai-formrecognizer . And the code is import json import os from azure.core.credentials import AzureKeyCredential from azure.ai.formrecognizer import DocumentAnalysisClient endpoint = os.environ["AZURE_FORM_REG_ENDPOINT"] access_key = os.environ["AZURE_FORM_REG_KEY"] client = DocumentAnalysisClient(endpoint, AzureKeyCredential(access_key)) with open("prescription_fake.pdf", "rb") as fd: document = fd.read() poller = client.begin_analyze_document("prebuilt-layout", document) fr_result = poller.result() results = { "pages": [], &qu

Azure Search Python SDK

Image
Image from https://www.pexels.com/@gratisography/ I am trying to familiarize myself with Azure Search API. This is the beginning of this learning journey. Install Azure Search Service Create an Azure resource group and create an Azure Search Service. Once it is installed, note the API endpoint and the access key as shown below Python Implementation Azure Search Python Library pip install azure-search-documents I used Faker to generate dummy data and load them into Azure Search pip install faker Source Code import os from azure.core.credentials import AzureKeyCredential from azure.search.documents import SearchClient from azure.search.documents.indexes import SearchIndexClient from azure.search.documents.indexes.models import ( CorsOptions, ComplexField, SearchIndex, ScoringProfile, SearchFieldDataType, SimpleField, SearchableField ) endpoint = os.environ["AZURE_SEARCH_ENDPOINT"] key = os.environ["AZURE_SEARCH_API_KEY"] We have the endpo

Comma Separated Values File Format & Parquet

Image
Image from https://www.pexels.com/@pixabay/ There are many discussions and articles about storing data in CSV (Comma-Separated Values) or Parquet file format. Discussions about file size, time to read and write access, etc. I feel that the most important thing is that Parquet stores the column data type information with it. When we use CSV, we lost all this important information . Let's look at some statistics on the file size, and time to read and write access, 1. Experiment Setup The file formats are CSV , Parquet , Parquet (compressed with gzip ),  and  Parquet (compressed with snappy ). We want to measure the following The time to write data to the file format The time to read data from the file format The file size of the file format The numbers of rows are 10000 , 5 0000 , and 100000 . We use Faker to generate dummy data and pandas  (with pyarrow ) to read and write data. The parquet file is NOT multi-parts. Tests are performed on my laptop ( MacbookPro 2.6 GHz 6-Co

Generate Latex formulas from Python code

Image
Image from https://www.pexels.com/@ketut-subiyanto/ About a few days ago, Google has released an open-source software,  latexify-py . This allows Python developers to generate latex formulas from Python functions. I was looking at this notebook and decided to try it out.   import latexify import math @latexify.with_latex def solve(a, b, c): return (-b + math.sqrt(b**2 - 4*a*c)) / (2*a) print(solve) gives us \mathrm{solve}(a, b, c) = \frac{-b + \sqrt{b^{{2}} - {4} a c}}{{2} a} @latexify.function def sinc(x): if x == 0: return 1 else: return math.sin(x) / x print(sinc) gives us \mathrm{sinc}(x) = \left\{ \begin{array}{ll} {1}, & \mathrm{if} \ {x = {0}} \\ \frac{\sin{\left({x}\right)}}{x}, & \mathrm{otherwise} \end{array} \right. However, it does not work if I tweak the function a little @latexify.function def sinc(x): if x == 0: return 1 return math.sin(x) / x print(sinc) got an error, "LatexifyNotSupportedError: Codege

Performance awareness when coding

Image
Image from https://www.pexels.com/@cesar-galeao-1673528/   Python is slow compared to many popular programming languages. It is because of its dynamic nature and versatility. Moreover, it is not statically typed which does not allow the compiler to optimize the code. This blog is about two cases where we can write better Python code (w.r.t. performance). The concept is the same when we are writing in other programming languages too. I am running the code posted below on MacbookPro 2.6 GHz 6-Core Intel Core i7 . Setup We created a decorator for measuring the time taken to execute a function. import time def timer(message: str): def decorator(fn): start = time.process_time() fn() print(f"{message}: time taken = {time.process_time() - start} secs") return decorator Finding the minimum value in a list We have seen this many times where software engineers sort a list just to get the minimum value. It can be because they are tired or do not pay much

Azure Text Analytics for health (Part II)

Image
Image from https://www.pexels.com/@anntarazevich/ Following this previous blog on Azure Text Analytics for health , I reimplement the code in Typescript. Honestly, from Python to Typescript is just a few lines of code to rewrite. I have also created a web UI for it using React and antd components. I have shared the code on github . You need to create an Azure Cognitive Services Text Analytics Service as described in  Azure Text Analytics for health , And, the rest should be easy if you are familiar with React. Once the application run, we get a page like this. Enter some text in the text area. I entered "Patient needs to take 50 mg of ibuprofen. She does not have high blood pressure." and hit the Analyze button. And got the application is able to identify things like Dosage, Drug name, and medical condition. This is cool because Azure Text Analytics has a vertical product for healthcare. I am hoping more verticals will come soon.

Video to Image (JPG)

Image
Image from https://www.pexels.com/@caleboquendo/ There are many situations when we need to capture frames from video clips into images so that we can use these images for processing (such as Machine Learning and Prediction). In the blog, we shall share how we do this in Python 3.x (I am using 3.9) and some open-source libraries. 1. Open Source Libraries opencv-python==4.6.0.66 moviepy==1.0.3 argparse==1.4.0 2. Requirements Capture all frames in a video Capture some frames in a video by skipping some frames. Capture frames in greyscale. 3. Command Line Options Based on the above requirements, we have usage: Video2JPEG [-h] -i INPUT_FILE [-g] [-s FRAMES_TO_SKIP] capture frames in video to JPEG files optional arguments: -h, --help show this help message and exit -i INPUT_FILE, --input-file INPUT_FILE path to video file -g, --grey_scale having images in grey scale -s FRAMES_TO_SKIP, --frames-to-skip FRAMES_TO_SKIP nu