Azure Batch Compute, Azure ML Service and Azure Databricks

Photo by Manuel Geissinger on Pexels.com

Cloud Computing is ubiquitous as it is fueling for Digital Transformations. Companies are exploiting cloud platforms and blue-prints to jumpstart the executions of their Digital Transformations strategies.

Among other things, Cloud Computing offers

On-demand availability
Scalability
Pay-per-use model
comprehensive identity, authentication and authorization models
logging and metric services
secure connections to data sources

In this blog, we look at Azure Batch Compute, Azure Machine Learning Service, and Azure Databricks Compute platforms. Azure provides these different computing platforms to assist customers in getting their solutions up and running with minimal efforts. Fundamentally, they are the same because they provide computing resources. However, they are fabricated for different use cases therefore they have different concepts, and tool sets.

Azure Batch Compute

This is a lightweight and generic platform for parallel processing. It has the concept of Job and Task. Job contains many Tasks. Once the job is registered with the compute platform, these tasks are distributed to the different worker nodes for processing automatically.

Azure Batch Compute

We can select the types of virtual machines (VM) in compute cluster. For instance, UbuntuServer 18.04-LTS. After these VM are provisioned, a Docker image (from Microsoft Open ACR) shall be deployed, and the initial setup scripts are executed to install the required libraries. Then they ready to execute the Tasks. These VM will be removed from the cluster once all the Tasks are processed.

There is an option to provide a custom Docker image where we can install the required libraries and include our (program) code. The custom Docker image needs to reside in Azure Container Registry (ACR) in the same Resource Group as the Azure Batch Compute.

Note: The Azure Blob Storage in the diagram is for storing log and metrics.

Azure Machine Learning Service

This is a comprehensive service for Machine Learning. It has these main components

Workspace, Experiment and Pipeline - for operationalization
Jupyter Notebook - for development
Datasets - registration of data sources/sinks from local file system, Azure datastores, APIs and Public/Open Sources.

A Workspace contains

Runtime Environment such as the required anaconda and pip packages.
Pipeline which contains

Pipeline Runtime Environment such as compute cluster information
Execution Steps such as ML Training, scoring, etc.

Experiment is an execution of a Pipeline

Azure ML Service

After a pipeline is constructed, a corresponding Docker image will be created. This image has a base image (from Microsoft Open ACR), the required anaconda and pip packages which are defined in Workspace Runtime Environment), and the code for ML data preparation, training, predicting, scoring, etc.

The beautiful thing is that Azure ML Service takes care of all the above Docker image creation for us.

The steps in the pipeline can be executed programmatically, and/or published as a ML Pipeline which can be executed via triggers (manual and/or scheduled).

Comparing to Azure Batch compute, this compute service has

Azure Container Registry to host the created Docker images of different pipelines
Jupyter Notebook for development
Pipelines for operationalization

Similar to Azure Batch compute, the base Docker image can be customized.

Azure Databricks

Azure Databricks is a good candidate when we have to work with big data. It is built on Spark which is a engine for large-scale data processing.

Azure Databricks

Typically, we start with writing code in Jupyter Notebook, and the code shall be executed in the compute nodes. Azure Databricks handles all the logistic to connect the Notebook to the designated cluster after we have defined all the required runtime environments such as the required pip packages.

Most of the time, data sources such as Azure Blob Storage, CosmosDB, etc are required. Databricks supports a wide range of data sources hence connecting to them is easy.

Once we are happy with the code in the Notebook, we have two options to execute it in pipeline.

MLflow which is an open source project. It is easy to add more code to Jupyter Notebook to handle this.
Azure ML Pipeline. In the previous section, we mentioned about steps in the Azure ML Pipeline. One of the step can be Azure Databricks Step (azureml.pipeline.steps.DatabricksStep). - Reference (look for DatabricksStep) - https://docs.microsoft.com/en-us/azure/machine-learning/how-to-create-your-first-pipeline

Comparing to Azure ML Service, this option has

MLLib which supports distributed processing for the commonly used ML algorithms such as k-means and decision trees.
MLflow as compared to Azure ML Pipeline
two flavors of compute clusters. Interactive and Automated Clusters. Reference.

Similar to Azure Batch amd ML compute, the base Docker image can be customized.

Common Services

In the previous sections, we have succinctly describe how the three computing platforms work. In this section, we discuss the common services that are provided by Azure.

Identity, Authentication and Authorization

Azure Active Directory, AAD is the main Identity Management, Authenticator and Policy Manager. This makes deployment and security models easy because we have a common service across all the different Azure resources. Without AAD, we may have fragmented core services and federating between these services can be a nightmare.

Networking

Virtual Network enables secure communication between applications/components on-premise and across Azure resources.

Secure Store

Azure Key Vault can be used to store passwords and secrets. It is fully integrated with Azure Applications, Services, and Dev-Ops.

Dev-Ops

Azure Dev-Ops provides the necessary tools for assembling and monitoring enterprise ready solutions. Especially, the integration of Dev-Ops Pipelines with Git. Compute Pipelines can be created automatically after code changes are committed to Git branches.

Monitoring and Analytics

Azure Application Insights monitor the runtime operations. It provides analytic, insights and alerts.

Conclusion

Designing and implementing solutions on cloud can be big tasks. With the right set of tools, prefabricated compute platforms (three of them mentioned in this blog), Azure integration and common services, and Azure Dev-Ops, we are on the right foot forward.

Dennis Seah

Search This Blog