Please enable JavaScript.
Coggle requires JavaScript to display documents.
AZ Machine learning workspace - Coggle Diagram
AZ Machine learning workspace
Azure machine learning studio, web portal for workspace. All assets can be managed from here.
Automated machine learning
Experiment utilise cluster
Classification
Select Target
Select best model
Deployment
Azure container instances
Kubernetis cluster
Regression
Time series
Workspace contains
Compute
Compute clusters, scalable clusters for on-demand processing of experiment code
Inferece clusters, deployment targets that use trained models
Attached compute, links to ohter azure compute resources, such as virtual machines or azure databricks clusters
Compute instance, development, run notebooks, fully featured python environment (Jupyter and JupyterLab)
Data
Datastores
Type of datastore
Azure Storage (blob and file containers)
Azure Data Lake stores
Azure SQL Database
Azure Databricks file system (DBFS)
Considerations
When using Azure blob storage, premium level storage may provide improved I/O performance for large datasets. However, this option will increase cost and may limit replication options for data redundancy.
When working with data files, although CSV format is very common, Parquet format generally results in better performance.
You can access any datastore by name, but you may want to consider changing the default datastore (which is initially the built-in workspaceblobstore datastore).
Datasets
Datasets are typically based on files in a datastore, though they can also be based on URLs and other sources. You can create the following types of dataset. Datasets are versioned packaged data objects that can be easily consumed in experiments and pipelines. Datasets are the recommended way to work with data, and are the primary mechanism for advanced Azure Machine Learning capabilities like data labeling and data drift monitoring.
Type of datsets
Tabular
The data is read from the dataset as a table. You should use this type of dataset when your data is consistently structured and you want to work with it in common tabular data structures, such as Pandas dataframes
File
The dataset presents a list of file paths that can be read as though from the file system. Use this type of dataset when your data is unstructured, or when you need to process the data at the file level (for example, to train a convolutional neural network from a set of image files).
Pass tabular / file dataset to experiment script
Script argument
Named input
Notebooks
Pipelines
Models
Experiments
Experiments
An experiment can be run multiple times, with different data, code, or settings; and Azure Machine Learning tracks each run, enabling you to view run history and compare results for each run.
Encapsulate experiment logic in a script. Th escript can be run in any valid compute context.
Logging metrics and creating outputs
Environments
Used to
Environment
Develop your training script.
Reuse the same environment on Azure Machine Learning Compute for model training at scale.
Revisit the environment in which an existing model was trained.
Deploy your model with that same environment.
Environments can broadly be divided
currated
Curated environments are provided by Azure Machine Learning and are available in your workspace by default. Intended to be used as is, they contain collections of Python packages and settings to help you get started with various machine learning frameworks. These pre-created environments also allow for faster deployment time. For a full list, see the curated environments article.
user managed
In user-managed environments, you're responsible for setting up your environment and installing every package that your training script needs on the compute target. Also be sure to include any dependencies needed for model deployment.
system managed
You use system-managed environments when you want conda to manage the Python environment for you. A new conda environment is materialized from your conda specification on top of a base docker image.
Azure Machine Learning builds environment definitions into Docker images and conda environments
Create workspace either via
Azure machine learning python sdk
Benefits
Run machine learning operations from your preferred development environment.
Automate asset creation and configuration to make it repeatable.
Ensure consistency for resources that must be replicated in multiple environments (for example, development, test, and production)
Incorporate machine learning asset configuration into developer operations (DevOps) workflows, such as continuous integration / continuous deployment (CI/CD) pipelines.
Azure comand line interface (CLI)
Azure portal
Resource group
Storage account
Key vault
Application insights
Container registry
Workspace
VS Code Extension
AZ cloud platform & ml enalbes to manage
Scalable on-demand compute for machine learning workloads.
Data storage and connectivity to ingest data from a wide range of sources.
Machine learning workflow orchestration to automate model training, deployment, and management processes.
Model registration and management, so you can track multiple versions of models and the data on which they were trained.
Metrics and monitoring for training experiments, datasets, and published services.
Model deployment for real-time and batch inferences.
Deployment targets
Local compute
AZ ML compute instance
Azure container instance
Azure kubernetes service
Azure function
IoT (edge) module
Steps for deplyoment as real-time service
1 Register trained model
Define inference configuration, init(), run(), environment
Define deployment configuration
Deploy model
Authentication
Key
Token
JSON Web Token (JWT)
service-principal authentication to verify its identity through Azure Active Directory (Azure AD) and call metohd of the service to retrieve a time-limited token
Not all targets support all Authentication types
Elements to a real-time service deployment
Trained model
Runtime environment configuration
Scoring script
Container image
Container host