Please enable JavaScript.

Coggle requires JavaScript to display documents.

AZ Machine learning workspace - Coggle Diagram

- - - - Select Target
        
        Select best model
        
        Deployment
        
        Azure container instances
        
        Kubernetis cluster
- - - - Type of datastore
        
        Azure Storage (blob and file containers)
        
        Azure Data Lake stores
        
        Azure SQL Database
        
        Azure Databricks file system (DBFS)
      - Considerations
        
        When using Azure blob storage, premium level storage may provide improved I/O performance for large datasets. However, this option will increase cost and may limit replication options for data redundancy.
        
        When working with data files, although CSV format is very common, Parquet format generally results in better performance.
        
        You can access any datastore by name, but you may want to consider changing the default datastore (which is initially the built-in workspaceblobstore datastore).
    - - Datasets are typically based on files in a datastore, though they can also be based on URLs and other sources. You can create the following types of dataset. Datasets are versioned packaged data objects that can be easily consumed in experiments and pipelines. Datasets are the recommended way to work with data, and are the primary mechanism for advanced Azure Machine Learning capabilities like data labeling and data drift monitoring.
        
        Type of datsets
        
        Tabular
        
        The data is read from the dataset as a table. You should use this type of dataset when your data is consistently structured and you want to work with it in common tabular data structures, such as Pandas dataframes
        
        File
        
        The dataset presents a list of file paths that can be read as though from the file system. Use this type of dataset when your data is unstructured, or when you need to process the data at the file level (for example, to train a convolutional neural network from a set of image files).
        
        Pass tabular / file dataset to experiment script
        
        Script argument
        
        Named input
  - - - An experiment can be run multiple times, with different data, code, or settings; and Azure Machine Learning tracks each run, enabling you to view run history and compare results for each run.
        
        Encapsulate experiment logic in a script. Th escript can be run in any valid compute context.
        
        Logging metrics and creating outputs
  - - - Environment
        
        Develop your training script.
        
        Reuse the same environment on Azure Machine Learning Compute for model training at scale.
        
        Revisit the environment in which an existing model was trained.
        
        Deploy your model with that same environment.
    - - currated
        
        Curated environments are provided by Azure Machine Learning and are available in your workspace by default. Intended to be used as is, they contain collections of Python packages and settings to help you get started with various machine learning frameworks. These pre-created environments also allow for faster deployment time. For a full list, see the curated environments article.
      - user managed
        
        In user-managed environments, you're responsible for setting up your environment and installing every package that your training script needs on the compute target. Also be sure to include any dependencies needed for model deployment.
      - system managed
        
        You use system-managed environments when you want conda to manage the Python environment for you. A new conda environment is materialized from your conda specification on top of a base docker image.
- - - - Run machine learning operations from your preferred development environment.
      - Automate asset creation and configuration to make it repeatable.
      - Ensure consistency for resources that must be replicated in multiple environments (for example, development, test, and production)
      - Incorporate machine learning asset configuration into developer operations (DevOps) workflows, such as continuous integration / continuous deployment (CI/CD) pipelines.
- - - - service-principal authentication to verify its identity through Azure Active Directory (Azure AD) and call metohd of the service to retrieve a time-limited token