Costa Rica
Last updated: 2025-07-16
List of References (Click to expand)
Table of Content (Click to expand)
- Step 1: Set Up Your Azure ML Workspace
- Step 2: Create a Compute Instance
- Step 3: Prepare Your Data
- Step 4: Create a New Notebook or Script
- Step 5: Load and Explore the Data
- Step 6: Train Your Model
- Step 7: Evaluate the Model
- Step 8: Register the Model
- Step 9: Deploy the Model
- Step 10: Test the Endpoint
You can use the azure portal approach:
- Go to the Azure Portal
- Create a Machine Learning workspace:
- Resource group
- Workspace name
- Region
- Once created, launch Azure Machine Learning Studio.
How.to.deploy.Azure.Machine.Learning.mp4
Or using terraform configurations for setting up an Azure Machine Learning workspace along with compute clusters and supportive resources to form the core of an ML platform, click here to see Demonstration: Deploying Azure Resources for an ML Platform
-
Go to Azure Machine Learning Studio and select your workspace.
-
Select
Computefrom the left menu Choose theCompute instancestab. -
Click
New- Enter a name for your compute instance.
- Choose a virtual machine size (e.g.,
Standard_DS3_v2). - Optionally, enable SSH access or assign a user.
-
Click
Create: Azure will provision the compute instance, which may take a few minutes.How.to.deploy.a.compute.instance.in.Azure.ML.mp4
- Upload your dataset to Azure ML datastore or connect to exrnal sources (e.g., Azure Blob Storage, SQL, etc.).
- Use Data > Datasets to register and version your dataset.
For example: Upload the CSV to Azure ML
- Under to Data > + Create > From local files.
- Choose:
- Name:
employee_data - Type: Tabular
- Browse and upload the sample_data.csv file.
- Name:
- Click Next, then Review + Create.
Register the Dataset:
- After upload, Azure will preview the data.
- Confirm the schema (column names and types).
- Click Create to register the dataset.
Upload.the.CSV.to.Azure.ML.mp4
-
Use the compute instance to open a Jupyter notebook or create a Python script.
-
Import necessary libraries:
import pandas as pd from sklearn.model_selection import train_test_split from sklearn.ensemble import RandomForestClassifier from sklearn.metrics import accuracy_score
Create.a.New.Notebook.or.Script.and.import.lib.mp4
Load the dataset and perform basic EDA (exploratory data analysis):
import mltable
from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential
ml_client = MLClient.from_config(credential=DefaultAzureCredential())
data_asset = ml_client.data.get("employee_data", version="1")
tbl = mltable.load(f'azureml:/{data_asset.id}')
df = tbl.to_pandas_dataframe()
dfLoad.and.Explore.the.Data.mp4
Split the data and train a model:
# Step 1: Preprocessing
from sklearn.preprocessing import LabelEncoder, StandardScaler
# Encode categorical columns
label_encoder = LabelEncoder()
df['Department'] = label_encoder.fit_transform(df['Department'])
# Drop non-informative or high-cardinality columns
if 'Name' in df.columns:
df = df.drop(columns=['Name']) # 'Name' is likely not predictive
# Optional: Check for missing values
if df.isnull().sum().any():
df = df.dropna() # or use df.fillna(method='ffill') for imputation
# Step 2: Define Features and Target
X = df.drop('Salary', axis=1) # Features: Age and Department
y = df['Salary'] # Target: Salary
# Optional: Feature Scaling (especially useful for models sensitive to scale)
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)
# Step 3: Split the Data
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(
X_scaled, y, test_size=0.2, random_state=42
)
# Step 4: Train a Regression Model
from sklearn.ensemble import RandomForestRegressor
model = RandomForestRegressor(
n_estimators=100,
max_depth=None,
random_state=42,
n_jobs=-1 # Use all available cores
)
model.fit(X_train, y_train)E.g.Train.Your.Model.-.Regression.Model.mp4
Check performance:
# Step 5: Make Predictions
predictions = model.predict(X_test)
# Step 6: Evaluate the Model
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
import numpy as np
mae = mean_absolute_error(y_test, predictions)
mse = mean_squared_error(y_test, predictions)
rmse = np.sqrt(mse)
r2 = r2_score(y_test, predictions)
print("Model Evaluation Metrics")
print(f"Mean Absolute Error (MAE): {mae:.2f}")
print(f"Mean Squared Error (MSE): {mse:.2f}")
print(f"Root Mean Squared Error (RMSE): {rmse:.2f}")
print(f"R² Score: {r2:.2f}")
Distribution of prediction errors:
import matplotlib.pyplot as plt
# Plot 1: Distribution of prediction errors
errors = y_test - predictions
plt.figure(figsize=(10, 6))
plt.hist(errors, bins=30, color='skyblue', edgecolor='black')
plt.title('Distribution of Prediction Errors')
plt.xlabel('Prediction Error')
plt.ylabel('Frequency')
plt.grid(True)
plt.show()
# Plot 2: Predicted vs Actual values
plt.figure(figsize=(10, 6))
plt.scatter(y_test, predictions, alpha=0.3, color='darkorange')
plt.plot([y_test.min(), y_test.max()], [y_test.min(), y_test.max()], 'k--', lw=2)
plt.title('Predicted vs Actual Salary')
plt.xlabel('Actual Salary')
plt.ylabel('Predicted Salary')
plt.grid(True)
plt.show()
Save and register the model in Azure ML:
import joblib
joblib.dump(model, 'model.pkl')
from azureml.core import Workspace, Model
ws = Workspace.from_config()
Model.register(workspace=ws, model_path="model.pkl", model_name="my_model_RegressionModel")Register.the.Model.mp4
Tip
Click here to read the script used.
Create the scoring script as demonstrated in the video below. Click to see a more detailed sample of the scoring file with debugs and logs included
How.to.create.the.scoring.script.mp4
Create the Environment File (env.yaml):
Sample.env.yml.that.uses.requirements.txt.mp4
Create a new notebook:
Creating.new.notebook.and.mv.location.mp4
Create an inference configuration and deploy to a web service:
from azureml.core import Workspace
from azureml.core.environment import Environment
from azureml.core.model import InferenceConfig, Model
from azureml.core.webservice import AciWebservice
# Load the workspace
ws = Workspace.from_config()
# Get the registered model
registered_model = Model(ws, name="my_model_RegressionModel")
# Create environment from requirements.txt (no conda)
env = Environment.from_pip_requirements(
name="regression-env",
file_path="requirements.txt" # Make sure this file exists in your working directory
)
# Define inference configuration
inference_config = InferenceConfig(entry_script="score.py", environment=env)
# Define deployment configuration
deployment_config = AciWebservice.deploy_configuration(cpu_cores=1, memory_gb=1)
# Deploy the model
service = Model.deploy(
workspace=ws,
name="regression-model-service",
models=[registered_model],
inference_config=inference_config,
deployment_config=deployment_config
)
service.wait_for_deployment(show_output=True)
print(f"Scoring URI: {service.scoring_uri}")Once deployed, you can send HTTP requests to the endpoint to get predictions.