Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 

README.md

Demostration: Creating a Machine Learning Model

Costa Rica

GitHub brown9804

Last updated: 2025-07-16


List of References (Click to expand)
Table of Content (Click to expand)

Step 1: Set Up Your Azure ML Workspace

You can use the azure portal approach:

  • Go to the Azure Portal
  • Create a Machine Learning workspace:
    • Resource group
    • Workspace name
    • Region
  • Once created, launch Azure Machine Learning Studio.
How.to.deploy.Azure.Machine.Learning.mp4

Or using terraform configurations for setting up an Azure Machine Learning workspace along with compute clusters and supportive resources to form the core of an ML platform, click here to see Demonstration: Deploying Azure Resources for an ML Platform

Step 2: Create a Compute Instance

  1. Go to Azure Machine Learning Studio and select your workspace.

  2. Select Compute from the left menu Choose the Compute instances tab.

  3. Click New

    • Enter a name for your compute instance.
    • Choose a virtual machine size (e.g., Standard_DS3_v2).
    • Optionally, enable SSH access or assign a user.
  4. Click Create: Azure will provision the compute instance, which may take a few minutes.

    How.to.deploy.a.compute.instance.in.Azure.ML.mp4

Step 3: Prepare Your Data

  • Upload your dataset to Azure ML datastore or connect to exrnal sources (e.g., Azure Blob Storage, SQL, etc.).
  • Use Data > Datasets to register and version your dataset.

For example: Upload the CSV to Azure ML

  1. Under to Data > + Create > From local files.
  2. Choose:
    • Name: employee_data
    • Type: Tabular
    • Browse and upload the sample_data.csv file.
  3. Click Next, then Review + Create.

Register the Dataset:

  1. After upload, Azure will preview the data.
  2. Confirm the schema (column names and types).
  3. Click Create to register the dataset.
Upload.the.CSV.to.Azure.ML.mp4

Step 4: Create a New Notebook or Script

  • Use the compute instance to open a Jupyter notebook or create a Python script.

  • Import necessary libraries:

    import pandas as pd
    from sklearn.model_selection import train_test_split
    from sklearn.ensemble import RandomForestClassifier
    from sklearn.metrics import accuracy_score
    Create.a.New.Notebook.or.Script.and.import.lib.mp4

Step 5: Load and Explore the Data

Load the dataset and perform basic EDA (exploratory data analysis):

import mltable
from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential

ml_client = MLClient.from_config(credential=DefaultAzureCredential())
data_asset = ml_client.data.get("employee_data", version="1")

tbl = mltable.load(f'azureml:/{data_asset.id}')

df = tbl.to_pandas_dataframe()
df
Load.and.Explore.the.Data.mp4

Step 6: Train Your Model

Split the data and train a model:

# Step 1: Preprocessing
from sklearn.preprocessing import LabelEncoder, StandardScaler

# Encode categorical columns
label_encoder = LabelEncoder()
df['Department'] = label_encoder.fit_transform(df['Department'])

# Drop non-informative or high-cardinality columns
if 'Name' in df.columns:
    df = df.drop(columns=['Name'])  # 'Name' is likely not predictive

# Optional: Check for missing values
if df.isnull().sum().any():
    df = df.dropna()  # or use df.fillna(method='ffill') for imputation

# Step 2: Define Features and Target
X = df.drop('Salary', axis=1)  # Features: Age and Department
y = df['Salary']               # Target: Salary

# Optional: Feature Scaling (especially useful for models sensitive to scale)
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Step 3: Split the Data
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(
    X_scaled, y, test_size=0.2, random_state=42
)

# Step 4: Train a Regression Model
from sklearn.ensemble import RandomForestRegressor

model = RandomForestRegressor(
    n_estimators=100,
    max_depth=None,
    random_state=42,
    n_jobs=-1  # Use all available cores
)
model.fit(X_train, y_train)
E.g.Train.Your.Model.-.Regression.Model.mp4

Step 7: Evaluate the Model

Check performance:

# Step 5: Make Predictions
predictions = model.predict(X_test)

# Step 6: Evaluate the Model
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
import numpy as np

mae = mean_absolute_error(y_test, predictions)
mse = mean_squared_error(y_test, predictions)
rmse = np.sqrt(mse)
r2 = r2_score(y_test, predictions)

print("Model Evaluation Metrics")
print(f"Mean Absolute Error (MAE): {mae:.2f}")
print(f"Mean Squared Error (MSE): {mse:.2f}")
print(f"Root Mean Squared Error (RMSE): {rmse:.2f}")
print(f"R² Score: {r2:.2f}")
image

Distribution of prediction errors:

import matplotlib.pyplot as plt

# Plot 1: Distribution of prediction errors
errors = y_test - predictions
plt.figure(figsize=(10, 6))
plt.hist(errors, bins=30, color='skyblue', edgecolor='black')
plt.title('Distribution of Prediction Errors')
plt.xlabel('Prediction Error')
plt.ylabel('Frequency')
plt.grid(True)
plt.show()

# Plot 2: Predicted vs Actual values
plt.figure(figsize=(10, 6))
plt.scatter(y_test, predictions, alpha=0.3, color='darkorange')
plt.plot([y_test.min(), y_test.max()], [y_test.min(), y_test.max()], 'k--', lw=2)
plt.title('Predicted vs Actual Salary')
plt.xlabel('Actual Salary')
plt.ylabel('Predicted Salary')
plt.grid(True)
plt.show()
image

Step 8: Register the Model

Save and register the model in Azure ML:

import joblib
joblib.dump(model, 'model.pkl')

from azureml.core import Workspace, Model
ws = Workspace.from_config()
Model.register(workspace=ws, model_path="model.pkl", model_name="my_model_RegressionModel")
Register.the.Model.mp4

Tip

Click here to read the script used.

Step 9: Deploy the Model

Create the scoring script as demonstrated in the video below. Click to see a more detailed sample of the scoring file with debugs and logs included

How.to.create.the.scoring.script.mp4

Create the Environment File (env.yaml):

Sample.env.yml.that.uses.requirements.txt.mp4

Create a new notebook:

Creating.new.notebook.and.mv.location.mp4

Create an inference configuration and deploy to a web service:

from azureml.core import Workspace
from azureml.core.environment import Environment
from azureml.core.model import InferenceConfig, Model
from azureml.core.webservice import AciWebservice

# Load the workspace
ws = Workspace.from_config()

# Get the registered model
registered_model = Model(ws, name="my_model_RegressionModel")

# Create environment from requirements.txt (no conda)
env = Environment.from_pip_requirements(
    name="regression-env",
    file_path="requirements.txt"  # Make sure this file exists in your working directory
)

# Define inference configuration
inference_config = InferenceConfig(entry_script="score.py", environment=env)

# Define deployment configuration
deployment_config = AciWebservice.deploy_configuration(cpu_cores=1, memory_gb=1)

# Deploy the model
service = Model.deploy(
    workspace=ws,
    name="regression-model-service",
    models=[registered_model],
    inference_config=inference_config,
    deployment_config=deployment_config
)

service.wait_for_deployment(show_output=True)
print(f"Scoring URI: {service.scoring_uri}")

Step 10: Test the Endpoint

Once deployed, you can send HTTP requests to the endpoint to get predictions.

Total views

Refresh Date: 2025-07-16