Goal of this sample is to acceleratre deployment of Industrial IoT Prediction Patterns. There is no one size fits all solution, as there are many considerations, please review them before moving your workload to production.
Exploratory Data Analysis (EDA) is the first step before we build any custom models using machine learning. This is a critical and often complex step where in we normalize & clean the data, understand data distribution, outliers, correlations and assess the data for various hypothesis and experiments.
Our scenario is around predicting failures (quality related) based on machine condition. The telemetry data contains a point in time snapshot of all the sensor values, how these values actually impacted quality failures conditions is logged in a different system. For this sample we will use:
- Simulated Sensor Data
- Generated via an IoT Edge Module
- Contains 40+ different sensor values
- Contain production batch number
- Production Quality Data
- Contains production batch number
- Contains quality error code for each batch
- 1 = Meets quality expectations | 0 = Does not meet quality expectations.
- You have Connectivity Deployment Sample working, or have your IIoT data in Data Explorer already.
-
Add new SimulatedManufacturingSensor module to the IoT Edge Device created from above sample.
-
In Azure Portal select IoT Hub > IoT Edge > [Your Device] > Set Modules
-
Select Add > IoT Edge Module
-
Module Name:
SimulatedManufacturingSensors
, Image URI:ghcr.io/jomit/simulatedmanufacturingsensors:0.0.1-amd64
and click Add -
Click Next and verify that the
upstream
route value isFROM /messages/* INTO $upstream
-
Click Next and Create
-
Wait for few seconds and verify that module is deployed and is sending the logs
-
Verify the data in Data Explorer using the query in VerifySimulatedData.kql
-
-
Open the data lake created earlier in Azure Portal and upload the
batch-quality-data.csv
file to a folder namedqualitydata
Machine Learning workspace provides end to end data science lifecycle management services. It also provides a centralized place collaborate on artifacts around machine learning development and deployment.
-
Create a new machine learning workspace
az ml workspace create -w iiotml -g iiotsample -l westus2
-
Create a new compute instance for development. (Compute instances are typically per user so prefix with your name.)
az ml computetarget create computeinstance --name jomitdev --vm-size STANDARD_DS3_V2 -w iiotml -g iiotsample
-
Go to the Notebooks section in Machine Learning Studio portal and upload the files from
notebooks
folder
-
Open Machine Learning Studio and select the workspace created above.
-
Create new datastore, to connect with the telemetry data lake that we created before.
-
Open and run 1_create_raw_dataset.ipynb notebook
-
Open and run 2_exploratory_analysis_feature_selection.ipynb notebook
-
Open and run 2_frequency_analysis.ipynb notebook
-
Open and run 3_baseline_modeling.ipynb notebook
For any Machine Learning project to succeed, it’s crucial to tie Machine Learning metrics with the overall business performance. Here's an example of how you may approach this for quality prediction scenarios:
- Build a baseline of business metrics that you want improve using ML. For example:
- Number of quality failures
- Percentange of scrap
- Additional time spent on quality rework
- Cost of quality failures
- Cost of quality rework
- Select machine learning metrics for model performance based on use case / scenario. For example:
- "Precision" attempts to answer: What proportion of positive identifications were actually correct?
- "Recall" attempts to answer : What proportion of actual positives were identified correctly?
- For scenarios where cost of wrong prediction is high, choose higher "precision"
- For scenarios where cost of missing any detection is high, choose higher "recall"
- Perform A/B testing and quantify business metric improvements and cost impact as shown in below example:
Current With ML (precision=50%, recall=90%) Cost Impact Number of quality failures per year 100 25 cost per quality failure - 75% Percentage of scrap 15% 9% cost of scrap - 6% Additional time spent on quality rework 10% 2% cost of rework - 8% ... ... ... ...