Capstone_Project_Azure_ML

This capstone project is the culmination of the Azure ML Engineer path. The primary objective is to develop and deploy a predictive model for stroke occurrence based on various diseases and habits recorded in the dataset. The project is structured into three main tasks:

Train a Model with AutoML: Utilize Azure AutoML to automate the process of model selection and hyperparameter tuning.
Train a Model with HyperDrive: Implement HyperDrive to perform hyperparameter optimization manually, ensuring a thorough search for the best model configuration.
Deploy the Best Model: Compare the models generated from AutoML and HyperDrive, and deploy the superior model. In our case we found that Hyperdrive gave the better model 0.95 Accuracy as compared to that obtained from AutoML i.e. 0.85 AUC.

Below are the mentioned steps in which this project was performed.

Overview of Dataset

The dataset utilized for this project is the Heart Failure Clinical Dataset. It contains clinical features such as age, sex, smoking, and medical history, which are potential predictors of stroke occurrence.

Method Used to Get Data into Azure ML Studio Workspace

To import the dataset into Azure ML Studio workspace: The dataset was sourced from below Kaggle link and downloaded as a CSV file. Azure ML Studio was used to create a Dataset in the worskpace by uploading it from local files.

AutoML

AutoML Run details

Details of all the trials performed by AutoML and their corresponding results.

AutoML Job completed

The screenshot of the completed AutoML job.

AutoML Best model

The best model found by the AutoML and its properties can be detailed like below.

All the metrics of the best AutoML model

Best Model can also be visualized in Azure ML workspace.

AutoML model registered

The best model is registered.

HyperDrive

Notebook and used files for HyperDrive

Folloowing Notebook with the requied files and dependencies are uploaded in the Notebooks section of Azure ML workspace. Python Azure ML kernel was used to run the notebooks.

HyperDrive Search details

For the HyperDrive experiment, a RandomParameterSampling was employed with the following parameters: C: "Inverse of regularization strength. Smaller values cause stronger regularization" with Uniform distribution between 0.1 and 1.0. max_iter: "Maximum number of iterations to converge" with Choice between 50, 100, 150, 200, and 250.

HyperDrive Run details

Following was can be used to show the progres of HyperDrive run.

HyperDrive Trials

Following figure shows varioius trials made by HYperdrive and their various metrics results can be seen in the dashboard.

Overview of the Best Model

THe Best model obtained from AutoML was compared with the best model obtained from HyperDrive. Below are the details.

AutoML Best Model: VotingEnsemble with an AUC_weighted of 0.85. HyperDrive Best Model: LogisticRegression with a regularization strength of 0.997 and max iterations of 50, achieving an accuracy of 0.95.

HyperDrive Best Model

Best model from HyperDrive can be obtained like below.

Overview of Deployed Model

The HyperDrive best model, a LogisticRegression classifier, was deployed as an endpoint. To query the endpoint for predictions, use the sample input in 'data' and below Python code snippet. You may need to copy the the Rest url from the Deployed model endpoint

data = {"data": [[0, 0, 0, 0, 0, False, 0, 0, 0, 0, 0]]}
body = str.encode(json.dumps(data))

url = 'Rest url of the endpoint'
headers = {'Content-Type':'application/json'}

req = urllib.request.Request(url, body, headers)
response = urllib.request.urlopen(req)
result = response.read()
print(result)

HyperDrive Model Deployed as endpoint

The best model from Hyper Drive is deployed as endpoint. The deployment should be in Healthy state for it to serve the requests.

HyperDrive Predictions

The prediction result of the request made to the HyperDrive endpoint is show below.

Screencast

Please refer to below screencast for more details

https://www.youtube.com/watch?v=Fjs2wnb_BH4

Service deletion

In the end dont forget to delete the service.

Future work

In future iterations, the project could be enhanced by:

Exploring advanced ensemble techniques for better model performance.
Implementing model monitoring and retraining strategies to ensure continous imporovement in the prediction accuracy with continous change in data.

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
README.md		README.md
automl.ipynb		automl.ipynb
automl.log		automl.log
automl_errors.log		automl_errors.log
azureml_automl.log		azureml_automl.log
conda_dependencies.yml		conda_dependencies.yml
data.json		data.json
hyperdrive_best_run.pkl		hyperdrive_best_run.pkl
hyperparameter_tuning.ipynb		hyperparameter_tuning.ipynb
score.py		score.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Capstone_Project_Azure_ML

Overview of Dataset

Method Used to Get Data into Azure ML Studio Workspace

AutoML

AutoML Run details

AutoML Job completed

AutoML Best model

AutoML model registered

HyperDrive

Notebook and used files for HyperDrive

HyperDrive Search details

HyperDrive Run details

HyperDrive Trials

Overview of the Best Model

HyperDrive Best Model

Overview of Deployed Model

HyperDrive Model Deployed as endpoint

HyperDrive Predictions

Screencast

Service deletion

Future work

About

Releases

Packages

Languages

saxenam06/Capstone_Project_Azure_ML

Folders and files

Latest commit

History

Repository files navigation

Capstone_Project_Azure_ML

Overview of Dataset

Method Used to Get Data into Azure ML Studio Workspace

AutoML

AutoML Run details

AutoML Job completed

AutoML Best model

AutoML model registered

HyperDrive

Notebook and used files for HyperDrive

HyperDrive Search details

HyperDrive Run details

HyperDrive Trials

Overview of the Best Model

HyperDrive Best Model

Overview of Deployed Model

HyperDrive Model Deployed as endpoint

HyperDrive Predictions

Screencast

Service deletion

Future work

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages