Jupyter is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text. Uses include: data cleaning and transformation, numerical simulation, statistical modelling, data visualization, machine learning.
A Jupyter Notebook provides a python run-time environment and a workspace that together can be used to pull in machine learning datasets and perform any of the following functions:
In the managed Jupyter notebook console, you can upload Jupyter notebook application files that contain descriptive steps as well as code to execute. You can also use the workspace to upload data for processing and formatting for a Sagemaker training job.
A prebuilt Jupyter notebook file has been built for this lab.
Right-click this Jupyter-Analogue-Gauge-Application link and select ‘Save Link As’ to download the file to local directory.Download prebuilt Jupyter-Analogue-Gauge-Application
In the Jupyter notebook select Upload and select the ‘analogue-gauge-augmented-manifest-training.ipynb’ file that you just saved locally.
Then click the Upload button inline with the file you selected.
Once the file is uploaded, click on the file name:
You will be prompted to find a kernal matching Python 3.
This will take you to the Jupyter Notebook execution environment, you will get a page similar to the below:
The Jupyter notebook consists of a number of cells that contain either code or descriptive text elements. The code cells can be executed by selecting them and clicking the Run button.
The notebook will start running and highlight each section as you progress. The first Initialise project variables section of notes is highlighted.
Read through the initial text cell description then select the first code cell under Initialise project variables.
You should have created a S3 bucket in lab0. If not, go and create one now as we will need one for this lab.
In the Initialise project variables code cell you only need to update two variables:
The parameters that need updating are shown below. Click into the cell and modify the values for the bucket_name and job_name_prefix variables.
## Updated below to your local lab team S3 bucket
bucket_name = 'custom-labels-console-us-east-1-67a9361d65' # Replace '[YOUR TEAM NAME]' with your lab teams bucket name.
# Create unique job name
job_name_prefix = 'analogue-gauge-[YOUR TEAM NAME]' # Replace '[YOUR TEAM NAME]' with your lab team name.
You are now ready to execute the first cell. This is going to create some user defined parameters in the Python environment that are needed to create the training job.
Make sure the code cell below the Initialise project variables cell is selected (it will be highlighted if so).
Before the cell has been run for the first time you will see the indicator next to the cell: In [ ]:
Always watch the cell indicator after executing. Don’t move to the next step while the cell indicator is still executing as shown by In [ * ]. Some of the code cells can take a while to complete.
When the above described variables are set and the Initialise Project Variables code cell is selected then click the Run button.
After a few second the cell will complete and you should see the status cell indicator will change to In [ 1 ].
The output of the code execution similar to below:
Training Job Name: analogue-gauge-727949722849-2021-01-05-09-58-24
S3 Bucket: deeplens-analogue-gauge-727949722849
Augmented manifest for training data: s3://deeplens-analogue-gauge-727949722849/manifests/train.manifest
Augmented manifest for validation data: s3://deeplens-analogue-gauge-727949722849/manifests/validate.manifest
Output training data path: s3://deeplens-analogue-gauge-727949722849/training-output
Continue reading and working your way through the Jupyter Notebook with the instructions below.
This section creates the AWS SDK objects for interacting with the AWS environment as well as read in the Execution Role. Click the Run button.
Once executed you should see an output showing the Execution Role ARN.
Execution Role: arn:aws:iam::727949722849:role/service-role/AmazonSageMaker-ExecutionRole-20210105T143943
This notes sections gives a detailed description of the augmented manifest file that we use to describe the training data set and for data labelling. There is no code to execute in this section.