rigélblu | about me and my projects

tom.hosiawa

Mar 06, 2024

How a machine learning engineer thinks (before ChatGPT)

Tom Hosiawa • 4 min read

Part three of a four-part series. Before we get to how a Product Manager thinks about using machine learning in products, we need to understand how a Machine Learning engineer thinks.

When they say machine learning, they mean creating a model of something. Not a physical model like a toy car. But model of how something behaves. For example, we all have a model that when the sun sets, next it must come up and we get light.

A machine learning engineer creates, aka trains, a model based on the patterns and relationships it finds in data you give it. Give it data as an input, and it predicts you an output. Give a model a time of day, and it'll tell you if the sun is up or down.

Before you can create a model of your information, remember to

Do steps 1-4 in How a data scientist thinks

Understand your situation, challenge, and outcome if you solve it
Prepare related information (aka data) to it
Understand your information — how it's spread out, where it's concentrated, potential issues
Iteratively fix the bad and vague information
Now you're ready to start the modelling part "Answer your Questions through analyzing and modelling your information"

Steps involved to create a model of your information (aka feature engineering)

Part Art, the Product Manager with the Data Scientist

Choose the most relevant information to what you're trying to predict

Part Art & Science

Leverage existing or create new model for each new question you want to answer
Select how you'll model it

The structure and process your information will flow through
Split your information into three (for training, validating, and testing)

You'll check your model each time you train it, including the final one, against information it hasn't seen yet
Train your model (using the gradient descent technique) on your training information
Measure how close it predicts what you expect using your validation information
weak what you chose in 1 and 2 if its predictions are not good enough and retrain your model in step 5
You have your final model, measure how close it predicts what you expect using your testing information

You'll again tweak what you chose in 1 and 2 if its predictions are not good enough, retrain your model in step 5
Deploy your model

Here's how you do each step

Part Art — the Product Manager with the Data Scientist

Choose the most relevant information to what you're trying to predict
- Assess which information (e.g. time, price, location — aka features)
  - has most influence on what you're trying to predict
  - helps predict highly different outcomes (i.e. cat, not cat)
- Start with more features
- Include enough that a human expert in their domain (i.e. finance) can confidently predict something when given only X

Part Art & Science

Leverage existing or create new model for each question you want to answer
- Before building a new model from scratch, check if an existing model can be adapted to the question you're trying to answer
- Considering using transfer learning or model adaptation techniques
Select how you'll model it
- Choose your training model architecture (i.e. CNN, RNN) i.e. how many huge matrices your training will run through
- Choose your loss function algorithm i.e. it calculates how close your expected output is to what it predicted
Split your information into three (for training, validating, and testing)
- One for training the model, say 60%
- One for validating the model's predictions on each training run, say 20%
  - Tune your model's parameters and selecting the best performing one
  - Why not train on this? This tells you how your model generalizes to information it hasn't seen yet, avoiding memorizing your training information instead of learning generalizable patterns
- One for testing your final model's predictions
Train your model (using the gradient descent technique) on your training information
Measure how close it predicts what you expect using your validation information
- Manually examine the errors on examples in your validation information, and try to spot a trend where most of the errors were made
- Get a single, numerical value to how close your model gives what you expect with data it hasn't seen
- Use statistical techniques like: accuracy, precision, recall, F1 score, etc
Tweak what you chose in 1 and 2 if its predictions are not good enough and retrain your model in step 5
1. Reduce # of features (price, size, location, etc.)
2. Adjust:
  - Learning rate: think of it as a tuning knob. A higher learning rate can be faster but also unstable, prone to being bad at predicting on information it hasn't seen yet (i.e. overfitting). While a lower learning rate is more stable, it can be too simple to learn anything useful (i.e. underfitting)
  - Number of training runs (aka epochs): is the number of complete passes through the training information. Increasing runs allows your model to learn more complex patterns but can also lead to overfitting
  - Regularize your information: it helps prevent memorizing the training information and encourages learning patterns that generalize to new data
  - Loss function algorithm: start with a simple one, implement it quickly
3. Go back to step 5
You have your final model, measure how close it predicts what you expect using your testing information
- Go back to step 6 if its predictions are not good enough
Deploy model

Credits

My learning from: Coursera Machine Learning, Coursera Machine Learning Foundations for Product Managers, Situation Challenge Questions Answers (SCQA framework)
My editors fixing gaps in my understanding: ChatGPT, Gemini

How a machine learning engineer thinks (before ChatGPT)

Before you can create a model of your information, remember to

Steps involved to create a model of your information (aka feature engineering)

Here's how you do each step

Choose the most relevant information to what you're trying to predict

Leverage existing or create new model for each question you want to answer

Select how you'll model it

Split your information into three (for training, validating, and testing)

Train your model (using the gradient descent technique) on your training information

Measure how close it predicts what you expect using your validation information

Tweak what you chose in 1 and 2 if its predictions are not good enough and retrain your model in step 5

You have your final model, measure how close it predicts what you expect using your testing information

Deploy model