Applied ML Summit — Highlights
Here are some of my highlights from Google Cloud’s Applied ML Summit:
- Use reinforcement learning to optimize for long term metrics
- Vertex AI for your full ML pipeline
- Trends in machine learning
Use reinforcement learning to optimize for long term metrics
Spotify’s Head of Machine Learning Tony Jebara explained how Spotify want to optimize their service to make sure users will keep using the service many months from now.
Many machine learning models will optimize for shorter term scenarios. At Spotify, they could keep queuing up specific songs that they know you already enjoy, but over time you’d probably get bored of those. They want to introduce you to new artists and songs that will deepen their relationship with you, and keep you engaged with their service. Reinforcement learning is apparently great for this, as you can create your reward functions to optimize for longer term happiness.
Spotify analyze the user journeys for their happiest users: how a user listens to different tracks/stations/podcasts, how they transition between them and re-engage with the service, etc.
They use those journeys as training data to create their models, which can suggest similar journeys to users who haven’t engaged as heavily with the service.
It’s hard to know how real users will respond to new models, so Spotify simulate how they expect a user to behave (this is the “agent” who is performing the actions), and use those to test their new policies before releasing them to production as A/B tests.
It was interesting to hear about this use case for reinforcement learning!
Vertex AI for your full ML pipeline
In May 2021 Google launched Vertex AI, which is a suite of tools allowing you to manage datasets, build, train and tune models, deploy them to production and monitor their performance.
You can train your model using AutoML (where it figures out everything you need without you needing to know almost anything about machine learning), or you can build custom models from scratch.
I’ve seen AutoML demoed a couple of times at prior Google I/Os, so I was very interested to see that it seems to support tabular and text datasets too.
Most AutoML demos I’ve seen usually show image classification examples, which are fun to see, but my suspicion is they are less helpful to professional software engineers than seeing realistic tabular/text problems solved.
The Vertex AI AutoML docs say that for tabular data it supports regression models, classification models, and forecasting models — which sounds very useful.
AWS have similar AI Services, including SageMaker AutoPilot, which offers an AutoML capability for tabular data, Amazon Forecast for forecasting, and many others.
I’d be very interested to hear how successful these AutoML products are on real world datasets, and if there are any significant differences between the big cloud platforms.
Trends in machine learning
The closing keynote talked about general trends like:
- Bigger and better libraries of components that can be combined together.
- Reduced manual effort: let tools perform hyperparameter tuning for you, or even use AutoML to do all the ML for you.
- Using the cloud to run massively parallel training and ML pipelines.
- Deploying models to TensorFlow Lite and TensorFlow.js.
Overall, I enjoyed many of the talks, and it’s exciting to see how fast the field of ML is moving along!