Running machine learning models in production introduces all kinds of interesting questions:
- How do we build resilient systems when we don’t own the compute we run on?
- How do we make consumer feedback and accurate telemetry first-class concerns for machine learning products?
- Machine learning maturity models often prescribe training on production data: inputs that could not only break the system but also change it in hard-to-track ways. How can we de-risk this for a production model?
- Deterministic code should do the same thing every time, but machine learning models often aren’t deterministic. How do we gauge whether systems are working when we don’t know exactly what output to expect from the working system?
This talk will explore those questions and outline some solutions for teams to consider. You’ll see diagrams and possibly some example code, but the concepts remain language-agnostic and applicable to any ML stack.