you have been doing ml system design interviews wrong
The questions you should ask, before you even begin
why am I writing this?
ML pays well. ML roles have a strict interview process. ML System Design is needed for L4+ roles. You need to be good at this, period.
I am well aware of the countless system design blogs/youtube videos/podcasts out there. For most of the interviews, those are sufficient. I will attach some resources at the end for convenience. But my intention is to NOT teach you those. You can figure that out yourself :) My goal here is to help you not trip over yourself.
I have sat at both sides of the table for quite a few number of ML interviews, and helped several of my friends crack good offers (they were cracked people already). Folks usually disregard the behavioral aspect of system design interviews. I am writing down the exact things that has helped me in these rounds. It really helps hit the home run in the last mile. I hope it reaches a wider audience, versus endless whatsapp calls that I have found myself giving.
setting the stage.
This is your 3rd round. You probably aced the coding round, flipping some BST, or running around a graph. Good. Now your interviewer hits you with
design a system which takes position of the sun as the input and predict if Italian Stock Market will go up or down?
There are 3 ways you can go from here:
“I will be using the latest and greatest research in sequential models and slap it on the Sun coordinates. Put it on some GPUs. Profit.”
Go into the framework level details “I will use CUDA optimized kernels, to put my Logistic Regression model on”
“1) What” expression. (This was me)
Unfortunately, all 3 approaches will land you on the rejected pile.
so then what’s it about?
As much as I would be hated on to say this out loud. No, honestly, interviewers really don’t want you to know this. But, system design interviews are a more sophisticated version of ✨ vibe check ✨. Yes, they are technical behavioral rounds in disguise.
There is just no way anyone can know whether it will work or not — as long as your basics and your assumptions are valid.
“I will use F1 score instead of precision, because I believe recall is equally important” — Works ✅
“I will use precision instead of F1 score, because I believe precision is more important here. But I will keep track of precision seperately” — Also, works. ✅
“I will use precision” — this doesn’t work. ❌
“I will approach this question…” — this doesn’t work as well. ❌
By the end of the blogpost, I hope you will find out why they don’t work.
then how do we approach?
When you hear the question for the first time? Just don’t say anything. Have a 10-15s long pause and use that time to think about the question. You can use filler words like “hmmm…”, “I see..” etc. Use this time to think about why is this even a problem statement.
Using the earlier example of
design a system which takes position of the sun as the input and predict if Italian Stock Market will go up or down?
poking the problem statement
Here, you need to take a moment and recreate the thought process of your interviewer. Why was this problem given? Why is this even a problem? The more you understand the why the interviewer gave you this question, the more chances, you will say the answer they want.
My approach to this is, to understand what is the problem setting. Why is sun in the picture, why do we care about the Italian Stock Market. If it helps you to think, divide the question into sub-topics and understand why each of them made it to the question.
Here, the interviewer may say things like “We are only active during day time, so we only have the sun’s position as input”. Now you know why sun made it to the question, because they are unable to stay awake at night.
If you solve staying awake at night, then you have changed the rules. Now assuming moon coordinates are easier (for whatever reason), you just made the problem easier to solve.
Takeaway: Don’t treat the problem statement as the starting point, treat what caused the problem, as the starting point.
Now, most of the times, you might end up at the problem statement. After asking all the details, you will be “Oh okay, thats why Sun is needed”. And that is COMPLETELY okay. The interviewer is very impressed (even if he doesn’t show it), that you took the time to understand the context.
asking what metrics we care about
What cannot be measured, cannot be improved
Your next question should be, what is our success metric? What does the stakeholders care about? You will be surprised the number of times your metric doesn’t align with the metrics they care about, even with the same problem statement.
Asking this question also shows that you are capable of thinking the problem from the lens of the people who are paying you to solve the problem. A very L5+ quality, not a lot of engineers have.
You can take some time to suggest some metrics of your own and have a discussion around that.
Takeaway: Never assume the metrics that will be tracked. Always ask. In interviews, as well as life in general :)
production constraints
After that, the next thing you would want to know is production constraints. Some of the questions to ask to know more about this are
Is it deployed on the edge or cloud?
Is it expected to serve how many queries per seconds/day?
How often does the data go stale? (This will let you know if you want to design a automated mechanism to retrain the model)
If you don’t ask any of these questions before beginning solving the question, you are leaving points on the table. And, trust me, you don’t want to propose a solution and the interviewer hits you back with “but we cannot do this”. This shows, you started solving the problem without knowing everything there is to know. Very L3 of you.
Takeaway: First know all the cards (constraints) you are dealt with, before playing your hand.
now you can start solving the problem
and this will help you think creatively. There are no limitations in a system design round. The only limitation is your creativeness. Be creative. One easy way to do this is to borrow ideas from other non-ML systems. For e.g. how distributed systems work, how a database works etc. As far as the ML algorithms are concerned, goes without saying, you need to understand them inside out. You get bonus points if a tweak a already known algorithm to fit in the problem statement.
ending note
ALWAYS, and I really mean, ALWAYS end with some ideas of how you can improve this system. Even if you don’t come up with concrete methods to improve it, just an attempt or maybe exploring some trade-off, will go a long long way.
some ml system design resources
These are some of the resources I use for practice
https://www.educative.io/courses/machine-learning-system-design (behind a paywall sadly, but covers everything)
[Book club] Designing Machine Learning Systems (if you have a ton of time)
plug ;)
If you like what I write, please consider subscribing. I write about technical topics in ML and topics which can help you in your next ML interview. My job at Cerebras involves pretraining and finetuning LLMs so I write about that. You can find my blog about how attention maps behave during interpolation here.
A small introduction about me:
Absolutely agree!!!