8381 Words, 41 Minutes
Becoming a Research Engineer at a Big LLM Lab -- 18 Months of Strategic Job Hunting
Note: This is also published on Substack (where you can give me your mail if you want emails whenever I publish something).
A couple of days ago, I signed as a research engineer with Mistral, one of the few ML foundation model labs with more than a billion-dollar funding.
My excitement on Twitter found quite some resonance — partly in the form of questions for advice. Getting here was not an accident. I have strategically worked towards this outcome for an extended period, and I have a few things to share about what worked for me. In a sense, this blog post is a sequel to How to become an ML Engineer in 5 to 7 steps, where I covered my self-taught path toward becoming a machine learning engineer from a non-CS (though STEM) background. Here, I outline how I worked towards what I hope will be a career-defining role. I started this work after working in my first ML position for about a year.
This is an account of my personal experiences, which I based on advice I got from friends and found online. I don’t claim it’s original, and my sample is n=1, so cherry-pick what resonates for you. I still hope some find it useful.
Now, without further ado, what follows is a discussion of
- Tactic and Strategy
- My Personal Timeline
- Defining the Goal
- The Application Playbook
- Networks
- Career Momentum
- Application Process Touch Points
- The Mindgame
- Conclusion
Tactic and Strategy
To improve my chances of getting a career inflecting role, I think there are two different kinds of useful actions you can take: strategic and tactical ones.
Tactical actions are relatively low effort, but with a high return in your specific situation. This may be reading up on the latest news on the company you are interviewing with, doing a couple of LeetCode problems to refresh muscle memory, doing mock interviews, or polishing your CV.
Strategic actions are high effort, high return actions that may even seem fruitless in the specific moment, but in aggregate and compounding, give you a substantial advantage. Think about learning a new technology deeply by building a substantial portfolio project, having significant tenure at a reputable organization, building and maintaining a network, or building a personal brand by talking about your work.
In the long run, it’s strategy that makes a successful career. But in each moment, there is often significant value in tactical work. Being prepared makes a good impression, and failing to get career-defining opportunities just because LeetCode is annoying is short-sighted.
Most advice for the application playbook and company touch points is tactical. However, to get interviews, networking, strategic skill development, and communication are helpful. This is what I cover in the networking and career momentum sections. Being clear about goals is necessary to develop an effective strategy and to get into situations where tactics even matter.
If that sounds rather abstract, the discussion of my personal timeline may be helpful to illustrate the difference.
My Personal Timeline
In total, landing what I hope will become a career-defining position at Mistral was an 18-month effort. This includes alternating phases of strategic and tactical work.
It was around April 2024 when I decided I wanted to step up in my career. First, I sought out a career discussion with my direct manager. More responsibility or concrete growth trajectories could have been an immediate and relatively low-threshold way to achieve my goal. However, it became clear that there was no short-term or even mid-term way to grow with my responsibilities.
This is when I started to clarify my goals. If I had to change, where to?
I was reaching out to friends and used my network to talk to people working in big tech companies, start-ups, scale-ups, FAANG, Big Labs, whatever seemed interesting, and where I could get an introduction.
My questions were always similar. How did they like their current position? Do they learn a lot, and what are the growth trajectories? How much work do they have to do? What are the necessary skills, impressive portfolio projects, and what else is necessary to land a similar position?
As a result, I kicked off the first phase of strategic skill development. I invested many hours into LeetCode prep and got a bunch of textbooks to catch up on relevant CS fundamentals; most notably distributed systems, data structures, and algorithms.
It became tactical a couple of months later. I sent my first application in August of 2024 and did pretty well in the process. However, after six rounds and around November, I didn’t make the cut and was quite devastated.
While I had planned to send out more applications, refreshing my ML fundamentals, prepping coding interviews, and completing takehomes, all while working full time, kept me so occupied that I did not manage.
I resigned to be able to fully focus on getting that next role. Now officially jobless, I got into sending applications full-time.
The results were meager. Many positions I was excited about didn’t even invite me for an interview, and in the few interviews I got, I failed because I was nervous and made a couple of easily avoidable mistakes. Did I just get lucky in making it that far in my first process?
Either way, I had to change my process and decided to get strategic once again. Now full-time, I could make a lot more progress in upskilling and tackle much more ambitious portfolio projects.
That’s when I decided to join Recurse Center — a cohort-based, but self-directed programming retreat. Essentially, it gave me the freedom and space to follow my upskilling, but provided me with an awesome group of people doing the same; it gave me an entry on my CV where I could put all these projects, and it was in New York, which sounded like an adventure (they also offer remote).
Most importantly, though, it was three months long. That is not a timeframe I would have been comfortable committing to without any structure out of fear of being seen as a slacker. However, you can achieve quite a lot in three months of dedicated time focusing on getting better.
So I moved to New York and spent until mid-May 2025 learning Rust, contributing 15 pull requests to highly scrutinized open source codebases (ruff and uv), and writing a research paper with my master’s thesis supervisor at the AI Safety Institute of the German Aerospace Center (my research wasn’t ML related, but the institue name certainly helped to make it more relevant).
Back at home, around June 2025, I got tactical and focused all my efforts on applying.
I had about 60 touchpoints with 40 different companies, got a verbal offer with Mistral (alongside a couple others) mid-August, and finally signed in early September.
It’s easy to put a red thread through this process in the retrospective. Rest assured, however, that at the moment it was a messy process with lots of doubt and insecurities on whether I would succeed or quit my job to become a professional hobbyist.
What helped me mentally and in devising a strategy was having a specific goal.
Defining the Goal
Changing a job is a personal project. You have to own the whole process, need to make judgment calls on where to allocate your work, and whether to accept or decline a specific opportunity. In that respect, job seeking is great. Lack of structure gives you a lot of room for agency, and your decisions have great leverage on your future career.
But the same lack of structure that makes the whole process difficult, tedious, and frankly unpleasant to navigate. This navigation is a lot easier when you know where you are going. Knowing your goals is helpful for both motivation and reminding yourself why you chose to do this when things get tough, as well as to be able to effectively allocate your resources or even say no and walk away when an opportunity is too much of a compromise.
Setting your goals is deeply personal, and I can’t do this goal-setting work for you. You need to be honest about what drives you and realistic with what you can reasonably achieve. But I can give you my own goals as an example.
I wanted to find
- a career inflecting role,
- where I would build rare and valuable technical (software and ML engineering) skills,
- doing work I enjoy,
- while having ownership and impact, but also
- support from senior peers so I can
- grow into technical leadership, but
- stay an individual contributor for the foreseeable future while living in
- reasonable proximity to the people I care about in a place that I would enjoy living in.
To me, these criteria satisfy two important constraints. They are general enough that there are a reasonable number of roles out there that could be a fit; I’d have more shots on goal. But they are also concrete enough that I can rule out and say no to opportunities that are too much of a compromise.
The goal of working at a frontier Model lab with billion-dollar-plus funding effectively limits your search to maybe 10 companies, all of which are highly competitive to get into. Just searching for a job that pays well would have been too broad. I could have even tried to switch into Consulting or VC and still hit the mark.
How I wrote my goals helped me to devise a profile of the ideal role. Getting ownership and impact early, having room for growth, and being forced to stay an IC are things that are very common at start-ups.
Access to senior peers, the CV pretty privilege of a well-known brand, and the need for and resources to train niche engineering skills more in established companies. Scale-ups seem like a good middle point that ticks a lot of these boxes. Though there are definitely start-ups or larger corporates that do, too.
These criteria not only help you to find places to apply to, they also gave me the confidence to decline roles that didn’t feel right, for example because I would be a small cog in a big bureaucratic machine, or because the start up was so early that I would have had to optimize for churning out MVPs to test product hypotheses as opposed to building high performant low level systems.
I found jobsearch.dev a helpful resource to work through and come up with these criteria (and generally get into the right headspace for applications).
The Application Playbook
With a goal defined, it’s time to get the ball rolling.
I was introduced to this playbook right at the beginning of my journey by a friend working at a large Silicon Valley company, and promptly proceeded not to follow it for about a year. But when I finally did apply it, starting May 2025, with a polished portfolio and extensive LeetCode muscle memory, the results spoke for themselves.
I used my predefined goal to compile a long list of positions and companies of interest. For my top choices, I tried to get in touch with people working there (or followed up with people I got to know in my exploration phase). The goal was to gather insider information on the application processes or sometimes even a referral.
This worked best with network contacts, but I had some luck with cold outreach on Twitter or LinkedIn. For cold outreach, I was writing something along the lines of: “I’m Max and really excited about xyz and strongly considering applying to role abc. Is there anything you can share to help me make the best possible application … ”
I didn’t apply to all companies right away and instead proceeded in batches. Each batch contained one of my (referred) top choices as well as other companies I was less excited about, but would still consider working at.
Proceeding through parallel processes in lockstep made coordination a lot easier. More importantly, I could schedule the lower-stakes interviews before the ones with my top choice. This way, you get some routine and do all the dumb first-time mistakes in a setting where the damage is reasonable.
I did not apply to companies where I was sure I wouldn’t want to work. There needs to be some stake in the process, and I don’t want to waste their time. While interviewing, some of these second-choice companies became first-choice ones throughout the process.
For each batch, it was the goal to make it to the offer stage with multiple companies at the same time. Concrete offers gave a lot of signal to me. Which feels better and why? Additionally, multiple offers provide leverage in negotiations.
Is there anything a company can do to make its package more attractive? Team assignment, signing bonus, remote work? I knew an ask was reasonable because I was offered the same by another place.
The reason that I was having only one of my top choices in a batch is that I did feel some obligation to the referee. If I made it to an offer stage and the offer was competitive, I should take it.
For the others, I wrote them after I had accepted my offer, thanked them for their advice that made getting this role possible, and promised to pay it forward (of which writing this blog post is a part).
To reiterate, the essence of the playbook is:
1) Batch your applications so you can use lower-stakes ones at training grounds
2) Use your network to get referrals and insights into the interview process
3) Be mindful of your referee’s time and do your best to land the role they are referring you for
Networks
I’ve touched upon the importance of a professional network in the last section when talking about referrals and information about a company and its recruiting process.
While friends in high places are surely useful, the main power of networks lies in the strength of weak ties. One generally has more acquaintances than close friends, and they know people far outside one’s own social circles.
I still call these “acquaintances” friends, because to me, that is what they are. Personally, any “Networking” worked better when I took it as meeting interesting people, being helpful, and making friends, rather than making connections to my benefit. It also feels far less cringe. Either way, getting a warm introduction from a friend is a lot more powerful than cold outreach to a stranger.
I found LinkedIn a useful tool to figure out who knows who. But even more powerful than online stalking was talking to friends about my goals. Friends generally want to help, but to be useful, they need to know what you want to achieve.
While using your network is tactical, building friendships can be strategic.
Online, I did so on Twitter/X. It is the only social media where I, through shared interests and banter in the comments, managed to build new relationships. Of course, there is a difference between a Twitter friend and someone you know for some time in real life, but I’ve got to know some mutual connections well enough that I felt comfortable asking for advice or help.
Offline, it’s about going where people with similar interests go. That can be clubs, meetups, fairs, but also bootcamps, schools, or any other cohort-based programs. The latter are particularly effective, because people attending are more committed and usually in a phase of their life where they are especially open to new friendships.
Online and offline, my formula for making friends in the same domain works is (1) doing interesting things, (2) talking about them, and (3) being open and interested.
Doing things: You need to be active. Build projects, go to events, listen to talks, learn, and build craft. The “interesting” part comes almost on its own. If you do stuff that interests you, likely there are others who are also interested in it. It’s also something to bond over with others once you meet them.
Talk: People can only find you interesting if they know what makes you interesting. So post about your projects, talk to your friends about them, or give a presentation at your local Maker Space.
Being open: When people reach out to you, be open to talk and help. Trust and friendships form with repeated interactions. Choose to care about what is important to others. Help them if you can.
How did I follow my own advice?
On Twitter, I was posting (sometimes daily) updates on my learning ML, Haskell, Kubernetes, or Rust, building a Compiler, or the struggles of writing a paper. It served as some public accountability structure and got me in touch with people who were doing the same, and was also a proof of work when someone stumbled across my profile. Blog posts about writing a Compiler (coming soon), an Interpreter, or this one serve the same purpose.
All these are artifacts that others may find interesting. Some reach out via DMs — I try to answer all of them.
Sometimes it’s also me doing the first step. Of course, it helps that I enjoy creating these artifacts, which makes them authentic. It also makes it easier to stomach if something that I spent a lot of work on making doesn’t find any resonance. Creating the piece was already some reward.
If you want to read more about why and how to share your progress and why you should do so, even if they aren’t polished, I recommend Show your work by Austin Kleon. It’s more geared toward artists, but the lessons generalize.
Building Career Momentum
Personal hot take: when organisations hire, they want to bet on winners. Winners can be All-Stars or up-and-coming underdogs. Either way, it’s necessary to demonstrate that this particular job is the logical next step on an upward trajectory. The artifacts you create to learn, build career capital, and a network can serve to build momentum toward a particular role.
I wanted my interviewer to think something along the lines of: “This applicant has a rigorous quantitative base from degree ABC, now they became so interested in field DEF that they built a highly relevant portfolio project / OSS contributions / … in their own time”.
This effectively means building career capital, a concept introduced to me in Cal Newport’s So Good They can’t Ignore You; the building of rare and valuable skills as well as ways (projects, positions, certificates, attended institutions, …) that verify these achievements to the outside.
Ideally, previous roles serve as sufficient capital to open the next door. If there are missing skills — something easy to identify by reading the job description — I went on to strategically select portfolio projects and open source contributions to close the gap.
If there isn’t a clear progression from previous roles to the desired ones, degrees, and to a lesser extent, unaccredited programs such as bootcamps, trainings, summer schools, certificates, and the like, provide pivot points or even a “CV reset”.
Since it’s hard to give generalizable advice, I will discuss a couple of projects that I did to build skills and relevant career capital.
Missing Work with LLMs Many of the Labs want you to have experience working with LLMs. I’ve built theoretical knowledge by reading Natural Language Processing with Transformers and built a project on idiosyncratic style transfer of small language models inspired by the TinyStories dataset. To give the project a little more weight when talking about it and forcing it into completion, I signed up for a presentation at an AI Tinkerers event (also a great opportunity to make new friends). This gave me a repository I could add to my portfolio and the measurable success metric “presented to an audience of 100+“.
No CS Background I’m essentially a self-taught engineer and sometimes felt that people were doubting my theoretical and practical foundations, especially when it came to performance engineering. To this end, I decided to learn Rust, a performance-oriented systems-level language, and get it to a level where I could comfortably contribute to production codebases. I made this measurable by getting 15+ pull requests merged in Ruff and UV. Doing this as part of a three-month batch at Recurse Center gave me the freedom to focus on this full-time, as well as a complete entry on my CV to communicate the achievements.

Missing Publications Publications are often listed as desired qualifications in ML Research positions. I didn’t have any. To address this, I got in touch with my thesis supervisor from four years ago and reworked my thesis into a publication we want to submit to a peer-reviewed journal. As of now, it isn’t published, but I added it as a work in progress on my CV (using the Recurse entry as both occurred roughly at the same time).
While job descriptions often gave special weight to venues like NeurIPS or ICML, I figured any peer-reviewed research would get me 80% there. I was also convinced that I already possessed the research chops companies want to proxy for by demanding publications. Thus, opting to extend my thesis research at the intersection of information theory and complex systems was a great effort-signal trade-off and essentially continuing work I had already done to a point where it gives signal to those reading my CV. This is especially true, given that my supervisor works for the AI Safety Institute of the German Aerospace Centre, which allows for some AI/ML namedropping. It had the additional benefit of working closely with my supervisor again, which gave him lots to talk about when I put him down as a reference.
Each of these was a considerable amount of work. In the end, there is no shortcut to building rare and valuable skills; otherwise, they wouldn’t be rare.
With limited time, it’s sensible to select projects that tick several boxes with a single measure. I used my research to dabble with system level programming in C++ (before pivoting to Rust), RC served as a pivot point as well as giving me the time to dive deep into systems programming and open source work and the talk about my LLM project is evidence to my communication ability, serves as a “business outcome” and turns dabbling into something serious. This works best for projects where one already possesses the skills and only needs a way to communicate them.
Long term, however, fewer high-quality projects that take considerable effort to complete are almost always better than many shallow ones. Building career momentum is not about gaming the system; it’s about doing the work and learning the skills.
Application Process Touch Points
Once I had built a network, strategic career capital, and enough portfolio projects that I was getting my foot in the door for interviews, it got tactical again.
What follows is a description of the different types of interviews I encountered in application processes and how I prepared for them. Remember the application playbook section for tips on real-life interview practice.
But there were some things I did in preparation for every interview:
- If I knew the Interviewer, I was looking at their LinkedIn, GitHub, Twitter, blog, etc. This helped me to set focus points in preparation and come up with questions, etc., in the chit-chat that comes before or after the meat of the interview.
- I’ve read up on recent releases and products of the company or team I was applying to.
- I knew the company values (at least roughly) and why I wanted to work for this organisation specifically.
- I knew the specific position, what I hoped to get out of it, and why my previous experience would make me a perfect candidate.
- I had a couple of questions prepared, which I could ask at the end of the interview if the interviewer gave me the space to do so (they almost always did).
In general, these can be summarised as me trying to be prepared and having done my homework.
All my interviews were remote video calls. That’s why I don’t have any experience preparing for on-sites.
Referrals
I’ve already touched upon this in the networking section above. For all my favourite positions, I was trying to get referrals via friends, friends-of-friends, or people I had interacted with for some time on Twitter or LinkedIn.
I asked warm leads for referrals right away. Those whom I only knew online, I tried to get to know in a call or, even better, a physical meeting, asking for general advice on careers or insights on the role and company. I then only asked for a referral if the vibes permitted (sometimes I was even offered one). Since referees often get a bonus if I land the job, there are some aligned incentives.
Still, I only got referrals to roles I was really excited about. I think it’s okay to decline an offer in favour of another. But declining an offer without an alternative makes a referee look bad. I didn’t want to risk that with roles I wasn’t completely sure about.
CV
The first thing to kick off the application process was submitting my CV. I don’t think a CV needs to be a comprehensive history of employment. Instead, it should highlight the aspects of my professional experience relevant to the roles that I am gunning for.
In terms of layout, I used different typefaces, font sizes, and colors to make it easy to read while keeping everything else as conservative as possible; one column, no fancy design elements beyond blue colored links.
Basically, I was imagining the hiring manager reading my CV on their phone, on the way to the office kitchen, semi-engaged in a discussion with colleagues. They have only some glances at the screen to get the gist: I’m an ML Engineer, my previous positions, and relevant projects.
As such, I made it a top priority to keep my CV to a single one-column page. If they weren’t scrolling, everything that is on the second page is lost anyway. The purpose of the single column was to be more parsable to automated applicant tracking systems.
I have four sections on my CV: 1) Work Experience, 2) Portfolio, 3) Education, and 4) Skills. People fresh out of school may want to put education higher, as it is their main qualification.
Each entry contained a small description of my tasks, their successful outcomes, and the technologies used, set in a smaller font. I want the hiring manager to read the text block of the most relevant experience. They should get excited about me instead of being bored by uninteresting details elsewhere.
Whenever available, I added metrics to add credibility and quantify my impact. The one exception is the summary at the top of the CV. I wanted to keep it succinct. This makes it especially important that all claims in the summary are supported by several of the other entries in my CV.
Besides metrics, I’ve added hyperlinks to GitHub code or other public assets. If something is interesting, people can check it out. The fact that the links are blue also allows me to highlight things I really want the reader to see. As eyes quickly scan the document, they will likely register the link text.
For every CV entry, I added a short list of the tools used. Together with the skills section at the end of my CV, this allows me to name-drop a bunch of technologies, which gives a signal and, maybe more importantly, ensures their presence if it is scanned automatically or added to a search index.
My CV worked pretty well, but it took quite some iterations to get it to the point where I was satisfied. I think there is value in letting a draft sit to look at it a few days later with fresh eyes and to ask friends for feedback.
I would like to think that I did a pretty decent job, and if you want some inspiration, my CV is linked on the about page.
Initial Screen
The initial screen was usually the first touchpoint where I had face-to-face contact with companies. It’s a first get-to-know-you with a recruiter or the hiring manager.
After introducing myself, the company, and the role, I was given time to introduce myself — “Please walk me through your CV”.
I used this question to give a unique spin to my past experiences and map them to the role instead of talking through my past chronologically.
Ideally, the interviewer should have read my CV. That wasn’t always the case, though, so I developed the habit of implicitly asking, and when I learned they hadn’t, I would take a little more time to chronologically walk through the main stations before adding the unique spin.
This introduction is something that, albeit in shorter form, repeats every interview, and I prepared and practiced this beforehand. Since I didn’t want to bother my friends with this, I did a couple of practice rounds with voice LLMs.
In the end, my introduction looked something like this:
“Hi, I’m Max, a Machine Learning engineer. You’ve probably read my CV, so I will just briefly summarise what makes me uniquely qualified for this role. My background spans Physics, Sociology, and systems-level software engineering. As such, I can work with quantitative data at scale, have experience with messy human-produced data, and can build the software systems around that to train and serve large models.”
Depending on the company (and interviewer) vibe, I would also add a personal touch, such as hobbies and interests. Practicing the same sport as the interviewer made for easy sympathy points.
The initial screen then proceeded with follow-up questions on specific projects and business outcomes that I had listed on my CV. The best advice I can give is to know it thoroughly. This should be easy; it’s your life after all.
Takehome
Takehomes are programming challenges that I was tasked to work on asynchronously. I got them via email, usually with a deadline of a couple of days to a week in the future. Sometimes I was just judged on what I submitted; other times, there was a follow-up interview where we talked through my solution.
The contents of the takehomes are pretty idiosyncratic to the company in question. These are some of the takehomes I’ve encountered personally or seen with friends.
- You’re being supplied a specification and can submit code to run against a test suite
- You are giving a small ticket, access to their codebase, and should solve the issue (this interview step was compensated with ~500 USD)
- I was given LLM training code, and the resulting model produced gibberish. I should identify 10 bugs in the code.
For the most part, these takehomes had an expected time to spend on them. I always spent far more because the roles were important to me.
I got my first takehomes while I was still working for my previous employer. Completing the tasks and having a crunch phase at work resulted in a pretty heavy week.
This experience was one of the major drivers for the decision to quit and then be able to commit full-time to searching for a new job. I don’t see how one can realistically do multiple of these challenges simultaneously in a sufficient quality for different applications while holding a job.
Online Assessment
Besides programming interviews and takehomes, I’ve also encountered online assessments. They are asynchronous, but have a time limit on the order of hours instead of days, as with the takehomes.
Sometimes they are proctored, which means that you have to film yourself and share a screen recording so that they can ensure it’s you taking the test and that you are not using prohibited aids.
In essence, they are similar to programming interviews, just with automated tests and no interviewer present.
Programming Interview
Programming interviews are the core of the software and ML engineering interview process, and they have been part of every interview process I’ve encountered. They come in several different forms, which we discuss shortly.
But first, talk about general advice. They all serve a common goal: to evaluate how you think, break down a problem, think about edge cases, and work toward the solution of a programming problem.
In the process, companies want to see your communication and collaboration skills. To that end, it’s imperative that you talk out loud. I think it’s fine to read the exercise and think for a minute in silence, but after that, I was verbalizing my thought process to give the interviewer enough signal to judge my ability beyond a binary pass/fail of the task.
Sometimes, the Interviewers gave me hints or helped me to avoid exploring an impasse. Of course, it’s better to pass without any interventions, but it’s a lot better to get the code to work with help than not work at all. If I were stuck, I would try to explain where and why. Sometimes that was enough to figure out a solution myself, but it also presented a possibility for the interviewer to nudge me in the right direction.
Let’s now discuss the different kinds of programming interviews in more detail.
Leetcode Type Interview
The most common coding interview questions are LeetCode or LeetCode-style questions that are adapted to the domain in which the company operates.
In all these interviews, I was asked to share my screen. Sometimes using my own editor, sometimes coding in Google Docs, but most often in a Coderpad online interpreter.
Whenever there was an interpreter, we ran my code against tests. If they failed, I used the test output for debugging. Generally, it helped to think through the problem first and get buy-in for a high-level pseudo-code solution written out in comments before writing out the actual code.
If I could choose the language for the interview, I would choose Python. Partly because it’s a language I’m pretty well versed in, but also because I didn’t want to deal with memory issues in an algorithmic interview.
Generally, I would recommend going for a high-level language that you are familiar with. There is little value in wrestling with the borrow checker or forgetting to declare a variable when you could just use Python and focus on the algorithm.
There are many resources to prepare for these types of interviews, and getting good at them is pretty much a practice and pattern recognition thing. You build a repertoire of algorithmic techniques and useful data structures and their operations, and with time, develop expertise to mix and match to a solution quickly.
I found the Neetcode 150 roadmap particularly useful. It’s an ordered collection of LeetCode problems that one after the other introduce new and more advanced concepts. I found that much more useful than the arbitrary order of LeetCode.
Because I am a big believer in more rigorous (self-)education, I’ve also read Skiena’s Algorithm Design Manual. This is probably a bit too theoretical for immediate interview preparation, but it is a great base for your life as a programmer.
For more conceptual coding interview preparation, I’ve supplemented this with reading Coding Interview Patterns — a practical and interview-focused approach to data structures and algorithms.
In the end, I’ve probably done around 300 LeetCode/NeetCode problems. About half of that is to build muscle memory, and then 20-30 rapid-fire questions in the days leading up to the coding interviews.
The interviews themselves consisted usually of two or three problems, with the first one or two being equivalent to a LeetCode easy as a warm-up, followed mostly by a LeetCode medium, though I’ve encountered hards.
The most effective means of preparation was just practice to get a solid base, supplemented with several mock interviews with friends, where we role-played to get comfortable talking aloud when coding up a solution.
It also helped to get a couple of real interviews under my belt. I was considerably less nervous in the later ones.
API/System Design Coding Interview
I’ve had some coding interviews that had much more of an API or system design focus.
These are different from the typical system design interview conducted on a whiteboard in that I was expected to produce code. They did not concern themselves with the design of a scalable and fault-tolerant distributed system.
For example, I had to design the method interfaces of a BankingSystem class or design a program to sort a text file many times larger than the memory of the machine on which it was run. These interviews didn’t have tests, and partly, pseudocode was accepted.
Here, I often did not need to fully implement the solution, but had to talk through my rationale with the interviewers, often with the help of pseudocode we were writing in a notepad.
The best means of preparation for these interviews was building a couple of larger-scale projects, which made me encounter similar issues in the wild, and building a sufficient theoretical computer science foundation to understand the problems and trade-offs.
Real-world challenge
Sometimes, a programming interview is made to mimic a real-world challenge that resembles work you may have at the job.
For example, wrangling a messy real-world dataset or implementing a Lambda function that calls an LLM API to scrape certain information from a given website and saves it to a database.
One of these interviews allowed me to use LLM assistance and was pretty much judged on completion speed. I did pretty poorly because at the time, I was actively avoiding AI-assisted coding to build muscle memory in Rust syntax, and hence, I was slow and clunky using the LLM.
I don’t think I’ve had enough of these interviews to make any generalizable recommendations.
System Design Interview
System design interviews serve to gauge your exposure to creating software that solves a business problem while satisfying preset constraints on the load it must be able to handle, as well as its fault tolerance. They are usually open-ended, and the design is iteratively created in a discussion with the interviewer. As with the coding interviews, it is important that you communicate throughout the interview to make sure your interviewer understands how you are thinking about trade-offs. Speaking out loud also helps them to course correct you if you make an assumption that is not warranted.
While it is some work to prepare for system design interviews, there are some great and helpful resources out there. Martin Kleppman’s Designing Data Intensive Applications covers handling of data from storing it on a single machine, space-efficiently sending it around, to distributed consensus algorithms to gracefully handle network partitions.
It took me about six months to work through it while having it on my bedside table. While it’s more than what you need for a single interview, and you definitely don’t need to know everything by heart for interviewing, I believe there are great dividends for your software or ML engineering career. Plus, if you’ve read DDIA, you have the theoretical foundation to go deep on your design decisions.
For more practical and interview-focused preparation, I’ve read Alex Xu’s System Design Interview. It covers a couple of reference system designs, gives interview tactics, and contains example system schemas. It’s not that long and a fast and easy read, especially compared to DDIA. Another useful reference was the System Design Primer GitHub repository, which contains additional example designs.
Then, there are several useful YouTube channels. I particularly liked System Design Fight Club, ByteByteGo (from the authors of System Design Interview), TRYEXCEPT, and Jordan has no life.
I especially liked Jordan’s videos where he explained technologies or designed reference systems.
A learning hack that worked for me was getting YouTube Premium, which allows for downloads and background play, and then listening to these videos while going for a walk. In conjunction with the more rigorous DDIA, this helped to solidify concepts and understand the big picture that sometimes is lost when working to understand technical details written on a page.
When I was interviewing with a particular company, I was browsing their technical blogs and press releases to gather information about recent projects. With reading these resources and then looking for example System Design Interviews on YouTube in the same or an adjacent domain, I had a pretty decent hit rate of about 50% of having prepared a similar system design question to the one I got just the other day. Nonetheless, sometimes I got completely different system design questions, such as designing a Snapchat clone.
As with the LeetCode interviews, I did a couple of mock interviews with friends. We used excalidraw, a website to draw boxes and put text, which is often used in real system design interviews as well. While I’ve experimented with voice AI to prepare these interviews, I found it too sycophantic to be useful beyond the practice of talking out loud.
Culture fit
The culture fit interview usually occurred after a few rounds. These rounds are designed to understand you as a person, what drives you, and if that matches the company. Questions can be of two types: 1) General behavioral questions and 2) company culture-specific questions.
General Behavioral Questions These questions are designed to gauge your behaviour in conflict and stress situations.
A couple of examples:
- They may ask if you’ve ever had a conflict with a colleague and how you handled it.
- Did you ever miss a deadline? Why? How did you mitigate the consequences?
- Tell me about a time when you had to behave in a way that was contrary to your personal values.
- Would you rather be overpaid or underrated?
- If you had to decide, would you rather optimize for learning, impact, or people in your career? Why?
Company Culture Questions These questions have a specific framing to the values of an organisation that you usually find on their “About” or “Careers” page. You may be asked which of these values resonates most with you (and which doesn’t). Why? And please share an exemplary story where you acted in accordance (or violation) of that value.
I’ve prepared these stories beforehand to avoid only recalling the perfect anecdote in the shower the next day. Often, I could spin the same story in several different ways.
Mistral, for example, has the value of “Be Audacious” and my two prepared stories were (1) when I started over with a Physics degree despite having dropped it in 10th grade, after 2 years of sociology studies and (2) me quitting my Job with nothing else lined up to burn through my savings, attend Recurse Center in NYC and learn Rust and Systems programming.
With these two stories, one about 10 years ago, one recent, I can demonstrate continuous audacious behaviour. But I could repurpose the same stories for their “Reason with rigor” value. Both are stories about me rigorously identifying a blind spot in my current skillset — math and systems programming, respectively — for then taking decisive action to course correct.
All in all, I had a long list of about 15 stories and 5ish ones that should cover most behavioural and cultural questions.
I did prepare them using the STAR framework:
- Situation: Where were you working, what was the team constellation, and what was the current goal?
- Task: What was your specific task, and why was it difficult?
- Action: What did you do to accomplish your task, overcome the difficulty, and contribute to achieving the team goal?
- Result: What was the final result of your efforts?
I used STAR when putting the story into writing and mapping the different company values. I’ve also followed the STAR framework when telling the story in the interview to make sure I did not forget anything in forming a coherent narrative.
Once again, I did practice culture fit interviews with friends and voice AI. It helped me a lot to have verbalized the stories before and have the necessary vocabulary and story details in recent memory.
Quiz Interview
Knowledge, Quiz, or Fundamentals interviews are designed to map and find the edges of your expertise in a relevant subject area. They are harder to specifically prepare for than System Design or LeetCode interviews because they are less formulaic. In a way, they are designed to gauge the knowledge and experience acquired over a career and can’t be prepared with cramming the night before.
I did, however, try to strategically refresh what I thought may be relevant based on the job description by skimming through books or lecture notes and listening to podcasts and YouTube videos.
The Interviewers usually had prepared a couple of questions in advance. Sometimes they wanted answers to all of them; other times, they were just initial pointers that we used to riff on and have an interesting technical discussion without knowing where it would lead us. Generally, these interviews are purposefully pretty broad.
I’ve encountered questions like
- How would you implement a set in your fork of a Python interpreter implementation, and what is the role of a hash function?
- How can you get error bars on the output of an LLM for a specific checkpoint, and how do you interpret their size?
- What is overfitting, what is double descent, and are modern deep learning models overparametrized?
But I was also asked to explain Rust’s ownership and borrowing system, not because Rust was relevant to the role, but (I guess) because the interviewer saw it on my CV, was interested, and wanted to gauge how well I could explain technical concepts.
The best preparation is to know the stuff on your CV and have enough knowledge on everything listed in the job description to say a couple of intelligent sentences about it.
Since they want to find the edge of your knowledge, it’s usually fine to say “I don’t know”. Whenever I wasn’t completely sure, I was also prefacing it with something along the lines of
“I haven’t had practical exposure to distributed training until now, so my knowledge is theoretical. But you have data, model, and tensor parallelism …”.
Hiring Manager / CTO Interview
If I didn’t meet the hiring manager or CTO in an initial screen, I usually had a separate interview at the end of the process. When a company commits time to senior people, they are seriously considering hiring you.
Since they have a picture of an applicant’s technical ability and culture fit at that point, the interview is much less rigorous. In essence, it’s a nice chat where the hiring manager or CTO tries to sell the role. Additionally, it’s a space to ask specific questions on how the team works together, what a first project may be, and what one would need to achieve to be deemed a successful hire.
While these interviews seemed somewhat lower stakes since I had already cleared the technical bar, senior staff can definitely fail one if they think the vibe is off. I tried to be nice, excited, and had some questions about the company role and the personal motivation of my new superior. I’ve also used these interviews to gather insights about the company strategy and how my role and responsibilities might evolve a year or so in the future.
Reference Check
I’ve encountered reference checks usually as the last step before the contract, though some companies ask for them before moving into the first stages.
This serves to cross-check CV claims and tests whether one is able to build and maintain relationships at school or previous work to get people to vouch for their name.
I was asked to provide the name, contact details, relation, and job title of a couple (usually two) former peers so the company could reach out to them.
I did ask my references before putting them down. Of course, because it’s etiquette, but also because I wanted to avoid my former manager being on a sabbatical, hiking through the Amazon without cellular reception, and not answering any inquiries about my previous performance.
I think it’s best to put down people who worked closely with me and are thus familiar with my working style and the output quality. Bonus points if it is your manager or someone more senior, but a close relationship is definitely more important than seniority.
My referees then got a call or an email with a couple of questions. Some of my referees did ask me before what I wanted them to put down. My answer was always similar: “I think the reference is most valuable if you put your honest opinion about me. But to be mindful of your time, I’m, of course, happy to provide a draft.” The important thing is that you have another person standing with their name and reputation for the content of the reference, not if they, you, or ChatGPT wrote the initial draft.
The Mindgame
On the one hand, changing jobs is great. It’s a period full of possibilities and leverage. I was in charge of shaping my future with a lot more negotiation power and ability to walk away than when I had been employed. This is what I tried to remind myself of when things got tough.
It got tough. The process is taxing because it also includes a lot of rejection, uncertainty, inability to make long-term plans because I didn’t even know in which country I would end up, and, of course, living off savings and unemployment insurance.
The worst part was that I was becoming aware of the stigma of being unemployed, which, as time progressed, made it increasingly harder not to simply accept a role that felt like too big a compromise. Then there is the endless wait for results, callbacks, or rejections. It’s nerve-racking, and I could do nothing to speed up the process.
I also needed a lot of volitional energy and agency to navigate the constant friction between competing responsibilities. Should I rather fill the funnel with a couple of new applications, prepare the next rounds, or spend time with a strategic portfolio project to learn a new technology?
While start-ups move fast, I found that competitive jobs at established companies or scale-ups take significant time. For me, that was around 2-3 months. Then it takes 2 weeks of back and forth to negotiate the contract and a couple more weeks to make the switch. So even if everything goes smoothly (and that is an if you cannot count on), a full-time job search is at least 4 months of a transitional state.
Conclusion
There you have it: my experiences and lessons of 18 months of job seeking. In essence, tactics boil down to two things: 1) Try to get an information advantage, and 2) act on this information to be prepared when it matters. This preparation can consist of strategic high-effort projects or tactical quick-win initiatives.
I hope that this blog post gave you some information on what you are in for and how to best prepare.
Outside of tactics, it’s strategy. These are things that take time and cannot be faked. Learn rare and valuable skills, build craft, and don’t forget to collect proof points of your abilities (technical artifacts, business outcomes, job titles, organisation membership) so others can easily gauge your level. Ideally, choose something you enjoy doing. It’s much easier to spend unreasonable amounts of time on getting better at something if you enjoy doing it. Build a network of peers and friends who know you, vouch for you, and send opportunities your way.
These things are now what I will do at Mistral. Do good work, learn the codebase and new technologies, own parts of the product, grow more senior, make a couple of friends, and make the lives of my colleagues easier.
Godspeed to all of us!