Amazon data scientist interview (questions, process, prep)

amazon data science interview

Data scientist interviews at Amazon are challenging. The questions are difficult, specific to Amazon, and cover a wide range of topics.

The good news is that the right preparation can help you maximize your chances of landing a job offer at Amazon (or AWS). We’ve analyzed 150+ data scientist interview questions reported by real Amazon candidates, categorized them, and listed examples below. 

Read on for our ultimate guide for success, including practice questions, links to helpful resources, and preparation tips to help you land that Amazon data scientist role.

Here's an overview of what we'll cover:

Click here to practice 1-on-1 with a data science ex-interviewer

1. Interview process and timeline

1.1 What interviews to expect

What's the Amazon data scientist interview process and timeline

It typically takes four to eight weeks and follows the steps below. If you're interviewing at AWS, you can expect a similar timeline.

  • Resume screen
  • Recruiter phone screen (~30 min)
  • Technical screen (1 or 2 screens, ~60 min each)
  • Onsite interviews (5-6 interviews, 45-60 min each)

1.1.1 Resume screen

First, recruiters will look at your resume and assess if your experience matches the open position. This is the most competitive step in the process, as millions of candidates do not make it past this stage.

If you’re looking for expert feedback on your resume, you can get input from our team of ex-Amazon recruiters, who will cover what achievements to focus on (or ignore), how to fine tune your bullet points, and more.

Of course, if you have any connections to current Amazon employees, consider asking for a referral. This often helps candidates get their feet in the door. 

1.1.2 Recruiter screen

Once you’ve applied or been directly contacted by a recruiter, the hiring process generally starts with a brief recruiter screen call. This will be a discussion of your background as well as the interviews ahead of you. Prepare answers to simple behavioral questions (see section 2.4) to show why you’re a good fit for Amazon. 

You may be speaking directly with your recruiter or with your hiring manager. This may not be someone with a technical background. If your recruiter hasn’t already detailed the process, this is a good time to ask specific questions about what to expect and what to prepare, as the process may differ per role.

Note: some candidates who have been contacted directly by recruiters via LinkedIn may skip the initial call and pass directly to the technical screen(s).

1.1.3 Technical screen

As mentioned above, the Amazon data scientist interview process differs between roles, so there are a few possibilities for your technical screen. You may have a take home assignment, a video call with live coding, a call focused on machine learning, or a combination of two of these.

While role-specific exercises or online assessments appear to be more common for internships, they are required for the occasional experienced role as well. They will consist of a coding assessment or a case study for you to explore in-depth. You may be asked to present your case study as a second stage of your technical screen, or during one of the onsite interview rounds.

Otherwise, your recruiter will schedule one or two interviews using Amazon Chime. Come prepared to answer machine learning questions and to work out SQL and Python/R questions on a shared notepad document. While other companies like Google or Facebook focus solely on technical skills at this step in the process, Amazon is equally interested in your past experience. Be ready to explain your past projects and business issues that you’ve solved, detailing concrete steps and framing them in the context of the leadership principles

1.1.4 Onsite interviews

If you crack the technical screen(s), the next step is to spend a full day onsite at the Amazon offices doing five or six separate rounds, one of which will take the form of an informal lunch interview. Due to COVID-19, this may happen as a “virtual onsite” using Amazon Chime.

These interviews will last 45 to 60mins and will be one-on-ones with a mix of people from the team you’re applying to join, including peers, the hiring manager, and a senior executive called the Bar Raiser.

Bar Raisers are not associated with the team you’re applying for. Instead, they focus on overall candidate quality rather than specific team needs. They get special training to make sure Amazon’s hiring standards stay high, so they are a big barrier between you and the job offer.

The format of the interviews differ, but may consist of case studies, technical presentations, Q&As, whiteboarding, or otherwise. Your recruiter should provide you with information on what to expect before going in. 

We’ll dig deeper into the question types later in the article, but expect an emphasis on behavioral questions. Each interviewer is usually assigned two or three leadership principles to focus on during your interview. 

1.2 What exactly is Amazon looking for?

At the end of each interview your interviewer will grade your performance using a standardized feedback form that summarizes the attributes Amazon looks for in a candidate. That form is constantly evolving, but we have listed some of its main components below.

A) Notes

The interviewer will file the notes they took during the interview. This usually includes the questions they asked, a summary of your answers, and any additional impressions they had (e.g. communicated ABC well, weak knowledge of XYZ, etc).

B) Technical competencies

Your interviewer will then grade you on technical competencies. They will be trying to determine whether you are "raising the bar" or not for each competency they have tested. In other words, you'll need to convince them that you are at least as good as or better than the average current Amazon data scientist at the level you're applying for.

The exact technical competencies you'll be evaluated against vary by role. But here are some common ones for data science roles:

  • Problem solving
  • Data analysis and manipulation
  • Machine learning / AI
  • Business acumen
  • Coding
C) Leadership principles

Your interviewer will also grade you on Amazon's 16 leadership principles and assess whether you're "raising the bar" for those too. As mentioned above, each interviewer is given two or three leadership principles to grill you on. 

 

D) Overall recommendation

Finally, each interviewer will file an overall recommendation into the system. The different options are along the lines of: "Strong hire", "Hire", "No hire", "Strong no hire".

1.3 What happens behind the scenes

Your recruiter is leading the process and taking you from one stage to the next. Here's what happens after each of the stages we’ve just described:

  • After the phone screens, your recruiter decides to move you to the onsite or not, depending on how well you've done up to that point.
  • After the onsite, each interviewer files their notes into the internal system, grades you and makes a hiring recommendation. (i.e. "Strong hire", "Hire", "No hire", "Strong no hire")
  • The "Debrief" brings all your interviewers together and is led by the Bar Raiser, who is usually the most experienced interviewer and is also not part of the hiring team. The Bar Raiser will try to guide the group towards a hiring decision. It's rare, but they can also veto hiring even if all other interviewers want to hire you.
  • You get an offer. If everything goes well, the recruiter will then give you an offer, usually within a week of the onsite, but it can sometimes take longer.

It's also important to note that recruiters and people who refer you have little influence on the overall process. They can help you get an interview at the beginning, but that's about it.

2. Example questions

Let’s get into the four primary categories of questions you’ll answer during the Amazon data science interview:

  1. Coding (37% of reported questions)
  2. Machine Learning (27%)
  3. Behavioral (19%)
  4. Statistics (17%)

In the sections below, we've put together a high-level overview of each type of question. In addition, we've compiled a selection of real Amazon data scientist interview questions, according to data from Glassdoor. We've edited the language in some places to improve the clarity or grammar, and, when appropriate, we’ve included a link to a solution.

Many of these questions are asked in the form of case studies. For more information about data science case study interviews, take a look here.

Important Note: Amazon data scientists work across many divisions, such as Amazon Web Services, Alexa, SCOT, logistics, and more. As each role will have different responsibilities, the interview questions you receive will reflect that. The Glassdoor data we’ve used is generalized across all data scientist roles, so consult the preparation materials from your recruiter to know what areas will be the most important for your position.

2.1 Coding questions (37%)

Amazon data scientists must write code and develop sophisticated algorithms that synthesize data coming in from multiple sources. You’ll need to demonstrate that you have the technical knowledge necessary to analyze and manipulate that data.

Expect interviewers to test you on SQL, data structures, algorithms, and some modeling. Most candidates report solving data structure and algorithm questions using Python and solving modeling questions with Python or R.

In most cases you will be coding on a whiteboard (or the virtual equivalent), but some candidates have reported entirely verbal onsite interview rounds. This shows how important communication skills are to Amazon, so practice both writing your scripts on paper and speaking through your reasoning. 

Practice using the example questions below. For more help, use our list of 49 real Amazon coding interview questions.

Amazon data scientist interview questions: coding

SQL

  • Write a SQL code to explain month to month user retention rate.
  • Describe different JOINs in SQL.
  • What is the most advanced query you’ve ever written?
  • Given a table with three columns, (id, category, value) and each id has 3 or less categories (price, size, color); how can you find those id's for which the value of two or more categories matches one another? 
  • I have table 1, with 1million records, with ID, AGE (column names) , Table 2 with 100 records with ID and Salary, and the following script. How many records would be returned?
    SELECT A.ID,A.AGE,B.SALARY
    FROM TABLE 1 A
    LEFT JOIN
    TABLE 2 B
    ON A.ID = B.ID
    +
    WHERE B.SALARY > 50000
  • Given a csv file with ID and Quantity columns, 50million records, and the size of the data is 2gig, write a program to aggregate the QUANTITY column.

Data structure and algorithms

  • Write a python code for recognizing if entries to a list have the same characters or not. Then what is the computational complexity of it?
  • You have an array of integers and you want to find a certain element; what effective algorithm would you use and what is the efficiency of it?
  • For a long sorted list and a short (4 element) sorted list, what algorithm would you use to search the long list for the 4 elements?
  • Given an unfair coin with the probability of heads not equal to .5, what algorithm could you use to create a list of random 1s and 0s?
  • Given a bar plot, imagine you are pouring water from the top. How do you qualify how much water can be kept in the bar chart? (solution)
  • Write a Python function that displays the first n Fibonacci numbers. (solution)
  • Suppose you have a list of strings, each of which is an English sentence. # Output a dictionary out_dict that maps a key n to the list of words that occur in n different sentences. # E.g. # Input: str_list = [ “The cat ate the fish”, “The cat saw the roses”, “The roses are red” ]
  • If given an integer n and an array of numbers, give out the histogram divided into n bins.

Modeling

  • How would you improve a classification model that suffers from low precision?
  • We have two models, one with 85% accuracy, one 82%. Which one do you pick? (solution)
  • When you have time series data by month, and it has large data records, how will you find significant differences between this month and previous month?
  • How do you inspect missing data and when are they important?
  • Assume you have a file containing data in the form of data = [{"one":a1, "two":b1,...},{"one":a2, "two":b2,...},{"one":a3, "two":b3,...},...] How could you split this data into 30% test and 70% train data?

2.2 Machine learning questions (27%)

Amazon data scientists must develop services and solve problems that are endlessly complex and constantly evolving. So your interviewer will test your ability to build innovative algorithms that improve and remain accurate over time.

Depending on the role, your interviewer may ask you to define and discuss specific ideas around system design and machine learning models. More in-depth machine learning rounds will require you to build out a hypothetical model or discuss how to improve existing ones related to real-life Amazon business decisions.

According to Glassdoor, some general topics that have come up before on Amazon machine learning interviews include unsupervised machine learning, bias-variance tradeoff, PCA, and recurrent neural networks, in addition to the full questions below. 

Let’s get into them.

Amazon data scientist interview questions: machine learning

  • How do you interpret logistic regression?
  • How does dropout work?
  • What is L1 vs L2 regularization?
  • What is the difference between bagging and boosting?
  • Explain in detail how a 1D CNN works.
  • Describe a case where you have solved an ambiguous business problem using machine learning.
  • Having a categorical variable with thousands of distinct values, how would you encode it?
  • How do you manage an unbalanced data set?
  • What is lstm? Why use lstm? How was lstm used in your experience?
  • What did you use to remove multicollinearity? Explain what values of VIF you used.
  • Explain different time series analysis models. What are some time series models other than Arima?
  • How does a neural network with one layer and one input and output compare to a logistic regression?

2.3 Behavioral (19%)

Amazon’s leadership principles tie into every step of the interview process, and interviewers will test your affinity with them through behavioral questions. Even in the technical rounds, your interviewers are looking for you to live and breathe these 16 principles, so spend extra time studying them.

If you're not already familiar with Amazon's leadership principles, here is the full list:

  1. Customer Obsession - "Leaders start with the customer and work backwards. They work vigorously to earn and keep customer trust. Although leaders pay attention to competitors, they obsess over customers.”
  2. Ownership - "Leaders are owners. They think long term and don’t sacrifice long-term value for short-term results. They act on behalf of the entire company, beyond just their own team. They never say ‘that’s not my job.’”
  3. Invent and Simplify - "Leaders expect and require innovation and invention from their teams and always find ways to simplify. They are externally aware, look for new ideas from everywhere, and are not limited by ‘not invented here.’ Because we do new things, we accept that we may be misunderstood for long periods of time.”
  4. Are Right, A Lot - "Leaders are right a lot. They have strong judgement and good instincts. They seek diverse perspectives and work to disconfirm their beliefs.”
  5. Learn and Be Curious - "Leaders are never done learning and always seek to improve themselves. They are curious about new possibilities and act to explore them.”
  6. Hire and Develop the Best - "Leaders raise the performance bar with every hire and promotion. They recognize exceptional talent, and willingly move them throughout the organization. Leaders develop leaders and take seriously their role in coaching others. We work on behalf of our people to invent mechanisms for development like Career Choice.”
  7. Insist on the Highest Standards - "Leaders have relentlessly high standards — many people may think these standards are unreasonably high. Leaders are continually raising the bar and drive their teams to deliver high quality products, services, and processes. Leaders ensure that defects do not get sent down the line and that problems are fixed so they stay fixed.”
  8. Think Big - "Thinking small is a self-fulfilling prophecy. Leaders create and communicate a bold direction that inspires results. They think differently and look around corners for ways to serve customers.”
  9. Bias for Action - "Speed matters in business. Many decisions and actions are reversible and do not need extensive study. We value calculated risk taking.”
  10. Frugality - "Accomplish more with less. Constraints breed resourcefulness, self-sufficiency, and invention. There are no extra points for growing headcount, budget size, or fixed expense.”
  11. Earn Trust - “Leaders listen attentively, speak candidly, and treat others respectfully. They are vocally self-critical, even when doing so is awkward or embarrassing. Leaders do not believe their or their team’s body odor smells of perfume. They benchmark themselves and their teams against the best.”
  12. Dive Deep - "Leaders operate at all levels, stay connected to the details, audit frequently, and are skeptical when metrics and anecdote differ. No task is beneath them.”
  13. Have Backbone; Disagree and Commit - "Leaders are obligated to respectfully challenge decisions when they disagree, even when doing so is uncomfortable or exhausting. Leaders have conviction and are tenacious. They do not compromise for the sake of social cohesion. Once a decision is determined, they commit wholly.”
  14. Deliver Results - "Leaders focus on the key inputs for their business and deliver them with the right quality and in a timely fashion. Despite setbacks, they rise to the occasion and never settle.”
  15. Strive to be Earth’s Best Employer - “Leaders work every day to create a safer, more productive, higher performing, more diverse, and more just work environment. They lead with empathy, have fun at work, and make it easy for others to have fun. Leaders ask themselves: Are my fellow employees growing? Are they empowered? Are they ready for what's next? Leaders have a vision for and commitment to their employees' personal success, whether that be at Amazon or elsewhere.”
  16. Success and Scale Bring Broad Responsibility - “We started in a garage, but we're not there anymore. We are big, we impact the world, and we are far from perfect. We must be humble and thoughtful about even the secondary effects of our actions. Our local communities, planet, and future generations need us to be better every day. We must begin each day with a determination to make better, do better, and be better for our customers, our employees, our partners, and the world at large. And we must end every day knowing we can do even more tomorrow. Leaders create more than they consume and always leave things better than how they found them.”

As you prepare for your interviews, you'll want to be strategic about practicing "stories" from your past experiences that highlight how you've embodied each of the 16 principles listed above. We'll talk more about the strategy for doing this in section 3 below (the preparation section).

To help you start practicing, we've compiled the following list of behavioral questions asked at Amazon data science interviews. We recommend that you practice each of them. In addition, we also recommend practicing the behavioral questions in our Amazon behavioral interview guide, which covers a broader range of behavioral topics related to Amazon’s leadership principles.

In the questions below, we’ve suggested the leadership principle that each question may be addressing. For any principles that were not reflected in the Amazon data scientist interview questions on Glassdoor, we’ve added a question from the Amazon SDE interview.

Let's get to it.

Amazon data scientist interview questions: leadership principles (behavioral)

  • Tell me about yourself.
  • Tell me about a time you made something much simpler for customers. (Principle: Customer Obsession)
  • Tell me about a project you worked on that was not successful. What would you do differently? (Principle: Ownership)
  • What’s the most innovative idea you’ve ever had? (Principle: Invent and Simplify)
  • Tell me about a time you applied judgement to a decision when data was not available. (Principle: Are Right, A Lot)
  • Why data science? (Principle: Learn and Be Curious)
  • Where do you see yourself within the next 5 years? (Principle: Hire and Develop the Best)
  • How would you improve this [project on your resume] if you had more time? (Principle: Insist on the Highest Standards)
  • Tell me a time that a goal was hard to achieve. What did you learn from that? (Principle: Insist on the Highest Standards)
  • Tell me about your most significant accomplishment. Why was it significant? (Principle: Think Big)
  • Did you come across a scenario where the deadline given to you for a project was earlier than expected? How did you deal with it and what was the result? (Principle: Bias for Action)
  • Describe the last time you figured out a way to keep an approach simple or to save on expenses (Principle: Frugality)
  • What is the one feedback/complaint you always get from your colleagues? How are you working on such feedback? (Principle: Earn Trust)
  • Tell me a time you used the data to come up with data-driven statistics, and how did you present your findings? (Principle: Dive Deep)
  • Describe the situation when you had a disagreement with your manager, and how did you handle that? (Principle: Have Backbone; Disagree and Commit)
  • How would you measure the impact of a business initiative? (Principle: Deliver Results)
  • Tell me about a time when you had two deadlines at the same time. How did you manage the situation? (Principle: Deliver Results)
  • What is the composition of your current team, and how are you encouraging their growth? (Principle: Strive to be Earth’s Best Employer)
  • How have you left a previous post better than you found it? (Principle: Success and Scale Bring Broad Responsibility)

Note: For the two questions related to the principles Strive to be Earth's Best Employer and Success and Scale Bring Broad Responsibility, we have created our own questions. As these principles are new at the time of publishing, we do not yet have Glassdoor data that addresses them.

2.4  Statistics questions (17%)

Amazon data scientists have to derive useful insights from large and complex datasets, which makes statistical analysis an important part of their daily work. Interviewers will look for you to demonstrate the robust statistical foundation needed in this role.

Review some fundamental statistics and how to give concise explanations of statistical terms, with an emphasis on applied statistics and statistical probability. Some general topics that interviewers have asked about in previous interviews include A/B testing, normalization, and Bayes theorem. 

In addition to these general topics, you’ll find complete questions to work through below.

Amazon data scientist interview questions: statistics

  • What is p-value?
  • What is the maximum likelihood of getting k heads when you tossed a coin n times? Write down the mathematics behind it.
  • There are 4 red balls and 2 blue balls, what's the probability of them not being the same in the 2 picks?
  • How would you explain hypothesis testing for a newbie?
  • What is cross-validation?
  • How do you interpret OLS regression results?
  • Explain confidence intervals
  • Name the five assumptions of linear regression
  • Estimate the disease probability in one city given the probability is very low nationwide. Randomly asked 1000 people in this city, with all negative responses (NO disease). What is the probability of disease in this city?
  • What is the difference between linear regression and a t-test?

3. How to prepare

Now that you know what questions to expect, let's focus on how to prepare. Below is our four-step prep plan for Amazon, which also applies to Amazon Web Services. If you're preparing for more companies than just Amazon, then check our generic data science interview preparation guide.

3.1 Learn about Amazon’s culture

Most candidates fail to do this. But before investing tens of hours preparing for an interview at Amazon, you should take some time to make sure it's actually the right company for you.

Amazon is prestigious, and it's tempting to assume that you should apply without considering things more carefully. But it's important to remember that the prestige of a job alone won't make you happy in your day-to-day work. It's the type of work and the people you work with that will.

If you know anyone who works at Amazon or used to work there, as a data scientist or in another role, talk to them to understand what the culture is like. The leadership principles we discussed above can give you a sense of what to expect, but there's no replacement for a conversation with an insider. 

Finally, we would also recommend reading the following resources:

3.2 Practice by yourself

As mentioned above, you'll encounter four main types of interview questions at Amazon: coding, machine learning, statistics, and leadership principles. Use each category below to find resources to help you prepare.

To get an idea of larger innovations and research that you may be taking part in as an Amazon data scientist, take a look at Amazon.science. For more information about how to prepare for case studies, take a look at our guide to data science case interviews.

For the coding interview questions, start with the video below that shows a step-by-step method by Amazon for answering programming questions. Practice the method using example questions such as those in section 2.1, or those relative to coding-heavy Amazon positions (e.g. Amazon software development engineer interview guide). 

 

Also, practice SQL and programming questions with medium and hard level examples on LeetCodeHackerRank, or StrataScratch. Take a look at Amazon’s technical topics page, which, although it’s designed around software development, should give you an idea of what they’re looking out  for. For even more help with SQL, read this analysis of the 3 "types" of SQL problems. Note that in the onsite rounds you’ll likely have to code on a whiteboard without being able to execute it, so practice writing through problems on paper.

For machine learning and statistics questions, Brilliant.org offers online courses designed around statistical probability and other useful topics, some of which are free. Kaggle also offers free courses around introductory and intermediate machine learning, as well as data cleaning, data visualization, SQL, and others.

Search for specific questions and answers around statistics, machine learning, data analysis, and others on StackExchange. Finally, you can post your own questions and discuss topics likely to come up in your interview on Reddit’s statistics and machine learning threads.

For behavioral interview questions, we recommend learning our step-by-step method for answering behavioral questions.  You can then use that method to practice answering the example questions provided in section 2.3 above. This is especially important for Amazon’s leadership principles. Make sure you have at least one story or example for each of the principles, from a wide range of positions and projects.

Finally, a great way to practice all of these different types of questions is to interview yourself out loud. This may sound strange, but it will significantly improve the way you communicate your answers during an interview. Play the role of both the candidate and the interviewer, asking questions and answering them, just like two people would in an interview. Trust us, it works.

3.3 Practice with peers

Practicing by yourself will only take you so far. One of the main challenges of data scientist interviews at Amazon is communicating your different answers in a way that's easy to understand.

As a result, we strongly recommend practicing with a peer interviewing you. If possible, a great place to start is to practice with friends. This can be especially helpful if your friend has experience with data scientist interviews, or is at least familiar with the process.

3.4 Practice with ex-interviewers

Finally, you should also try to practice data science mock interviews with expert ex-interviewers, as they’ll be able to give you much more accurate feedback than friends and peers

If you know a data scientist or someone who has experience running interviews at Amazon or another big tech company, then that's fantastic. But for most of us, it's tough to find the right connections to make this happen. And it might also be difficult to practice multiple hours with that person unless you know them really well.

Here's the good news. We've already made the connections for you. We’ve created a coaching service where you can practice 1-on-1 with ex-interviewers from leading tech companies. Learn more and start scheduling sessions today.