14 Smart Interview Questions for Data Scientists (Best Answers Included)

by Gonzalo March 20, 2019

Way back in 2012, Google’s Chief Economist Hal Varian predicted that Data Scientist would the sexiest job of the 21st century–a prediction that today is backed up by, well, data. As technological advances see exponential growth, whole industries are being born from data science, such as machine learning and artificial intelligence. Hiring the right data scientist is no small feat. Fierce competition aside, you’ll need to find someone with the right skills, expertise, and the ability to put that knowledge into action. We here at Jobbatical are here to help you find the perfect candidate with smart interview questions you should ask your next data scientist.

Outlook

Demand for data scientists has rocketed and last year it was deemed the number one job in the U.S. The American-based management and consulting firm, McKinsey & Company estimates that there will be a shortage of 140,000 to 190,000 data scientists to fill demand in the U.S. alone. And, according to Business Insider, data science ranks as one of the most sought-after and well-paid roles with a 4.8 out of 5 job satisfaction rating.

As with all things data, the role itself is in a constant state of flux. While at the core of data science remains the gathering, processing, and interpreting of data, there is a big focus on the ability to engineer solutions that are aimed at turning data into actionable insights.

Whether you’re a start-up collecting data for the first time or a well-established business just wanting to know more about the data you’ve collected, data science can be a real game changer for your organization.

The Perfect Fit

First, ease into the interview with some simple questions that will put the interviewee at ease while giving you some insight into their personality and communications skills.

What is a great day at the office?

While this question may seem obviously simple, you will learn a great deal about the candidate’s personality and work ethic. Do they jump right into work with a passion? Are they overly serious? Do they have a sense of humor? Let them talk freely about their ideal working day in terms of tasks, environment, and collaboration with colleagues; it’s a great ice-breaker.

In your opinion, is having more data a good thing?

An answer that considers the pros and cons of large datasets should be given. They should cite issues such as more data requires more time, more storage, and computing power. A well thought-out answer will also provide the pros: the impact that more data has on existing data models, and how it provides more information that can be used to isolate growth patterns, for example.

How do you think machine learning will impact the future role of a data scientist?

The candidate should show consideration for their existing tasks, identifying those that are likely to be replaced by machine learning in the immediate future, and, as a result, what the focus of the role could shift towards.  

Describe a project in which you encountered an obstacle and explain how you handled it.

A strong candidate will describe a situation, explain why it was a problem, and discuss how they handled it. It should illustrate their problem-solving methodolgies, and standards for evaluating success, as well as giving you an idea of how they handle work-related pressure.

Digging Deeper

Now that you’ve had a chance to assess the candidate’s character, dig a little deeper into the specifics of the data scientist role.  

What is your favorite statistical software and can you explain its pros and cons?

Whether it’s a traditional tool or an emerging technology, the interviewee should demonstrate not only an understanding of the software they use and why they use it but the benefits the tool provides them and the business.

What are the basic steps taken when analyzing a project?

At a minimum, the candidate should cite a concise list of activities: identify the business problem, explore the data, model the data, analyze it, and reach an actionable conclusion.

How have you used data to impact business?

The candidate should provide an example situation that either they, individually or as part of a wider team, have used that led to a business change. This could be, for example, a 20% increase in sales due to improved targeted marketing.

Why is an agile mindset so important in today’s world of data science?

The applicant should show their understanding that as more data becomes available, they will need to adapt not only to different types of data but also to understand how businesses will use the data.

Let’s Get Technical

These are straightforward questions related to data science, designed to reveal the candidate’s understanding of data and how it’s manipulated for analysis. It will also demonstrate their ability to effectively communicate.

What is the importance of data cleaning?

Although data cleaning can be a time-consuming and cumbersome task–particularly when mining data from multiple sources–it is critical to ensure accurate interpretation for business use.

What is data sampling?

Data sampling is the statistical analysis of data. It is used to select, manipulate, and examine a representative subgroup of data points that allow you to identify trends.

What is linear regression?

Linear regression is a type of predictive analysis that identifies the relationship between variables in order to predict the best correlation between them.

What is selection bias?

Selection bias, or sample bias, is an unrepresentative sample of data. It is when the data that has been mined, cleaned, and prepared for modeling is not illustrative of the data that the model will see once it is in use.

Can you provide an example of where you have worked with data that had not been cleaned properly and explain what actions you took to rectify the issue?

The applicant should provide an example of a large dataset and the types of anomalies and inaccuracies within the data. Next, they should provide a description of the steps they took to fix the problem. Did they use any automation tools? How did they monitor for errors? How did they validate data accuracy? Finally, did they establish standardized processes to avoid future issues?  

Wrap it Up

What is the most interesting thing you’ve found from data?

This final question is designed to start an open conversation, to allow the candidate to speak freely about themselves and the data science profession. This question may spark further discussion or bring to mind more questions. The objective is to end on a casual note while still soliciting information about the candidate’s suitability.

A candidate who displays an enthusiastic response indicates that they not only have a strategic outlook for their technical expertise to help business models but that they are passionate about what they do.

A good data scientist is methodical in their approach, has excellent analytical and problem-solving skills, and can communicate confidently. They have a high level of business intelligence, enabling them to turn data into a valuable asset and keep the business trajectory skyward.

Looking for a more complete guide on interviewing?

Take a look at our comprehensive guide on How To Interview a Candidate for everything you need to know about getting the best out of any interview, and more interview question templates.

Share this article: