Machine Learning Interview Questions

1) What is Machine learning?
Machine learning is a branch of computer science which deals with system programming in order to
automatically learn and improve with experience. For example: Robots are programed so that they
can perform the task based on data they gather from sensors. It automatically learns programs from
2) Mention the difference between Data Mining and Machine learning?
Machine learning relates with the study, design and development of the algorithms that give
computers the capability to learn without being explicitly programmed. While, data mining can be
defined as the process in which the unstructured data tries to extract knowledge or unknown
interesting patterns. During this process machine, learning algorithms are used.
3) What is ‘Overfitting’ in Machine learning?
In machine learning, when a statistical model describes random error or noise instead of underlying
relationship ‘overfitting’ occurs. When a model is excessively complex, overfitting is normally
observed, because of having too many parameters with respect to the number of training data types.
The model exhibits poor performance which has been overfit.
4) Why overfitting happens?
The possibility of overfitting exists as the criteria used for training the model is not the same as the
criteria used to judge the efficacy of a model.
5) How can you avoid overfitting?
By using a lot of data overfitting can be avoided, overfitting happens relatively as you have a small
dataset, and you try to learn from it. But if you have a small database and you are forced to come
with a model based on that. In such situation, you can use a technique known as cross validation. In
this method the dataset splits into two section, testing and training datasets, the testing dataset will
only test the model while, in training dataset, the datapoints will come up with the model.
In this technique, a model is usually given a dataset of a known data on which training (training data
set) is run and a dataset of unknown data against which the model is tested. The idea of cross
validation is to define a dataset to “test” the model in the training phase.
6) What is inductive machine learning?
The inductive machine learning involves the process of learning by examples, where a system, from a
set of observed instances tries to induce a general rule.
7) What are the five popular algorithms of Machine Learning?
a) Decision Trees
b) Neural Networks (back propagation)
c) Probabilistic networks
d) Nearest Neighbor
e) Support vector machines
8) What are the different Algorithm techniques in Machine Learning?
The different types of techniques in Machine Learning are
a) Supervised Learning
b) Unsupervised Learning
c) Semi-supervised Learning
d) Reinforcement Learning
e) Transduction
f) Learning to Learn
9) What are the three stages to build the hypotheses or model in machine learning?
a) Model building
b) Model testing
c) Applying the model
10) What is the standard approach to supervised learning?
The standard approach to supervised learning is to split the set of example into the training set and
the test.

Following is the interview experience of Mr. Rajat Gupta.

Some of the basic ML questions asked to him in the interview are as follows:

  • Explain the differences between linear and logistic regression. Could you make a list of their assumptions? Why is it that we can’t apply Linear Regression on categorical data?

  • Describe the Bias-Variance Tradeoff. Explain the concepts of underfitting and overfitting. What exactly is the point of regularisation?

  • Explain the various Gradient Descent versions and their benefits and drawbacks. Bagging vs. Boosting: What’s the Difference?

  • Explanation of the Random Forest

  • Give examples of use cases where Precision and Recall are measured and how to calculate them.

  • Describe the ROC Curve. What do the ROC Curve’s axes represent? Expound on the ROC Curve’s two extreme points — (0, 0) and (0, 1). (1, 1).

  • What is AUC, and what does it mean in terms of physical interpretation? Is it feasible to get an AUC of less than 0.5? What is the absolute worst AUC you can achieve?

  • Explain the disappearing and exploding gradients problem. Outline various approaches to solving these problems.

  • Why is it necessary to include a pooling layer in CNNs? There is a distinction between maximum pooling and average pooling.