Machine Learning Interview Questions (Level- Medium)

1) What is ‘Training set’ and ‘Test set’?
In various areas of information science like machine learning, a set of data is used to discover the
potentially predictive relationship known as ‘Training Set’. Training set is an examples given to the
learner, while Test set is used to test the accuracy of the hypotheses generated by the learner, and it
is the set of example held back from the learner. Training set are distinct from Test set.
2) List down various approaches for machine learning?
The different approaches in Machine Learning are
a) Concept Vs Classification Learning
b) Symbolic Vs Statistical Learningc) Inductive Vs Analytical Learning
3) What is not Machine Learning?
a) Artificial Intelligence
b) Rule based inference
4) Explain what is the function of ‘Unsupervised Learning’?
a) Find clusters of the data
b) Find low-dimensional representations of the data
c) Find interesting directions in data
d) Interesting coordinates and correlations
e) Find novel observations/ database cleaning
5) Explain what is the function of ‘Supervised Learning’?
a) Classifications
b) Speech recognition
c) Regression
d) Predict time series
e) Annotate strings
6) What is algorithm independent machine learning?
Machine learning in where mathematical foundations is independent of any particular classifier or
learning algorithm is referred as algorithm independent machine learning?
7) What is the difference between artificial learning and machine learning?
Designing and developing algorithms according to the behaviours based on empirical data are known
as Machine Learning. While artificial intelligence in addition to machine learning, it also covers other
aspects like knowledge representation, natural language processing, planning, robotics etc.
8) What is classifier in machine learning?
A classifier in a Machine Learning is a system that inputs a vector of discrete or continuous feature
values and outputs a single discrete value, the class.
9) What are the advantages of Naive Bayes?
In Naïve Bayes classifier will converge quicker than discriminative models like logistic regression, so
you need less training data. The main advantage is that it can’t learn interactions between features.
10) In what areas Pattern Recognition is used?
Pattern Recognition can be used ina) Computer Vision
b) Speech Recognition
c) Data Mining
d) Statistics
e) Informal Retrieval
f) Bio-Informatics
11) What is Genetic Programming?
Genetic programming is one of the two techniques used in machine learning. The model is based on
the testing and selecting the best choice among a set of results.
12) What is Inductive Logic Programming in Machine Learning?
Inductive Logic Programming (ILP) is a subfield of machine learning which uses logical programming
representing background knowledge and examples.
13) What is Model Selection in Machine Learning?
The process of selecting models among different mathematical models, which are used to describe
the same data set is known as Model Selection. Model selection is applied to the fields of statistics,
machine learning and data mining.
14) What are the two methods used for the calibration in Supervised Learning?
The two methods used for predicting good probabilities in Supervised Learning are
a) Platt Calibration
b) Isotonic Regression
These methods are designed for binary classification, and it is not trivial.
15) Which method is frequently used to prevent overfitting?
When there is sufficient data ‘Isotonic Regression’ is used to prevent an overfitting issue.
16) What is the difference between heuristic for rule learning and heuristics for decision
trees?
The difference is that the heuristics for decision trees evaluate the average quality of a number of
disjointed sets while rule learners only evaluate the quality of the set of instances that is covered with
the candidate rule.
17) What is Perceptron in Machine Learning?
In Machine Learning, Perceptron is an algorithm for supervised classification of the input into one of
several possible non-binary outputs.
18) Explain the two components of Bayesian logic program?
Bayesian logic program consists of two components. The first component is a logical one ; it consists
of a set of Bayesian Clauses, which captures the qualitative structure of the domain. The second
component is a quantitative one, it encodes the quantitative information about the domain.
19) What are Bayesian Networks (BN) ?
Bayesian Network is used to represent the graphical model for probability relationship among a set of
variables .
20) Why instance based learning algorithm sometimes referred as Lazy learning
algorithm?
Instance based learning algorithm is also referred as Lazy learning algorithm as they delay the
induction or generalization process until classification is performed.