Supervised machine learning algorithms define models that capture relationships among data. Classification is an area of supervised machine learning that tries to predict which class or category some entity belongs to, based on its features.
For example, you might analyze the employees of some company and try to establish a dependence on the features or variables , such as the level of education, number of years in a current position, age, salary, odds for being promoted, and so on. The set of data related to a single employee is one observation . The features or variables can take one of two forms:
- Independent variables , also called inputs or predictors, don’t depend on other features of interest (or at least you assume so for the purpose of the analysis).
- Dependent variables , also called outputs or responses, depend on the independent variables.