Regression is a technique for investigating the relationship between independent variables or features and a dependent variable or outcome. It’s used as a method for predictive modeling in machine learning, in which an algorithm is used to predict continuous outcomes.
Some examples of regression can be as:
- Prediction of rain using temperature and other factors
- Determining Market trends
- Prediction of road accidents due to rash driving.
Terminologies Related to the Regression Analysis:
- Dependent Variable: The main factor in Regression analysis that we want to predict or understand is called the dependent variable. It is also called target variable.
- Independent Variable: The factors which affect the dependent variables or which are used to predict the values of the dependent variables are called the independent variables, also called as a predictors.
- Outliers: Outlier is an observation that contains either a very low value or a very high value in comparison to other observed values. An outlier may hamper the result, so it should be avoided.
- Multicollinearity: If the independent variables are more highly correlated with each other than other variables, then such condition is called Multicollinearity. It should not be present in the dataset, because it creates problems while ranking the most affecting variable.
- Underfitting and Overfitting: If our algorithm works well with the training dataset but not well with the test dataset, then such a problem is called Overfitting. And if our algorithm does not perform well even with the training dataset, then such a problem is called underfitting.