Cross-validation is a technique used to estimate the efficacy of a machine learning model. The parameter, k, is a tally of the number of groups that a dataset can be split up into.
The process starts with the entire dataset being shuffled in a random manner. It is then divided into k groups, also known as folds. The following procedure is applied to each unique fold:
- Assign one fold as a test fold and the remaining k-1 folds as a test set.
- Begin training the model on the training set. For each cross-validation iteration, train a new model that’s independent of the models used in prior iterations.
- Validate the model on the test set and save the result of each iteration.
- Average out the results from each iteration to obtain the final score.