What is Data mining?

Data mining is the procedure of searching across large amounts of data for concepts and trends that can be used to come up with solutions through data analysis. Data mining tools and applications allow businesses to accurately predict patterns and make smarter decisions.

Data mining, which also uses advanced statistical techniques to identify useful information from the data sets, is a critical element of data data analysis as well as one of the key areas in data science. Data mining is a subcategory of the knowledge discovery in databases (KDD) process, which is a data science technique for assembling, sorting, and analyzing data. Data mining and KDD are often used synonymously, but they are more commonly regarded as distinct concepts.

Data mining is the process of analyzing large amounts of data in an effort to find correlations, patterns, and insights.

A lot of data mining is about experimentation: compute all sorts of correlation coefficients and summary statistics (mean/mode/std. deviation); look at outliers and try to find what makes them different; test different hypothesis to see if the data supports them; etc.

The mining analogy is quite apt, actually. You have a mountain of data (rocks) and your goal is to mine it to find the nuggets of insight (gold).