Data analysis is best practiced using real-world data sets which can be obtained from Kaggle. While students or beginners can start with analysing data using Excel by using pre-defined formulae, pivots, charts, and using other statistical plug-ins but invariably they need to progress towards Python for data analysis.
A major challenge that students may face is that the data sources in Kaggle is already cleansed and is ready for analysis. So, students would not get the actual experience of working with unstructured datasets.
- Mark your training and test set.
- Use a basic implementation of ANN, SVM etc to learn and predict from data.
- Test your implementation and record your performance
- Now modify your learner to achieve a greater performance by a large margin. For eg : better data cleaning, evolutionary or othet algorithms on top, modifying initial data for better performance etc