If you are trying to build ML projects, watch out for the following pitfalls:
- Software infrastructure allows you to stream, store and process the require data. Also not having any deployment system can be a sure pitfall.
- Mislabeling during data collection.
- Not validating manually (do predictions make sense? are the features really important, or it’s just noise or prediction by chance? are there edge cases that can throw out erratic results?)
- Train and serve skew. (the distribution of features corresponding to labels change over a period of time, have you identified that time period yet?)