ML pitfalls to avoid

If you are trying to build ML projects, watch out for the following pitfalls:

  1. Software infrastructure allows you to stream, store and process the require data. Also not having any deployment system can be a sure pitfall.
  2. Mislabeling during data collection.
  3. Not validating manually (do predictions make sense? are the features really important, or it’s just noise or prediction by chance? are there edge cases that can throw out erratic results?)
  4. Train and serve skew. (the distribution of features corresponding to labels change over a period of time, have you identified that time period yet?)