There are 4 commonly used ways / architectures of deploying an ML model:
- Model Embedded in Application (detailed post)
- Dedicated model API (detailed post)
- Model published as Data (Streaming)
- Offline predictions
Depending upon the type of the final application, utility, flexibility and easy of use a choice is made. There’s no such as something is bad over the other, though the mode of offline predictions has started becoming slightly outdated.