Dedicated ML Model API - Deployment Architecture

In this type of architecture, a microservice is setup that independently returns predictions through an API to the main application. It could be REST, gRPC, SOAP, messaging. A simple way could be to set up a flask server and send output predictions through REST API. Though this architecture is slightly more complex than embed approach, it is more flexible as the microservice can independently be upgraded without the need to change the main application as long as the API end point is the same.