Preprocess-prediction pattern



There are cases when the requirements for preprocess and the one for prediction are different. Usually the preprocess runs normalization or standardization for numericals, one-hot encoding for categories, vocab filtering for NLP, resizing for image, and so forth, and they can be pretrained or can be rule-based. On the other hand, the prediction implies pretrained and built, or binalized, model file that can take input to return inference. The preprocess maybe just defined in Python, and the prediction model may be built with Tensorflow or ONNX in binary. These differences enables to build separate servers or containers to optimize development and operation of each resource.
Since you divide preprocess and prediction model into separate resources, you need to consider tuning, network connection, versioning for each of them. It makes the system complex than the web single pattern, while the preprocess-prediction pattern enables you to use the serving resource efficient, as well as independent development and fault isolation. There are many cases to apply the pattern in case of using deep learning for inference.
The diagram2 shows deploying a proxy in front of the preprocess and prediction, to make them as microservices. In this case, the proxy will mediate data retrieval, preprocess and prediction services. It will allow you to develop those resources separately with independent library, language and infrastructure. However, it will add components to manage that requires you to understand all the components to troubleshoot in case of incident.


Simple prep-pred pattern


Microservice prep-pred pattern




Needs consideration