Training code in serving pattern



Code and logic for machine learning development, which includes training, evaluation, experiment and anything before release, are probably different from those for the production serving. Difference in code may be difference in dependent library. For instance, you will write data splitting, batch training, parameter tuning and so on in your training code, though you will rarely use them for serving. Having different logic and code for training and serving is one of the difficulties of implementing machine learning into production system. It is recommended to not include useless code in your production in order to limit update, isolate fault and simplify code reading. It is better not to include training specific code into your serving environment.
Resources used for training, such as CPU, GPU, RAM, network and storage, might be different in those for serving. In web services, network and storage for training may be limited to internal, and those for serving will be open to public. For deep learning, use of GPU for training and CPU for serving, or use of training GPU and inference GPU, is very much common practice. For these reasons, the environment configurations for training may be unnecessary, or misconfiguration, for serving environment. It is acceptable if the configuration is not-used, though if it is applicable, it may cause an incident.
On the other hand, it is rather needed to have some common codes and configurations between training and serving to maintain same performance, latency and accuracy. It is important ot distinguish what to separate and what to standardize.





