This title appears in the Scientific Report : 2020 

Designing a Data Logistics and Model Deployment Service
Rybicki, Jedrzej (Corresponding author)
Jülich Supercomputing Center; JSC
IARIA 2020
ALLDATA 2020, The Sixth International Conference on Big Data, Small Data, Linked Data and Open Data, Lisbon (Portugal), 2020-02-23 - 2020-02-27
Contribution to a book
Contribution to a conference proceedings
Data-Intensive Science and Federated Computing
In Big Data applications, it is often required to integrate data from different sources to fuel machine learning models. In this paper, we describe a prototype implementation of the data logistics and model deployment services. Our goal was to create a one stop shop solution to support generic Data Science life cycle. It starts from formalized and repeatable data selection and processing provided by the data logistic service. The data are used for model creation in a typical machine learning fashion. The model is then put into a model repository to enable easy model management, sharing, and deployment. The functionality of the proposed prototype is positively verified with a particular use case from environmental science.