Description: Designing a Data Logistics and Model Deployment Service

This title appears in the Scientific Report : 2020

Designing a Data Logistics and Model Deployment Service

In Big Data applications, it is often required to integrate data from different sources to fuel machine learning models. In this paper, we describe a prototype implementation of the data logistics and model deployment services. Our goal was to create a one stop shop solution to support generic Data...

Personal Name(s):	Rybicki, Jedrzej (Corresponding author)
Contributing Institute:	Jülich Supercomputing Center; JSC
Imprint:	IARIA 2020
Physical Description:	22-26
ISBN:	978-1-61208-775-7
Conference:	ALLDATA 2020, The Sixth International Conference on Big Data, Small Data, Linked Data and Open Data, Lisbon (Portugal), 2020-02-23 - 2020-02-27
Document Type:	Contribution to a book Contribution to a conference proceedings
Research Program:	Data-Intensive Science and Federated Computing
	Publikationsportal JuSER

In Big Data applications, it is often required to integrate data from different sources to fuel machine learning models. In this paper, we describe a prototype implementation of the data logistics and model deployment services. Our goal was to create a one stop shop solution to support generic Data Science life cycle. It starts from formalized and repeatable data selection and processing provided by the data logistic service. The data are used for model creation in a typical machine learning fashion. The model is then put into a model repository to enable easy model management, sharing, and deployment. The functionality of the proposed prototype is positively verified with a particular use case from environmental science.