Climate (CDS) and Atmosphere (ADS) Data Stores are the core infrastructures supporting the implementation of Copernicus Climate Change (C3S) and Atmosphere Monitoring (CAMS) Services, providing access to essential state-of-the-art data, tools, applications and other digital information. Currently in operation but looking forward to cope with new challenges, demanding requirements and a changing technological environment, the underlaying infrastructure requires further modernisation taking onboard operational experience, user feedback, lessons learned, know-how and more advanced technologies. The improved version will be made more accessible to embrace open-source scientific software and ensure compatibility with state-of-the-art solutions, such as machine learning, data cubes and interactive notebooks. In summary, the goal of the tendered work is to evolve the current infrastructure into a modern, more usable and interoperable data store infrastructure that will engage with a wider user community and will facilitate a stronger integration with related platforms, such as WEkEO, maximising shared capabilities and resources.
The data store will be hosted in an in-house cloud infrastructure and shall be re-engineered as to maximise its capabilities by moving into a hyper-scalable container-based application. To reinforce usability and interoperability the portal will implement discovery, view and download services following standards as mandated by the European INSPIRE directive.
The selected contractor will be required to guarantee a smooth transition to the new infrastructure, supporting the migration and adaptation of current catalogue content and applications.
The tender will include a dedicated work package for taking over the maintenance of the current Data Store infrastructure until replaced by the new infrastructure. ECMWF will facilitate the conditions and required actions to assure a smooth transition of responsibilities and knowledge from the current contractor.
The new data store infrastructure will also host a dedicated, centrally managed repository for observations data with the aim to facilitate the integration and accessibility of this data by the different system functionalities.
Work at all stages in the project will be done in close collaboration with ECMWF. Development will follow an agile methodology or equivalent, implementing a CI/CD (Continuous Integration and Continuous Development) approach. Technological choices will be based on existing open-source software, Python being the preferable programming language. The result of the work tendered will itself be released under an open-source licence. The architecture of the system will have to be open and extendable both in terms of the number of concurrent users, handled data volumes and number of workflows and tools.
The ECMWF Copernicus Data Stores infrastructure will provide an operational service supporting heavy workload and will have to be closely monitored. Usage statistics will have to be collected to support informed decision making, reporting obligations and carry out capacity planning.