Data management constitutes one of the major challenges that a geographically- distributed e-Infrastructure has to face, especially when remote data access is involved. We discuss an integrated solution which enables transparent and efficient access to on-line and near-line data through high latency networks. The solution is based on the joint use of the General Parallel File System (GPFS) and of the Tivoli Storage Manager (TSM). Both products, developed by IBM, are well known and extensively used in the HEP computing community. Owing to a new feature introduced in GPFS 3.5, so-called Active File Management (AFM), the definition of a single, geographically-distributed namespace, characterised by automated data flow management between different locations, becomes possible. As a practical example, we present the implementation of AFM-based remote data access between two data centres located in Bologna and Rome, demonstrating the validity of the solution for the use case of the AMS experiment, an astro-particle experiment supported by the INFN CNAF data centre with the large disk space requirements (more than 1.5 PB).

An integrated solution for remote data access / Sapunenko, Vladimir; D'Urso, Domenico; Dell'Agnello, Luca; Vagnoni, Vincenzo; Duranti, Matteo. - In: JOURNAL OF PHYSICS. CONFERENCE SERIES. - ISSN 1742-6588. - 664:4(2015), p. 042047. (Intervento presentato al convegno 21st International Conference on Computing in High Energy and Nuclear Physics, CHEP 2015 tenutosi a jpn nel 2015) [10.1088/1742-6596/664/4/042047].

An integrated solution for remote data access

D'URSO, Domenico;
2015-01-01

Abstract

Data management constitutes one of the major challenges that a geographically- distributed e-Infrastructure has to face, especially when remote data access is involved. We discuss an integrated solution which enables transparent and efficient access to on-line and near-line data through high latency networks. The solution is based on the joint use of the General Parallel File System (GPFS) and of the Tivoli Storage Manager (TSM). Both products, developed by IBM, are well known and extensively used in the HEP computing community. Owing to a new feature introduced in GPFS 3.5, so-called Active File Management (AFM), the definition of a single, geographically-distributed namespace, characterised by automated data flow management between different locations, becomes possible. As a practical example, we present the implementation of AFM-based remote data access between two data centres located in Bologna and Rome, demonstrating the validity of the solution for the use case of the AMS experiment, an astro-particle experiment supported by the INFN CNAF data centre with the large disk space requirements (more than 1.5 PB).
2015
An integrated solution for remote data access / Sapunenko, Vladimir; D'Urso, Domenico; Dell'Agnello, Luca; Vagnoni, Vincenzo; Duranti, Matteo. - In: JOURNAL OF PHYSICS. CONFERENCE SERIES. - ISSN 1742-6588. - 664:4(2015), p. 042047. (Intervento presentato al convegno 21st International Conference on Computing in High Energy and Nuclear Physics, CHEP 2015 tenutosi a jpn nel 2015) [10.1088/1742-6596/664/4/042047].
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11388/179717
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 0
social impact