Conferences >Proceedings 10th IEEE Interna...

File and object replication in data grids

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Data replication is a key issue in a data grid and can be managed in different ways and at different levels of granularity: for example, at the file level or the object l...Show More

Metadata

Abstract:

Data replication is a key issue in a data grid and can be managed in different ways and at different levels of granularity: for example, at the file level or the object level. In the high-energy physics community, data grids are being developed to support the distributed analysis of experimental data. We have produced a prototype data replication tool, the Grid Data Management Pilot (GDMP) that is in production use in one physics experiment, with middleware provided by the Globus toolkit used for authentication, data movement and other purposes. We present a new, enhanced GDMP architecture and prototype implementation that uses Globus data-grid tools for efficient file replication. We also explain how this architecture can address object replication issues in an object-oriented database management system. File transfer over wide-area networks requires specific performance tuning in order to gain optimal data transfer rates. We present performance results obtained with GridFTP, an enhanced version of FTP, and discuss tuning parameters.

Published in: Proceedings 10th IEEE International Symposium on High Performance Distributed Computing

Date of Conference: 07-09 August 2001

Date Added to IEEE Xplore: 07 August 2002

Print ISBN:0-7695-1296-8

Print ISSN: 1082-8907

DOI: 10.1109/HPDC.2001.945178

Conference Location: San Francisco, CA, USA

Contents

1 Introduction

Data replication is an optimization technique well known in the distributed systems and database communities as a means of achieving better access times to data (data locality) and/or fault tolerance (data availability) [Bres99], [Karg99], [Tewa99]. This technique appears clearly applicable to data distribution problems in large-scale scientific collaborations, due to their globally distributed user communities and distributed data sites. As an example of such an environment, we consider the High Energy Physics community where several thousand physicists want to access the Terabytes and even Petabytes of data that will be produced by large particle detectors around 2006 at CERN, the European Organization for Nuclear Research.

References is not available for this document.

MIT Libraries

MIT Libraries

File and object replication in data grids

Abstract:

Metadata

Abstract:

1 Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

MIT Libraries

MIT Libraries

File and object replication in data grids

Alerts

Abstract:

Metadata

Abstract:

1 Introduction

References