Free Trial

Safari Books Online is a digital library providing on-demand subscription access to thousands of learning resources.

111 Data Storage, Retrieval and Management Chapter 6 intrOductiOn The latest advances in network and distributed- system technologies now allow integration of a vast variety of services with almost unlimited process- ing power, using large amounts of data. Sharing of resources is often viewed as the key goal for distributed systems, and in this context the shar- ing of stored data appears as the most important aspect of distributed resource sharing. Scientific applications are the first to take advantage of such environments as the requirements of current and future high performance computing experiments are pressing, in terms of even higher volumes of issued data to be stored and managed. While these new environments reveal huge opportunities for large-scale distributed data storage and manage- ment, they also raise important technical challenges, which need to be addressed. The ability to support persistent storage of data on behalf of users, the consistent distribution of up-to-date data, the reliable replication of fast changing datasets or the efficient DOI: 10.4018/978-1-61520-703-9.ch006 management of large data transfers are just some of these new challenges. In this chapter we discuss how the existing dis- tributed computing infrastructure is adequate for supporting the required data storage and manage- ment functionalities. We highlight the issues raised from storing data over large distributed environ- ments and discuss the recent research efforts dealing with challenges of data retrieval, replication and fast data transfers. Interaction of data management with other data sensitive, emerging technologies as the workflow management is also addressed. data stOrage Many approaches to build highly available and incrementally extendable distributed data storage systems have been proposed. Solutions span from distributed storage repositories to massively paral- lel and high performance storage systems. A large majority of these aim at a virtualization of the data space allowing users to access data on multiple stor- age systems, eventually geographically dispersed. Copyright © 2010, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.