With the emergence of information technologies, an overwhelming amount of data and information is generated everyday. Storing and processing this huge volume of data is named by a ubiquitous term: big data management. Cloud storage systems enhance reliability and availability of data by introducing redundancy, i.e., data replication, in the system, thereby protecting the data integrity from node failures which occur frequently in any large-scale storage system. However, efficiently determining the level of redundancy, i.e., number of data replicas, is not a trivial task for a cloud service provider (CSP). Traditional methods, which use a fixed number of replicas for all users regardless of the user’s budget, do not achieve
efficiency in terms of financial benefit of CSPs. This paper presents an efficient replication scheme that allows a CSP to determine the optimal number of replicas for each user depending on the user’s budgetary constraint and the CSP’s resource capacity while maximizing the financial benefit of the CSP. Numerical simulations were performed to assess the validity of our approach. The results show the scalability of the proposed scheme which can apply to real systems with an arbitrary number of users.