Geometric Similarity Preserving Embedding-Based Hashing for Big Data in Cloud Computing

International Journal of Research and Scientific Innovation (IJRSI) | Volume VI, Issue VI, June 2019 | ISSN 2321–2705

Boukari Souley¹, Abubakar Usman Othman¹

¹Faculty of Science, Department of Mathematical Sciences, Abubakar Tafawa Balewa University, Bauchi, Nigeria.

Abstract- Indexing techniques are used on big data for efficient information retrieval from a very large and complex datasets with distributed storage in cloud computing. The availability of broad band access, mobile devices such as smartphones and tablets, body-sensor devices and cloud applications have greatly contributed to the rapid growth in data volume or big data. The existing indexing techniques are inadequate to satisfy the indexing requirements of big data. An efficient index scheme is required to meet the indexing requirement for efficient retrieval of big data. Finding approximate nearest neighbour (ANN) is essential in huge database for efficient similarity search to return the nearest neighbour of a given query. Density sensitive Hashing (DSH) achieved good performance but the discriminating information on data points are not fully utilised aside using long binary hash codes to achieve high precision-recall which slows performance as the binary code length increases and hence increase storage cost and search time. To address the aforementioned problems, this research proposes Geometric Similarity Preserving Embedding-Based hashing (Geo-SPEBH) method for improving the search accuracy and memory cost for large-scale-image retrieval. The technique aimed at preserving the underlying geometric information among data, and exploit the prior information that utilises reconstructive relationship of the data to learn compact and effective hash codes. The Geo-SPEBH makes full use of the geometric structure properties of data. An extensive experiment conducted on a cloud simulator like CloudSim should show that the proposed scheme outperforms state-of-the-art-techniques.

Keywords: Big Data, Hashing, Indexing, Image, Similarity preserving, CloudSim, Interface.

I. INTRODUCTION

Cloud computing is a web-based application that provides a shared pool of resources. The advance in mobile technology have allowed mobile devices such as smartphones and tablets to be used in a variety of different applications (Thilkanathan et al., 2014). The availability of internet such as with the use of the wide spread broadband Internet access (Huang et al. 2010), coupled with these hand held devices (mobile devices), resulted to the easy collection of digital information in form of structured and unstructured (Gartner et al., 2013) data, had contributed to the availability of large volumes of data known as big data.

Related Posts