International Journal of Advanced Computing
and Mechanical Systems

Submit Paper
← Back to Current Issue

DEDUCT: A Secure Deduplication Framework for Textual Data in Cloud Environments

Authors: Himabindu B, Vempalli Mallikarjuna, Sadiyam Padmanabhan Mukeshkumar, Kencha Harsha Vardhan, Dasari Chandra

Abstract

The widespread adoption of cloud storage services has strengthened challenges associated with data redundancy, excessive storage consumption, and the protection of sensitive information. A considerable share of cloud storage space is consumed by repeated copies of textual data, including documents, reports, emails, and system logs uploaded by multiple users. Although data deduplication is widely recognized as an effective approach for reducing redundancy and improving storage efficiency, conventional deduplication techniques typically rely on plaintext data comparison. This reliance exposes confidential user information to cloud service providers, who may not always be fully trusted. While secure deduplication methods have been proposed to mitigate these risks, many existing solutions remain vulnerable to brute-force attacks, metadata leakage, and scalability limitations, particularly when handling large volumes of textual data. This paper presents DEDUCT; a secure and efficient deduplication framework specifically designed for textual data in cloud environments. The proposed framework allows the cloud server to detect and eliminate duplicated data without gaining access to the original file content. DEDUCT combines text preprocessing, chunk-based fingerprint generation, cryptographic hashing, and secure encryption techniques to preserve data confidentiality while enabling reliable duplicate detection. In addition, a proof-of-ownership mechanism is employed to prevent unauthorized users from exploiting deduplication advantages. The proposed approach effectively balances storage optimization with strong security guarantees by ensuring that only encrypted data and protected metadata are processed by the cloud. Experimental observations indicate that DEDUCT substantially reduces storage redundancy while strengthening resistance to inference, guessing, and confirmation attacks, making it well suited for privacy-aware cloud storage applications.

Keywords

Secure Deduplication Cloud Storage Textual Data Security Data Privacy Cryptographic Encryption

How to Cite this Article

Himabindu B, Vempalli Mallikarjuna, Sadiyam Padmanabhan Mukeshkumar, Kencha Harsha Vardhan, Dasari Chandra. "DEDUCT: A Secure Deduplication Framework for Textual Data in Cloud Environments". International Journal of Advanced Computing and Mechanical Systems. 2026/01/13;2(1):1-9. doi:10.5281/zenodo.18235545
APA Citation Format | Copy and paste for your references