A Hash-based Image Encryption Algorithm Cheddad, A., Condell, J., Curran, K and Mc Kevitt, P School of Computing and Intelligent Systems, Faculty of Computing and Engineering University of Ulster, Northland Road, Derry, BT48 7JL Abstract-There exist several algorithms that deal with text encryption. However there has been little research carried out to date on encrypting digital images or video files. This paper describes a novel way of encrypting digital images with password protection using 1D SHA-2 algorithm coupled with a compound forward transform. A spatial mask is generated from the frequency domain by taking advantage of the conjugate symmetry of the complex imagery part of the Fourier Transform. This mask is then XORed with the bit stream of the original image. Exclusive OR (XOR), a logical symmetric operation, that yields 0 if both binary pixels are zeros or if both are ones and 1 otherwise. This can be verified simply by modulus (pixel1, pixel2, 2). Finally, confusion is applied based on the displacement of the cipher‟s pixels in accordance with a reference mask. Both security and performance aspects of the proposed method are analyzed, which prove that the method is efficient and secure from a cryptographic point of view. One of the merits of such an algorithm is to force a continuous tone payload, a steganographic term, to map onto a balanced bits distribution sequence. This bit balance is needed in certain applications, such as steganography and watermarking, since it is likely to have a balanced perceptibility effect on the cover image when embedding. I. INTRODUCTION Much research has been done in the area of steganography which is the science of concealing data in a transmission medium in such a way that it would not draw the attention of eavesdroppers. Steganography has various useful applications such as for human rights organizations (since encryption is prohibited in some countries), smart IDs where individuals‟ details are embedded in their photographs (content authentication), data integrity by embedding checksum, medical imaging and secure transmission of medical data to name a few. Various algorithms have been proposed to implement steganography in digital images. They can be categorized into three major clusters, algorithms using the spatial domain such as S-Tools [1], algorithms using the transform domain such as F5 [1] and algorithms taking an adaptive approach combined with one of the former two methods, e.g., ABCDE (A Block- based Complexity Data Embedding) [2]. Most of the existing steganographic methods rely on two factors: the secret key and the robustness of the algorithm. However, all of them either do not address the issue of encryption of the payload prior to embedding or merely give a hint of using one or more of the conventional block cipher algorithms. Hence, Westfeld et al. concluded their CRYSTAL (Cryptography and encoding in the context of steganographic algorithms) project with an important observation that “Crypto-Stego interaction is not very well researched yet” [3]. The renowned generic block cipher algorithms, such as Data Encryption Standard (DES), Advanced Encryption Standard (AES), International Data Encryption Algorithm (IDEA), etc, are not suitable to handle bulky data, i.e., digital images, due to their intensive computational process [4] [5] unless accelerated by hardware implementations. Additionally, such symmetric-key cryptographic algorithms are found unfit for digital images characterized with some intrinsic features such as bulk data capacity and high pixel correlation and redundancy [6] [7], especially when confidentiality is needed. Other limitations were reported in [8] in light of multimedia communication. Various hash algorithms are available such as MD5 (Message Digest 5) and SHA-2 (Secure Hash Algorithm) which hash data strings, thus changing their state from being natural to a seemingly unnatural state. A hash function is more formally defined as the mapping of bit strings of an arbitrary finite length to strings of fixed length [9]. Here the aim is to extend SHA-2 (the terminology and functions used as building blocks are described in the US Secure Hash Algorithm documentation) to encrypt digital 2D data. The introduction of two transforms combined with the output of SHA-2 creates a strong image encryption setting. Security systems are built on increasingly strong cryptographic methods that foil pattern and statistical analysis attempts. Encryption is particularly useful for Intellectual Property Management and Protection (IPMP) standardization group and multimedia communications that prefer handling media streams compliant to certain multimedia coding standard, such as JPEG or MPEG-1/2/4 standard [10]. The proposed approach retains the structures readily available in the unencrypted bit stream. Such structures, often specified by special header patterns, would comply with standard multimedia codec. Thus, an encrypted video for instance would be still decoded successfully. This paper is organized as follows; section II discusses related work, followed by the proposed encryption method in section III and security analysis is provided in section IV. Section V provides results and discussions. Finally, a conclusion is drawn in section VI. II. RELATED WORK ANALYSIS The research on the design of secure image encryption tends to focus on transferring images into chaotic maps. Chaos theory, which essentially emerged from mathematics and physics, deals with the behaviour of certain nonlinear dynamic systems that exhibit a phenomenon under certain conditions known as chaos which adopt the Shannon requirement on diffusion and confusion [11]. Due to its attractive features such as its sensitivity to initial conditions and the random-like outspreading behaviour, chaotic maps are employed for various applications of data protection [9]. In the realm of 2D data, Shih [12] outlines the following method, called Toral automorphism map, in order to spread the neighbouring pixels into largely dispersed locations. The Transformation is represented through the following formula: 1 1 detl l 1 1 or 1, and l and N denote an arbitrary integer and the width of a square image respectively. Also, x and y represent the original pixel coordinates and x’ and y’ the new location coordinates. We refer to the determinant here as „det‟. Applying Eq. 1 to the sample image „Lena‟, after exactly 17 iterations, termed as the stable orbit, the chaotic map converged into the original image. This Discrete Time Dynamic System (DTDS) is also the basic framework used in [13]. Regarding this method, it is important to note: A) Since the algorithm uses a determinant in its process, the input matrix can only be square. This constraint was highlighted also in [4]. A work around this problem might be in applying the algorithm on square blocks of a given image repetitively. However, that would generate noticeable peculiar periodic square patterns given the nature of the process and of course this is not an interesting fact as it conflicts with the aim of generating chaotic maps. B) As far as the security systems are concerned, the convergence of the translated pixels into their initial locations, i.e., image exact reconstruction after some iteration, is also not an appealing factor. This is an observed phenomenon in a variety of chaotic based algorithms. Given one of the iterations is used, if an attacker gains knowledge of the algorithm and obtains the parameter “l”, which is actually not difficult to crack using a brute force attack, he/she will be able to invest some time to add more iterations that will reveal the original image. For example, Wang et al. [14] show that for such systems if two parameters are set to 10 and 8, then regardless of image contents, any image with the dimensions of 256x256 will converge after 128 iterations. This periodicity brings insecurity to the process as methods for computing the periodicity can be formulated such as the one proposed in [15] [16]. In a more detailed and concise attempt to introduce image encryption, Pisarchik et al. [17] demonstrated that any image can be represented as a lattice of pixels, each of which has a particular colour. The pixel colour is the combination of three components: red, green, and blue, each of which takes an integer value C= (Cr, Cg, and Cb) between 0 and 255. Thus, they create three parallel CMLs (Chaotic Map lattices) by converting each of these three colour components to the corresponding values of the map variable,x (xr,xg,xb) and use c c c c these values as the initial conditions,x x . Starting from different initial conditions, each c 0 chaotic map in the CMLs, after a small number of iterations, yields a different value from the initial conditions, and hence the image becomes indistinguishable because of an exponential divergence of chaotic trajectories [17]. They introduced seven steps for encrypting images and seven steps for decryption. Moreover, four parameters were used of which two were regulated. Their settings can have a tremendous effect on the chaotic map quality. Therefore, the receiver must know the decryption algorithm and the parameters which act as secret keys. The algorithm is well formulated and adequately presented; it yields good results for RGB images as proclaimed by the authors. It was noticed that they used a rounding operator which was applied recursively along the different iterations. The major concern would be in recovering the exact intensity values of the input image as the recovered image shown in their work might be just an approximation because of the aforementioned operator. This is important, especially in the application of steganography where the objective is to recover the exact embedded file rather than its approximation. The raised point was remarked independently in [18] where they stated that a sensitive generator, i.e., a generator with a rounding operator, can produce two different binary sequences (after some iterations) for the same initial values and parameters if generated on two different machines which round off fractions after unmatched decimal places. However, a desired algorithm must be efficient, repeatable and portable (i.e., works in the same way in different software/hardware environments) [19]. As a result, such a chaotic encryption system is not invertible under double precision arithmetic [20]. Usman et al. [4] describe a method for generating chaotic maps to encrypt medical images by repetitive pixel arrangement and column and row permutations. The pixel arrangement is achieved through the following system: X(i,j)Y(k,l),where k (j (i 1)N 1)/L1 (2) l (j (i 1)N 1)mod(L)1 Here, k, l denote the mapped spatial coordinates of the original location at i, j. N and L are the height of the original image and transformed image respectively in such a way that: (KxL)(MxN),where:K M. The authors show some experiments in which the deciphered phase was missing. It is suspected that the rounding operator introduced in Eq. 2 will force some pixels to collude at the same location resulting in the loss of information needed for the original image reconstruction. Zou et al. [21] reduce the number of iterations using 2D generalised Baker transformations to enhance the key space. Unfortunately, most chaotic maps are unstable due to the periodicity of the mapping [13] [22]. Systems based on such maps are prone to attacks, such as the broken system shown in [23]. Other types of image encryption include the Fourier plane encoding algorithm, introduced in [24], which is attacked in [25] using an initial guess of the Fourier plane random phase while searching over a key space to minimise a cost function between the decrypted image for a given key and the original image. This spurred a variety of authors to apply the Fourier transform such as the works in [26] [27]. Shin and Kim [28] presented a phase-only encryption scheme using the Fourier plane. To generate this phase encrypted data, a zero-padded original image, multiplied by a random phase image, is Fourier transformed and its real-valued data is encrypted with key data by using phase-encoded XOR rules. Since the original information is encrypted on the Fourier plane, the decryption cannot retrieve the original image without perceptual degradation, i.e., PSNR in the interval (20dB, 42.23dB). One time pad hash algorithms, known also as stream ciphers, were believed to be unsuitable for image encryption since they would require a key of the size of the ciphered image itself [4]. Sinha and Singh [29] use MD5 to generate image signature by which they encrypt the image itself using a bitwise exclusive-OR (XOR) operation; they coupled that with an error control code, i.e., Bose-Chaudhuri Hochquenghem (BCH). The ciphered image was larger than the original because of the added redundancy due to applying the BCH. Since the message digest is smaller than the image, they XOR the signature block by block which eventually left some traces of repetitive patterns. Hence, their method was commented on in [30] in which they show also how insecure the method is with some experiments, a fact that provoked Sinha and Singh [29] to debate the arguments in their recent published reply in [31]. Gao and Chen [32] propose an image encryption algorithm based on hyper-chaos, which uses a matrix permutation to shuffle the pixel positions of the plain-image, i.e., logistic map, and then the states combination of hyper-chaos is used to change the grey values of the shuffled-image (diffusion). Their proposed algorithm did not survive attacks for too long. Rhouma and Belghith [33] successfully broke such cryptosystem using a chosen plaintext attack and a chosen cipher-text attack that recover the ciphered-image without any knowledge of the key value. Zeghid et al. [5] propose a new modified version of AES which involves the design of a secure symmetric image encryption technique. The AES is extended to support a key stream generator for image encryption which can overcome the problem of textured zones existing in other known encryption algorithms. The problems of AES based algorithm are, computational time complexity, generation of repetitive spatial patterns, and the sensitivity to image manipulation. In the realm of information hiding some steganographic applications prefer to use conventional pseudo-random number generator (PRNG) algorithms which form the basic and essential ingredient for any stochastic simulation in which random variables and other random objects are simulated by deterministic algorithms [19]. III. PROPOSED ENCRYPTION METHOD This proposal exploits the strength of a 1D hash algorithm, namely SHA-2 and extends it to handle 2D data such as images. “Secure one-way hash functions are recurring tools in cryptosystems just like the symmetric block ciphers. They are highly flexible primitives that can be used to obtain privacy, integrity and authenticity” [34]. SHA-2 hash standard underlies four secure hash algorithms SHA-224, SHA-256, SHA-384, and SHA-512. These algorithms are the result of a continuous development of SHA-1. SHA algorithms are used to provide a condensed fixed length representation known as message digest of an input message. The length of the unique digest ranges from 160 to 512 bits depending on the used algorithm [35]. The security of SHA-256, SHA-384, and SHA-512 matches the security of AES with complexity of the best attack as 2128, 2 192 and 2256, respectively, have been announced by the National Institute of Standards and Technology (NIST) [36]. SHA-2 can be described in two phases: pre-processing and hash computation. Pre- processing comprises padding, parsing the padded message into m-bit blocks, and setting any initialization values to be used in the hash generation. The hash computation generates a message schedule from the padded message which is used, along with functions, constants and word operations, to iteratively generate a series of hash values [37]. For exhaustive details on SHA-2 algorithms see the online specification1. The DCT and FFT are incorporated into the process to increase the disguise level and thus generate a random-like output that does not leave any distinguishable patterns of the original image. The ordering of the transforms is very crucial since the algorithm‟s strength lands itself to exploiting the symmetrical property of the FFT‟s imaginary part. The exhaustive step by step description of the encryption algorithm is illustrated in Fig. 1. The method works as a one-time pad cipher; therefore, the decryption will follow the same digital process but with the cipher input into the system, i.e., symmetric encryption. Starting with a password phrase K supplied by the user the algorithm generates a SHA-2, i.e., SHA-256, based hash string H (K) which forms the initial condition. The vector H, treated as a string of hexadecimal characters, is then converted to its decimal version and finally transformed to a bit stream matrix of fixed dimension K’={8x32}. Parallel to this, the original image A is converted to a bit stream and reshaped to the order8xMN. The partially extended key, herein K’, is still short to accommodate the image bit stream. Therefore, the algorithm performs key full expansion towards the needed dimension, herein 8xMN . Obviously, this step would result in repetitive patterns that would make the ciphered image prone to attacks, a problem that was independently noticed in [4]. To cope with this situation the method applies a thresholded DCT, where Eq.4 is used, followed by FFT to provide the diffusion requirement and to tighten the security. Note that nested transforms are not scant in the literature, for example O'Ruanaidh and Pun [38] used FFT followed by log- polar mapping and FFT to embed a watermark. Let the resized key be where the subscripts M and N denote the width and height 8,MN dimensions of the image, respectively. The FFT operates on the DCT transform of 8,MN subject to Eq. (4): 1 7 MN1 f(u,v) F(x,y)e2i(xu/8yv/MN) 8MN x0 y0 ,satisfying Eq. (4) (3) where,F(x,y)DCT( ),subject to (*) 8,MN 1 if DCT( )0 F(x,y) 8,MN (*) 0 Otherwise F( x , y ) is the thresholded DCT of the resized 2D bit stream of the 1D hash string generated from applying SHA-2 and denoted herein by . Similarly f(u,v) is the thresholded 8,MN Fourier Transform of the function F(x,y). 1 US Secure Hash Algorithms, (2006). Secure Hash Standard [Online]. Available: http://www.ietf.org/rfc/rfc4634.txt. Generating a pseudo-random binary sequence from the orbit of f(u,v) requires the mapping of the state of the system to its binary values{0,1}. One clear method for converting a real number to a discrete bit symbol is to use a rule as shown in Eq. 4. Given the output of Eq. (3) we can derive the corresponding binary map as depicted in Fig. 2: 1 if imag(f(u,v))thr Map(x,y) (4) 0 Otherwise where thr is an appropriately selected threshold value and imag()denotes the imaginary part of the complex function which can be compared directly with a threshold thr. For a balanced binary sequence and for robustness, thrshould be chosen such that the probability P (imag(f(u,v))<thr) = P (imag(f(u,v))>thr). Fortunately, the imaginary part of the signalf(u,v) is always symmetrical around zero (see Fig. 6 for validity of this property). Therefore,thr 0is an explicit solution. Since the coefficients in this calculation are converted to a binary map the reverse construction of the password phrase is impossible. Hence the name Irreversible Fast Fourier Transform (IrFFT). The generated bit-pattern exhibits sufficient randomness, which it will be proved, to provide cryptographic security as shown in the security analysis section. This map finally is XORed with the bit stream version of the image. The result is then converted into grayscale values and then reshaped to form the ciphered image. A post-processing step is introduced providing pixel substitution based on a new randomized image using K2=H(H(K)), this is illustrated with a simple example in Fig. 1 (bottom). The coding phase uses the Map (Eq. 4) to encrypt the bit stream of image A and produce a new encrypted matrixA, in such a way that: {AD(A,Map)} (5) auth where,D(A,Map)denotes the decoding of Awith the same key generated Map. Ideally, should be equal to {Ø} and starts to deviate from that whenAundergoes an image auth processing attack. Another noticed phenomenon which has been exploited was the sensitivity of the spread of the FFT coefficients to changes in the spatial domain. Therefore when we coupled this with the sensitivity of the SHA-2 algorithm to changes of the initial condition, i.e., password phrase, we can meet easily the Shannon law requirements. For instance slight changes in the password phrase will, with overwhelming probability, result in a completely different hash and therefore completely different Map. In reference to Fig.1 the encryption procedure is summarised as follows: 1- Generate a SHA-based hash string of the user password. 2- Convert the hash into {8x32} binary matrix, SHA-256 is used. This step forms the initial key expansion. 3- Resize this matrix to fit into the image binary matrix, this is denotes as 8,MN where M, N are the original image dimensions. 4- Feed the result into a compound transform, DCT followed by FFT with those constraints described in Eq. 3&4. 5- XOR the result with the image bit stream 6- Generate a new randomized binary Map of size (M x N) for the post-processing step (pixel substitution) as shown with a simple example in Fig.1 (bottom) 7- Encrypted image. The decryption process starts with step 6 as a pre-processing step then followed by steps 1-5. Fig. 1. Block diagram of the steps used in the proposed algorithm for image encryption and decryption. (bottom) shows a simple example to illustrate the swapping process used for pixel substitution. Fig. 2. Generation of random pattern using the frequency transform of the signal. Note that the pattern has the advantage of straightforwardly providing a balanced bit stream So, the core idea here is to transform this sensitivity into the spatial domain where 2D- DCT and 2D-FFT can be applied so that to introduce the aforementioned sensitivity to the two dimensional space. As such, images can be easily encoded securely with password protection. Note that this scheme efficiently encrypts grayscale and binary images. However, for RGB images we noticed that using the same password for the three primaries will yield some traceable patterns inherited from the original image (RGB colours are highly correlated). This is easily overcome through the following two choices: either the user supplies three passwords each of which encrypts one colour channel or more conveniently the system generates other two unique keys from the original supplied password. For instance, a single key can be utilized to generate the following different hash functions H(K),H(K), and H(H(K)) to encrypt the R, G and B channels, respectively. K denotes the supplied key, the arrows indicate the string reading directions and H(H()) denotes double hashing. IV. SECURITY ANALYSIS OF THE ENCRYPTION METHOD This section analyses the security aspects of the proposed method. Encryption algorithms are assumed to be robust to different statistical and visual attacks, moreover key sensitivity and key space should be adequate. In addition to that, and being a tailored method for steganography, the result should exhibit high randomness and a balanced bit values. Therefore, this section is split into five sub-sections, namely, key space analysis, key sensitivity, adjacent pixels analysis, randomness and other security merits. A. Key Space Analysis The key space analysis of the proposed algorithm essentially involves analysing the underlying SHA-2 algorithm. SHA-2 accepts any key of any length less than 264 bits. SHA is secure because it is computationally infeasible to find a message which corresponds to a given message digest, or to find two different messages which produce the same message digest. SHA has been extensively adopted in several organisations and has received much scrutiny from the cryptography community. The proposed encryption algorithm is flexible enough to migrate to a newer version of the SHA‟s algorithmic family or other secure hash functions especially knowing that no collisions have been found in SHA-2. Any version of SHA‟s family, be it SHA-224, SHA-256, SHA-384, or SHA-512, can all be written as an {8x?} matrix where “?” denotes (the length of the hash/2) and 8 comes from the binary of the hexadecimal value of the ASCII pairs. B. Key Sensitivity Analysis As it was the aim from the design, the algorithm is proven to be very sensitive to initial conditions, see Fig. 3. That was due to the plugged in hash algorithm and the FFT which made the technique immune to malleability attack. Fig.3 (a) shows the encrypted “boat” image using the word Steganography as a password, note that this password is case sensitive, Fig.3(b) successful decryption of Fig.3(a) and Fig.3 (c, d) illustrate decryption failure with slightly different password and hash code shown in bold, respectively. (a) (b) (c) (d) Fig.3. Key sensitivity test: (a) encrypted image, (b) decrypted (a) using the key „Steganography‟ '40662a5f1e7349123c4012d827be8688d9fe013b', (c) decrypted (a) using the wrong key „Steganographie‟ 'c703bbc5b91736d8daa72fd5d620536d0dfbfe01' (d) decrypted (a) using slightly modified hash „40662a5f1e7349123c4012d827be8688d9fe013B‟. C. Adjacent Pixels Analysis To test for statistical properties of the original image and the encrypted version we carried out tests based on the linear relationship between two adjacent pixels horizontally, vertically and diagonally. It is observed that natural images with natural data have a high correlation ratio between neighbouring pixels, see Fig.4. To measure this relationship the correlation coefficient was calculated of each pair of pixels (as shown in Table 1). The comparison given in Table 1 shows that the proposed method outperforms other recent methods reported in the literature. To establish a fair evaluation the same test image was used. In the horizontal, diagonal and vertical directions the encrypted version under this scheme had the highest performance. Bear in mind that unlike various methods, the proposed algorithm does not involve an extensive and computationally intensive iterations process. The encrypted image shown in Fig. 3 is automatically generated once the program is invoked with a key. The process does not retain any image statistics. This can be seen by comparing histograms of the plain and encrypted images, the original histogram is flattened and has a uniform distribution. Fig. 4. Correlation analysis of 5000 pairs of horizontal adjacent pixels chosen randomly from: (top) the original plain boat image (Fig. 3 (b)), (bottom) the encrypted image (Fig. 3 (a)) using the proposed method.
Description: