Math 1580 - Solutions to Information Theory Problems



Problem # 4.42: Consider the cipher that has three keys, three plaintexts, and four ciphertexts that are combined using the following encryption table (which is similar to Table 4.12 used in Example 4.53 on page 246). \[ \begin{array}{|c||c|c|c|} \hline & m_1 & m_2 & m_3 \\ \hline \hline k_1 & c_2 & c_4 & c_1 \\ \hline k_2 & c_1 & c_3 & c_2 \\ \hline k_3 & c_3 & c_1 & c_2 \\ \hline \end{array} \] Suppose further that the plaintexts and keys are used with the following probabilities: \[ f(m_1)=f(m_2)=\frac{2}{5},\qquad f(m_3)=\frac{1}{5},\qquad f(k_1)=f(k_2)=f(k_3)=\frac{1}{3}. \]

(a) Compute \(f(c_1)\), \(f(c_2)\), \(f(c_3)\), and \(f(c_4)\).

(b) Compute \(f(c_1\mid m_1)\), \(f(c_1\mid m_2)\), and \(f(c_1\mid m_3)\). Does this cryptosystem have perfect secrecy?

(c) Compute \(f(c_2\mid m_1)\) and \(f(c_3\mid m_1)\).

(d) Compute \(f(k_1\mid c_3)\) and \(f(k_2\mid c_3)\).
Solution: (a) \[\begin{aligned} f(c_1) &= f(k_1)f_M(d_{k_1}(c_1)) + f(k_2)f_M(d_{k_2}(c_1)) + f(k_3)f_M(d_{k_3}(c_1)) \\ &= f(k_1)f(m_3) + f(k_2)f(m_1) + f(k_3)f(m_2) \\ &= \frac13\cdot\frac15+\frac13\cdot\frac25+\frac13\cdot\frac25 \\ &= \frac13. \\ f(c_2) &= f(k_1)f_M(d_{k_1}(c_2)) + f(k_2)f_M(d_{k_2}(c_2)) + f(k_3)f_M(d_{k_3}(c_2)) \\ &= f(k_1)f(m_1) + f(k_2)f(m_3) + f(k_3)f(m_3) \\ &= \frac13\cdot\frac25+\frac13\cdot\frac15+\frac13\cdot\frac15 \\ &= \frac4{15}. \\ f(c_3) &= f(k_1)f_M(d_{k_1}(c_3)) + f(k_2)f_M(d_{k_2}(c_3)) + f(k_3)f_M(d_{k_3}(c_3)) \\ &= f(k_1)\cdot0 + f(k_2)f(m_2) + f(k_3)f(m_1) \\ &= 0+\frac13\cdot\frac25+\frac23\cdot\frac15 \\ &= \frac4{15}. \\ f(c_4) &= f(k_1)f_M(d_{k_1}(c_4)) + f(k_2)f_M(d_{k_2}(c_4)) + f(k_3)f_M(d_{k_3}(c_4)) \\ &= f(k_1)f(m_2) + f(k_2)\cdot0 + f(k_3)\cdot0 \\ &= \frac13\cdot\frac25+0+0\\ &=\frac2{15}. \end{aligned}\]

(b) \[\begin{aligned} f(c_1\mid m_1) &= f(k_2) = \frac13. \\ f(c_1\mid m_2) &= f(k_3) = \frac13. \\ f(c_1\mid m_3) &= f(k_1) = \frac13. \end{aligned}\] So we see that \(f(c_1\mid m_i)=f(c_1)=\frac13\) for all messages. But that doesn't mean that we have perfect security, since we need to check the other \(f(c_j\mid m_i)\). So for example, \(f(c_2\mid m_2)=0\), since if the plaintext is \(m_2\), then \(c_2\) is not a possible ciphertext. On the other hand, from (a) we have \(f(c_2)=\frac4{15}\). Hence \(f(c_2\mid m_2)\ne f(c_2)\), so this cryptosystem does not have perfect security..

(c) \[\begin{aligned} f(c_2\mid m_1) &= f(k_1) = \frac13. \\ f(c_3\mid m_1) &= f(k_3) = \frac13. \end{aligned}\]

(d) \[\begin{aligned} f(k_1\mid c_3) & =0.\\ f(k_2\mid c_3) &= \frac{f(c_3\mid k_2)f(k_2)}{f(c_3)} = \frac{f(m_2)f(k_2)}{f(c_3)} = \frac{(2/5)(1/3)}{4/15} = \frac12. \end{aligned}\] (For the last computation, we used the value \(f(c_3)=\frac{4}{15}\) from (a).)


Problem # 4.43:
Suppose that a shift cipher is employed such that each key, i.e., each shift amount from \(0\) to \(25\), is used with equal probability and such that a new key is chosen to encrypt each successive letter. Show that this cryptosystem has perfect secrecy by filling in the details of the following steps.

(a) Show that \(\sum_{k\in \mathcal{K}} f_M(d_k(c)) = 1\) for every ciphertext \(c\in\mathcal{C}\).

(b) Compute the ciphertext density function \(f_C\) using the formula \[ f_C(c) = \sum_{k \in \mathcal{K}} f_K(k)f_M(d_k(c)). \]

(c) Compare \(f_C(c)\) to \(f_{C\mid M}(c\mid m)\).
Solution: (a) For a given ciphertext \(c\), as \(k\) ranges over all possible keys, i.e., \(k\) ranges over all possible shift amounts, the possible decryptions \(d_k(c)\) of \(c\) range over all of the possible plaintexts. Hence \[ \sum_{k \in {\cal K}} f_M(d_k(c)) = \sum_{m \in {\cal M}} f_M(m) = 1. \]

(b) Since we are assuming that each key is used with equal probability and there are 26 keys, we have \(f_K(k)=\frac1{26}\) for every key \(k\). Hence \[ f_C(c) = \sum_{k \in {\cal K}} f_K(k)f_M(d_k(c)) = \sum_{k \in {\cal K}} \frac{1}{26}f_M(d_k(c)) = \frac{1}{26}, \] where for the last equality we used (a).

(c) For a given ciphertext \(c\) and given plaintext \(m\), there is exactly one key \(k\) such that \(c=e_k(m)\). (If we think of the ciphertexts, plaintexts, and keys as numbers modulo \(26\), then \(e_k(m)\equiv m+k\pmod{26}\), so we have \(c=e_k(m)\) if and only if \(k\equiv c-m\pmod{26}\).) Hence \(f_{C|M}(c\mid m)\) is the probability that the key \(k\) is the particular key that encrypts \(m\) to \(c\), so \(f_{C|M}(c\mid m)=f_K(k)=\frac1{26}\), where the last equality comes from the assumption that every key is used with equal probability. Comparing this with (b), we see that \(f_C(c)=f_{C\mid M}(c\mid m)\) for every ciphertext \(c\) and plaintext \(m\), hence the system has perfect security.


Problem # 4.44: Suppose that a cryptosystem has the same number of plaintexts as it does ciphertexts (\(\#\mathcal{M}=\#\mathcal{C}\)). Prove that for any given key \(k\in\mathcal{K}\) and any given ciphertext \(c\in\mathcal{C}\), there is a unique plaintext \(m\in\mathcal{M}\) that encrypts to \(c\) using the key \(k\). (We used this fact during the proof of Theorem 4.55. Notice that the proof does not require the cryptosystem to have perfect secrecy; all that is needed is that \(\#\mathcal{M}=\#\mathcal{C}\).)
Solution: Fix \(k\in\mathcal{K}\). The encryption map \(e_k:\mathcal{M}\to\mathcal{C}\) is injective by definition of a cryptosystem, so our assumption that \(\#\mathcal{M}=\#\mathcal{C}\) implies that \(e_k\) is also surjective, and hence is a bijective map from \(\mathcal{M}\) to \(\mathcal{C}\). This is equivalent to the assertion that for every \(c\in\mathcal{C}\), there is a unique \(m\in\mathcal{M}\) satisfying \(e_k(m)=c\), which is the desired result.


Problem # 4.48: Let \(X\) be an experiment (random variable) with outcomes \(x_1,\ldots,x_n\) occurring with probabilities \(p_1,\ldots,p_n\), and similarly let \(Y\) be an experiment with outcomes \(y_1,\ldots,y_m\) occurring with probabilities \(q_1,\ldots,q_m\). Consider the experiment \(Z\) consisting of first performing \(X\) and then performing \(Y\). Thus the outcomes of \(Z\) are the \(mn\) pairs \((x_i,y_j)\) occurring with probabilities \(p_iq_j\). Use the formula for entropy (4.51) to prove that \[ H(Z) = H(X) + H(Y). \] Thus entropy is additive on independent compound events, which is a special case of Property \(\textbf{H}_3\) on page 250.
Solution: Using the formula for entropy, we compute \[ \begin{aligned} H(Z) &= - \sum_{i=1}^n\sum_{j=1}^m p_iq_j\log(p_iq_j) \\ &= - \sum_{i=1}^n\sum_{j=1}^m p_iq_j(\log p_i+\log q_j) \\ &= - \sum_{i=1}^m p_i\log p_i \sum_{j=1}^n q_j - \sum_{i=1}^n p_i \sum_{j=1}^m q_j\log q_j \\ &= H(X)\cdot1 + 1\cdot H(Y). \end{aligned} \]


Problem # 4.49: Let \(F(t)\) be a twice differentiable function with the property that \(F''(t)<0\) for all \(t\) in its domain. Prove that \(F\) is concave in the sense of (4.52). Conclude in particular that the function \(F(t)=\log t\) is concave for all \(t>0\).
Solution: We take \(s\) and \(t\) to be fixed and consider the function \[ G(\alpha) = F\bigl((1-\alpha)s+\alpha t\bigr) - \alpha F(t) - (1-\alpha) F(s). \] We need to show that \(G(\alpha)>0\) for all \(0\le\alpha\le1\). We give a proof by contradiction, so we assume that there is some \(0<\beta<1\) such that \(G(\beta)<0\) and we derive a contradiction.

We first observe that \[ G(0)=G(1)=0. \] The assumption that \(G(\beta)<0\) means that the infimum of \(G\) on the interval \([0,1]\) is negative, and since \([0,1]\) is compact, we can find a \(0<\gamma<1\) where \(G\) attains its minimum. (The point \(\gamma\) will be in the interior of the interval because, as already noted, we have \(G(0)=G(1)=0\).) In particular, since \(G\) has a minimum at \(\gamma\), we have \(G'(\gamma)=0\).

The extended mean value theorem says that for any (small) \(h>0\) we can write \[ G(\gamma+h) = G(\gamma) + G'(\gamma)h + \frac12 G''(\delta)h^2 \quad\text{for some \(\gamma<\delta<\gamma+h\).} \] But since \(G(\gamma)\) is the minimum value of \(G\), we have \(G(\gamma)\le G(\gamma+h)\), and further \(G'(\gamma)=0\). This leads to the inequality \[ G(\gamma) \le G(\gamma+h) = G(\gamma) + G'(\gamma)h + \frac12 G''(\delta)h^2 = G(\gamma) + \frac12 G''(\delta)h^2, \] which in turn implies that \[ G''(\delta)\ge0. \] However, directly from the definition of \(G\) we have \[ G''(\delta) = F''\bigl((1-\delta)s+\delta t\bigr)(t-s)^2, \] and the statement of the exercise says that \(F''(x)<0\) for all \(x\), hence \(G''(\delta)<0\). This contradiction completes the proof.


Problem # 4.51(a): Let \(X\) and \(Y\) be independent random variables.

(a) Prove that the equivocation \(H(X\mid Y)\) is equal to the entropy \(H(X)\).
Solution: Independence means that \(f(x\mid y)=f(x)\), so \[ \begin{aligned} H(X\mid Y)&=-\sum_{x,y} f(y)f(x\mid y)\log f(x\mid y)\\ &= -\sum_{x,y} f(y)f(x)\log f(x)\\ &= -\sum_{y} f(y)\sum_x f(x)\log f(x)\\ &= 1\cdot H(X). \end{aligned} \]


Problem # 4.53: Suppose that the key equivocation of a certain cryptosystem vanishes, i.e., suppose that \(H(K\mid C)=0\). Prove that even a single observed ciphertext uniquely determines which key was used.
Solution: By definition of equivocation, we have \[ 0 = H(K\mid C) = -\sum_{k\in\mathcal{K},\, c\in\mathcal{C}} f(c)f(k\mid c)\log_2 f(k\mid c). \] The values of the density functions are all between \(0\) and \(1\), so every term \(f(c)f(k\mid c)\log_2 f(k\mid c)\) in the sum is non-positive (because the two factors in front are non-negative, and the value of the logarithm is non-positive). The only way that a sum of non-positive quantities can equal \(0\) is for every term in the sum to equal \(0\). Hence \[ f(c)f(k\mid c)\log_2 f(k\mid c) = 0 \quad\text{for every \(k\in\mathcal{K}\) and every \(c\in\mathcal{C}\).} \] Suppose now that we observe some ciphertext \(c'\). This means that \(f(c')>0\), i.e., the ciphertext \(c'\) appears with non-zero probability. It follows that for every key \(k\), either \[ f(k\mid c')=0\quad\text{or}\quad f(k\mid c')=1, \] since the quantity \(f(k\mid c')\log_2 f(k\mid c')\) has to vanish. But we also know that \[ \sum_{k\in\mathcal{K}} f(k\mid c') = 1, \] since some key must have been used. It follows that there is a unique key \(k'\) satisfying \(f(k'\mid c')=1\), and for every other key \(k\ne k'\) we have \(f(k\mid c')=0\). Hence the single observed ciphertext \(c'\) uniquely determines the key.