This post is a continuation of my previous post. I continued my implementation of the DES encryption standard in Verilog by completing the encryption operation. Eventually, I intend to wrap this IP in a 3DES block and integrate it into a firmware build for the Zynq7000 as a memory mapped peripheral for use by the processing system. I am not going to cover the actual DES algorithm in detail, but a reference I used in my previous post has an excellent overview.

Disclaimer: I am a complete Verilog novice. I'm certain there are bad design/bad practice/bugs in this implementation. Eventually I intend for someone knowledgeable to review and critique my work in an effort to improve.

## DES in Python

As before, I wanted to finish my DES implementation in Python before moving on to Verilog. In the last post, I completed the round key scheduling scheme. Next, I implemented the remaining operations and tidied up the code. I am not going to go through the Python impelementation in detail because it isn't the focus of this series of posts. The Python script was extremely useful for printing fine-grained debug messages I could then compare against the Verilog implementation. The complete Verilog and Python implementations will be available in the repository of a future post when the project is finished.

The inputs and resulting output from my Python implementation are:

``````Plaintext data: 00000001 00100011 01000101 01100111 10001001 10101011 11001101 11101111
Key:            00010011 00110100 01010111 01111001 10011011 10111100 11011111 11110001
Ciphertext out: 10000101 11101000 00010011 01010100 00001111 00001010 10110100 00000101

``````

For the sake of completeness, I verified my implementation against a Python DES library using a short script:

``````from des import DesKey

# Input key, converted to hex:
# 0b0001001100110100010101110111100110011011101111001101111111110001"
key = "133457799BBCDFF1"
key_arr = bytes(bytearray.fromhex(key))

# Plaintext, converted to hex:
# 0b0000000100100011010001010110011110001001101010111100110111101111
plaintext = "0123456789ABCDEF"
pt_arr = bytes(bytearray.fromhex(plaintext))

#Create the key
Key0 = DesKey(key_arr)

#Use it to encrypt the plaintext
cipher = Key0.encrypt(pt_arr)

#Print the ciphertext
# = 85e813540f0ab405
# = 1000010111101000000100110101010000001111000010101011010000000101
print(cipher.hex())
``````

The output from my implementation and the library implementation match as expected.

## CDL Implementation

I continued implementing the rest of the encryption process in my IP block. I have one clock-driven `always` block that completes all the steps of the encryption process.

At the top of the block, I declared registers to hold intermediate values.

`````` //Previous C and D key components
reg [0:27] cprev;
reg [0:27] dprev;

//Previous L and R data components
reg [0:31] lprev;
reg [0:31] rprev;

//Current C and D key components
reg [0:27] cn;
reg [0:27] dn;

//Current L and R data components
reg [0:31] ln;
reg [0:31] rn;

//Round key
reg [0:47] round_key;

//Transposted concatentation of Rn and Ln
reg [0:63] rl;
``````

I combined the logic from my previous key-schedule-only module with the rest of the encryption rounds. I won't cover the same functions I covered in my last post. The first step is the `PC-1` permutation of the key.

``````pc1_key = pc1(key);
``````

Next, the initial permutation `IP` of the plaintext data.

``````plain_p = ip(plaintext);
``````

`IP` is carried out by a dedicated function. As with other permutations, I represented `IP` with a look-up array, `ip_table`, initialized in the `intial` block of the module.

``````function [0:63] ip;
input [0:63] data;
integer i;
begin
for(i = 0; i < 64; i=i+1)
begin
ip[i] = data[ip_table[i]];
end
end
endfunction
``````

Next, `C0` and `D0` are obtained from the `PC-1` permutated key. In my implementation, I stored these values in `cprev` and `dprev`.

``````cprev = pc1_key[0:27];
dprev = pc1_key[28:55];
``````

Similarly, the `L0` and `R0` values are obtained from the initial permutation of the plaintext.

``````lprev = plain_p[0:31];
rprev = plain_p[32:63];
``````

Next, I created a loop to complete the sixteen rounds of encryption. The first step is to calculate the current round `C` and `D` using the left-rotation schedule and shift function.

``````cn = left_rotate(cprev, shift_table[i-1]);
dn = left_rotate(dprev, shift_table[i-1]);
``````

The round key is created using the `PC-2` permutation on the concatenation of `C` and `D`.

``````round_key = pc2({cn, dn});
``````

`L` is assigned to the previous `R(N-1)`

``````ln = rprev;
``````

`R` is calculated by XORing the previous `L` with the output of the function `f`.

`````` rn = lprev ^ f(rprev, round_key);
``````

The function `f` takes the `R(N-1)` plaintext half and the round key as arguments. The 32 bits of data are expanded with a bit selection table, `e`. The output of this expansion is XORd with the round key.

``````//F function
function [0:31] f;
input [0:31] data;
input [0:47] round_key;
reg [0:47] e;
begin
//XOR the expanded data bits with the round key
e = ebit(data) ^ round_key;
``````

The resulting 48 bit value is split into eight six-bit parts. Each of these parts are passed to an `S` box matrix. The first and last bits provide the row index and the middle four bits provide the column index. I created a function for the `S` box operation. I represented the `S` boxes as multidimensional arrays.

``````reg [7:0] sbox_1 [0:3][0:15];
reg [7:0] sbox_2 [0:3][0:15];
reg [7:0] sbox_3 [0:3][0:15];
reg [7:0] sbox_4 [0:3][0:15];
reg [7:0] sbox_5 [0:3][0:15];
reg [7:0] sbox_6 [0:3][0:15];
reg [7:0] sbox_7 [0:3][0:15];
reg [7:0] sbox_8 [0:3][0:15];
``````

These boxes were initialized in the `initial` block. The `S` function takes the box number (1-8) and the 6-bit input value. The row and column index are calculated and used to access the values in the respective `S` box.

``````//S-box function
function [0:3] sbox;
input [0:3] box_num;
input [0:5] val;
begin
case(box_num)
4'd1: sbox = sbox_1[{val[0],val[5]}][val[1:4]];
4'd2: sbox = sbox_2[{val[0],val[5]}][val[1:4]];
4'd3: sbox = sbox_3[{val[0],val[5]}][val[1:4]];
4'd4: sbox = sbox_4[{val[0],val[5]}][val[1:4]];
4'd5: sbox = sbox_5[{val[0],val[5]}][val[1:4]];
4'd6: sbox = sbox_6[{val[0],val[5]}][val[1:4]];
4'd7: sbox = sbox_7[{val[0],val[5]}][val[1:4]];
4'd8: sbox = sbox_8[{val[0],val[5]}][val[1:4]];
endcase
end
endfunction
``````

The output of the `S` function calls are concatenated together to produce an intermediate 48-bit value.

``````f[0:3]   = sbox(1, e[0:5]);
f[4:7]   = sbox(2, e[6:11]);
f[8:11]  = sbox(3, e[12:17]);
f[12:15] = sbox(4, e[18:23]);
f[16:19] = sbox(5, e[24:29]);
f[20:23] = sbox(6, e[30:35]);
f[24:27] = sbox(7, e[36:41]);
f[28:31] = sbox(8, e[42:47]);
``````

Finally, the intermediate value is permutated using a permutation function `P`. As with other permutation functions, `P` is represented as a table that is initialized in the `initial` block of the module

``````f = p(f);
``````

The final step in the round is to set up the `N-1` variables for subsequent rounds.

``````rprev = rn;
lprev = ln;
cprev = cn;
dprev = dn;
``````

After the sixteen encryption rounds are completed, the `R` and `L` from the final round are concatenated in reverse order. This value is then permutated with the inverse of `IP`. The result of this permutation is the ciphertext, written to the module's `ciphertex` output.

``````//Final concatenation
rl = {rprev, lprev};

//Assign ciphertext to inverse IP permutation
ciphertext = ip_inv(rl);
``````

## Encryption Output in Simulation

I modified the DES IP block to accept a 64-bit plaintext input and to output the 64 bits of ciphertext.

``````module DES(
input clock,
input  [0:63] key,
input  [0:63] plaintext,
output reg [0:63] ciphertext
);

``````

I modifed the test bench for the block to provide a 64-bit plaintext input, drive the clock, and print the resulting ciphertex.

```````timescale 1ns / 1ps

module test_bench;

reg clock = 0;
reg [0:63] key = 0;
reg [0:63] plaintext = 0;
wire [0:63] ciphertext = 0;
//UUT
DES uut (
.key(key),
.plaintext(plaintext),
.ciphertext(ciphertext),
.clock(clock)
);

integer k = 0;
initial
begin
key =       64'B00010011_00110100_01010111_01111001_10011011_10111100_11011111_11110001;
plaintext = 64'B00000001_00100011_01000101_01100111_10001001_10101011_11001101_11101111;
#5 clock = 1;
#10 \$display("Cipher = %b", ciphertext);
#5 \$finish;
end

endmodule

``````

Running the simulation results in the expected ciphertext output.

``````Vivado Simulator 2020.1
Time resolution is 1 ps
source test_bench.tcl
# set curr_wave [current_wave_config]
# if { [string length \$curr_wave] == 0 } {
#   if { [llength [get_objects]] > 0} {
#     set_property needs_save false [current_wave_config]
#   } else {
#      send_msg_id Add_Wave-1 WARNING "No top level signals found. Simulator will start without a wave window. If you want to open a wave window go to 'File->New Waveform Configuration' or type 'create_wave_config' in the TCL console."
#   }
# }
# run 1000ns
Cipher = 1000010111101000000100110101010000001111000010101011010000000101

``````

## Summary

I'm happy to have this design working as expected. The next step will be to add support for decryption before eventually wrapping the block and tying it in to the processor system on the Zynq7000. I will cover those next steps in future posts.