# Computing sin & cos in hardware with synthesisable Verilog

The CORDIC algorithm is a clever method for accurately computing trigonometric functions using only additions, bitshifts and a small lookup table.

## The Algorithm

It’s well known that rotating the vector $$(1, 0)$$ anticlockwise about the origin by an angle $$\theta$$ gives the vector $$(\cos \theta, \sin \theta)$$. We will use this as the basis of our algorithm:

function cordic(θ) {
[c, s] ← rotate([1, 0], θ)
return c, s
}

Let’s split up this big rotation into $$N$$ smaller rotations, with the angle of rotation in step $$i$$ given by $$\alpha_i$$. At the moment, we don’t care what the values of $$\alpha_i$$ are, we just want their sum to be equal to $$\theta$$.

α ← [
...
]

function cordic(θ) {
c ← 1
s ← 0
for i in 0 .. N-1 {
[c, s] ← rotate([c, s], α[i])
}
return c, s
}

How do we rotate a vector? The page for rotation on Wikipedia tells us that it is equivalent to left-multiplying by a particular matrix:

α ← [
...
]

function cordic(θ) {
c ← 1
s ← 0
for i in 0 .. N-1 {
rotation_matrix ← [[cos α[i], -sin α[i]],
[sin α[i],  cos α[i]]]
[c, s] ← rotation_matrix × [c, s]
}
return c, s
}

Let’s expand out this matrix multiplication:

α ← [
...
]

function cordic(θ) {
c ← 1
s ← 0
for i in 0 .. N-1 {
c_new ← cos α[i] × c - sin α[i] × s
s_new ← sin α[i] × c + cos α[i] × s
c ← c_new
s ← s_new
}
return c, s
}

We will use the fact that $$\sin \alpha_i = \cos \alpha_i \tan \alpha_i$$ to factorise by $$\cos \alpha_i$$:

α ← [
...
]

function cordic(θ) {
c ← 1
s ← 0
for i in 0 .. N-1 {
c_new ← cos α[i] × (c - tan α[i] × s)
s_new ← cos α[i] × (s + tan α[i] × c)
c ← c_new
s ← s_new
}
return c, s
}

Now, we will consider the values of $$\alpha_i$$. Let’s assume that each rotation angle $$\alpha_i$$ has a fixed magnitude $$\beta_i$$ with respect to $$\theta$$, but can be either positive or negative depending on the value of $$\theta$$. In this way, the rotating vector can be directed to converge on a particular result vector.

We will accomplish this using a new variable $$\phi$$, initially at 0. Each step, we compare $$\phi$$ to $$\theta$$. If $$\phi$$ is less than $$\theta$$, we need to rotate by a positive angle (i.e. $$\alpha_i = \beta_i$$). If $$\phi$$ is greater than $$\theta$$, we need to rotate by a negative angle ($$\alpha_i = -\beta_i$$).

β ← [
...
]

function cordic(θ) {
c ← 1
s ← 0
φ ← 0
for i in 0 .. N-1 {
if φ < θ {
direction ← 1
} else {
direction ← -1
}
α ← direction × β[i]
c_new ← cos α × (c - tan α × s)
s_new ← cos α × (s + tan α × c)
c ← c_new
s ← s_new
φ ← φ + α
}
return c, s
}

Let’s eliminate the variable $$\alpha$$ and use $$\beta_i$$ directly:

β ← [
...
]

function cordic(θ) {
c ← 1
s ← 0
φ ← 0
for i in 0 .. N-1 {
if φ < θ {
direction ← 1
} else {
direction ← -1
}
c_new ← cos (direction × β[i]) × (c - tan (direction × β[i]) × s)
s_new ← cos (direction × β[i]) × (s + tan (direction × β[i]) × c)
c ← c_new
s ← s_new
φ ← φ + (direction × β[i])
}
return c, s
}

$$\cos (-x) = \cos x$$ and $$\tan (-x) = -\tan x$$, so this program is equivalent to:

β ← [
...
]

function cordic(θ) {
c ← 1
s ← 0
φ ← 0
for i in 0 .. N-1 {
if φ < θ {
direction ← 1
} else {
direction ← -1
}
c_new ← cos β[i] × (c - direction × tan β[i] × s)
s_new ← cos β[i] × (s + direction × tan β[i] × c)
c ← c_new
s ← s_new
φ ← φ + (direction × β[i])
}
return c, s
}

Now what about the values of $$\beta_i$$? If we assign them to be such that $$\tan \beta_i = 2^{-i}$$, then we have:

β ← [
atan 2^0,
atan 2^1,
...
atan 2^(N-1)
]

function cordic(θ) {
c ← 1
s ← 0
φ ← 0
for i in 0 .. N-1 {
if φ < θ {
direction ← 1
} else {
direction ← -1
}
c_new ← cos β[i] × (c - direction × 2^(-i) × s)
s_new ← cos β[i] × (s + direction × 2^(-i) × c)
c ← c_new
s ← s_new
φ ← φ + (direction × β[i])
}
return c, s
}

Multiplication by $$2^{-i}$$ is a shift-right by $$i$$ places:

β ← [
atan 2^0,
atan 2^1,
...
atan 2^(N-1)
]

function cordic(θ) {
c ← 1
s ← 0
φ ← 0
for i in 0 .. N-1 {
if φ < θ {
direction ← 1
} else {
direction ← -1
}
c_new ← cos β[i] × (c - direction × (s >> i))
s_new ← cos β[i] × (s + direction × (c >> i))
c ← c_new
s ← s_new
φ ← φ + (direction × β[i])
}
return c, s
}

Let’s move the multiplications by $$\cos \beta_i$$ into a separate loop:

β ← [
atan 2^0,
atan 2^1,
...
atan 2^(N-1)
]

function cordic(θ) {
c ← 1
s ← 0
φ ← 0
for i in 0 .. N-1 {
if φ < θ {
direction ← 1
} else {
direction ← -1
}
c_new ← c - direction × (s >> i)
s_new ← s + direction × (c >> i)
c ← c_new
s ← s_new
φ ← φ + (direction × β[i])
}
for i in 0 .. N-1 {
c ← c × cos β[i]
s ← s × cos β[i]
}
return c, s
}

In fact, all these multiplications can be replaced with a single value, which we will call $$K$$. $$K$$ is equal to the product of $$\cos \beta_i$$ for all $$i$$ between $$0$$ and $$N-1$$ inclusive. To put it mathematically,

$K = \prod_{i=0}^{N-1} \cos \beta_i$

It can be shown through the use of trigonometric identities that:

$K = \prod_{i=0}^{N-1} \frac{1}{\sqrt{1 + 2^{-2i}}}$

$$K$$ tends to approximately $$0.607252935$$ as $$N$$ increases.

Therefore, we have:

β ← [
atan 2^0,
atan 2^1,
...
atan 2^(N-1)
]

K ← 1
for i in 0 .. N-1 {
K ← K × cos β[i]
}

function cordic(θ) {
c ← K
s ← 0
φ ← 0
for i in 0 .. N-1 {
if φ < θ {
direction ← 1
} else {
direction ← -1
}
c_new ← c - direction × (s >> i)
s_new ← s + direction × (c >> i)
c ← c_new
s ← s_new
φ ← φ + (direction × β[i])
}
return c, s
}

Finally, we will change $$\phi$$ so that it now holds $$\theta$$ minus “whatever it was holding before this change”. This replaces the comparison against $$\theta$$ with a comparison against $$0$$, which is simpler to compute:

β ← [
atan 2^0,
atan 2^1,
...
atan 2^(N-1)
]

K ← 1
for i in 0 .. N-1 {
K ← K × cos β[i]
}

function cordic(θ) {
c ← K
s ← 0
φ ← θ
for i in 0 .. N-1 {
if φ > 0 {
direction ← 1
} else {
direction ← -1
}
c_new ← c - direction × (s >> i)
s_new ← s + direction × (c >> i)
c ← c_new
s ← s_new
φ ← φ - (direction × β[i])
}
return c, s
}

What we now have is an algorithm that is possible to implement in hardware, but is still equivalent to the original algorithm.

## The implementation

We will assume that all numbers are stored as 32-bit fixed-point numbers, with the radix point between the second-most-significant and third-most-significant bits. This allows for numbers ranging between -2 and 2, which includes the interval $$-\frac{\pi}{2}$$ to $$\frac{\pi}{2}$$ that all other inputs can be reduced into.

First off, we declare the constants that we will use:

define K 32'h26dd3b6a  // = 0.6072529350088814

define BETA_0  32'h3243f6a9  // = atan 2^0     = 0.7853981633974483
define BETA_1  32'h1dac6705  // = atan 2^(-1)  = 0.4636476090008061
define BETA_2  32'h0fadbafd  // = atan 2^(-2)  = 0.24497866312686414
define BETA_3  32'h07f56ea7  // = atan 2^(-3)  = 0.12435499454676144
define BETA_4  32'h03feab77  // = atan 2^(-4)  = 0.06241880999595735
define BETA_5  32'h01ffd55c  // = atan 2^(-5)  = 0.031239833430268277
define BETA_6  32'h00fffaab  // = atan 2^(-6)  = 0.015623728620476831
define BETA_7  32'h007fff55  // = atan 2^(-7)  = 0.007812341060101111
define BETA_8  32'h003fffeb  // = atan 2^(-8)  = 0.0039062301319669718
define BETA_9  32'h001ffffd  // = atan 2^(-9)  = 0.0019531225164788188
define BETA_10 32'h00100000  // = atan 2^(-10) = 0.0009765621895593195
define BETA_11 32'h00080000  // = atan 2^(-11) = 0.0004882812111948983
define BETA_12 32'h00040000  // = atan 2^(-12) = 0.00024414062014936177
define BETA_13 32'h00020000  // = atan 2^(-13) = 0.00012207031189367021
define BETA_14 32'h00010000  // = atan 2^(-14) = 6.103515617420877e-05
define BETA_15 32'h00008000  // = atan 2^(-15) = 3.0517578115526096e-05
define BETA_16 32'h00004000  // = atan 2^(-16) = 1.5258789061315762e-05
define BETA_17 32'h00002000  // = atan 2^(-17) = 7.62939453110197e-06
define BETA_18 32'h00001000  // = atan 2^(-18) = 3.814697265606496e-06
define BETA_19 32'h00000800  // = atan 2^(-19) = 1.907348632810187e-06
define BETA_20 32'h00000400  // = atan 2^(-20) = 9.536743164059608e-07
define BETA_21 32'h00000200  // = atan 2^(-21) = 4.7683715820308884e-07
define BETA_22 32'h00000100  // = atan 2^(-22) = 2.3841857910155797e-07
define BETA_23 32'h00000080  // = atan 2^(-23) = 1.1920928955078068e-07
define BETA_24 32'h00000040  // = atan 2^(-24) = 5.960464477539055e-08
define BETA_25 32'h00000020  // = atan 2^(-25) = 2.9802322387695303e-08
define BETA_26 32'h00000010  // = atan 2^(-26) = 1.4901161193847655e-08
define BETA_27 32'h00000008  // = atan 2^(-27) = 7.450580596923828e-09
define BETA_28 32'h00000004  // = atan 2^(-28) = 3.725290298461914e-09
define BETA_29 32'h00000002  // = atan 2^(-29) = 1.862645149230957e-09
define BETA_30 32'h00000001  // = atan 2^(-30) = 9.313225746154785e-10
define BETA_31 32'h00000000  // = atan 2^(-31) = 4.656612873077393e-10

Then, we introduce the module and its ports.

module cordic(
angle,

clock,    // Master clock
reset,    // Master asynchronous reset (active-high)
start,    // An input signal that the user of this module should set to high when computation should begin
angle_in, // Input angle
cos_out,  // Output value for cosine of angle
sin_out   // Output value for sine of angle
);

input clock;
input reset;
input start;
input [31:0] angle_in;
output [31:0] cos_out;
output [31:0] sin_out;

wire [31:0] cos_out = cos;
wire [31:0] sin_out = sin;

We need five registers: cos, sin and angle from the above algorithm, a counter register count, and a state register state. The first three are 32 bits wide, since they are storing fixed-point numbers as described above. count is 5 bits since 32 iterations will be used, and state is only 1 bit since we only need two states.

The inputs to this registers will be determined in a sequential logic block, so five more registers cos_next, sin_next, angle_next, count_next and state_next are defined, although these will be automatically reduced to logic functions during synthesis.

reg [31:0] cos;
reg [31:0] sin;
reg [31:0] angle;
reg [4:0] count;
reg state;

reg [31:0] cos_next;
reg [31:0] sin_next;
reg [31:0] angle_next;
reg [4:0] count_next;
reg state_next;

always @(posedge clock or posedge reset) begin
if (reset) begin
cos <= 0;
sin <= 0;
angle <= 0;
count <= 0;
state <= 0;
end else begin
cos <= cos_next;
sin <= sin_next;
angle <= angle_next;
count <= count_next;
state <= state_next;
end
end

The two states that we will use are 0, an idle state, and 1, a state indicating computation is occurring.

The logic block is defined as follows:

always @* begin
// Set all logic regs to a value to prevent any of them holding the value
// from last tick and hence being misinterpreted as hardware registers.
cos_next = cos;
sin_next = sin;
angle_next = angle;
count_next = count;
state_next = state;

if (state) begin
// Compute mode.
cos_next = cos + (direction_negative ? sin_shr : -sin_shr);
sin_next = sin + (direction_negative ? -cos_shr : cos_shr);
angle_next = angle + (direction_negative ? beta : -beta);
count_next = count + 1;

if (count == 31) begin
// If this is the last iteration, go back to the idle state.
state_next = 0;
end
end

else begin
// Idle mode.
if (start) begin
cos_next = K;         // Set up initial value for cos.
sin_next = 0;          // Set up initial value for sin.
angle_next = angle_in; // Latch input angle into the angle register.
count_next = 0;        // Set up counter.
state_next = 1;        // Go to compute mode.
end
end
end

The cos_shr and sin_shr signals refer to the values of cos and sin shifted right by count places:

wire [31:0] cos_signbits = {32{cos}};
wire [31:0] sin_signbits = {32{sin}};
wire [31:0] cos_shr = {cos_signbits, cos} >> count;
wire [31:0] sin_shr = {sin_signbits, sin} >> count;

The direction_negative signal is high if direction in the pseudocode algorithm is negative (if $$φ < 0$$), and low if it is positive (if $$φ ≥ 0$$). It is therefore defined to simply be equal to the sign bit of angle:

wire direction_negative = angle;

Lastly, we will define the lookup table used for the beta values:

wire [31:0] beta_lut [0:31];
assign beta_lut = BETA_0;
assign beta_lut = BETA_1;
// ...
assign beta_lut = BETA_31;

wire [31:0] beta = beta_lut[count];`