Sunday, 7 August 2016

The ALU

Arithmetic Logic Unit

The clue's in the name - this block handles both arithmetic and logical operations. We want to implement a total of ten operations:
  • Addition with carry
  • Subtraction with borrow
  • Increment
  • Decrement
  • AND
  • OR
  • XOR
  • Shift right from and into the carry
  • Shift left from and into the carry
  • Compare
Though with the aim of simplifying the hardware, we're only going to implement the first eight of these directly; we'll look at the last two towards the end of this article.

Let's start with the arithmetic, since the logical instructions are trivial. If we use the conventional two's complement notation, we can do it all with adders. The 74hc283 (datasheet here) can add two four bit numbers and a carry input to give a four bit plus carry result; two can be used for eight bit data simply by taking the carry out of the low nibble adder and using it as the carry in for the high nibble.

So far, so good (though we have not yet addressed the carry, for reasons which will become obvious). But what about subtraction?

Subtraction is mathematically equivalent to adding a negative number. Using two's complement notation, a number has its sign changed (i.e. it is multiplied by minus one) by first inverting all the bits and then adding one, so -1 in decimal becomes 0b11111111 in binary. Let's add an inverter:

We use the multiplexor to select either the inverted or non-inverted B, and then add it to A and the carry. For the subtraction to be correct, we have to set the carry; if the carry is clear, the result will be one less than expected - the carry is acting as an inverted borrow. Also, the carry out also behaves as an inverted borrow - it is cleared if the calculation overflows and needs to borrow from a further stage, otherwise it is set. So when we add, we have a carry, and when we subtract, we have an inverted borrow - but this is internally consistent and works properly for multi-byte arithmetic:

clc
lda param1_lo
adc param2_lo    ; carry is clear if no overflow
sta result_lo
lda param1_hi
adc param2_hi
sta result_hi

and

sec
lda param1_lo
sbc param2_lo    ; ~borrow is set if no underflow
sta result_lo
lda param1_hi
sbc param2_hi
sta result_hi

But what about increment and decrement operations? They're extremely common, and while we could of course just add or subtract immediate values that takes an extra byte in the instruction and an extra clock cycle to get that data. We'd also have to preset the carry/~borrow - another instruction. It's faster to have this internally arranged.

I've added a further multiplexor. This one allows us to replace the B input with a fixed zero. There are also now a couple of control lines which control the multiplexors - so we can select either the normal or inverted version of the B input or zero.

To increment, we select the zero, don't invert it, and add it to A - making sure that we set the carry. So the result is A+1. To decrement, we invert the zero to get minus one, and add that but with the carry cleared.

A little combinational logic ensures that the carry is arranged correctly: it's passed in unchanged for addition and subtraction, or set or cleared as required for the increment and decrement. The mode bits - three of them - tell the ALU which operation it should be performing.

There's one extra function that's sneaked in there: a compare input. To perform a compare, you need to subtract the operand from the accumulator, noting the flags but discarding the result. To do that, it's convenient not to need to worry about the carry flag, which must be set. The compare input looks after that for us.

Now we can look at the logic functions, which are indeed trivial (though note that black lines in the drawing are all eight bits wide, so there's a lot of wiring to do!).

The logic routes are obvious; the mess of wiring at the bottom left is a shift right unit. Each bit of the input is wired to the next bit to the right of the output, with the carry being used to provide the top bit. The lowest bit of the input will be moved to the carry output. A multiplexor controlled by the mode bits selects the desired logical output.

The only thing left to do then is a mechanism to select the logic or arithmetic output and to route the flags from the appropriate place, and a way to detect a zero condition.

Two more multiplexors are controlled by the high bit of mode to select the appropriate outputs, and a comparator against zero (in practice, an eight-input nor gate) gives a high output if the output is zero.

And that's the ALU complete. At an estimate, this will need four chips for the adder, eight chips for the eight-bit multiplexors and another for the carry multiplexor, two each for the three logic channels, one for the zero detect, and a handful of smaller single or double gate chips. Twenty or so chips. That should fit on our target half-eurocard size without too much trouble.

But can we do it another way? Of course we can... the complete ALU is nothing more than combinational logic with sixteen input bits of data, four bits to control the function (including the compare input), and one bit for carry in. It has ten bits of output: eight bits of output, one carry, and one zero flag. We can put this in a programmable device - a PROM of some kind.

A glance through the catalogue will reveal a startling lack of 2M x 10 bit parts - but one might consider a 4*4 ALU - in that case, each prom requires four bits each for A and B, four bits to select the function, and the carry - and produces four bits of output plus carry and zero. This is a much more manageable 8k x 6 bits, so two of these, with the carry passed through. An extra bit doubles the number of functions you can provide, should you wish (though a *left* shift is tricky). One issue is that PROM access is not as fast as discrete logic, and you need to pass through two PROMs in series. This may be a speed limiting factor at a lower speed than might be wanted.


Here's the logisim circuit for the complete ALU.

There's just one thing I haven't mentioned - the missing instruction SHL - shift left through the carry. It turns out it doesn't need any extra hardware... we simply add the accumulator to itself using the normal ADC logic. It may take an extra clock cycle to get the accumulator into the B input, but I can live with that for a simpler ALU.

No comments:

Post a Comment