Educational feature 2

Hello readers, how's it going? Good. Anyway, do you remember a while ago when Next Generation posted some detailed specs about the functions of the Hitachi SH4 processor, rumoured to be a vital component of the new Sega console? We had a bit of a laugh, because simple UK people that we are, we didn't understand a word. All that technical speak is now translated for your pleasure, thanks to the UK:Resistance technical support department. And some nice tech bloke called Simon.

In answer to your question of 17/09/97, namely "Does anybody understand this?" about the alledged Saturn II specs - well I understand them, because it was me who pointed out these Hitachi SH4 CPU specs to the guys at Next Generation when I came back from Microprocessor Forum in California this time last year. The next gen guys printed the news the next day, although at that time it had not been confirmed that Sega would indeed plump for the Hitachi chip for Saturn II. Much more recently, NextGen have had confirmation of the SH4 winning the Saturn II deal, and so they ran the same article again.
Sorry the tech specs don't make much sense in laymans terms, but they are meant for comparative purposes and wouldn't mean much to someone outside the microprocessor industry. Here's a stab at explaining what it's all about:
1. 360 Dhrystone v1.1 MIPS
Dhrystone is a synthetic benchmark used to rate CPU performance. You run the benchmark, it times itself, and gives the CPU a MIPS rating. The number is meaningless in itself, but can be useful in comparisons. So, for example, the current Saturn has two Hitachi SH2's for its main processors. These manage about 25 Dhrystone v1.1 MIPS each. From this you can deduce that the SH4 cpu is about 7 times faster than the combined power of the SH2s in the current saturn.
2. Scalar product in 3 cycles, fully pipelined (single-precision floating point) using just 1 instruction
Performing scalar product calculations is fundamental to geometry processing, so a cpu being designed for graphics needs to be able to do these fast. A scalar product involves multiplying two vectors together, and in the case of 3d graphics, these vectors are 4 elements long. So, the SH4 can do:

(a b c d) * (x y z w) -> (a*x b*y c*z d*w)

...in 3 cycles, where a, b, c etc are all 32-bit single precision floating point numbers. Basically, this thing will be able to throw lots of triangles around and fast! On most CPUs this same calculation would take at least 8 cycles, maybe more.
3. Matrix transform in 7 cycles, partially pipelined, single-precision floating point, using 1 instruction. That's 16 multiplies and 12 additions, all single-precision fp, in 1 instruction.
Matrix transform is another important part of geometry processing. It involves multiplying a vector by a matrix, which boils down to:

/ a1 a2 a3 a4 \ |x| |x'|
| b1 b2 b3 b4 | \/ |y| --- |y'|
| c1 c2 c3 c4 | /\ |z| --- |z'|
\ d1 d2 d3 d4 / |w| |w'|

This is the thing that needs 16 multiplies and 12 additions. The SH4 can do it in a total of 7 cycles. That's a *wow* number! The "partially pipelined" means the SH4 can start another transform before the previous one has finished, increasing the throughput. Basically it can do one of these transforms every 4 cycles. Most other machines would struggle to do one every 28 cycles...
Hope this is some help, and at least a little bit interesting. The Hitachi SH4 is one *hot* chip, designed with consoles very much in mind.

Hooray! Let's give Simon a big round of applause, not just for understanding that all, but for being bothered to type it all in and send it to us. Cheers mate, now can you tell me why my Mac keeps crashing all the time?