Stepping back a bit. EGM March 1994.
![]()
Atari Jaguar and Genesis 32X racer comparison video is up.
Atari Jaguar (S-Video) and Genesis 32X (Composite) racers. Checkered Flag (0:17), Virtua Racing Deluxe (7:00), Power Drive Rally (21:52), Atari Karts (24:24), BC Racers 32X (27:42), Motocross Championship (37:03), Supercross (41:21), Super Burnout (46:10).
Was the 32X as powerful as the Jag? It seems like the only thing the 32X couldn't match or best was sound...on the other hand I swear I read somewhere that somebody at Sega commented on the 32X after it was gone and said that programmers were already pushing it to the limit and there really wasn't much more they could do with it, that it wasn't the powerhouse Sega claimed it was and the 32X wouldn't be able to really do anything more than it already was, any idea if that's true?
Did the Jag have any true potential or was it squandered by sloppy development?
From what I can tell the Jaguar was squandered by under developed games and poor development kits and low budget games. Checkered Flag not being as good as Virtua Racing Deluxe should simply be a matter of development talent, time and money spent. The Jaguar should technically be able to best the 32X if all of its bus contentions between the various chips were handled. Everything I have found about both system's PR specs for 3D shows the 32X ahead of the Jaguar for flat polygons and textured though. The 32X has one clear problem, the extremely limited RAM, which was less of a problem with the 32X CD and no problem for carts with RAM built in going into 1996 and 1997.
The Genesis 32X and Jaguar without the CD were never going to beat a CD based 3D system, but they had a lot of raw processing capability to succeed sales wise before the next generation really took off in late 1997.
I wonder if games like Metal Slug would have been possible on the 32X, and I mean as close as possible to the actual game play with the sound and animation.
Metal Slug would have needed enough ROM to handle the animation and that's about it I think. I can't think of any reason why a 4 MegaByte (32-mbit) game couldn't come pretty close to the Playstation and Saturn games, much less a 5 Megabyte ROM. The need for more RAM should only be necessary for textured (and lit?) polygonal 3D games and height mapped games. I could be thinking incorrectly about how the 32X displays everything through the frame buffers though.
Responding to one, adding to the other...
The Jaguar was more powerful, but harder to work with. First, due to Atari pushing the Jaguar out early, the chipset had bugs that prevented programmers from putting JRISC code in main ram - all GPU and DSP code had to be run from their respective local ram. A couple Jaguar developers have since learned how to work around those bugs so that GPU and DSP code can be put in main ram. This would have been a big boost for games back when the Jaguar was relevant. So the first problem was nearly all game code had to be run by the 68000, which limited the speed; worse, the more the 68000 ran, the more bandwidth is stolen from the other chips... which is the second issue.
After the chipset bugs, the second biggest problem with the Jaguar was bandwidth - the Jaguar has a high bandwidth to main ram, but there's just one bus. The GPU and DSP have to run in their local ram to stay off the bus; if they access the rom or the main ram, they have to wait for a free cycle to main ram. The 68000 is slow (relatively speaking), so when it's running, it hogs the main ram/rom bus heavily. When you wish the GPU/DSP/BLITTER to run fast, you need to STOP the 68000 (an instruction that halts the 68000 until an int occurs) so it doesn't take bus cycles.
Third biggest problem - video is in the main ram. There is no separate buffer for video, so you have to draw to the main ram for your display, and then you have to fetch the data from the main ram to actually display it (that's what the Object Processor does - fetch areas of memory for display on the screen). Actually, this is only USUALLY true - the OP can fetch video data from anywhere - the rom, the ram, the GPU or DSP local ram... it's just that the rom is slower than ram, and local ram is being used by the GPU/DSP, so 99% of the time, video is in main ram. The time spent fetching video data takes away from the bandwidth available for everything else.
Despite all that, the Jaguar is a real powerhouse for the time - two MBytes of 64 bit wide ram, dual 32-bit RISC CPUs AND a hardware BLITTER, as well as a 13 MHz 68000. The rom could be 32, 16, or 8 bits wide (32 was standard). If they had known how to run GPU and DSP code from main ram, and if they had a decent compiler, the Jaguar would have done far better.
The 32X has things it did very well - it's an excellent design for all that it's an add-on. The 68000 running code in the Genesis work ram doesn't take cycles from the SH2 processors in the 32X. The SH2s have 4KB caches that help keep them off the bus - they are like the local ram in the Jaguar GPU/DSP. The SH2 were rather robust and bug-free, and had decent compiler support. You could run SH2 code from rom, sdram, or the internal cache.
The video is kept in separate dual buffers; while one buffer is written to, the other is fetched for display. Neither interferes with the other. The 32X is much easier for developers to avoid bus contention.
However, the 32X has its own flaws - mostly due to cost-cutting measures to keep the price as low as possible (since it's an add-on). The primary flaw is no hardware video acceleration. The only thing the 32X video chip can do is display video, and fill a line with a solid color. EVERYTHING has to be done in software on the 32X. Given the superior nature of the buses and the processors used, it's not quite as bad as you would expect. The SH2 is an excellent processor for doing things like that.
Second biggest problem - lack of ram. The 32X was mostly designed to run from cart. It seems clear that SEGA felt there was still life in carts... and so did Nintendo. Had the 32X lived on, we'd have probably seen 16, 24, or even 32 MByte carts on the 32X. A full version of Doom would have been possible in a 12 to 16 MByte cart.
Third biggest problem - the buses on the 32X are only 16 bits wide. This is the main source of the difference in speed between the 32X and the Jaguar; while the Jaguar used 64-bit paths wherever it could, it was all 16-bit on the 32X... 16 bit bus to the cart, 16 bit bus to the video ram, 16 bit bus to the sdram. The only part that was full width/speed was the cache ram, and that's only because it was INSIDE the SH2!
Conclusion: the Jaguar was much more powerful, but bugs and limited programmer tools prevented devs from taking advantage of the power. The 32X was easier to work with, but was only supported for about 6 months, with a few games straggling out over a year. There simply wasn't time to see the improvement that always occurs in the second generation of software for consoles as devs get used to working on the console.
I missed this the last dozen times I read this. Why would 12 to 16 MegaByte carts be required when Doom 95 is only 4 MegaBytes? I didn't realize that Doom 95 was only the first episode of Doom 1. Come on people, the Internet was invented for conflict, tell me when I'm wrong! I expect nothing less, especially in a "me too" thread such as this one turned out to be.
Anyway, it looks like both Doom 1 and Doom 2 can be compressed to around 40 Megabytes total, with the Doom 1 WAD being 11 Megabytes on its own. So, now I understand why all of the console ports are based on the Jaguar version or worse (SNES).
Only 4KB of space per SH-2 for 32-bit processes? Sounds like the 32X is just a 16-bit add-on to me.
Why don't comments like this every get put in people's signatures? Oh yeah, see above comment on why the Internet was invented.
Yeah, the first episode by itself (the shareware version) is fairly small.
Doom 1 + Doom 2 ~ 24MBytes, UNCOMPRESSED. That link is to a SkullTag setup of D1 & 2 that includes lots of skulltag data packs. Not to mention that the QT4 DLL included is 8MB all by itself.Anyway, it looks like both Doom 1 and Doom 2 can be compressed to around 40 Megabytes total, with the Doom 1 WAD being 11 Megabytes on its own. So, now I understand why all of the console ports are based on the Jaguar version or worse (SNES).
I'd have loved to have seen a Doom 1 + Doom 2 full game for 32X on a 24MB cart. Or Final Doom (TnT + Plutonia) on a 32MB cart.
Only 4KB of INTERNAL CACHE! That was HUGE for the time. Remember that the 68030 only had 512 BYTES of cache (split between code and data)! The 486 only had 8KB of cache. The SH2 was designed to be competitive with the 486 in areas where memory protection or user states weren't needed (embedded appliances or consoles). The SH2 is faster than the 486 at the same clock rate.Only 4KB of space per SH-2 for 32-bit processes? Sounds like the 32X is just a 16-bit add-on to me.![]()
Also remember that while the SH2 in the 32X was restricted to a 16-bit bus, it at least did burst reads from SDRAM, allowing it to fetch 8 words in 12 clock cycles.
Chilly, I am constantly awed by the depths of your hardware knowledge.
I wish I could help with some of the grunt work on making that happen. I am good with code and data entry so long as there is a clear process and a way to check my work.
Oh, okay, I wouldn't have figured that 4KB could do much, but I still have a hard time thinking in cycles and scanlines. I don't think I've seen a direct comparison to the 486 before, that is very interesting. Man, it will be great to see what you can come up with. Between your work on the 32X and now the Saturn, I am getting an itch to dig through what documentation has been unearthed to see what I can find out too.
Well, the place to start is with existing examples... make sure you can build them yourself, then try doing some modifications. For example, take my port of xrick, and then try changing the controller arrangement.
Well, my work on the 32X goes beyond most docs and examples... Sega never got around to DMA sound, for example. The Saturn is much more documented and used by Sega's examples.Oh, okay, I wouldn't have figured that 4KB could do much, but I still have a hard time thinking in cycles and scanlines. I don't think I've seen a direct comparison to the 486 before, that is very interesting. Man, it will be great to see what you can come up with. Between your work on the 32X and now the Saturn, I am getting an itch to dig through what documentation has been unearthed to see what I can find out too.
A very late response here, but I hope this is still relevant. (or at least for others browsing this for info)
There's a little more to it than this on the Jaguar:
As far as the J-RISCs go, only one of them (the one in the TOM chip, designated as the CPU or GPU depending on where you look) actually has a workaround to work in main RAM. Even so, neither has a hardware-managed cache, but only scratchpad, so programming is still more painful to optimize for than the likes of the SH2s allow. (among various other architectural trade-offs)
The RISC core in JERRY is far more limited than TOM's in several areas though, effectively making TOM (mostly) the only useful general purpose RISC core in the system. JERRY has a buggy memory interface that has a massive delay in accessing main memory; to quote Kskunk on this:
http://www.atariage.com/forums/topic...0#entry1771522
There's three big bus performance problems in Jerry, but only one is caused by the 68K. Jerry has a 32-bit bus, but with a 68K installed, Jerry must run in 16-bit mode. (This is because Tom sees all bus masters as the same -- so both the 68K and Jerry must use the same bus width.)
The next problem is that Jerry's memory pipeline is hard-coded to delay 6 cycles. So with a 68K, Jerry takes 24 cycles to read a 64-bit word. With a 68020, Jerry would use 12 cycles to read a 64-bit word, but that is still much worse than Tom which can read a 64-bit word in 2 cycles.
The final problem is that Jerry's memory pipeline has buggy writes, so writing a 16-bit word takes not 6 cycles, but 12.
They may have intended Jerry to work primarily as an audio synthesizer (i.e., all computation, little memory access). It's pretty hard to use it for much else due to its slow memory interface.
In addition to this, it should also be noted that there's some key graphical areas where the Jaguar doesn't feature any buffering to make use of 64-bit wide or page-mode bandwidth needed to optimize main RAM bandwidth.Third biggest problem - video is in the main ram. There is no separate buffer for video, so you have to draw to the main ram for your display, and then you have to fetch the data from the main ram to actually display it (that's what the Object Processor does - fetch areas of memory for display on the screen). Actually, this is only USUALLY true - the OP can fetch video data from anywhere - the rom, the ram, the GPU or DSP local ram... it's just that the rom is slower than ram, and local ram is being used by the GPU/DSP, so 99% of the time, video is in main ram. The time spent fetching video data takes away from the bandwidth available for everything else.
Despite all that, the Jaguar is a real powerhouse for the time - two MBytes of 64 bit wide ram, dual 32-bit RISC CPUs AND a hardware BLITTER, as well as a 13 MHz 68000. The rom could be 32, 16, or 8 bits wide (32 was standard). If they had known how to run GPU and DSP code from main ram, and if they had a decent compiler, the Jaguar would have done far better.
For 2D and 3D texture mapping effects (affine mapping type stuff, including scaling and rotation effects), the blitter has only limited hardware support making it only able to render one pixel at a time (regardless of bit depth), so a common 16-bit color depth scene could only be rendered 1 pixel per read/write. (for a total of 11 cycles per textel rendered, reading textels from main RAM and then writing to the framebuffer) That's still a lot better off than the 32x limited to software texture mapping though, and simple scaled objects (no rotation effects) can be done as sprites in the object processor at full bandwidth.
The blitter is extremely fast at filling smooth shaded polygons though (significantly higher peak fillrate than the PS1 GPU) and pretty much equal to solid filling polygons (flat shaded), which is one of the big areas the designers emphasized for the system's 3D capabilities. It's still line by line drawing though, with the RISC "GPU" assisting rasterization.
There are currently 1 users browsing this thread. (0 members and 1 guests)