Quantcast

Page 1 of 46 1234511 ... LastLast
Results 1 to 15 of 690

Thread: Comparison of 5th generation ("32/64-bit") game console hardware

  1. #1
    Hero of Algol kool kitty89's Avatar
    Join Date
    Mar 2009
    Location
    San Jose, CA
    Age
    23
    Posts
    9,127
    Rep Power
    49

    Default Comparison of 5th generation ("32/64-bit") game console hardware

    Not quite sure if this should be in Blast processing (given the tech-head-ish nature of the discussion), or insert coin (as it has a lot more than Sega consoles involved), but anyway:

    This is sort of a catch-all thread for technical comparisons of the various 5th generation consoles (and perhaps computer hardware) to give a place for various off-topic discussions that arise in various threads. (and in part due to 16-bitter's Saturn tech thread getting merged into the mega-thread a while ago -and the "best chipset" thread disappearing prior to that)

    So to start, there's this topic (or mix of topics) stepping from the Saturn vs System 32 thread:
    http://www.sega-16.com/forum/showthr...aturn%29/page2


    In addition to what was already brought up there, I've been wondering about the 3DO's texture mapping ability too. (especially whether it's limited to single pixel reads/writes like the Sega CD/Jaguar/Saturn, or if it at least has a 32-bit destination buffer . . . that, and whether the 3DO could texture map directly from VRAM without hitting main RAM -it must support it on a hardware level, like the Jaguar, but I'm not sure if the libraries did . . . it also would obviously have been significant slower than pulling from main RAM, but there's CPU contention to consider -at some point, having 100% CPU bandwidth and 100% slow texture mapping bandwidth would be preferable)

    That, and whether the Saturn does any added buffering to allow faster texture rendering in 8bpp mode (like a 16-bit destination buffer).
    6 days older than SEGA Genesis
    -------------
    Quote Originally Posted by evilevoix View Post
    Dude it’s the bios that marries the 16 bit and the 8 bit that makes it 24 bit. If SNK released their double speed bios revision SNK would have had the world’s first 48 bit machine, IDK how you keep ignoring this.

  2. #2
    Wildside Expert
    Join Date
    May 2011
    Posts
    144
    Rep Power
    4

    Default

    Hi Kitty, my reply here

    Quote Originally Posted by kool kitty89 View Post
    It does exceed the sprite capabilities, right? (or have I been misled on the object processor's capabilities?)
    So it compares more favorably against the PSX (which doesn't have the tilemap advantage) . . . or for Saturn games pretending to be the PSX (or Neo Geo) and only using "sprites."
    If you take the entire Jaguar's best case , and compare that against the Saturn VDP1 only then it appears comparable, but it's not quite as simple in practise and the hard limits on sprite overdraw are based on max pixels per hor. line , not total pixels per frame.
    Once any backgrounds are involved there's no comparision at all.

    Also when maxing out the Jaguar this way you end up starving the cpu of any bandwidth - so all processing end's up having to run in vertical blank. And with a complex sprite system the overhead of maintaining the object list can become excessive.



    Quote Originally Posted by kool kitty89 View Post
    I meant blitter-like, in the same since that the Saturn's VDP1 (or 3DO's cel engine -which Kskunk recently described as a "Suzie-like blitter" ) is a blitter-type set-up and Sega CD's graphics system is a blitter. (calling it a GPU also doesn't make sense in the modern context . . . though the Jaguar's GPU is a lot closer to that -the only mass-market "GPU" in that context was probably the TMS340 series, everything else was an advanced VDC or blitter-like graphics system )
    After all, the Jaguar's "blitter" was also in a similar range, much more so with the Jaguar II. (given it id pretty much everything the PSX's "GPU" did and more)
    Not really, even the Jaguar II still was a 'blitter' - the PSX GPU was a full geometric drawing engine for 2D triangles and lines , with all setup handled internally. ( Plus the name of the chip was GPU )

  3. #3
    Mastering your Systems Hero of Algol TmEE's Avatar
    Join Date
    Oct 2007
    Location
    Estonia, Rapla City
    Age
    23
    Posts
    9,071
    Rep Power
    68

    Default

    So, unless you call a ~6-7% hit (for a 320x224/240x16bpp 60 Hz) to bus time "massive overhead", that's not a problem. (how do you think the jaguar manages to scan the framebuffer from shared DRAM . . . actually it can scan multiple framebuffers and composite them via the object processor -treating them as separate "sprite" objects, but for a simple single frame buffer scanning for a 320x224x16-bit 60 Hz display, it would be about 8% of the bus time)

    Efficient, heavily buffered, serial bus sharing is the way to maximize bandwidth on a shared FPM DRAM bus. (the PSX had a MUCH easier time of it than the jaguar given the multiple buses and inclusion of caches in addition to line buffers -the 1996 Jaguar II added more extensive buffering for many more operations as well as actual caching -the N64 undoubtedly has extremely heavy caching and buffering to make use of its narrow, high-latency, VERY high page-change penalty, but very high peak bandwidth bus -the PS2 does that too with 16-bit RDRAM with massive penalties if you use the system "wrong")

    Pretty much all graphics hardware since then has made use of single buses, though some used special RAM to assist things too (VRAM mainly just cuts out the framebuffer scanning overhead, so not that significant in the long run, multi-bank RAM was rather interesting, SGRAM's dual page holding ability was cheaper than VRAM and considerably more useful -better than multi-bank RAM in some respects, but in the end, most opted for plain FPM/EDO/SDR/DDR/etc with increasingly wide buses and increasingly heavy buffering)
    The hit must be way higher... for example DRAM memory can to 16 reads OR writes in 1 time unit, but it can only 3...4 reads AND writes in that same time unit. That is a massive lost. Now if the reads and writes happen in same DRAM row then you get mo performance hit.

    Also massive caching will help, BUT you are forgetting the caches have to be written back to the memory at some point and while you write them you cannot do anything else, which is why I said it only softens the blow, and does not eliminate it.

    PSX can have only 2x higher peak performance than Saturn, due to 32bit VRAM bus or 16+16, if there were 2x 34bit banks the bandwidth must be much higher than 133MB/sec so it is probably 16+16 setup, which makes more sense given what chips are used on the mobo and what do later models look like. Some early mobos use 4x 8bit 60ns DRAM chips. 60ns is access time within a row, and for 8bit memory it only gives 16.7MB/sec bandwidth per chip, you got 4 of them so its around 65MB/sec. Access that is done on different pages will result in much poorer performance, about 40% less, random access is 104ns.
    That 133MB/sec figure is only plausible when some really small chunk of memory is being manipulated with so it all fits into the internal caches. There is no way this is achieved in any other way.
    I only read that there's 2KB of cache, and that isn't whole lot of memory

    I am designing a computer right now, and while its maximum bus bandwidth is 67MB/sec, you will only reach it in ideal condition when only one single device is on the bus and no other is. Now if I add CPU to the mix, some DMAs happening every once in a while I will end up with a figure nowhere near that 67MB/sec, I am not yet sure how big the hit is but its minimally 25%. I also use 60ns EDO DRAM in the system and within DRAM you will not achieve that 67MB/sec ever, there's refresh and other aspects that get in the way. Refresh messes up all your timings, nice chunk of performance goes lost into refreshes, though in video applications you can just interleave your data so that the chips never need refresh cycles, but that assumes VRAM access that is all over DRAMs so you can omit refresh circuitry from the DRAM controller and live with slight performance penalty if the accesses aren't proeprly thought out (random access is SLOW)
    Death To MP3, :3
    Mida sa loed ? Nagunii aru ei saa "Gnirts test is a shit" New and growing website of total jawusumness !

  4. #4
    Road Rasher
    Join Date
    Apr 2011
    Posts
    471
    Rep Power
    17

    Default Awesome

    This is my dream thread, and I must admit these tech talks were kind of the reason I was drawn into joining the forums.

    Is it a worthy rep? Personally I think so!

    Anyway, keep on rolling...

  5. #5
    Wildside Expert
    Join Date
    May 2011
    Posts
    144
    Rep Power
    4

    Default

    Quote Originally Posted by TmEE View Post
    Also massive caching will help, BUT you are forgetting the caches have to be written back to the memory at some point and while you write them you cannot do anything else, which is why I said it only softens the blow, and does not eliminate it.
    For the GPU the cache is read only ( texture cache ) - and the drawing order is intended to maximise fast writes

    Quote Originally Posted by TmEE View Post
    PSX can have only 2x higher peak performance than Saturn, due to 32bit VRAM bus or 16+16, if there were 2x 34bit banks the bandwidth must be much higher than 133MB/sec so it is probably 16+16 setup, which makes more sense given what chips are used on the mobo and what do later models look like.
    sounds roughly correct - dont forget Saturn VDP1 also need bandwidth to read drawing commands as well as textures, and loses bandwidth when drawing commands are copied from the SH2 side. PSX drawing commands come from host memory, there's no VRAM side.

  6. #6
    Hero of Algol kool kitty89's Avatar
    Join Date
    Mar 2009
    Location
    San Jose, CA
    Age
    23
    Posts
    9,127
    Rep Power
    49

    Default

    Quote Originally Posted by Crazyace View Post
    If you take the entire Jaguar's best case , and compare that against the Saturn VDP1 only then it appears comparable, but it's not quite as simple in practise and the hard limits on sprite overdraw are based on max pixels per hor. line , not total pixels per frame.
    Once any backgrounds are involved there's no comparision at all.
    Hmm, and any blitter rendered stuff would STILL add to that overdraw limitation as the framebuffer as scanned out as part of the object list, right? (which does allow neat things like fast full-screen scaling -and, of course, use of the OPL's line buffers for scanning the frame buffer at max bandwdith)

    So the Jaguar is good for what it manages at low cost (and for being an older -though extremely cutting-edge- design), but not so much in raw performance. (then again, take almost the same hardware, but remove the super tight cost constraints -like being fixed to a unified bus, cheap CPU without cache, or -especially- relatively slow FPM DRAM, and things would change a lot -like with 35 ns EDO DRAM and something in line with an SH2 in place of the 68k, even without adding a 2nd bus -especially with the 2nd bank enabled for minimized page breaks . . . or no added CPU at all, but enough R&D time/resource to debug the GPU and DSP and add a cache in at least one of the 2 )

    Flare did exceptionally well with low-cost limits on design and Atari's funding limits (and the very small R&D staff involved). Not to knock Sega's R&D though, they probably could have managed better (albeit spending a lot more on R&D to do so), but the Saturn seems to have gone in a rather odd direction as far as an efficient/unified design and one keeping costs to practical levels for the time. (not just the use of SDRAM, multiple buses for less-buffered graphics chips, or such, but things like the oddly overbuilt CD-ROM interface -SH1 with 512k of SDRAM dedicated to it where a cheap MCU and a small chunk of DRAM would probably be dine, hell, they could have built off of the MCD's 68000 based interface and chipset for that matter -tweaking things to support 2x speed mode and probably boosting the cache to at least 32k)

    If the Saturn had been backwards compatible with the MD+MCD, it would have made a lot more sense to have some odd inefficiencies . . . except they probably could have had a comparably powerful (as far as consumers could see) and more cost effective machine and still made it backwards compatible.

    Albeit there's also the rumor that went around the industry (and in the press) of the Saturn being significantly built-up late in design (usually claimed to be a result of the PSX threat, but it may have been more due to the 3DO/Jaguar and perhaps newer PC games -Sega of Japan was apparently taking the Jaguar's 1993 hype as a serious threat, partially spurring the 32x as well -and preceding "Super Megadrive" project; of course, they also had a good idea of what Nintendo was working with when the SGI partnership was announced -since Sega's own engineers had already scrutinized and critiqued that prototype chipset previously).
    Though it they really DID do such a redesign, it was certainly rather sloppy, or at very least not carefully cost controlled. (the earlier design was supposedly primarily built around VDP2 with a single SH1 as the CPU -presumably dual-prupose as CD-ROM controller as in the MCD- with the same sound system, less main RAM, either software rendered "sprites" to VDP2 or -more likely- a primitive incarnation of VDP1 lacking the 3D drawing modes -which could also explain why the warped quad modes have broken "half transparency" effects)

    The Saturn's VDP1 seems rather similar to the 3DO's Cel engine in a number of ways, though the Saturn obviously avoids the contention issue by adding a multi-bus video subsystem and uses RAM clocked as fast as the processor too. (but the 3DO has a 32-bit bus to work with, so potentially close to the same bandwidth -not sure how many operations actually supported 32-bit accesses though)
    If the 3DO's Cell does actually allow most things in 32-bits (fill, gouraud shading, non-rotated 2D blits) that would get it a lot closer to VDP1 performance on its much slower RAM, especially if it had a 32-bit write buffer for texture mapping. (I doubt it has a 32-bit read buffer though)
    If the Cel doesn't so much of anything at 32-bits, the only reason to even have 32-bit VRAM would be for famebuffer scanning bandwidth. (ie for the rather unnecessary/overkill high-res/high-depth modes) A 32-bit write buffer would also make it faster than the Jaguar's peak texture mapping rate even if using VRAM as source and destination.


    Also when maxing out the Jaguar this way you end up starving the cpu of any bandwidth - so all processing end's up having to run in vertical blank. And with a complex sprite system the overhead of maintaining the object list can become excessive.
    Isn't it preferable to use the GPU to build the object lists? (especially in such intensive cases . . . and where the GPU has no overhead for 3D or such)


    Not really, even the Jaguar II still was a 'blitter' - the PSX GPU was a full geometric drawing engine for 2D triangles and lines , with all setup handled internally. ( Plus the name of the chip was GPU )
    I thought the Jaguar II allowed triangle rasterization in a single blitter operation (not set up a line at a time like the Jaguar), with just the vertex information needed. (the Saturn's VDP1, 3DO's Cel, and PSX GPU both need separate processors to generate the vextex data too -the GTE is obviously not part of the GPU)

    Couldn't the jaguar blitter also freely draw vector lines as well? (albeit not all that useful for full rasterized 3D . . . and something the old Flare 1 video controller also supported -actually, that had the neat "color hold" mode too, allowing filled polygons to be formed from plain outlines -though you'd still need to handle hidden line removal, and you could only render filled polygons in that mode, no textures/bitmaps/sprites and no dithering -it definitely saves a lot of blitter bandwidth for pure polygonal games -the 32x's RLE mode might have allowed that to some extent too, but probably with significantly more overhead for formatting things to RLE correctly -and the flexibility of using textures/bitmap stuff as well)


    On another note, was I mistaken when I spoke of MARIA's line RAM being on-chip? (I'd gotten the impression that MARIA loads a scanline into on-chip RAM via DMA, and double buffers it to allow nearly free access of the main bus during active display to DMA the next line while the other line is being clocked out to the screen -of course, you'll end up with the CPU only working in vblank if you use all the bandwidth)
    I'm not sure if GTIA/ANTIC has anything remotely close to that (if it is, it's certainly no double buffered -which would mean you could only fill it in hblank), but I'd definitely gotten the impression that MARIA did it that way.

    The 7800 obviously isn't a 5th gen machine, but I was just using that as an analogy to the later line buffer systems used. (and how early such set-ups were being implemented in LSI) Albeit, applying that system to make efficient use of FPM accesses came much later.







    Quote Originally Posted by TmEE View Post
    The hit must be way higher... for example DRAM memory can to 16 reads OR writes in 1 time unit, but it can only 3...4 reads AND writes in that same time unit. That is a massive lost. Now if the reads and writes happen in same DRAM row then you get mo performance hit.
    Yes, there's a good bit of contention in general, but as far as the framebuffer is concerned, it shouldn't be a major hit. (I'm not positive on the GPU's low-level functionality, but it should be stealing the bus for long segments to buffer a scanline at a time -or close to it- and staying off the bus the rest of the time)

    Also massive caching will help, BUT you are forgetting the caches have to be written back to the memory at some point and while you write them you cannot do anything else, which is why I said it only softens the blow, and does not eliminate it.
    Yes, always an issue with efficient bus sharing of DRAM based systems with page mode support (old, slow DRAM systems have similar trade-offs as SRAM since all accesses are slow anyway).
    And again, it's not just caching, but additional buffering on top of that. (line buffers or "phrase buffers" especially important for un-cached operations -again, not sure on the specifics of the PSX GPU, but one direct comparison given to the jaguar was use of 64-bit source and 256-bit destination buffers to reach PSX level performance on the Jaguar's bus -working in cache would allow even higher bandwidth for cached textures, but there's the cache fill overhead to consider too)

    PSX can have only 2x higher peak performance than Saturn, due to 32bit VRAM bus or 16+16, if there were 2x 34bit banks the bandwidth must be much higher than 133MB/sec so it is probably 16+16 setup, which makes more sense given what chips are used on the mobo and what do later models look like. Some early mobos use 4x 8bit 60ns DRAM chips. 60ns is access time within a row, and for 8bit memory it only gives 16.7MB/sec bandwidth per chip, you got 4 of them so its around 65MB/sec. Access that is done on different pages will result in much poorer performance, about 40% less, random access is 104ns.
    60 ns seems way too slow, it should be 30 ns at least (for page mode accesses), it's too slow to allow 133 MB/s on a 32-bit bus. (then again, DRAM naming is a bit weird -wiki's example of "60 ns" referring to RAS low to valid data out, and that same RAM being capable of 25 ns FPM cycle times -so definitely compatible with the PSX . . . the Jaguar has 75 ns FPM accesses and 175 ns -actually longer due to 5 26.6 MHz cycles being the closest you can go- . . . so I guess that wouldn't actually be "75 ns DRAM" by convention -which would also imply that FPM cycle times of the MCD/SVP/32x's "80 ns" DRAM could be much faster than 80 ns)

    The PSX's DRAM should be operating at 1 access per 33.8 MHz clock (when in page mode) and (given the ~104 ns figure) adding 3 wait states for full random read/writes. (so ~33 MB/s rather than 133 -and minimizing those page breaks is essential to good performance, which is a huge part of that caching/buffering, maintaining page mode accesses as much as possible -most efficient for moving large chunks of data around, worst for rendering a lot of small objects where you'd have nearly constant page breaks even for plain solid color fills or cached textures)

    The PSX's DRAM is faster than the Saturn's RAM, at very least for page mode reads/writes. (~33 MHz vs 28.6/26.6 MHz) That's hardly surprising as EDO DRAM commonly went up to 33 MHz and the higher end stuff at least went to 40 MHz (most stuff from 50 MHz, and especially 66 MHz upward tends to be SDRAM). IIRC, common FPM DRAM rarely went beyond 22 MHz.

    If the PSX was only using 60 ns FPM accesses, it would be using super cheap DRAM in-line with the jaguar (the jaguar 2 was actually planned to use RAM in that speed range, the Jag 1 was 75 ns in FPM), and they probably would have invested in a GPU with 64-bit DMA (though the CPU would still be stuck on a slower 32-bit bus -with more wait states for random accesses too). It would also mean they were using much, much cheaper RAM than the Saturn (or 3Do VRAM) and only slightly more than the Jaguar or 3DO (CPU work RAM), so I highly doubt that's the case given Sony's position. (not to mention the 133 MHz CPU and video bus speeds listed in various documents)

    That 133MB/sec figure is only plausible when some really small chunk of memory is being manipulated with so it all fits into the internal caches. There is no way this is achieved in any other way.
    It's plausible for framebuffer reads a block fills to "VRAM" where fast page mode is sustained for long periods. (fills of tiny polygons would have a lot more overhead on average per pixel)
    Texture mapping would only reach close that speed for larger polygons using cached textures (small polygons end up with a lot more page breaks), and somewhat less for uncached textures. (though there's cache fill overhead to consider too -also more wasteful for small polygons that don't even use much of the texture)

    I only read that there's 2KB of cache, and that isn't whole lot of memory
    I'd gotten the impression there was an additional GPU data/command cache in addition to that, but I can't seem to find that reference again, so maybe I just remembered it wrong. (obviously it's got a lot of buffering for accelerating non-cached operations too)

    I am designing a computer right now, and while its maximum bus bandwidth is 67MB/sec, you will only reach it in ideal condition when only one single device is on the bus and no other is. Now if I add CPU to the mix, some DMAs happening every once in a while I will end up with a figure nowhere near that 67MB/sec, I am not yet sure how big the hit is but its minimally 25%. I also use 60ns EDO DRAM in the system and within DRAM you will not achieve that 67MB/sec ever, there's refresh and other aspects that get in the way. Refresh messes up all your timings, nice chunk of performance goes lost into refreshes, though in video applications you can just interleave your data so that the chips never need refresh cycles, but that assumes VRAM access that is all over DRAMs so you can omit refresh circuitry from the DRAM controller and live with slight performance penalty if the accesses aren't properly thought out (random access is SLOW)
    Yes, which is why any bus sharing MUST be heavily buffered and handled in long bursts as much as possible (better to end up with more wait states for bus masters than end up with tons of page breaks -of course, some things must have priority and can't wait, like when the framebuffer needs to be scanned -again, a good FPM based system would be buffering a long chunk of the framebuffer into a line buffer, possibly an entire scan line at a time)

    Only old/slow systems (into the mid/late 80s) with no facilities for page mode wouldn't have to worry about that . . . they were just stuck with slow random accesses. (like the 250 ns of the ST or 280 of the Amiga)
    Last edited by kool kitty89; 06-09-2011 at 06:03 PM.
    6 days older than SEGA Genesis
    -------------
    Quote Originally Posted by evilevoix View Post
    Dude it’s the bios that marries the 16 bit and the 8 bit that makes it 24 bit. If SNK released their double speed bios revision SNK would have had the world’s first 48 bit machine, IDK how you keep ignoring this.

  7. #7
    Mastering your Systems Hero of Algol TmEE's Avatar
    Join Date
    Oct 2007
    Location
    Estonia, Rapla City
    Age
    23
    Posts
    9,071
    Rep Power
    68

    Default

    Framebuffer can have some caching done but you only get very few cycles between pixels so even if you cache you don't win much, since you can only cache few pixels before you run out of time... DRAMs suck :P

    its most definitely 60ns, all photos of PSX mboos show 60ns DRAMs. 60ns figure is the fastest access you can do on the chip, which is access within a page. EDO is FPM, which slightly reduced access time and that's all.

    sounds roughly correct - dont forget Saturn VDP1 also need bandwidth to read drawing commands as well as textures, and loses bandwidth when drawing commands are copied from the SH2 side. PSX drawing commands come from host memory, there's no VRAM side.
    I wonder if there is any internal cache for the drawing commands.... ? I don't want to mess with Saturn just yet, I got way too many other things to finish first xD
    Death To MP3, :3
    Mida sa loed ? Nagunii aru ei saa "Gnirts test is a shit" New and growing website of total jawusumness !

  8. #8
    Wildside Expert
    Join Date
    May 2011
    Posts
    144
    Rep Power
    4

    Default

    Quote Originally Posted by kool kitty89 View Post
    Albeit there's also the rumor that went around the industry (and in the press) of the Saturn being significantly built-up late in design (usually claimed to be a result of the PSX threat, but it may have been more due to the 3DO/Jaguar and perhaps newer PC games -Sega of Japan was apparently taking the Jaguar's 1993 hype as a serious threat, partially spurring the 32x as well -and preceding "Super Megadrive" project; of course, they also had a good idea of what Nintendo was working with when the SGI partnership was announced -since Sega's own engineers had already scrutinized and critiqued that prototype chipset previously).
    Though it they really DID do such a redesign, it was certainly rather sloppy, or at very least not carefully cost controlled. (the earlier design was supposedly primarily built around VDP2 with a single SH1 as the CPU -presumably dual-prupose as CD-ROM controller as in the MCD- with the same sound system, less main RAM, either software rendered "sprites" to VDP2 or -more likely- a primitive incarnation of VDP1 lacking the 3D drawing modes -which could also explain why the warped quad modes have broken "half transparency" effects)
    The only thing that seemed 'built up' to me in the Saturn is 1M of the 2M memory - it's a different slower type. VDP1 & VDP2 are totally designed to function together, so only an idiot would suggest that one of them was an 'add on'.

    Quote Originally Posted by kool kitty89 View Post
    The Saturn's VDP1 seems rather similar to the 3DO's Cel engine in a number of ways, though the Saturn obviously avoids the contention issue by adding a multi-bus video subsystem and uses RAM clocked as fast as the processor too. (but the 3DO has a 32-bit bus to work with, so potentially close to the same bandwidth -not sure how many operations actually supported 32-bit accesses though)
    If the 3DO's Cell does actually allow most things in 32-bits (fill, gouraud shading, non-rotated 2D blits) that would get it a lot closer to VDP1 performance on its much slower RAM, especially if it had a 32-bit write buffer for texture mapping. (I doubt it has a 32-bit read buffer though)
    If the Cel doesn't so much of anything at 32-bits, the only reason to even have 32-bit VRAM would be for famebuffer scanning bandwidth. (ie for the rather unnecessary/overkill high-res/high-depth modes) A 32-bit write buffer would also make it faster than the Jaguar's peak texture mapping rate even if using VRAM as source and destination.
    3DO's Cell draw's in a similar way to VDP1 - but has some nice image quality features ( counterweights to give 'super sampled' feel to some graphics )

    Quote Originally Posted by kool kitty89 View Post
    Isn't it preferable to use the GPU to build the object lists? (especially in such intensive cases . . . and where the GPU has no overhead for 3D or such)


    I thought the Jaguar II allowed triangle rasterization in a single blitter operation (not set up a line at a time like the Jaguar), with just the vertex information needed. (the Saturn's VDP1, 3DO's Cel, and PSX GPU both need separate processors to generate the vextex data too -the GTE is obviously not part of the GPU)
    gpu/cpu - they still need b/w to fetch and write data.

    Jag II had trapezoidal support - but not arbitary triangles.

  9. #9
    Hero of Algol kool kitty89's Avatar
    Join Date
    Mar 2009
    Location
    San Jose, CA
    Age
    23
    Posts
    9,127
    Rep Power
    49

    Default

    Quote Originally Posted by TmEE View Post
    Framebuffer can have some caching done but you only get very few cycles between pixels so even if you cache you don't win much, since you can only cache few pixels before you run out of time... DRAMs suck :P
    Run out of time until the next read/write cycle after refresh. (and if you had double buffered scan-lines on-chip with one reading to the screen and the other being filled, there's a lot of flexibility -iirc, that's what the 7800's MARIA did back in 1983/84 with on-chip line RAM to build the scan line from the display list, albeit limited to pretty low resolutions/depths)

    its most definitely 60ns, all photos of PSX mboos show 60ns DRAMs. 60ns figure is the fastest access you can do on the chip, which is access within a page. EDO is FPM, which slightly reduced access time and that's all.
    OK, then the "60 ns" reference on wiki's DRAM discription is incorrect?
    http://en.wikipedia.org/wiki/Dynamic...#Memory_timing

    If that IS the case, that means the 133 MB/s figure for the CPU and GPU is false and the RAM used is only moderately faster than the 80 ns FPM RAM used in the 3DO. (or MCD/SVP/32x for that matter)
    It also means the PSX was using pretty damn cheap RAM in it on top of the lesser number of buses and individual components and Sony's vertical integration on top of that. (making the cost/price gap a lot wider than I was thinking)

    However, the numerous mentions of 133 MB/s DRAM bus connections (including from some PSX programmers) really doesn't mesh with 60 ns DRAM. (it definitely could be EDO DRAM at 33 MHz . . . some PC video cards were using faster EDO DRAM than that around the same time -the original Rage used 40 MHz EDO DRAM)

    And I know FPM DRAM is almost the same as EDO (and both asynchronous), but it was my impression that mass market FPM DRAM available maxed out somewhat lower than EDO DRAM.

    I wonder if there is any internal cache for the drawing commands.... ? I don't want to mess with Saturn just yet, I got way too many other things to finish first xD
    If there is, that would improve flexibility a good bit . . . and also make software rasterization easier. (one of the suggestions of what the jaguar should have had is a blitter command cache to allow the GPU and blitter run in parallel for 3D drawing much more easily)








    Quote Originally Posted by Crazyace View Post
    The only thing that seemed 'built up' to me in the Saturn is 1M of the 2M memory - it's a different slower type. VDP1 & VDP2 are totally designed to function together, so only an idiot would suggest that one of them was an 'add on'.
    It really depends how much the system was modified (if at all), but we'll never know the real story unless someone like Sato takes interest in discussing the topic. (or if some of Sega of America's engineers have some information -if anyone, it would probably be Joe Miller)

    As for VDP1+VDP2, yes they make sense together . . . but without VDP1 it wouldn't be much different than what NEC did with the PCFX (except the PCFX had the old SuperGrafx sprite+BG engine as well -albeit that would make as much sense as VDP1 being replaced by a modified/doubled MD VDP ).
    Engineering VDP1 at the last minute (ie within the final 10 months before release) is really far fetched, but one more realistic possibility might have been VDP1 being a more primitive incarnation of what is is now, perhaps lacking the 3D (warped) quad drawing modes and just having the 2D "sprite" engine with scaling and rotation support. (rather like an updated derivative of the graphics coprocessor in the MCD's gate array -and a VDP1 like that would assist in 3D rather like the Jaguar's blitter)
    Possible evidence for that is the broken half-transparency effect in warped primitives. (total speculation though and very circumstantial)

    As for the RAM, I've seen that as "evidence" before, but it doesn't seem like much of a smoking gun to me. (it seems more likely that it was just put in there as cheap auxiliary storage, though the fact that it's only 16-bits wide -in spite of using 2 256kx16-bit DRAMs- is a bit odd -and it would have been just as strange if coupled with the SH1 as the main CPU -though the SH1's SDRAM in the Saturn is also 16-bits wide like in the 32x, but that makes more sense since both cases use a single 16-bit wide SDRAM chip)

    This was brought up before a few times, notably here:
    http://www.sega-16.com/forum/showthr...-hype&p=277055 (chilly willy's post about the rumor going around the industry back then)
    then a lot more discussion on it here:
    http://www.sega-16.com/forum/showthr...2X-hype/page12 (mainly between Chilly Willy and me)
    Also this rather vague, but interesting press release by EDGE in late 1993
    Quote Originally Posted by Team Andromeda View Post
    Yes it was around November time, covered in Edge issue 4 1994 [typo, really 1993]. Long after the 1st Saturn tech spec's were leaking out . In Edge Issue 5 the Saturn spec's were as follows (being brief)

    Hitachi: SH7032 CPU RISC CHIP Running at 27 Mhz
    Memory: 3Mb Ram
    Sound: 32 channels, support for PCM and FM
    Relase: November 94

    Again, if there is any truth to the Saturn redesign (especially if it's anything like the SH1CPU with no VDP1 -or more likely, a 2D only VDP1) probably was not a response to the PSX specifically and probably started earlier on as a response to the poor performance compared to the Jaguar and 3DO (and their related hype), perhaps PC games as well. (and/or the realization that upcoming arcade games would be virtually unworkable)

    Such a major redesign should have started by mid 1993 at the very latest.

    3DO's Cell draw's in a similar way to VDP1 - but has some nice image quality features ( counterweights to give 'super sampled' feel to some graphics )
    The interpolation is nice, but overkill and a bit unnecessary for the time. (forcing things to 480i also limited framerate -not just in terms of peak framerate, but flexibility of a variable framerate -divisions of 30/25 rather than 60/50)
    The N64 did that for a handful of games (320x240 interpolated to 640x480) and I honestly don't like it that much. (Episode 1 racer does it in low detail mode, not sure what others -oddly, the intro and title screen are in 240p)

    Jag II had trapezoidal support - but not arbitary triangles.
    It used warped quads like the 3DO and Saturn then? (which technically can be folded into triangles at the expense of heavily warping textures, screwing up gouraud shading, and cutting fillrate in half due to cosntant 50% overdraw -but sevral Saturn games did exactly that using pre-warped textures to get around one of the problems . . . but I can't help but think a software line by line triangle rastrizer -set up by one of the CPUs and filled by VDP1- wouldn't have been faster overall, definitely for any games where fillrate is the limit and not CPU resource)

    Actually, doing a software rasterizer on the Saturn might get around the broken translucency effect since you're rendering in 2D scaled/rotated mode and nor warped mode.


    That reminds me: I've seen some conflicting details on what the VDP1 alpha blending bugs cause in warped mode and when they are a problem.
    From what I understand, plain 2D modes (1 or 2 point sprites) work fine for "half transparency" and VDP2 BG alpha blending effects work fine with everything (which makes sense since they're totally separate and treat everything on the VDP1 buffer as plain pixels with no added information other than priority -which might include alpha layer information in some modes, not sure), so that leaves problems with VDP1 objects and other VDP1 objects.
    It's my impression that any 2D sprites over any other part of the framebuffer can use half transparency properly, but warped primitives (with or without texture) will not work for that effect. (either they show up as solid objects or show garbage when transparency is enabled, I'm not sure)


    That reminds me:
    I've read over a lot of the Jaguar manual, but I don't seem to remember mention of alpha blending/translucency/"transparency" effects for the blitter or object processor being mentioned.
    Does the jaguar support any alpha blending effects in hardware? (be it in CRY or RGB modes )




    Oh, and I made a stupid mistake with that comment on the Slipstream's "color hold" mode . . . you can still use normal bitmap graphics, you just need to cater to the specifics of that mode. (and not leave any framebuffer regions left as 0s that you don't what to get filled according to color hold -all it does is repeat the last color scanned in the line as long as 0s are encountered and changes to another color when a non-0 pixel is encountered -so if you wanted to genlock over another screen, you'd need to include "transparent" as one of the indexes above zero and limit total color count to 14 or 254 depending on the mode)
    So it's just about as flexible as the 32x's RLE mode, but works a bit differently. (and probably requires less bandwidth and logic complexity in the video controller than RLE -which uses 16 bit values to set an 8-bit color and then an 8-bit count for pixel run-length)
    Last edited by kool kitty89; 06-11-2011 at 02:58 AM.
    6 days older than SEGA Genesis
    -------------
    Quote Originally Posted by evilevoix View Post
    Dude it’s the bios that marries the 16 bit and the 8 bit that makes it 24 bit. If SNK released their double speed bios revision SNK would have had the world’s first 48 bit machine, IDK how you keep ignoring this.

  10. #10
    Mastering your Systems Hero of Algol TmEE's Avatar
    Join Date
    Oct 2007
    Location
    Estonia, Rapla City
    Age
    23
    Posts
    9,071
    Rep Power
    68

    Default

    The only thing that seemed 'built up' to me in the Saturn is 1M of the 2M memory - it's a different slower type. VDP1 & VDP2 are totally designed to function together, so only an idiot would suggest that one of them was an 'add on'.
    Yes exactly, that slower type of RAM is really slapped on the design, when you look at Saturn schematics you see its hacked on there, like a tumor. Note says its only for holding data not running code.

    I did take a look at DRAM specs and tried to find the type used in PSX and seems there's FPMs in use not EDO. I messed up in page cycles, and on EDO they can be quite fast, 25ns, but on FPM is 45ns, so it will raise the possible bandwidth to 22MB/sec per chip. Only overhead is initial and final parts of the access, so accesses need to be long for them to be minimal.

    EDIT: I may have been looking at wrong kind of RAMs there... the 4 chips are 512KB and that is 2MB total which means they're main RAM not VRAM. VRAM is single 1MByte "synchronous GRAM" chip. Looking its properties up right now... definitely something expensive and exotic.
    Seems to be 32bit SDRAM with 2 banks, and and that case, yes 133MB/sec figure is all good.
    Looks like a quite fancy chip in the datasheet. PSX is saved by the 32bit bus of VRAM, that gives it double. The chip allows for much higher speeds than 133MB/sec though, I guess its not used to its fullest then. Its 12ns, with 2 or 3 cycle latency setting for up 55MHz or 83MHz operation, the 133MB/sec figure suggests the chip is ran at only 33.x MHz (probably the 33.8688MHz).
    Last edited by TmEE; 06-11-2011 at 04:45 AM.
    Death To MP3, :3
    Mida sa loed ? Nagunii aru ei saa "Gnirts test is a shit" New and growing website of total jawusumness !

  11. #11
    Wildside Expert
    Join Date
    May 2011
    Posts
    144
    Rep Power
    4

    Default

    KoolKitty,

    JagII did not use warped quads - just trapezoids with flat top/bottoms ( so 2 would be needed for an arbitary triangle ) - it's just a slight modification to the blitter - the lynx also could draw trapezoids

    TmEE,

    The PSX supported 133MB/s bursts on both main and video memory, so you are probably just reading the chip specs wrongly. ( Every company always quotes peak figures though )

  12. #12
    Mastering your Systems Hero of Algol TmEE's Avatar
    Join Date
    Oct 2007
    Location
    Estonia, Rapla City
    Age
    23
    Posts
    9,071
    Rep Power
    68

    Default

    On main RAM I cannot really see 133MB/sec happening, that would assume single cycle access on the memory, but that is not possible on such DRAM. 133MB/sec inside a (CPU) cache is definitely plausible
    on VRAM its definitely possible to get that 133MB/sec and beyond, and not even just burst accesses. I do see VRAM being clocked at 67MHz to minimize overhead on SDRAM commands etc., judging from the datasheet of the SGRAM chip there's quite a lot of empty wait cycles
    Death To MP3, :3
    Mida sa loed ? Nagunii aru ei saa "Gnirts test is a shit" New and growing website of total jawusumness !

  13. #13
    Wildside Expert
    Join Date
    May 2011
    Posts
    144
    Rep Power
    4

    Default

    As I said - 133MB/s burst ( 1 transfer/cycle ) do happen during DMA -that is a fact.

  14. #14
    Mastering your Systems Hero of Algol TmEE's Avatar
    Join Date
    Oct 2007
    Location
    Estonia, Rapla City
    Age
    23
    Posts
    9,071
    Rep Power
    68

    Default

    Then the DRAM has to be EDO not FPM, (and DRAM controller has to run faster than CPU does)
    Death To MP3, :3
    Mida sa loed ? Nagunii aru ei saa "Gnirts test is a shit" New and growing website of total jawusumness !

  15. #15
    Wildside Expert
    Join Date
    May 2011
    Posts
    144
    Rep Power
    4

    Default

    It was always described as SDRAM, so that's what I assumed it was.

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •