Quantcast

Page 118 of 145 FirstFirst ... 1868108114115116117118119120121122128 ... LastLast
Results 1,756 to 1,770 of 2166

Thread: Advantages of SNES hardware vs. Genesis hardware

  1. #1756
    Hero of Algol kool kitty89's Avatar
    Join Date
    Mar 2009
    Location
    San Jose, CA
    Age
    29
    Posts
    9,724
    Rep Power
    62

    Default

    Does anyone know how the SNES actually fares in DMA bandwidth for the VDP (for VRAM updates)?

    I remember at one point it was mentioned that, on paper, it should be similar to the MD in H32, but in reality it was hampered by problems that led to it falling far short of that. IIRC, the MD VDP does 198 bytes per vblank scanline in H40 and 166 bytes in H32, and a limtied amount in active display. (32 bytes?) But I've never seen any figures for the SNES.
    6 days older than SEGA Genesis
    -------------
    Quote Originally Posted by evilevoix View Post
    Dude it’s the bios that marries the 16 bit and the 8 bit that makes it 24 bit. If SNK released their double speed bios revision SNK would have had the world’s first 48 bit machine, IDK how you keep ignoring this.
    Quote Originally Posted by evilevoix View Post
    the PCE, that system has no extra silicone for music, how many resources are used to make music and it has less sprites than the MD on screen at once but a larger sprite area?

  2. #1757
    I remain nonsequitur Shining Hero sheath's Avatar
    Join Date
    Jul 2010
    Location
    Texas
    Age
    41
    Posts
    13,313
    Rep Power
    129

    Default

    I used what information was available in the RE docs years ago to come up with a comparable DMA bandwidth for the Genesis and SNES. The "tech community," contrarians they tend to be, would discuss the numbers without helping to confirm or deny them. What I came up with was:

    Genesis:
    Transfer Rate: 7.2 KB per 1/60th second12

    SNES:
    Transfer Rate: 5.72 KB per 1/60th second shared by 8 Channels 33
    "... If Sony reduced the price of the Playstation, Sega would have to follow suit in order to stay competitive, but Saturn's high manufacturing cost would then translate into huge losses for the company." p170 Revolutionaries at Sony.

    "We ... put Sega out of the hardware business ..." Peter Dille senior vice president of marketing at Sony Computer Entertainment

  3. #1758
    Road Rasher
    Join Date
    Apr 2013
    Location
    SF Bay Area, California
    Posts
    274
    Rep Power
    18

    Default

    Quote Originally Posted by kool kitty89 View Post
    IIRC, the MD VDP does 198 bytes per vblank scanline in H40 and 166 bytes in H32, and a limtied amount in active display. (32 bytes?)
    In H40, it's 204-bytes per scanline for 68K->VRAM transfers when the display is off or during the 37 lines it's not rendering. For 68K->VSRAM or 68K->CRAM it's 198 words per line (396 bytes). For active lines, it's 18 bytes for VRAM and 18 words for VSRAM and CRAM. For H32 mode it's 165 bytes (320 words for VSRAM and CRAM) per scanline when the display is off or during the 37 lines the VDP is not busy rendering. For active lines, it's 16 bytes (16 words for VSRAM and CRAM).

  4. #1759
    Mastering your Systems Shining Hero TmEE's Avatar
    Join Date
    Oct 2007
    Location
    Estonia, Rapla City
    Age
    28
    Posts
    10,063
    Rep Power
    107

    Default

    SNES can transfer one byte every 8 cycles. During active display no access is possible to VRAM and OAM and mostly unsuccessful access is possible to the CGRAM (you cannot do midframe palette swaps or VRAM updates on SNES), all VRAM/OAM/CGRAM updates have to be done in VBL (or forced blanking)

    On SNES a line is normally 1364 cycles long (there are some lines that are 4 cycles more or less but for simplicity we ignore that).
    NTSC machines have 262 lines total, which of 224 are active, this leaves 38 lines for VBL.
    PAL machines have 312 lines total, which of 224 or 239 are active, leaving 88 or 73 lines for VBL.

    Bytes per line is ~ 1364 / 8 ~ 170.5
    NTSCbytesPerFrame = (1364 * 38) / 8 = 6479
    PAL1bytesPerFrame = (1364 * 73) / 8 = 12446
    PAL2bytesPerFrame = (1364 * 88) / 8 = 15004


    In case of MD things are more complex due to involved refreshes and clock speed change during the line, but experimentally it has been verified that in H32 there is 161 access slots in VBL and 16 in active lines, and in H40 there's 198 in VBL and 18 in active.
    Unlike SNES, MD can transfer data to VRAM, VSRAM and CRAM during active display, albeit slowly. On SNES you get data corruption when doing that. Access to VSRAM and CRAM is in 16bits per transfer, and VRAM is 8bits.

    MD has 262 lines in NTSC which of 224 are active leaving same 38 lines for VBL as SNES.
    There is 313 lines in PAL which of 224 or 240 are active leaving 89 or 73 for VBL.

    Code:
    +----+------------+---------+---------+---------+
    | Hz | Resolution | Passive | Active  |  Total  |
    +----+------------+---------+---------+---------+
    | 60 | 256 * 224  |   6118  |   3584  |   9702  |
    |    | 320 * 224  |   7524  |   4032  |  11556  |
    +----+------------+---------+---------+---------+
    | 50 | 256 * 240  |  11753  |   3840  |  15593  |
    |    | 320 * 240  |  14454  |   4320  |  18774  |
    |    | 256 * 224  |  14329  |   3584  |  17913  |
    |    | 320 * 224  |  17622  |   4032  |  21654  |
    +----+------------+---------+---------+---------+
    EDIT: One thing I forgot about is refresh that lasts 40 cycles every line on SNES. The bus is locked during that time.

    Bytes per line is ~ (1364-40) / 8 ~ 165.5
    NTSCbytesPerFrame = ((1364-40) * 38) / 8 = 6289
    PAL1bytesPerFrame ~ ((1364-40) * 73) / 8 ~ 12081.5
    PAL2bytesPerFrame = ((1364-40) * 88) / 8 = 14564

    The SNES is still little bit faster than H32 in MD.
    Last edited by TmEE; 10-06-2013 at 03:41 AM.
    Death To MP3, :3
    Mida sa loed ? Nagunii aru ei saa "Gnirts test is a shit" New and growing website of total jawusumness !
    If any of my images in my posts no longer work you can find them in "FileDen Dump" on my site ^

  5. #1760
    Outrunner roundwars's Avatar
    Join Date
    Jan 2010
    Location
    California, USA
    Age
    28
    Posts
    574
    Rep Power
    16

    Default

    Now do PCE

  6. #1761
    Mastering your Systems Shining Hero TmEE's Avatar
    Join Date
    Oct 2007
    Location
    Estonia, Rapla City
    Age
    28
    Posts
    10,063
    Rep Power
    107

    Default

    TG16/PCE do not have DMA, but I know that CPU cannot use up the VRAM bandwidth in that machine, also there are ne restrictions on when you can transfer data, all the frame is available for max speed (and solely because it has one BG layer :P).

    CPU speed is 7159090Hz, there's ~60 frames per second and 262 lines per frame.
    ~119318 cycles per frame
    ~455 cycles per line

    Bandwidth here relates directly to how fast CPU can push data into VRAM, and I don't know how many cycles one such operation takes. Bandwidth value is above 2 figures divided by "that missing cycle count".
    Educated guess is that it takes one cycle to fetch instruction, second to load data, and third to store it. Which would be the best case scenario, in which case there would be :

    39772 bytes per frame
    151 bytes per line
    Last edited by TmEE; 10-06-2013 at 09:20 AM.
    Death To MP3, :3
    Mida sa loed ? Nagunii aru ei saa "Gnirts test is a shit" New and growing website of total jawusumness !
    If any of my images in my posts no longer work you can find them in "FileDen Dump" on my site ^

  7. #1762
    Death Bringer Raging in the Streets Black_Tiger's Avatar
    Join Date
    Oct 2006
    Location
    Vancouver
    Age
    41
    Posts
    4,278
    Rep Power
    99

    Default

    Quote Originally Posted by TmEE View Post
    TG16/PCE do not have DMA, but I know that CPU cannot use up the VRAM bandwidth in that machine, also there are ne restrictions on when you can transfer data, all the frame is available for max speed (and solely because it has one BG layer :P).

    CPU speed is 7159090Hz, there's ~60 frames per second and 262 lines per frame.
    ~119318 cycles per frame
    ~455 cycles per line

    Bandwidth here relates directly to how fast CPU can push data into VRAM, and I don't know how many cycles one such operation takes. Bandwidth value is above 2 figures divided by "that missing cycle count".
    Educated guess is that it takes one cycle to fetch instruction, second to load data, and third to store it. Which would be the best case scenario, in which case there would be :

    39772 bytes per frame
    151 bytes per line
    Does this mean that the SuperGrafx has restrictions for when it can transfer data?

  8. #1763
    Road Rasher
    Join Date
    Apr 2013
    Location
    SF Bay Area, California
    Posts
    274
    Rep Power
    18

    Default

    Quote Originally Posted by TmEE View Post
    In case of MD things are more complex due to involved refreshes and clock speed change during the line, but experimentally it has been verified that in H32 there is 161 access slots in VBL and 16 in active lines, and in H40 there's 198 in VBL and 18 in active.
    This is not entirely correct. In H40 there are 210 VRAM access slots, six of which get used for refresh during Vblank/display off. Additionally, the DMA engine can read one word from the 68K's bus per access slot; however, two reads get missed whenever there's a VRAM refresh cycle. When VRAM is the target, this doesn't matter as it takes two slots to retire a single word anyway; however, CRAM and VSRAM do not have this limitation since they're more or less word wide. So it's 204 bytes to VRAM or 198 words to CRAM or VSRAM. The picture is similar in H32. There are 171 slots per line, five of which are refresh so we get 166 bytes to VRAM or 161 words to CRAM/VSRAM per line (I apologize for the off-by-one error in my previous post, temporarily forgot that I was using zero based slot numbers in my code).

    Quote Originally Posted by TmEE View Post
    MD has 262 lines in NTSC which of 224 are active leaving same 38 lines for VBL as SNES.
    There is 313 lines in PAL which of 224 or 240 are active leaving 89 or 73 for VBL.
    224/240 lines are displayed, but the VDP is busy rendering for 225. The reason for this is that sprites are processed and rendered to a buffer a line early.

  9. #1764
    Hero of Algol kool kitty89's Avatar
    Join Date
    Mar 2009
    Location
    San Jose, CA
    Age
    29
    Posts
    9,724
    Rep Power
    62

    Default

    Quote Originally Posted by TmEE View Post
    MD has 262 lines in NTSC which of 224 are active leaving same 38 lines for VBL as SNES.
    There is 313 lines in PAL which of 224 or 240 are active leaving 89 or 73 for VBL.

    Code:
    +----+------------+---------+---------+---------+
    | Hz | Resolution | Passive | Active  |  Total  |
    +----+------------+---------+---------+---------+
    | 60 | 256 * 224  |   6118  |   3584  |   9702  |
    |    | 320 * 224  |   7524  |   4032  |  11556  |
    +----+------------+---------+---------+---------+
    | 50 | 256 * 240  |  11753  |   3840  |  15593  |
    |    | 320 * 240  |  14454  |   4320  |  18774  |
    |    | 256 * 224  |  14329  |   3584  |  17913  |
    |    | 320 * 224  |  17622  |   4032  |  21654  |
    +----+------------+---------+---------+---------+
    Don't you lose vblank 2 scanlines per frame to something else? (synchronization or something . . . I forget, but iirc Tomaitheous mentioned it before to the effect that for NTSC you've got 260 usable scanlines per frame, and any DMA time would be 260 - (vertical res) or for PAL 311 - (vertical res)). Thus something like Virtua Racing with 192 lines active would have 68 lines usable for vblank DMA each frame in 60 Hz or 119 in 50 Hz.

    EDIT: One thing I forgot about is refresh that lasts 40 cycles every line on SNES. The bus is locked during that time.

    Bytes per line is ~ (1364-40) / 8 ~ 165.5
    NTSCbytesPerFrame = ((1364-40) * 38) / 8 = 6289
    PAL1bytesPerFrame ~ ((1364-40) * 73) / 8 ~ 12081.5
    PAL2bytesPerFrame = ((1364-40) * 88) / 8 = 14564

    The SNES is still little bit faster than H32 in MD.
    What about at 3.58 MHz CPU/bus mode? Is the VDP locked at those 8-cycle access times, or could it be bumped to 6 cycles for fast ROM (or on-cart SRAM)?




    Quote Originally Posted by TmEE View Post
    TG16/PCE do not have DMA, but I know that CPU cannot use up the VRAM bandwidth in that machine, also there are ne restrictions on when you can transfer data, all the frame is available for max speed (and solely because it has one BG layer :P).

    CPU speed is 7159090Hz, there's ~60 frames per second and 262 lines per frame.
    ~119318 cycles per frame
    ~455 cycles per line

    Bandwidth here relates directly to how fast CPU can push data into VRAM, and I don't know how many cycles one such operation takes. Bandwidth value is above 2 figures divided by "that missing cycle count".
    Educated guess is that it takes one cycle to fetch instruction, second to load data, and third to store it. Which would be the best case scenario, in which case there would be :

    39772 bytes per frame
    151 bytes per line
    Last time I remember seeing Tomaitheous post on the HU6280's block-move performance, I remeber it being 5 or 6 cycles ber byte (I think it was either, depending on some circumstances), but I forget the specifics. I would have thought a bit faster than that given the 650x style bus access timing and the general overhead associated with sequential data accesses for block move instructions (we're not talking a stock 6502/C02 here).
    Hmm, maybe it was per-word and not byte . . . the VDP does use a 16-bit I/O port for CPU writes, so that might have been the context.

    In any case it's still less than the peak capacity for VRAM updates, so there'd be potential for enhancing that with dedicated hardware in add-ons. (aside form basic DMA block move, a more general purpose blitter could do that and more . . . not to mention the context of a 2nd VDP being framebuffer+blitter oriented rather than the idential one the SuperGrafx added, well that and if NEC actually worked that into the CD upgrade peripherals rather than weirdly release a new console at an odd time like with the SuperGrafx )

    Theoretical peak, given the main bus and I/O port for the VDP (assuming we're talking about the existing VDP and not some hypothetical one with a dual-bus DMA engine sitting between the main and VRAM buses), should be 2 7.16 MHz cycles per byte (3.58 MB/s or ~228 bytes per line)



    Quote Originally Posted by Mask of Destiny View Post
    This is not entirely correct. In H40 there are 210 VRAM access slots, six of which get used for refresh during Vblank/display off. Additionally, the DMA engine can read one word from the 68K's bus per access slot; however, two reads get missed whenever there's a VRAM refresh cycle. When VRAM is the target, this doesn't matter as it takes two slots to retire a single word anyway; however, CRAM and VSRAM do not have this limitation since they're more or less word wide. So it's 204 bytes to VRAM or 198 words to CRAM or VSRAM. The picture is similar in H32. There are 171 slots per line, five of which are refresh so we get 166 bytes to VRAM or 161 words to CRAM/VSRAM per line (I apologize for the off-by-one error in my previous post, temporarily forgot that I was using zero based slot numbers in my code).
    224/240 lines are displayed, but the VDP is busy rendering for 225. The reason for this is that sprites are processed and rendered to a buffer a line early.
    Hmm, that may be part of the 2 "unusable" lines I was thinking of . . . I could have sworn there was some mention of synchronization too. (maybe 1 line lost to set-up and one to sync)

    Also, interesting on the sprite loading, I'd assumed that was done in hblank . . . wait, nevermind, for hblank you'd need to start a line early too. (start with trailing hblank from previous line and end with leading hblank on current line)
    6 days older than SEGA Genesis
    -------------
    Quote Originally Posted by evilevoix View Post
    Dude it’s the bios that marries the 16 bit and the 8 bit that makes it 24 bit. If SNK released their double speed bios revision SNK would have had the world’s first 48 bit machine, IDK how you keep ignoring this.
    Quote Originally Posted by evilevoix View Post
    the PCE, that system has no extra silicone for music, how many resources are used to make music and it has less sprites than the MD on screen at once but a larger sprite area?

  10. #1765
    Mastering your Systems Shining Hero TmEE's Avatar
    Join Date
    Oct 2007
    Location
    Estonia, Rapla City
    Age
    28
    Posts
    10,063
    Rep Power
    107

    Default

    SNES DMA is always MCLK / 8, regardless of ROM speed setting.

    I also got some info that most fastest way to push data to VRAM on PCE takes 5 cycles per byte, but that way doubles all the data size requirements. Normal block moves way takes 7 cycles per byte. So :

    ~23863 bytes per frame
    ~91 bytes per line

    or

    ~17045 bytes per frame
    ~65 bytes per line

    Quote Originally Posted by Mask of Destiny View Post
    This is not entirely correct. In H40 there are 210 VRAM access slots, six of which get used for refresh during Vblank/display off. Additionally, the DMA engine can read one word from the 68K's bus per access slot; however, two reads get missed whenever there's a VRAM refresh cycle. When VRAM is the target, this doesn't matter as it takes two slots to retire a single word anyway; however, CRAM and VSRAM do not have this limitation since they're more or less word wide. So it's 204 bytes to VRAM or 198 words to CRAM or VSRAM. The picture is similar in H32. There are 171 slots per line, five of which are refresh so we get 166 bytes to VRAM or 161 words to CRAM/VSRAM per line (I apologize for the off-by-one error in my previous post, temporarily forgot that I was using zero based slot numbers in my code).

    224/240 lines are displayed, but the VDP is busy rendering for 225. The reason for this is that sprites are processed and rendered to a buffer a line early.
    I'll have to update my docs a bit then, that info is good !
    Now that I look at it, the H32 matches up with SMS speed * 2 and thus my observations. I have to take a closer look at SMS refresh behaviour, it uses PSRAM like memory for VRAM with address latches so VDP can mux address and data pins.

    Regarding sprites - They are prepared in HBL area, so wouldn't that mean that only part of the line is locked off (BG tiles are fetched and displayed on the fly) ? Or a line really gets wasted fully ?

    Quote Originally Posted by Black_Tiger View Post
    Does this mean that the SuperGrafx has restrictions for when it can transfer data?
    SuperGrafx has 2 separate chips, so you will not have access limits, however you got to manage same amount of VRAM bandwidth on 2 chips now.
    Last edited by TmEE; 10-07-2013 at 03:05 PM.
    Death To MP3, :3
    Mida sa loed ? Nagunii aru ei saa "Gnirts test is a shit" New and growing website of total jawusumness !
    If any of my images in my posts no longer work you can find them in "FileDen Dump" on my site ^

  11. #1766
    Road Rasher
    Join Date
    Apr 2013
    Location
    SF Bay Area, California
    Posts
    274
    Rep Power
    18

    Default

    Quote Originally Posted by kool kitty89 View Post
    Also, interesting on the sprite loading, I'd assumed that was done in hblank . . . wait, nevermind, for hblank you'd need to start a line early too. (start with trailing hblank from previous line and end with leading hblank on current line)
    Sprite rendering occurs in 3 phases: Scanning the cached half of the full sprite table for sprites that are on the current line. The first 20 matches (16 for H32) are recorded in an internal buffer somewhere for use in the second stage. The second stage involves reading the uncached portion of the sprite table entries for those 20 (or less) sprites. This is used to generate a list of tile lines that should be drawn to the line buffer. Stage three involves reading the appropriate tile lines and writing them to the appropriate locations in the line buffer.

    From the VDP's internal perspective a line begins at the start of the right border of the line that visually precedes it. Phase 3 of sprite rendering for the current line begins immediately at the beginning of the line (from the VDP's internal perspective, so in the right border of the previous line). 4 slots later Phase 1 of sprite rendering for the next line begins (this time is an estimate based on when rendering resumes after VBLANK) and continues in parallel with Phase 3 from the current line. Phase 2 for the next line is intermingled with background plane rendering during the active display of the current line.

    Quote Originally Posted by TmEE View Post
    Regarding sprites - They are prepared in HBL area, so wouldn't that mean that only part of the line is locked off (BG tiles are fetched and displayed on the fly) ? Or a line really gets wasted fully ?
    As mentioned above, the full sprite rendering process is actually spread out through the entire line. In theory, the VDP could turn all the slots that aren't relevant for the sprite rendering process into external access slots, but it doesn't. Presumably it wasn't considered worth the added complexity.

    As a small side note, that extra line of rendering actually starts 4 slots into the line (which is why I assume that's when the sprite table scan starts), but you don't actually get any benefit as the 4 slots you gain are lost at the end of the frame/field.

  12. #1767
    Mastering your Systems Shining Hero TmEE's Avatar
    Join Date
    Oct 2007
    Location
    Estonia, Rapla City
    Age
    28
    Posts
    10,063
    Rep Power
    107

    Default

    Ok, more good info !

    When I connected VRAM signals to RGB out as experiment I could see sprite pixels on the output 16 pixels before the right end of the visible area, I recall it was 8 sprite pixels per 1 screen pixel... I could also see BG tiles in the output 16 pixels before they were actually shown. I need to do that stuff again sometime !

    EDIT :

    I just wired a probe to one of the RGB signals and poked VRAM data lines :

    You can see the HUD and Sonic sprites data on the right side, and you can also see BG tiles appearing before they are seen, with "empty" space between, I guess then there's more sprites around their data is seen in those holes... gonna test this shortly.
    EDIT3 :


    *BG tiles are fetched 16 pixels before they are shown, scrolling will shift the time things end up on the screen
    *Each 16 pixels seem to contain tile fetches for both BGs and some more stuff which is likely the tilemap entry
    *Some of that is definitely related to sprites as there is change as you change sprite with A button. There's 20 zones, and you can only have 20 sprites per line.
    *BG tiles are only loaded on the active lines.
    *Right before sprites there's some "garbage" which I guess relate to sprite list, then follow the sprite tile fetches. My TV will not show the blanking so I would have to use my scope to see what more happens, but that means one line at the time, maybe several but things are blurry then. I though of suing my caputer card but DScaler exclaims overlays don't work, and seems they really don't and I could not get them working...
    *in any case sprite GFX data seems to be fetched exclusively in blanking area.
    *GFX fetching is takes variable cycles, it does not always use up fixed amount of time because of the next point :
    *Something to do with sprite table itself happens right after GFX data. I guess its the caching part.
    *First sprite GFX data is loaded one line before active display starts
    *sprite data is loaded on the last active line aswell, but there's no line coming to show it... so one line is being partially wasted for sure.
    *When sprite goes into top or bottom border something happens in right side in both top and bottom regording data fetching.
    *Going into left or right border will not change anything in the fetch area.
    *When there are no sprites on the line the VDP with fetch same sprite GFX as it did last line with sprites.
    Last edited by TmEE; 10-07-2013 at 08:36 PM.
    Death To MP3, :3
    Mida sa loed ? Nagunii aru ei saa "Gnirts test is a shit" New and growing website of total jawusumness !
    If any of my images in my posts no longer work you can find them in "FileDen Dump" on my site ^

  13. #1768
    Road Rasher
    Join Date
    Jun 2011
    Posts
    472
    Rep Power
    14

    Default

    Quote Originally Posted by roundwars View Post
    Now do PCE
    Cool!, now do the Super ACan!!!...just kidding. Although aparently it can do 256 sprites and 4 background layers, the thing must have almost Neo Geo level 2d capabilities.
    This thread keeps getting more interesting! Thanks guys!

  14. #1769
    Road Rasher
    Join Date
    Apr 2013
    Location
    SF Bay Area, California
    Posts
    274
    Rep Power
    18

    Default

    The timing of what the VDP does in a line is already well documented by Nemesis here and by me here. It's much easier with a logic analyzer

    Quote Originally Posted by TmEE View Post
    *BG tiles are fetched 16 pixels before they are shown, scrolling will shift the time things end up on the screen
    Essentially the way scrolling works is that the VDP actually renders 42 columns and has a small buffer for column data. The coarse portion of the hscroll value is used for a column offset and the fine portion is to figure out what portion of the buffer and current bg data should be used. This is why the left portion of the screen gets screwed up when you use the 2-cell vscroll mode (there's no vscroll data for that extra pair of columns) and why the window bug exists.

    Quote Originally Posted by TmEE View Post
    *Each 16 pixels seem to contain tile fetches for both BGs and some more stuff which is likely the tilemap entry
    *Some of that is definitely related to sprites as there is change as you change sprite with A button. There's 20 zones, and you can only have 20 sprites per line.
    The sequence is:
    Plane A Name table entry
    Refresh | Sprite Tile data (first 2 columns only) | External access slot
    Plane A tile row
    Plane A tile row
    Plane B Name table entry
    Sprite attribute table | Sprite Tile data (first 2 columns only)
    Plane B tile row
    Plane B tile row

    Each of those lines represents a 4-byte serial read (except for refresh and the external access slot which is a single byte parallel read or write) from VRAM which takes 4 cycles of SC which is double the pixel clock

    There are 21 of these, but one of them reads the final bit of sprite tile data rather than refresh/external access/sprite attribute table data.

    Quote Originally Posted by TmEE View Post
    *Right before sprites there's some "garbage" which I guess relate to sprite list, then follow the sprite tile fetches. My TV will not show the blanking so I would have to use my scope to see what more happens, but that means one line at the time, maybe several but things are blurry then. I though of suing my caputer card but DScaler exclaims overlays don't work, and seems they really don't and I could not get them working...
    Only thing between the end of the BG rendering and the beginning of the sprite tile fetch are two external access slots (8 SC cycles total, 4 pixels). There are a few things in the middle of the sprite tile data fetch though. There are 2 more external slots and there's the read of the hscroll data and then the final two sprite tile rows are read in the middle of that extra BG column pair.

    Quote Originally Posted by TmEE View Post
    *in any case sprite GFX data seems to be fetched exclusively in blanking area.
    Depends on what you mean by blanking area. If you mean the border area + HSYNC, then yes.

    Quote Originally Posted by TmEE View Post
    *GFX fetching is takes variable cycles, it does not always use up fixed amount of time because of the next point :
    *Something to do with sprite table itself happens right after GFX data. I guess its the caching part.
    Sprite table caching gets done on write. If you change the sprite attribute table address, the cache is not updated. However, only half of each entry is cached. The other half of the entries for the 20 sprites on the appropriate line are read in middle of doing the BG rendering.
    Last edited by Mask of Destiny; 10-09-2013 at 01:17 PM.

  15. #1770
    Hero of Algol kool kitty89's Avatar
    Join Date
    Mar 2009
    Location
    San Jose, CA
    Age
    29
    Posts
    9,724
    Rep Power
    62

    Default

    Quote Originally Posted by TmEE View Post
    SNES DMA is always MCLK / 8, regardless of ROM speed setting.

    I also got some info that most fastest way to push data to VRAM on PCE takes 5 cycles per byte, but that way doubles all the data size requirements. Normal block moves way takes 7 cycles per byte. So :

    ~23863 bytes per frame
    ~91 bytes per line

    or

    ~17045 bytes per frame
    ~65 bytes per line
    Hmm, so the Hu6280 is slower at block moves than the MD's 68000. (20 cyles per 32-bits for block copy iirc, or 5 cycles per byte)


    SuperGrafx has 2 separate chips, so you will not have access limits, however you got to manage same amount of VRAM bandwidth on 2 chips now.
    Yep . . . more space to buffer animation in VRAM though, but that much more to update for games with dynamic animation. (OTOH, compared to games slaving much of the bandwidth to dynamic tiles, you'd save a LOT there with the hardware BG+sprite additions of the 2nd VDP) Having a DMA engine that could max out VRAM bandwidth capacity would have been way more powerful still. (albeit, if there was no bus priority control, you'd have similar issues with bus contention as the MD does for DMA, including problems with software driven sample playback routines . . . albeit, had the SGX added DMA for PCM as well, you'd solve that too)




    Quote Originally Posted by Mask of Destiny View Post
    Sprite rendering occurs in 3 phases: Scanning the cached half of the full sprite table for sprites that are on the current line. The first 20 matches (16 for H32) are recorded in an internal buffer somewhere for use in the second stage. The second stage involves reading the uncached portion of the sprite table entries for those 20 (or less) sprites. This is used to generate a list of tile lines that should be drawn to the line buffer. Stage three involves reading the appropriate tile lines and writing them to the appropriate locations in the line buffer.

    From the VDP's internal perspective a line begins at the start of the right border of the line that visually precedes it. Phase 3 of sprite rendering for the current line begins immediately at the beginning of the line (from the VDP's internal perspective, so in the right border of the previous line). 4 slots later Phase 1 of sprite rendering for the next line begins (this time is an estimate based on when rendering resumes after VBLANK) and continues in parallel with Phase 3 from the current line. Phase 2 for the next line is intermingled with background plane rendering during the active display of the current line.
    Hmm, interesting. The line buffers don't span an entire screen's width of pixels, I assume (NOT like the 7800/Panther/Jaguar), but are shorter, more limited buffers. (otherwise it seems like there'd be a lot more potential flexibility there)


    With that much buffering, it seems like optimizing around cheap, commodity FPM DRAM would have made a lot of sense over the VRAM they used. (at 16-bits wide it should have had similar peak bandwdith to the dual port 8-bit VRAM, but cheaper)

    Then again, there's the same strangeness for Atari using SRAM for the Panther when the line buffer oriented object processor would have allowed relatively efficient use of FPM DRAM.
    6 days older than SEGA Genesis
    -------------
    Quote Originally Posted by evilevoix View Post
    Dude it’s the bios that marries the 16 bit and the 8 bit that makes it 24 bit. If SNK released their double speed bios revision SNK would have had the world’s first 48 bit machine, IDK how you keep ignoring this.
    Quote Originally Posted by evilevoix View Post
    the PCE, that system has no extra silicone for music, how many resources are used to make music and it has less sprites than the MD on screen at once but a larger sprite area?

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •