Quantcast

Page 3 of 11 FirstFirst 1234567 ... LastLast
Results 31 to 45 of 159

Thread: Barone's Hacking Corner

  1. #31
    Road Rasher Bibin's Avatar
    Join Date
    Jun 2010
    Posts
    437
    Rep Power
    29

    Default

    This is great to see an organized list of planned and in progress works.

    Would there be any interest in doing some instrument replacement hacks for Super Street Fighter 2? With the great sound driver and color hacks we have, and the backdrop replacement in progress, the lackluster music work would be the next target.

  2. #32
    VA1LT CHIP ENABLED Master of Shinobi OverDrone's Avatar
    Join Date
    Jun 2012
    Posts
    1,534
    Rep Power
    42

    Default

    Rep must be spread...

    Great initiative. Unbelievable that Marble Madness Tengen version is harder than the arcade, I didn't know that.

    I think there's some deserving games on that list, but if I could add a suggestion it would be to reinstate the sampled hit FX from the arcade in Double Dragon - the effect is weedy as hell in the retail release, and pretty much kills any impact the game could have. The palette is enhanced already and there is a hack that apparently makes the movement speed more like the arcade, so yeah.

    Not sure about ROM space available for such an endevour, this port is only 512k isn't it?

  3. #33
    Hero of Algol
    Join Date
    Aug 2010
    Posts
    7,611
    Rep Power
    168

    Default

    Quote Originally Posted by Bibin View Post
    This is great to see an organized list of planned and in progress works.

    Would there be any interest in doing some instrument replacement hacks for Super Street Fighter 2? With the great sound driver and color hacks we have, and the backdrop replacement in progress, the lackluster music work would be the next target.
    That could be cool but I'd like to team-up with someone who actually knows about FM Synth composing, 'cause I know shit.


    Quote Originally Posted by OverDrone View Post
    Rep must be spread...

    Great initiative. Unbelievable that Marble Madness Tengen version is harder than the arcade, I didn't know that.

    I think there's some deserving games on that list, but if I could add a suggestion it would be to reinstate the sampled hit FX from the arcade in Double Dragon - the effect is weedy as hell in the retail release, and pretty much kills any impact the game could have. The palette is enhanced already and there is a hack that apparently makes the movement speed more like the arcade, so yeah.

    Not sure about ROM space available for such an endevour, this port is only 512k isn't it?
    Hmm, yes, DD is a very very good suggestion. It's almost there with all the community updates but the sfx still ruin it.
    ROM space is no problem for that one.




    Guys, here's an interesting and unexpected by-product of my tests with Super Hang-On:
    https://drive.google.com/file/d/0B0c...ew?usp=sharing

    The link above contains an .IPS patch to be applied to the Super Hang-On (W) (REV 00) ROM (it's very easy to find it).
    It will remove ALL elevation changes of ALL tracks (it actually broke the calc loop but the game runs fine).
    It's enough to run the game at 30 fps most of the time on Kega Fusion, which is inaccurate as fuck and whose CPU power would be roughly equivalent to a 68000 clocked at 9 MHz instead of 7.67 MHz like it should be.

    On real hardware it's still not enough to unleash the 30 fps beast but it probably give you less heavy slowdown throughout the game.
    A bit more speed up should be noticed on a 68010 equipped Mega Drive.

    Obviously it's not my goal to have the game broken like this but it's a proof of concept IMO:
    - Kega Fusion IS inaccurate, quit arguing yo ROM'ers.
    - The elevation changes calculation is the major drawback to achieve 30 fps.

    OTOH, I've found other loops using MULS and DIVS instructions in situations where they could be skipped, so there's a lot of room for optimization IMO.
    However, I might say, the original developers probably didn't have a choice. With all the alternative branches I'll need to add in order to try to speed up the game I believe it wouldn't fit in a 4 Mbit cartridge. If it was a hard limit for them then it could be a valid excuse. A 5 Mbit cartridge would have been enough though for what I've seen until now.
    Last edited by Barone; 09-08-2015 at 05:04 AM.

  4. #34
    Bite my shiny, metal ***! Hero of Algol retrospiel's Avatar
    Join Date
    Mar 2008
    Location
    Cologne, FRG
    Posts
    7,816
    Rep Power
    91

    Default

    It's really nice to see games like Super Hang On, Metal Fangs and Fastest One on your list. You're probably the only one who would ever even consider hacking them, so I'm really looking forward to your work on those.


    Edit:
    Quote Originally Posted by Barone View Post
    Obviously it's not my goal to have the game broken like this but it's a proof of concept IMO [...] The elevation changes calculation is the major drawback to achieve 30 fps.

    OTOH, I've found other loops using MULS and DIVS instructions in situations where they could be skipped, so there's a lot of room for optimization IMO.
    That sounds amazing! - Maybe you could release the final patch in two versions, one with elevations and one without so people can choose what version they prefer.

    Either way, I can't wait to try the version you uploaded.
    Last edited by retrospiel; 09-08-2015 at 07:57 AM.
    The Mega Drive was far inferior to the NES in terms of diffusion rate and sales in the Japanese market, though there were ardent Sega users. But in the US and Europe, we knew Sega could challenge Nintendo. We aimed at dominating those markets, hiring experienced staff for our overseas department in Japan, and revitalising Sega of America and the ailing Virgin group in Europe.

    Then we set about developing killer games.

    - Hayao Nakayama, Mega Drive Collected Works (p. 17)

  5. #35
    Road Rasher Bibin's Avatar
    Join Date
    Jun 2010
    Posts
    437
    Rep Power
    29

    Default

    Quote Originally Posted by Barone View Post
    That could be cool but I'd like to team-up with someone who actually knows about FM Synth composing, 'cause I know shit.
    If you can find where to pull instruments from / where to put them back, I would have a stab at improving the patches Capcom used.

  6. #36
    Master of Shinobi Alianger's Avatar
    Join Date
    Sep 2005
    Location
    Sweden
    Age
    34
    Posts
    1,603
    Rep Power
    41

    Default

    Cool, really looking forward to some of these!
    I recently saw a let's play of MD Battletoads by Austin Mackert where he compares it to the NES version in detail. It might be of use to you.
    https://www.youtube.com/watch?v=eUu2U31xsTI

    Side by side vid by the same guy:
    https://www.youtube.com/watch?v=MGh9KKEi6yA
    On a wave of mutilation

    https://minirevver.weebly.com/

  7. #37
    Comrade as in friend. Master of Shinobi ComradeOj's Avatar
    Join Date
    Dec 2012
    Location
    New Mexico, USA
    Age
    24
    Posts
    1,321
    Rep Power
    56

    Default

    @Barone

    Nice! I just tried out the patched ROM on my 10MHZ Genesis. It definitely does seem a bit faster.

    It's enough to run the game at 30 fps most of the time on Kega Fusion, which is inaccurate as fuck and whose CPU power would be roughly equivalent to a 68000 clocked at 9 MHz instead of 7.67 MHz like it should be.
    This. I never knew how much faster fusion's emulated 68k is compared to the real one until I started writing software for the Genesis. I did some benchmarks a while back with various emulators.

    Here's how Fusion stacked up:
    Platform Score (higher is better)
    Real hardware (10mhz 68010) 7104
    AtGames handheld 7009
    Real hardware (10mhz 68000) 6756
    KEGA Fusion 6333
    Regen 5300
    GENS 5271
    Real hardware (stock CPU) 5220
    Exodus 5174

    I think you've already seen it, but I put up a clip of REV00 super hang-on running on a 7.67MHZ 68010. Maybe it will be helpful in seeing what the game would be like with more optimized MUL/DIV routines running on a stock system. It shows some hills and the beeping checkpoint, which seems faster when overclocking. https://youtu.be/wAJ0Dsaerq0?t=16s

    Great job! Hopefully you can get it optimized, while keeping the elevation changes.
    Modded consoles:
    Master System (v7040) with s-video & direct AV out
    Model 1 with 10mhz overclock & halt switches
    Model 1 with 10mhz 68010
    Model 2 VA2.3 with unfiltered Mega Amp, & s-video
    Model 3 VA1 with compatibility fixes & s-video
    32X with s-video
    Visit my web site at www.mode5.net
    Or my collection of homebrew Genesis games, programs, and music on SEGA-16!

  8. #38
    Wildside Expert RedAngel's Avatar
    Join Date
    May 2006
    Location
    Spain
    Age
    38
    Posts
    120
    Rep Power
    18

    Default

    I was amazed the first time I saw Super HangOn on a MD2 slowing down like Messi doing maths.

    Theses guys from gamehacking.org make some really nice Game Genie FIXE Codes for the Megaman Wily Wars, like the fix for the delay when megaman starts to walk and a faster speed for bullets to plays like the original game:

    http://gamehacking.org/vb/threads/13...Wily-Wars-(MD)
    I have tested it and now the game plays 100X better, fucking Crapcom.

  9. #39
    rent a hero! Road Rasher maxxfarras's Avatar
    Join Date
    Feb 2009
    Location
    Hermosillo Mexico
    Posts
    302
    Rep Power
    22

    Default

    Very nice findings about super hang on Barone! This is really exciting.
    I cant wait for see the final result of your optimizations for this game.

    Good luck with all this great projects.

    Edit:
    Bibin: if you and Barone can improve the music of SSF2, that could be really amazing!
    hay que tomar el control del descontrol.

  10. #40
    Hero of Algol
    Join Date
    Aug 2010
    Posts
    7,611
    Rep Power
    168

    Default

    Quote Originally Posted by Bibin View Post
    If you can find where to pull instruments from / where to put them back, I would have a stab at improving the patches Capcom used.
    Thanks a lot for the prompt offering. I'll let you know if/when I get the addresses of the whole thing.



    Some updates about Super Hang-On:
    - I've like 3 different versions, one of them if lots of branches and all the shit in order to try to avoid unnecessary calculations but none of them gave the result I'd like. The problem is that any of those alternate version end up worsening the cycle timing of the worst case scenario and the worst case scenario seems to be quite common, to not say the most common one; so any speed up in detriment of that, even if saving like 200+ cycles in some cases, is not enough to have a positive effect on the in terms of less slowdown on a stock machine.

    - It would be awesome, an impossible dream if we had something like a performance analyzer for the Mega Drive; it would make it much easier to pinpoint which parts of the code all the real bottleneck and which changes have a positive effect overall. But that's just a vague dream.

    - Some small optimization can be performed in the main loop of the game without worsening the worst case scenarios and most expensive loops; it will be necessary to replace the list of JSR instructions (and their respective RTS calls) (1 JSR + 1 RTS = 32 cycles) for LEA and JMP pairs (1 LEA (An) + 1 JMP xxx.W (to go) + 1 JMP (An) (to return) = 22 cycles or 1 LEA (An) + 1 JMP xxx.L (to go) + 1 JMP (An) (to return) = 24 cycles). It's a very small gain and it's only applicable in certain situations.
    However, the main loop has a lot of those so it may be worth implementing in order to make the frame rate a bit more consistent on 10 MHz overclocked MD, for an example.

    - The main issue seems to be the multiplication and division instructions being used all the time. And it's very very difficult to find a workaround for their usage because those calcs are using signed values with HUGE numbers and with both numbers varying most of the time.
    Even the idea of using a table with pre-calculated values seems to be NOT feasible here given the number of different calcs and possible results I can have.
    I don't want to even try to simplify the engine in terms of reducing the precision of the calcs because it would hurt the core of its design IMO, so it would make no sense to me.

    - The rest of the logic-related code seems to be very well optimized and structured so IDK what to do about it at this moment. But I'll keep trying to improve it, there's still a good chunk of the code to be explored and analyzed.
    But to reach 30 fps on a stock MD preserving the precision of the original engine seems very, very unlikely to me now.




    Well, thanks to the SHO thing I've learned quite a bit in terms of 68000 instructions timing and I decided to have a look at "The Corporation" since its frame rate is really abysmal and the game is still mostly unplayable even on a 12 MHz overclocked MD.
    It does use a lot of multiplication and division instructions but this one seems to have some non-optimized shit going on:
    - It seems to have way too many subroutines and the vast majority of them using JSR and RTS instructions even when those instructions could be replaced by less expensive stuff.
    - Move.l being used where Moveq is *clearly* applicable.
    - Lots of data manipulation from memory to memory instead of using registers (which is like the #1 rule of 68000 programming).

    So yeah, I'm giving this one a try and, who knows, maybe I'll have better luck this time.

  11. #41
    Hero of Algol
    Join Date
    Aug 2010
    Posts
    7,611
    Rep Power
    168

    Default

    I've made good progress on Corporation so far:
    Code:
    Corporation (US version) - Optimization Notes
    
    #1 For the loop originally beginning at 4C86 and ending at 4D07
    
    1.1)
    From:
    MOVE.l #$00000064, D0  (Hex opcode: 20 3C 00 00 00 64) (Cost: 12 cycles)
    
    To:
    MOVEQ  #$00000064, D0  (Hex opcode: 70 64)             (Cost:  4 cycles)
    ------------------------------------------------------------------------
    Saved: 8 cycles | 4 bytes 
    
    
    1.2)
    From:
    ADDI.w #$0046, D1     (Hex opcode: 06 41 00 46) (Cost: 8 cycles) 
    ADDI.w #$0046, D6     (Hex opcode: 06 46 00 46) (Cost: 8 cycles)
    
    To:
    MOVEQ #$00000046, D0  (Hex opcode: 70 46)       (Cost: 4 cycles)
    ADD.w D0, D1          (Hex opcode: D2 40)       (Cost: 4 cycles)
    ADD.w D0, D6          (Hex opcode: DC 40)       (Cost: 4 cycles)
    ----------------------------------------------------------------
    Saved: 4 cycles | 2 bytes 
    
    
    1.3)
    From:
    JSR $00004C86 (Hex opcode: 4E B9 00 00 4C 86) (Cost: 20 cycles)
    ...
    RTS           (Hex opcode: 4E 75)             (Cost: 16 cycles)
    
    To:
    JMP $000xxxxx (Hex opcode: 4E B9 00 0x xx xx) (Cost: 12 cycles) ; xxxxx means target address so I've made one routine copy for each call, dirty but efficient
    ...
    JMP $0000xxxx (Hex opcode: 4E B9 00 00 xx xx) (Cost: 12 cycles) ; xxxx means return address so I've made one routine copy for each call, dirty but efficient
    --------------------------------------------------------------------------------------------------------------------------------------------------------------
    Saved: 12 cycles
    
    Where:
    Target#1: 4E F9 00 0E B3 00 
    Return#1: 4E F9 00 00 CF 3E
    
    Target#2: 4E F9 00 0E B3 90 
    Return#2: 4E F9 00 00 D1 4E
    
    Target#3: 4E F9 00 0E B4 20 
    Return#3: 4E F9 00 00 D1 7A
    
    Target#4: 4E F9 00 0E B4 B0 
    Return#4: 4E F9 00 00 D3 74
    
    Target#5: 4E F9 00 0E B5 40 
    Return#5: 4E F9 00 00 D3 B2
    
    Target#6: 4E F9 00 0E B5 D0 
    Return#6: 4E F9 00 00 D3 F8
    
    Target#7: 4E F9 00 0E B6 60 
    Return#7: 4E F9 00 00 D4 38
    ---------------------------------------------------------------------------------------------------------------------------------------------------------------
    ---------------------------------------------------------------------------------------------------------------------------------------------------------------
    Total saved in each iteration of this loop: 24 cycles
    Works on real hardware, yes.
    Last edited by Barone; 09-19-2015 at 10:07 AM.

  12. #42
    Raging in the Streets goldenband's Avatar
    Join Date
    Dec 2009
    Posts
    4,385
    Rep Power
    90

    Default

    ^Wow! Awesome work you're doing, Barone. Are you able to see tangible results (i.e. any change in the apparent framerate) onscreen yet?

  13. #43
    Hero of Algol
    Join Date
    Aug 2010
    Posts
    7,611
    Rep Power
    168

    Default

    Quote Originally Posted by goldenband View Post
    ^Wow! Awesome work you're doing, Barone.
    Thanks man! I'm really happy with what I'm seeing here.


    Quote Originally Posted by goldenband View Post
    Are you able to see tangible results (i.e. any change in the apparent framerate) onscreen yet?
    24 cycles is way too small of a gain to impact the overall performance in a noticeable way.
    BUT:
    - 24 cycles in such a short loop like that one is a very good improvement IMO.
    - The code structure of this game is WAAAAAAY better in terms of allowing changes than SHO's; which means that this version of Corporation has very loose and sloppy coding in it.
    - These gains are real gains for the most part, I mean, no extra overhead was added to the original code like I had to do with SHO; so every change I'm doing has only benefits.
    - This loop I've optimized is like the 6th or 7th more often called and it's very, very short. There are lots of longer and more important (performance-wise) parts of the code to be optimized.

    Let's see what I can achieve in the next day.
    Last edited by Barone; 09-19-2015 at 10:06 AM.

  14. #44
    Smith's Minister of War Hero of Algol Kamahl's Avatar
    Join Date
    Jan 2011
    Location
    Belgium
    Age
    29
    Posts
    8,423
    Rep Power
    137

    Default

    This is all really interesting, well done Barone. Rep given.
    This thread needs more... ENGINEERS

  15. #45
    Outrunner Wesker's Avatar
    Join Date
    Feb 2006
    Posts
    637
    Rep Power
    28

    Default

    "You must spread some Reputation around before giving it to Barone again."

    I didn't know you were going to dig into Corporation. The Core Design FPS which is also known as Cyber-Cop, right? What are you aiming to do with this one, improving the fps count and such?

    By the way, would you be interested in taking a look at Universal Soldier and trying to retool it back to the original intended Turrican II conversion as much as possible like it was discussed here?

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •