Some interesting forum posts I found just now.
Since KGL is completely absent from any lighting code, I decided to take a swing at it.
I have made an implementation of Dynamic Vertex Lighting based on some resources provided by nVidia, although it applies to GL.
To accelerate things, I have used the SH4s fast math functions when possible.
Here is a basic scene I wrote for testing; some cylindrical and rectangular columns.
In this image, the scene is lit with one diffuse light, and a low level of ambient light:
This image is the same scene with the same diffuse light, except now the specular term is also being calculated:
Same scene, now with a high level of ambient light:
So, I have the basic lighting model up and running, but now it leaves the desire for shadows to interact with the lighting.
My question is, how do we use the PVR to achieve shadows?
I know there is a modifier volume that can be set to "PVR_MODIFIER_CHEAP_SHADOW" but I really dont know how to use modifier volumes.
Any info appreciated!
BTW this guy seems to have achieved some very nice effects with PVR shadows:
Small update, added attenuation calculations, and posted a short clip on youtube:
Quite a lot of time did I spend on that one and I guess this is gonna be
my favorite platform for trying things out or just for senseless coding hours.
I tried several times writing a 3D Graphics Engine for Dreamcast.
The very first version was written in C but I rapidely switched to C++ (and a lot of assembly..).
It now consists of asynchronous file loaders, simple thread support, a scene graph renderer
and a strange multi texturing material system that needs a redesign.
The graphics data is created with Cinema 4D and exported with a custom plugin.
I am currently thinking about a realtime plugin for Cinema that sends modification
parameters for the running scene over to the Dreamcast target, so you can change things in Cinema
and see the changes on the Dreamcast in realtime.
Below are some screenshots and some descriptions of my work so far.
On Sega Dreamcast you must clip all polygons at least against the z near plane witch can break up all the nice flowing TnL code. Also, since you're working with triangle strips most of the time, you somehow have to take care of breaking up a strip into single polygons and stuff. The right picture shows some strange mipmapping behaviour on clipped polygons I've not figured out yet.
Before we can do some stencil based volume shadows we first must find the object outlines and then extrude a shadow volume with some special polygons. Stencil shadows on Dreamcast are quite easy to do but they have their drawbacks. The biggest disadvantage is that you have only one type of volume witch tells the other polygons whether they are inside a volume or not. However, the results tend to look quite impressive...
...until you quickly run into some problems. First, you can have as many volumes inside the scene as you want. But the polygons of one volume must not cross, i.e. it must be convex. As far as I could figure out there are only two solutions to the torus problem you can see above. Either you break up the torus into multiple segments and let each segment cast a different volume, or you create a different volume for each backfacing polygon, which tends to slow down the rendering on the graphics chip a lot.
And there is one more major problem. Notice the wrong shadowing in the middle of the picture. This happens when the camera enters a shadow volume. The polygons of the originally closed volume are clipped against the z near plane and the volume is opened. There is a nice solution to that famious capping problem by John Carmack but it works only with 8 Bit stencil buffer depth. On Dreamcast we have some kind of what I call 1.5 Bit stencil buffer. For one volume object there is only 1 Bit stencil, so either in or out. But you can have multiple overlapping separated volumes without any problems. So what we have to do is to generate capping polygons at the znear plane in order to close the clipped volume.
I wrote some cheap algorithm that generates capping polygons based on the clipping information. It does not work for all arbitrary meshes but it turned out to be fine in normal cases. You usually don't have such strange volume meshes like spheres or tori, but rather cylinders and stuff.
Above you can see some multi texturing and environment mapping with specular highlights. On the left there is only one environment texture, on the right there is a base texture and an environment map that makes up the black blots. On Dreamcast there is no real multitexturing support in hardware. Each additional texture layer must be rendered with translucent polygons, i.e. you have to send almost the same geometry vertex data to the graphics chips multiple times. My intention was to reduce the burden on the CPU (all geometry processing is done by the CPU on Dreamcast) by organizing the data in such a way that it could easily be cloned in one go. And in some cases it is even almost for free!
Now that we've got multitexturing up and running we can continue with some more interesting stuff: The bump mapping. I don't remember seeing any game out there that makes use of it. But I also must confess that you won't be able to recognice it when you don't know it's there in some cases. On Dreamcast bump mapping can be done quite easily, too. Basically all you have to do is just a lookup table for the atan2 computation. However, things get complicated again if you want more than one light source to affect the bump mapping surface.
Left image: no bump map.
Right image: heavy bump map applied, looks damn cool when you move the light source. Bottom image: base texture + bump map + environment map (pink / black)
Last but not least some standard post processing effects.
Top left: normal image.
Top right: noisy pixel displacement.
Bottom left: box filter blur.
Bottom right: radial blur / crash zoom.
I also found this discussion of Sony's "How far have we got" 2003 document which apparently includes comments by the document author (llmarie post 9):
As the author of the document, I would like to make a few comments:
What a shame that there are still many developers that are not even using VU0.
Very true, but VU0 is not as easy and staightforward to use as VU1. There are many reasons why this is so. We can't condemn developers for that, I am just trying to encourage them to give it a second thought.
It's quite a bit different from the 15 - 20 million figure that has been used by the console's advocates (infact this would seem to roughly indicate that the DC was to PS2 what the PS2 is to Xbox in terms of performance).
Let's not get confused between "top speed" and "average speed". A Porsche can go over 200km per hour, but drive it over hundreds of kilometers and look at the average speed you got.
There are regulations and constraints, there are pauses and traffic jam.
It's all the same on PS2. You can achieve up to 22MP/s, but the average will still be around 5MP/s.
As has been mentioned sereral times on these boards, the PS2's CPU does appear to be the weak point of the system.
True, although not the CPU itself, but rather the CPU efficiency. It is slowed down by bus accesses and stalls, etc... That makes it the weak point.
But I too, am quite surprised at the slide which says "Best performing games use up to 8% VU0". Not sure if they scanned the just released J&D2, R&C2, or SH3 and ZoE: Anubis from Japan. Because that statistic is really surprising.
I did indeed, and the stats do includes the results from those games.
It is not an easy platform to program for, but definitely a very interesting architecture and it does allow for interesting techniques. It is nice to see that it hasn't reached the end of its life yet.
If you have any more questions about the document, please let me know.
There is one thing I was wondering about - the app with 40x overdraw, is it a big secret which is it?
I'm not sure I'm supposed to say it, but I see no harm in it, especially that the game is doing pretty well technically. It was LOTR2, it is full of giant particles.
do you mind if I ask where you pull your information from? Individual hand-testing of every game through the PA yourselves only, or is there a lot of developer info-swapping going on as well?
Over the last two year we have scanned a large number of games. There is a developer community too, but they have their secrets that I cannot discuss it here.
Do you test only the maximum performance a title can bring into play, or average it with all the various levels a game can deliver? (Thinking specifically of racing titles here, as you can play with just yourself on a track, or potentially load it up with many more cars--meaning much more needed pushing power.)
I usually take a random point in the game. It might be unfair, but it is unfair in the same way for each game, so on average the results are still relevant.
I must say your "maximum" count is going to get a lot of people jumping around, but it's supposed to read more like "maximum average polys at 60fps for the entire title," right?
Reading the document without attending the talk is not so good, because of course you don't get the explanation for each assertion.
For example, the maximum count is indeed an average and is nowhere near the maximum the PS2 can achieve at peak.
If people like big numbers, a developer has contributed some code running a 600K poly textured scene at 60fps. It is public too, and can be found on playstation2-linux.com
I rather doubt the majority of ANY games for any of the consoles push over 5M at this point
I think so too. In most cases though, it is fair to say that the game has been designed not to need more that that, rather than it being limited by the hardware.
BTW: I loved your comments on the Data Packing slide talking about palletized textures
Thanks. During the talk I asked if there were any artists, I didn't expect any, but there were several of them actually. Busted.
the PA actually underestimates the polys drawn
Wrong, the PA shows you _exactly_ the number of polygons drawn, regardless of how many you actually intended to draw.
How does this affect your averages?
If affects the stats indeed. But in the end if you send 100K polys but only 50K got drawn, then you did only draw 50K. I think it is fair to count only polygons that are drawn on screen.
You are working for the developer support at SCEE (like listed in you presentation) this results in the following questions:
I am working for the developer support at SCEE, this results in the following answer:
No comments, sorry.
Hope this helps,
Missed a few questions, sorry:
Would this be 145,000 peak, average (over a section of the game) or sustained (over successive frames)?
145k sustained it was, but not at 60fps. That makes it 70K polys at 60 on average, so 125K at 60 is still the fastest (and those are actual displayed polygons). But those are just numbers, in most games the quality of the picture does not depend so much on the number of polygons.
I'm just interested to know specifically what the figures are referring to
I read the figures by looking at the rendering part of the scans only. That is to say I didn't include the waiting time for the VSync for instance. I did include the time waiting for texture uploads on the other hand, that could make a renderer capable of 15MPs run at 10 or 8 or less.
How successful do you feel you will be at utilizing VU0 in future and upcoming titles?
As we said, using VU0 is not easy. Results will certainly vary, we are still waiting to be surprised.
As a comment, it has been mentioned that when I said "VU0 usage" that wasn't taking VU0 as COP2 in account. That is true, and during the talk I did explain that the figures were for VU0 running independantly from the CPU, which was the relevant point I wanted to make.