Here is a screenshot from my translation of Nate Glasser's demo of Terrain Splatting via pixelshader or fixedfunction pipeline.

This demo only draws two triangles, but this is a per-Texture approach, suitable for a square 'chunk' of terrain, such as seen in my previous terrain demos.. but previously, I had used one single precalculated texture per terrain chunk in order to avoid realtime blending of textures.
Posted on 2007-10-14 04:42:16 by Homer
That's the article and C sourcecode which I transcribed.

Posted on 2007-10-15 00:32:33 by Homer
I'm adding functionality to allow the user to select textures and edit alphamaps at runtime.
I have been thinking about the costs of adding more blending textures, in order to provide channels for 'blood splats' and 'footprints' and 'bullet holes' etc.
For starters, they'd need their own UV coords for proper placement, which means our VertexSize grows.

Just for fun, I duplicated the render call.
Suprising result.
For fixedfunction pipeline, the framerate was about half, as we would expect.
But for pixelshader, the cost was noticeably less than we would expect!

This is really good news.
Using this new technique, we have N textures loaded, not 'one per terrain chunk'.
If we can blend them fast enough, we don't have to consume huge amounts of video memory.

But let's think about this - in the past, I kept one texture per terrain chunk, and thought that was costly (for large worlds anyway).

Now, we have 4 blending channels, requiring 4 alphamaps (which are textures) per terrain chunk, without counting the 4 textures used as blend sources (4 in the demo).

For one chunk of terrain, we have 8 textures.
For more chunks, at LEAST 4 textures (alphamaps) per chunk.


What if we pre-render to a (mipmapped) Texture rather than to the framebuffer?
We can get back to a cost of one texture per terrain chunk at runtime, and we can free both the cpu and the gpu, so our framerate will be off the chart, and we will have bonus party time to spend.
The rendering speed is sufficient that we don't actually have to pregenerate textures for the entire terrain, just what is newly-visible in the distance.
That means theres no huge cost up front, and no huge cost at runtime!!

Posted on 2007-10-19 00:32:55 by Homer
I have created a class called DXPixelmap which acts as a texture container.
Deriving from Pixelmap, it provides powerful pixel operations, and acts as a bridge between gdi and directx drawing functionality... quite useful.
It means we can load, save, draw and edit a texture using any mixture of pixelmap, directx, and gdi calls.
We can draw it on an application control, we can draw it on a 3D surface, we can do both at once.

I have a modeless dialog which shows the four input terrain textures on buttons.
Simply click a texture and you can reload that texture at runtime.
You can see the result in the main rendering window instantly.

I'm going to increase the size of the AlphaMap textures, and use a tiling approach to terrain chunks.
One big alphamap which is applied to the entire terrain, rather than many small ones.
If the AlphaMap texture size == the HeightMap texture size, then we have one 'blender control point' for each Vertex in our tesselation... thats going to be far more than we need, of course.
I'm going to try one quarter size for starters.

I will be adding a simple pixel editor for crafting alpha maps in the very near future.

Posted on 2007-10-19 11:44:21 by Homer
This image looks pretty :)

Posted on 2007-10-19 12:49:46 by Homer
Now the edit dialog looks like this:
Posted on 2007-10-20 03:13:31 by Homer
Its now possible to edit all four AlphaMaps individually with a built-in pixel editor dialog.
The results can be seen instantly in the main window.
I expect to allow the user to apply texture weights directly to the terrain in the main window later.

I have been thinking about how I propose to go about tesselating the terrain geometry from the heightmap information, because I plan to use a QuadTree to store the geometry.
The quadtree will allow me to have a terrain whose geometric LOD varies as required to approximate the theoretical surface to a given degree of error.
Furthermore, it becomes quite simple to implement from that a view-dependent dynamic LOD.

But rather than just tesselate a zillion triangles, and then have to sort them into a tree, AND simplify them by 'joining' them, I have thought of a much better way.

Lets consider just the four corner texels of the HeightMap texture.
At root level, our Terrain consists of a Textured Quad between these texels.
To make a QuadTree, we have to find the MidPoint between those texels,  thus defining a Fifth Point.
We then treat each Quarter as a subspace, represented by a Child Node, and iterate it, until we reach some error threshhold, whereapon we have 'discovered a Leaf Node'.
At this point, we generate Actual Geometry for JUST A TEXTURED QUAD and store it in that Leaf.

Now we have a rather unbalanced Tree whose Nodes represent ever-smaller chunks of world, and whose Leaves contain textured Quads.

We performed as few pixel lookups as possible, we performed no Sorting of geometry, we performed no Welding of triangles, we performed no Splitting of triangles, we avoided Simplifying the geometry and yet we now have a geometry whose complexity meets our demands, and which may or may not be dynamically dependant apon the current View.
In fact in the case of view dependant dynamic lod, we can even implement frustum culling within the Tree rebuilding function :)
While creating the tree, renderable geometry could be output to a buffer or list rather than stored in the Leaves.

Any comments?

Posted on 2007-10-22 13:15:55 by Homer
I am also inclined to try out the idea I posted some time ago called Sparse Radial Oversampling.
Perhaps I will make that project in parallel from this point.

Back to QuadTree generation.
I'm deliberately gearing my thoughts about culling to the 2D realm for now.
This is good for You, Dear Reader, because it means there will be pretty pictures to explain my concepts :)

We have loaded a (square) 2D picture for our HeightMap, its a picture of some mountains from overhead, and the brightness indicates height. Great. Now what?

Now, we need to agree on a way to describe a 2D rectangular polygon - the winding order will become important.


0-------->1          ------> +X
/|\          |          |
|            |          |
|            \|/          |
3<--------2          \|/ 

This has absolutely nothing to do with rendering,ok? We need to be able to refer to and manipulate theoretical rectangles, we're not trying to draw anything just yet.

OK, we need to define a rectangle that describes the entire world.
We will be passing that to our TreeGenerator function.
This function is often written as a recursive function, but that is not required.
I will write my tree building function to loop rather than to recurse, its easy enough.

Back on the subject of Rendering (which also MAY be recursive), when we walk the QuadTree nodes, we wish to check for intersection of view frustum's XZ approximation and a Node's rectangle defined in the XZ plane.
We are ignoring Height, so we can use 2D thinking here.
And this is why the Winding Order is important.
We can use a 2D version of plane tests, which has the effect of extending each Edge of the test polygon to infinity (and beyond!).
We can then use a series of 2D point-plane equiv. tests to perform the intersection check.
This is analogous to searching for a theoretical plane of separation.
We can tell if the polygons are "partly or fully occluded", or "separated", which is exactly what we need.
But it relies on us being certain about the Winding Order of the 2D polygons.
There are algorithms to check the winding order, however, I am not patient, I want to see results!
Perhaps later..

Anyway, if a 2d box is partially or fully visible, then we walk its children until we find Leaf nodes and then we draw them.
If a 2d box isnt visible, neither are its children, ho hum, you know this stuff.

But maybe you've never seen it used to guide the construction of a quadtree BEFORE rendering.
We only want our Tree to contain what is Visible.

Come to think of it, why bother building a Tree at all?
We could just write a discover-and render-visible-leaves function to replace our recursion-free treebuilder function.
Now the tree, and the leaves, are theoretical.
We dont need child pointers in our nodes, we dont even need nodes!!

OK I am not belittling the value of trees - theyre still essential in other places, just not required to perform view-dependent rendering of terrain based on regular grid.

Posted on 2007-10-24 02:24:25 by Homer
Because I haven't been able to decide yet how my partitioning will look, in the meanwhile I have implemented some code dealing with 'Mouse Pick Rays'.
I define a Ray which goes from the mousecursor position, above the 3d rendering window, and heads out into the 3D Scene at some 3D angle based apon the current camera view and projection settings.
IE, we are 'projecting a ray from screenspace into worldspace'.
This is done by 'back-projection', which uses an 'inverse combined matrix' to transform a vector between two coordinate systems (ScreenSpace and WorldSpace).

In my Text rendering function, I've added code to test for an intersection between that Ray and the first of the two Triangles I am rendering.
Should the intersection be found, I am calculating data for a new vertex at the intersection point (including two sets of UV coordinates and VertexNormal), and I am rendering onscreen some Text to display the (first) Texture UV coordinate at the intersection point.

So, now my toy can detect interaction between the mousecursor and 3D Triangles, and can calculate the texture UV for an intersection point anywhere inside a Triangle.

These features are both extremely useful , wouldn't you say?
The core function is a wrapper around D3DXIntersectTri, which returns useful but incomplete information.
My function takes the returned Barycentric coordinates and uses them to interpolate a new Vertex.
Like D3DXIntersectTri, it returns TRUE or FALSE, but it returns more data for TRUE.

Here's the code for anyone interested - and I'll post the Mouse PickRay code if you ask :)

;Test for intersection of Ray and Textured Triangle
;If found, return Vertex
IntersectTriangleRay proc pvOut,pV0,pV1,pV2,pvOrigin,pvDirection
LOCAL fDist:real4
LOCAL ftemp
invoke D3DXIntersectTri,pV0,pV1,pV2,pvOrigin,pvDirection,addr bary.x,addr bary.y,addr fDist
.if eax==TRUE

;Calculate intersection point P based on A)Ray and B)Distance to Intersection
mov eax,pvDirection
mov edx,pvOrigin
mov ecx,pvOut
fld fDist
fmul .D3DXVECTOR3.x
fadd .D3DXVECTOR3.x
fstp .QuadVertex.vPosition.x
fld fDist
fmul .D3DXVECTOR3.y
fadd .D3DXVECTOR3.y
fstp .QuadVertex.vPosition.y
fld fDist
fmul .D3DXVECTOR3.z
fadd .D3DXVECTOR3.z
fstp .QuadVertex.vPosition.z

;Calculate Texture UV coords at P based on Barycentric coords and Triangle Vertices
mov ebx,pV2
mov eax,pV1
mov edx,pV0

;This inline macro makes this code a lot more readable
FromBary macro ComponentName
fld  .QuadVertex.ComponentName
fsub .QuadVertex.ComponentName
fmul bary.x
fld .QuadVertex.ComponentName
fsub .QuadVertex.ComponentName
fmul bary.y
fadd .QuadVertex.ComponentName
fstp .QuadVertex.ComponentName

FromBary vTex0.x
FromBary vTex0.y
FromBary vTex1.x
FromBary vTex1.y
FromBary vNormal.x
FromBary vNormal.y
FromBary vNormal.z

;And we are done!
mov eax,TRUE
IntersectTriangleRay endp

Posted on 2007-10-27 23:38:59 by Homer
The code in the previous post has been updated (corrected, and made more readable).

The formula I was using to interpolate (from Barycentric edge weights) the texture coords and Normal for theoretical new vertex at intersection point was wrong.
It did not correctly handle triangles at arbitrary orientations, or whose vertices were in arbitrary order !!
Now, it does :)

Here's the correct formula, which Google failed two furnish over two days of agony:

Output.u = (((B.u - A.u) * Bary.u) + ((C.u - A.u) * Bary.v)) + A.u
Output.v = (((B.v - A.v) * Bary.u) + ((C.v - A.v) * Bary.v)) + A.v

To interpolate Normals, we use the same formula, just adding a third dimension:

Normal.x = (((B.x - A.x) * Bary.u) + ((C.x - A.x) * Bary.v)) + A.x
Normal.y = (((B.y - A.y) * Bary.u) + ((C.y - A.y) * Bary.v)) + A.y
Normal.z = (((B.z - A.z) * Bary.u) + ((C.z - A.z) * Bary.v)) + A.z

Attached is a screen shot which shows the UV coordinates being correctly displayed, noting that our two Triangles have their vertices provided in arbitrary order - we get correct UV values Every Time, no matter which way the triangle is rotated, and no matter which Winding Order is employed.

Now that we can 'pick the texel under the mouse' in the 3D window, it should be fairly easy to finish implementing the in-place terrain texture blend editor (aka 'terrain painting tool'), which will allow the user to 'directly' edit the appearance of the terrain.

After that, I might look at implementing iocp-based networking, transforming the editor into server and client sides, for party-mode editing of a served world.
N users will be able to edit the same level at once, that's where I wanna be.
It simply involves designing inter-application messages for various editor events.
Example event : painting a Pixel of a Texture.

Posted on 2007-10-28 23:52:42 by Homer
A problem with mousepicking over a resized window was fixed.

The 'ray/triangle intersection' test was moved out of the Rendering code, and into the handler for WM_MOUSEMOVE... whenever the mouse moves, we check if the user is pressing the mousebutton, and if so, we calculate our stabbing ray and try to stab them triangles.
If we hit one, we write the current blending weight (from sliders) to the appropriate pixel in the combined alphamap.
Now our expensive intersection checking code is only called when the user is Painting, taking the load off the cpu. This optimization was possible because the Triangles are defined in WorldSpace. If they were defined in 'object space' (model space), we'd have to pay for our sins one way or another.
To be honest, on this machine I did not notice any improvement, but this machine is a hot rod :P

Direct painting of terrain texture blending weights was implemented !!!
You can use Sliders to set your 'texture blend', similar to selecting a color in a paint tool.
Now you can paint that mix of textures directly onto the world :)

At the moment, the direct approach only affects the Combined alphamap, so you need to be using Pixelshader rendering technique to see the effect.
I'll probably add code to break down the Combined map into separate channels for use by the more lame FixedFunctionPipeline technique.

Here's a screenshot showing how easy it should be .. its not art, is it?

next I'll be adding code to decompose the combined alphamap into alpha channel textures, and then I can get rid of the four associated texture files, opting to store only the combined map in file form.
Posted on 2007-10-30 00:21:28 by Homer
I'd just like to point out that the data required to load, save and generally reproduce our handywork is just one 32x32 pixel image, with 4 channels (32 bit depth) - just like a Cursor or Icon !
The alphamap we use to control the blending is FAR SMALLER than our main input textures !!!
One side effect of this is that when we 'paint' the world, we SEEM to be using a kind of big round Brush, and it really feels quite nice and easy to use.

Code was added so that both the single-channel and Combined alphamaps are edited together, now it doesnt matter whether your hardware supports pixelshader or not.

This thing is really fun to play with !!!

Posted on 2007-10-30 08:29:08 by Homer
I will now begin to move all relevant code into a new Terrain object.

Last night I thought of a way to speed up my frustum culling algo.
Instead of performing tests against the 6 planes of the frustum, I will test first against the frustum's boundingsphere, and if we detect intersection, then perform secondary test to make sure the subject is in Front of the NEAR Plane (ie its not behind us).
This is more than accurate enough, and we can screw with the radius of the boundingsphere to make it more or less aggressive.
I estimate this algo to be at least 400% faster than testing against 6 planes when testing Points, and even faster when testing against more complex geometric entities :)

Posted on 2007-10-31 00:25:50 by Homer
Its time for the first public release of the demo.
This 'retail build' should work on any machine.
Please tell me if you have problems!

I have included some public domain textures, rather than those which nate glasser provided, with courtesy of nvidia. You can replace them, and/or load new ones at runtime.

Keith : 4096 kb isnt enough for me. 16 would do the trick for the immediate future, tia.
Posted on 2007-10-31 06:59:58 by Homer
Early reports:

Quadcore 2400 , nvidia 8500 video = roughly 1100 fps
Quadcore 2400 , nvidia 8600 video = roughly 1800 fps
Quadcore 2400 , dual nvidia 8800 video = roughly 4000 fps

I wish I paid the extra money :P
Posted on 2007-10-31 09:14:13 by Homer
Core 2 Duo (E6750 @ 2.66 GHz),Nvidia GF 8600 GT (PCIE16), 2GB DDR2 @ 800MHz: ~1600 FPS with fixed function pipeline and ~1900 FPS with pixel shaders

AMD Athlon XP @ 2GHz, Nvidia GF 4 4200 (AGP8), 768MB DDR @ 333 MHz: ~350 FPS with fixed function pipeline and ~650 FPS with pixel shaders, but with pixel shaders it's not displaying correctly. probably due to the lack of the required shader version (GF4 Ti supports only pixel shader version 1.3)

Posted on 2007-10-31 14:52:36 by ti_mo_n
Thanks for the feedback - all good :)

Well, I disallow pixelshader if the device capabilities declare that PS1.4 is unavailable... so it shouldn't have worked at all under pixelshader :P

If I get time later today I might have a go at writing some different versions of that shader for other PS versions, with a little gpu register juggling it should be ok.
Thats something I've wanted to look into anyway.
This is my first pixelshader demo so I am very much on the learning curve here, which is great :)

I've just been reading everything I can about local error metrics, used to determine view-dependant LOD.
Most of the algorithms describe a maximum allowable displacement error (height), scaled by the distance from the Camera origin.

One such example formula was to divide the error metric E by the Distance Squared.
I think this is too harsh a falloff.

If we say that E is defined in WorldSpace terms, then we can transform E using the view and projection matrices, which should result in a view-dependant scaling of E.
Now we can compare the scaled value to our hardcoded limit to make our decision whether to continue splitting triangles or not.

To me, having just rewritten my mouse-pick raycasting code, this seems bloody obvious.

I used the inverse view*projection matrices to cast rays from camera space into world space.
This tends to make small values bigger as the distance increases.

Now we wish to transform this value E from world space into camera space...we wish to make our value E get smaller with distance.

Seems logical enough :)

Posted on 2007-10-31 18:57:37 by Homer
I hope 8800GT (not GTS nor GTX) are going to be cheap :)
Posted on 2007-10-31 19:25:40 by f0dder
Back to this value E.

Our QuadTree structure holds this value at each Node.
When we build the Tree, we calculate the 'variance' for each Quad.
This is done by sampling (heightmap) points within the Quad and measuring the overall (height) difference.... and maybe taking into account samples from a NormalMap...
The Tree building recursion stops when E is less than some value we predetermined.

Now we have our Tree. We wanna Render it!
As we walk our QuadTree, we take each Node's E value and shrink it as I described, then compare it to yet another predetermined limit value, and if less, and we wish to stop recursing, we have found the geometry we wish to Draw, its exactly defined by the Quad of the Node we have landed on.

Will this look good enough without all the splitting and joining and messy stuff??
It sure sounds fast!
Posted on 2007-10-31 20:06:25 by Homer
I've decided to change the geometry used to define a Quad.
Attached is an image showing a Quad being defined as Four Triangles formed as a TriangleFan between Five Vertices.

If you imagine this geometry repeated four times as Quadrants of the parent, you will understand why I have chosen this geometry over the more simple 'diagonal sandwich'.

Here's my proposed Node structure.
It could be a lot smaller than this if we used more parameters in our recursive functions, in fact we could pretty much just store a tree of error metric values.. but this form is ready for immediate rendering at any LOD which is just fine for early days - KISS principle applies.

;QuadTree Node
QNode struct
fVariance real4 ? ;Error Metric
fBoundingRadius real4 ?
vMid QuadVertex <> ;Origin of NodeSpace
v0 QuadVertex <> ;Four corner vertices
v1 QuadVertex <>
v2 QuadVertex <>
v3 QuadVertex <>
pChild0 dd ? ;Four pointers to Child Nodes (Quadrants)
pChild1 dd ? ;They can be NULL, indicating that
pChild2 dd ? ;a Quadrant does not require Splitting
pChild3 dd ? ;(can be represented by two Triangles)
QNode ends

Posted on 2007-11-01 04:18:12 by Homer