SW3D: The Future is Now

Guidelines to development of SW3D 2

Author: Noisecrime

Date: 14.07.08

Version: 0.92

Contents

1.0 Overview

1.1 Foreword

1.2 Summary

1.3 Key points

2.0 Avoiding issues of the past:

2.1 New Xtra - SW3D 2

2.2 Director Compatibility

2.3 3D XDK

2.4 Development Cycle

3.0 Core Features:

3.1 Render State Access

3.2 Shaders pixel/vertex

3.3 Render To Texture & Render To Image

3.4 Particle Systems Manager

3.5 Texture Formats

3.6 Texture Non-power of two support

3.7 Alpha component access

3.8 File Format – Multiple UV layers

4.0 Performance and Efficiency:

4.1 API Efficiency

4.2 Batching

4.3 Buffering geometry data on card

4.4 Fast Geometry manipulation and access

4.5 Animation – bones and keyframing

4.6 New Static Model Format

5.0 Issues to Resolve:

5.1 Transparency Sorting

5.2 Lights

1.0 Overview

1.1 Foreword

This document is still a work in progress, it does not claim to be definitive, although it should present most if not all of the major areas to address when considering updating SW3D. However it is important to note that it only examines the 3D rendering engine, and not any related aspects such as geometry detection (collisions, efficient raycasting) or other subsystems that might be useful for a 3D engine.

1.2 Summary

SW3D was practically out of date the day it was released back in 2000. At the time that didn't matter so much as it introduce a whole new exciting feature for developers who had few other viable options. However after 8 years of negligible updates it has not so much fallen behind the curve but completely dropped of the graph.

Such a long period of inaction now means it is insufficient to simply address the problems with SW3D by adding a couple of cool effects (e.g. normal mapping), a few bug fixes and additive blending. It would be far too little too late to bring it up to the level of competitors.

What is needed is a substantial commitment by Adobe along with a major revision of the SW3D framework. I still believe that the architecture is sound, although perhaps not cutting edge, but for a general purpose 3D engine it should still suffice. However careful decisions will need to be made as to how the xtra is extended to ensure that it does not cause performance bottlenecks or restrict developers.

The key points for updating SW3D for me is accessibility to a wealth of low level functions. This will provide an engine that can be pushed by the advanced 3d developers without being restricted by arbitrary design decisions. Whilst at the same time provide the basis to build simple to use, specific 'tick box' features and effects for more novice developers. Expanding upon this theme would be the release of a 3d xdk permitting third party developers to extend the engines abilities further, a method that has worked so well for Director in general.

Development of SW3D should not get hung up on providing a 3D editor, gui enhancements, 3D timelines (I.e a 3D score), shader editors as these are a considerable cost/time investment that can only detract from the potential of updating the underlying 3D engine. Instead these elements can be added at a later date or even provided by the Director community themselves.

So please don't consider sw3d feature and effect requests in isolation from one another, look at what is being asked for and implement generalised, fully accessible framework which can be used to build them instead.

1.3 Key points

Make it as open as possible, provide access to low level functions
Implement shader support fully, no shortcuts
Focus on getting the engine up to date before gui enhancements/editors
Ditch old xtra, move to a new one and specify a reasonable API version (e.g. ps 3.0 level support)

2.0 Avoiding issues of the past:

2.1 New Xtra - SW3D 2

Why risk the problems seen with the D11 text and Unicode update or have silly restrictions enforced by needing to maintain backwards compatibility and simply phase out the old xtra, in favour of a nice new shiny one.

The original SW3D will still be supported/maintained, and maybe even have a few minor fixes such as addressing additive blending, but essentially fixed in terms of features at what we have now. This will become the 'basic' 3D engine of Director, one that developers can use for targeting the very low end machine specification.

SW3D 2 will take up the mantel, it will raise the required system specification to cards made in the last 3-5 years, free-ing it from the fixed function pipeline and able to embrace the new shading languages to provide state of the art effects. Ideally then supporting dx9 and opengl 2.0 hardware only and all the benefits this provides.

This instantly removes any concerns or design restrictions around compatibility with the old xtra and even w3d files. It removes any draconian restrictions based on supporting 10 year old graphic cards. it avoids having to have gracious fallbacks for shaders to fixed function pipeline versions of the same effect (time consuming to add). it allows marketing to come up with a groovy new name SW3D Extreme' ;)

Its important to note that a new version of the xtra does not mean completely new code. This should all be built on the framework provided by Intels SW3D, which is still quite respectable for a generalised 3D engine. So its not starting from scratch, its just making a break with the past so as to avoid its restrictions.

2.2 Director Compatibility

D11 has a number of issues that currently can make it less than desirable to use than older versions. It would be a great advantage for SW3D 2 if it could be built maintaining compatibility with D10 or earlier versions. This would allow those for whom D12 built on the D11 codebase which they have yet to come to trust or need will not be locked into it. It may even offer a different avenue stream if customers can purchase it separately

2.3 3D XDK

Heavily promoted but never materialised for SW3D, a failing that has clearly hindered any potential development over the last 8 years. Its vital not to make the same mistake, although with correct architecture technology, based around shaders and low level api function access, this may no longer be quite as important as in the past.

2.4 Development Cycle

With no knowledge concerning the 'amount' of investment Adobe would give to 3D and considering the lacklustre updates of the past it may be worth considering a new development cycle if not for Director, then for SW3D. 3D is an area of rapid innovation, one which does not sit around waiting for 18 month dev cycles.

SW3D needs more continuous development in order to quickly adept to the new technology found in 3D api's.

3.0 Core Features:

3.1 Shaders pixel/vertex

With unrestricted access to running shaders these are the future not only for 3D but SW3D. They should not be underestimated in terms of the abilities they can provide the developer in achieving their goals with 3D.

There have been many requests for various 3d effects, such as normal mapping, most these days rely or are far easier to create via shaders than it ever was with the fixed function pipeline. However it would be naive to simply implement closed box shader versions of these effects as it would prevent newer techniques or effects from being implemented by the developer. Instead they'd have to wait for Adobe to add them.

It is therefore vital that shader implementation in sw3d in an unrestricted manor, offering the Director developer access to all aspects of shader technology, not only allowing them to write shaders (GLSL, HLSL, Cg), but to pass arbitrary 'uniforms' and attribute array values. At the same time certain commonly used data constructs such as TBN (Tangent,BiNormal, Normal Matrix) should be included and calculated by sw3d there may be others. In addition other aspects of SW3D will need to be update, for example including more texture dimensions (1D, 2D, 3D, Cubemap) and ideally NPOT (non-power of Two).

Essentially then inclusion of shader support should itself just be a framework, one that permits developers to implement pretty much any shader or technique they come across. Adobe should then build on top of this specific requested shader effects, such as normal mapping, to provide a user friendly means of adding them.

Again RenderMonkey is a great example of the level of control and accessibility that should be provided at a lingo level.

The power and flexibility of shaders (especially in Dx10 with vertex creation) may provide a better means to implement some requested features (improved particle systems, hardware skinning, sharing animations,improved lingo animation etch) and avoid the need for an complex xdk.

If for any reason dual api (opengl/directX) support is not feasible then clearly the obvious solution is to turn to supporting just opengl, since it is cross-platform.

Shader Requirements

Textures – 1D, 2D, 3D, Cubemap

Textures – NPOT

Texture Compression – dxt3, dxt5 – others? (dxt5 supports cubemaps in a single file)

Setting Uniform Attributes - Including setting texture references (samplers)

Setting Varying Attributes - Although conversion from lingo list to native data is a bottleneck.

Supply shader code via text members or strings

How to deal with different shader versions?

By stipulating a specific system level requirement, such as opengl2.0 and PS3.0. I don't believe that for a high tech 3d engine, moving to the most common denominator is a wise move. At most perhaps some backward scalability can be given to see support PS2.1

How to deal with cross-API compatibility

Should pixels shaders be implemented for a single API, openGL in this case due to cross-platform support, or should it be implemented for both opengl and directx.

As developers will be able to add their own shaders, support functions should be added to disable selection of API other than that which the developer has selected. That way a windows dev could focus on a purely dx api application and not have to worry if the user tried to run it in opengl.

How to deal with compatibility issues

SW3D should always gracefully fallback to displaying the model as unshaded if pixel shaders fail to avoid any potential for fatal errors or the application simply not running on a users system.

3.2 Render State Access

Render States can be considered to define how a sequence of polygons will be rendered in terms of the display. They control aspects such as depth buffer, blending (additive etch), alphatesting, fog, lights and clipping planes.

Many of these states are currently set by SW3D locking out the developer and will require setting for specific pre-programmed effects. However by opening up access to them at a sw3d node level (model or group node), would enable all manor of visual effects to be controlled by the developer. We simply need to look at how useful the undoc'd _noDepth flag for models has been to see how access to these states could be used by developers.

One drawback in changing these value per node might affect other optimising methods such as texture/shader batching, but the gains in terms of potential special effects and useful techniques should outwieght any negatives. it will just need careful design consideration.

For an excellent example of the render states that should be accessible check out AMD/ATI RenderMonkey application.

3.3 Render To Texture & Render To Image

Requires two modes, RenderToTexture and RenderToImage (object or cast member).

The first being the most efficient and probably most used, rendering scene data to an texture which is then rendered itself in the final scene. Useful for shadow mapping, reflections etc.

The latter is for those cases where the user needs effectively a 'screenshot' of a 3D scene, but by utilising renderbuffers this does no longer needs to be of the image currently displayed on stage.

The real trick with this feature is how to control the rendering into the renderbuffer, how the user will define to sw3d what scene objects need to be rendered. Most obvious candidate is to use the scenegraph, but this should be as scalable as possible.

For example for shadow mapping I want to render * just * a character model (or possibly several, but no architectural/environment geometry) to a buffer a black/white image from the direction of the light. One method could be to have a dummy scenegraph node that is not in the hierarchy of the camera rendernode. That way a renderbuffer camera could be attached to the dummy scenegraph node to render just those models. However this is highly in-efficient as it means duplicating models and if they are animating, duplicating and keeping sync. A far better way would be to flag models within the scene graph as needing to be rendered into a buffer and that all of these are collected and rendered first, then the scene is rendered as normal afterwards.

Supporting more recent hardware (e.g. opengl 2.0 or Dx9.0) should mean its safe to assume access to 'render buffers' which will greatly simplify implementing this feature. Its also essential for a large number of effects such as shadow mapping, real-time, dynamic reflections etc.

3.4 Particle Systems Manager

The old emitters needs to be scrapped, replaced with a new modern, flexible particle management system.

The original still has bugs ( frame not time based, fails when using more than 1 camera – confirm?, no additive blending) and is hopelessly inadequate for today's particle requirements. As a feature it often appears neglected by developers, due to the bugs, lack of control and complexity in designing an emitter system, its just too much work.

Today we don't just need particle emitters, but whole particle management systems, as the best effects, even very simple ones, make use of multiple emitters. Therefore there should be a particle management system object in sw3d, that be loaded with 'definitions' and be responsible for updating them each frame.

This is no easy task and would be perfect opportunity for Adobe to buy in some middleware for, however it is vitally important that we also get an editor as this is one area where there are few or no substitute editors that could be used. An editor is required as a single particle system would require a timeline controlling multiple emitters, akin to the score in Director.

Particle Illusion (http://www.wondertouch.com/) is an application that illustrates the complexity of particle systems today, but the ease of use for developers to utilise. Its designed mainly for off-line rendering and as such is likely far more advanced than is really needed, but Particle Illusion SE does illustrate very well how good looking particle effects require a more comprehensive API than what is currently offered in sw3d.

3.5 Texture Formats

Support for all texture types 1D, 2D, 3D, cubemap and support for compression formats such as dxt3 and dxt5. Once you move into adding pixel support, texture type such as cubemap is essential anyway, so might as well add them all, since both 1D and 3D can be used to great effect.

3.6 Texture Non-power of two support

This has been supported on gpu's for a number of years now and it would undoubtedly make implementation of elements such as 2D interfaces so much easier. Need to double check but pretty sure as a baseline all dx9/opengl2.0 cards have to support this.

3.7 Alpha component access

I remember running into various issues due to the alpha component in vectors not be supported in sw3d. definitely in terms of setting per vertex alpha via colour vertex array, but I'm sure there were some other cases too.

3.8 File Format – Multiple UV layers

Support must be added for multiple sets of uv's per model/mesh in the w3d file as it is only supported at runtime currently. I'd suggest a minimum of 4 sets.

4.0 Performance and Efficiency

4.1 API Efficiency

When examing the opengl output of a sw3d scene it is very obvious that it is poorly optimised. Frequently for the same model, various states and attributes will be repeatedly set or cleared, when there is no need to do so. One good example of this is setting the textureCoordinates vertex array, which is cleared at the start of each model, updated to the models specific values, model is rendered and then the array cleared again!.

4.2 Batching

This is a method reducing the performance overhead of making state changes. At its most simple level, it just 'batches' up all models that share the same shader, so that the shader states are set once for that batch regardless of the number of models actually being rendered. Its most obvious use would be in rendering many of the same object, such as trees and avoids developers/artists having to manually group the same models into a single mesh.

4.3 Buffering geometry data on card

To me mind one of the biggest performance drains on sw3d is the lack of caching or buffering geometry on the card. Without this, the whole scene geometry data from the viewport to be rendered, must be uploaded to the card each frame, which is clearly un-optimal and may well face bandwidth issues.

Implementation of this needs to be carefully considered, possibly given as a developer per model option. On its simplest level all static geometry should be cached though. Animating geometry can be too, but it gets a little more complex to deal with.

Most common approach to using vertex buffer objects (or equivalent) is for the developer to create a serious of different sized buffers and then place specific models into specific buffers that have room left to hold the data. This is too avoid fragmentation issues on the card and as such should be considered in its implementation.

4.4 Fast Geometry manipulation and access

Mesh deform was a step in the right direction, but its so amazingly slow compared to C++ and generally so slow that it has little practical purposes if you are trying to deform models with high polygon counts, many models at a time or at high framerates. It needs a complete overhaul with the emphasis on performance over anything else.

However an easy route to supporting this might be via shader (vertex) technology. For example say I want to animate water with waves. I can simply create a base grid mesh and then pass via shader attributes the vertical offset to apply to the vertex and the shader will manipulate the vertex, instead of having to set all the vertices via mesh deform.

4.5 Animation – bones and keyframing

This needs to be updated from being only a per model modifier to a per mesh modifier. This would enable multiple models to use the same animating mesh, instead of having to create multiple models each using its own animation modifier. Useful for when you want to have many models all doing the same animation with minimal performance loss. For example in making an RTS a unit of 20 'orcs' could be made up up of 4 or so animating meshes, so that there are 4 groups of 5 orcs playing the same motion. Instead of needing 20 orc models each with an animation modifier.

However if caching of model data and pixel shaders are implemented, there might not be a need for this.

4.6 New Static Model Format

It could be beneficial to introduce a new model format or modifier that could take 'polygon soup' of say a game level and using a more efficient space partitioning scheme than the scenegraph, partition the polygons into vertex buffers. This would optimise both the amount having to be rendered (space partitioning) and texture/shading batching as its all done automatically.

Essentially the modern day equivalent to running a BSP rendering engine.

5.0 Issues to Resolve:

5.1 Transparency Sorting

Sadly there is no magic bullet for per polygon transparency sorting and requesting 'fixing transparent sorting issues' is frequently a naive request. It can be addressed, but depending upon situation could be at a huge loss in terms of performance.

Sorting everything

Impractical, not only do you have to sort at a polygon level (slow) you must also sort between models and this could lead to extremely in-efficient rendering sequences, in the most extreme cases where you are literally switching texture States (shaders) every other polygon. This is guaranteed to destroy your performance on even the most powerful 3d card.

Sorting the model

This is relatively easy to do and I posted a demo with source illustrating doing this in real-time on dir3D-l many years ago. Its usable, but due to the slow mesh manipulation access in sw3d its performance is be limited.

The same algorithm could easily be implemented in SW3D and would be much faster (although possibly using a BSP tree could improve it further), making it a practical solution for specific cases, but its not a complete solution. For starters each polygon is still be rendered in the 'painters' algorithm, which means two intersecting transparent polygons can never always be drawn in the correct order. It does not deal with multi-sub meshes of a model using different shaders and it doesn't address different transparent models intersecting.

Alpha testing

I believe today this is still the ideal method (especially in games) when dealing with elements such as vegetation like bushes and trees. An alpha test, unlike blending produces a boolean result, which means anytime a pixel is rendered, it can also have a depth value. In conjunction with the depth buffer this removes any need for sorting at all!

In order to avoid the blockiness of this method it is common practice to render the same vegetation again in a standard blend pass as only the edges will be re-rendered (since the depth buffer has accurate values), thus smoothing, blurring the edges for improved quality.

5.2 Lights

Currently lights do not adhere to their position in the scenegraph hierarchy and the camera. Meaning that lights that should not be rendered as they are in a different part of the scenegraph tree to the camera are still rendered.