I have access to an actual Android phone now, a Samsung Galaxy S Plus.
First thing I did was to debug the texturing.
If you have an Android phone with OpenGL ES 1.0, try this:
http://bohemiq.scali.eu.org/OpenGLES.apk

On my phone it got about 45 fps (with CPU skinning, since there is no shader support in OpenGL ES 1.0).

Then I took OpenGL ES 2.0 to task:
http://bohemiq.scali.eu.org/OpenGLES2.apk

It runs entirely on shaders, and gets 60 fps on my phone.

Here is a video of it: http://www.youtube.com/watch?v=zSphFf2j2zM
Posted on 2012-02-26 14:45:17 by Scali

Another thing I will be doing is to make a cleaner separation between GLUT and my own code. Since the iPhone doesn't have GLUT, I had to use an alternative framework there. By making clean entrypoints for the init() and renderFrame() functions, it will be much easier to adapt the code to run inside any OpenGL wrapper.
I may want to ditch GLUT anyway, since the real GLUT is old and abandoned, and the newer implementations I've tried, weren't that great.
I was told to check out GLFW, so I might just give that a whirl. Would be a good test to see if it really works without GLUT.

Another thing that the iPhone doesn't have is GLU. I only used the gluPerspective() function, but that was partly because of laziness. There should really be a proper function for that in the GLUX library. GLU and GLUT are just more pieces of legacy that should be stripped from my framework.


Following up on this...
For now, I have decided not to look into other rendering frameworks. I already had to write custom code for the iPhone and the Android to host my OpenGL engine.
And now I want to host the OpenGL engine inside a .NET application.

Namely, we have been using an internal tool to visualize the BHM files in a tree, and inspect their contents. Since we mainly use it for storing 3d objects and scenes, we put in a window that could preview the 3d data. This way you could quickly inspect exported data from 3dsmax.

Initially I wanted to release the BHM visualization tool without the 3d component as part of the open source project. But then we had the problem that we had to maintain two versions: one with the 3d renderer, and one without. After all, we don't want to release the 3d renderer as open source.
Then I figured I could make a single version which would try to load the engine component dynamically. If it can't find it, the window is not displayed.
From there, I figured I could also make an engine component based on the OpenGL engine, which is already open source anyway.

So today I've been working on making my own Windows-specific code for creating an OpenGL context and such, and I no longer need GLUT on Windows now.
I've also made a minimal .NET wrapper, so now the OpenGL context can be created on any WinForms Control.
So... by the looks of it, the visualizer can be released complete with the 3d renderer soon. I just need to do a lot of code cleanup.

I'll still be loading the engine component dynamically, so that I can test with both the OpenGL and D3D renderers. After all, the D3D renderer is the one we're using mostly.
Posted on 2012-03-01 10:38:39 by Scali
I'm glad you decided to drop GLUT.
My recommendation for a windowing solution is to use SDL, since its already cross platform and the license is liberal, meaning one set of code will work everywhere.
Don't bother writing a specific platform solution where one already exists across many platforms!
Posted on 2012-03-02 04:21:16 by Homer
Well... there are two problems with using third-party solutions.
The first is that most of them insist on hosting their own window and doing everything for you. Which is okay if you just want an app with one window, but that's not what I am looking for right now.

The second is that I hate SDL with a vengeange. It seems SDL is the main reason why many emulators run like total crap on my Android phone. I can't even emulate a C64 at full speed without frameskip? I mean, really? On a 1.4 GHz phone?
My OpenGLES stuff runs at 60 fps, and UAE4Droid will easily do 50 fps with an emulated Amiga, so why would a C64 be so slow? Same with dosbox. It can barely get low-end 386 speeds, and you need to use frameskip. What on earth for?

Likewise, I noticed quite significant differences in speed between the original GLUT, OpenGLUT and FreeGLUT. How can you make something this simple, that slow?

So no, I'll just roll my own code, thank you.
Posted on 2012-03-02 05:32:56 by Scali
An interesting observation while rolling my own OpenGL initialization code:
Normally you are told that you always need to use the CS_OWNDC style for the window that you are attaching OpenGL to.
Now the problem is, in .NET this style does not exist.
At first I just ignored it, as I was not quite sure what the significance of that flag was anyway, regarding OpenGL, or whether or not .NET would set it by default anyway (after all, it's still using native windows underneath).
And I just happened to be developing on my laptop, which ran the code just fine on its Intel X3100 IGP.

When I tested the code on my desktop, with an nVidia GeForce GTX460, the code failed.
That is, it initalized correctly, and the renderloop was running, but nothing was being drawn on my window.
So I tested the native code with CS_OWNDC, and then the OpenGL code would run as expected. So the problem is related to the DC somehow.

At first I tried to manually set the CS_OWNDC style on the window class of the .NET window in native code. However, that did not seem to work either. If I tried GetDC() after that, it returned an invalid handle. So I decided to dive a little deeper into CS_OWNDC, and found this page:
http://blogs.msdn.com/b/oldnewthing/archive/2006/06/01/612970.aspx
(Great blog anyway, full of interesting Windows factoids, and I really like the author's style).

So, apparently CS_OWNDC caches the DC for your window, and makes sure you always receive the same DC when you call GetDC(). It never destroys or reinitializes this DC, so the state becomes persistent, unlike with regular windows.

And that explains what I have been seeing: As a nice programmer, I try to call GetDC() only when I need the DC, and do ReleaseDC() as soon as I'm done.
Now the problem here is, I attach the OpenGL context to the DC, then I release it again. Now when I call GetDC() again with CS_OWNDC, I get the same DC, with the OpenGL context attached. So if I then want to call SwapBuffers(), it works fine.
If I don't have CS_OWNDC, I can get an entirely different DC, and SwapBuffers() won't do anything, because there is no OpenGL context attached to that particular DC.

So, what to do when you can't set CS_OWNDC? Well, just hog the DC handle. I just do a GetDC() when I create and attach the OpenGL context, and I don't ReleaseDC() until the OpenGL context is discarded.
Effectively this is no different from CS_OWNDC, as the DC would persist in the DC cache anyway, as long as the window was alive.
However, what is different is that if you call GetDC() multiple times for the same window, you get different DCs everytime. As a result, the application behaves exactly like other 'normal' windows, without persistent state, and you won't run into strange problems when you think you have two different DC's, but they are actually the same one. Also, CS_OWNDC can conflict with CS_CLASSDC, a problem that we do not have now. So I personally think that hogging the DC handle is a slightly nicer way than using the CS_OWNDC flag.

It's just a tad strange that:
1) OpenGL on Windows promotes, even forces, such dirty use of DC's. The preferred usage of DCs on Windows in normal applications is to release a DC as soon as you're done.
Why does OpenGL want to attach itself to the DC in the first place? Direct3D is attached to the window handle instead, DC's are not used at all.
2) Apparently on some systems it will work if you don't use CS_OWNDC, yet GetDC() and ReleaseDC() everytime, while on others it doesn't.
I wonder why it works anyway... Does the driver do something clever under the bonnet, so that it attaches to the window anyway, rather than just the DC? Or does it somehow make the system return the same DC everytime (either through setting CS_OWNDC silently, or some other magic)?

Update: I've tested on a system with an ATi Radeon X1900XTX card as well, and it does the same thing as the Intel driver: it works even if you don't use CS_OWNDC and don't store the DC permanently. So it seems nVidia is the odd one out here...
Posted on 2012-03-02 12:20:43 by Scali

My recommendation for a windowing solution is to use SDL, since its already cross platform and the license is liberal, meaning one set of code will work everywhere.


If you want to abstract away different platforms and along with some OpenGL nuisances, SFML is a decent approach.
Posted on 2012-03-02 23:56:26 by SpooK
Speaking of OpenGL nuisances...
I tested my code with the same multithreaded window framework that I used for D3D earlier, and I ran into an interesting issue:


The framerates are very uneven, and the animation is not smooth either. This was on a GeForce GTX460. So I figured I'd also try it on some other machines, one with an ATi Radeon X1900XTX, and one with an AMD Radeon HD5770. Both ran more smoothly.
In D3D the GTX460 had no such problem, so I just reported it as a driver issue. I guess nVidia wants to prioritize OpenGL tasks... which works fine with a single task, but as Raymond Chen of The Old New Thing always says: What if TWO applications would do this?
Or in this case, just two or more threads from the same application.

Anyway, here's the binaries if you want to test it (Win32, .NET 4.0):
http://bohemiq.scali.eu.org/OGLMT.zip
Posted on 2012-03-07 10:19:10 by Scali
Due to scheduling, using multiple threads to render a single frame is fraught with disaster and I would never recommend it. I would #1 recommend only EVER using one thread to perform rendering.
Now, I know you're thinking, what about background loaders and so on?
The render path can be controlled from other threads, this is not the same thing as PERFORMING rendering from multiple threads.
If you need to drive your visual object transforms from a thread other than the rendering thread, then you're going to need some way of synchronizing them other than a hard mutex, usually this will be based on time, and involve artificially lagging something.

Even setting your thread priorities won't save you from being scheduled.
Posted on 2012-03-07 23:43:15 by Homer

Due to scheduling, using multiple threads to render a single frame is fraught with disaster and I would never recommend it. I would #1 recommend only EVER using one thread to perform rendering.


Firstly, did you even bother to look at the screenshot? It's not a single frame, it's 10 concurrent renderers. So your entire post does not apply. I am talking about completely different issues (as usual). I want to use multiple OpenGL/D3D windows from the same application, so that I can render to multiple GPUs and screens at the same time, and have additional accelerated windows inside the UI of the application. There is no rendering from different threads than the rendering thread, there are just multiple rendering threads. Which works fine in general, except that on nVidia there seems to be a driver glitch when using OpenGL (but not D3D).

Secondly, since D3D and especially OpenGL are connected to the window and thread they are created with, it is generally not even possible to use the same context from different threads (although D3D11 allows you to create additional contexts with limited functionality which you can use to prepare commands, which can later be executed by the main context in the main thread).
It's only logical, since the context is connected to one GPU (and SLI/CrossFire is also one GPU as far as the application is concerned, multiple GPUs are virtualized into a single one for the application), and you need to send commands in a strict order.
Posted on 2012-03-08 03:24:27 by Scali
OpenGL also allows you to create multiple contexts, and to share resources across contexts, I'm not saying it can't be done, I'm saying one thread per context means no artificial stalling due to mutexing.

But I see you're already doing that, and the only reason that your windows are not rendering at the same rate is because the OS thread scheduler is not democratic, it performs poorly in terms of load balancing.

So I think my comments were in line.

Posted on 2012-03-08 21:09:11 by Homer

But I see you're already doing that, and the only reason that your windows are not rendering at the same rate is because the OS thread scheduler is not democratic, it performs poorly in terms of load balancing.


Incorrect.
Because as I already said:
1) When using Direct3D on the same videocard, there is no issue with load balancing:

I have also done tests with just rendering the windows with a framecounter in the title bar, and no D3D or OpenGL rendering at all. The OS scheduler is not the problem.

2) I have tested on two other systems, which did not show the problem either. Even though they are slower systems, and the framerates were lower on average, the animation remained smooth because the load remained balanced properly.


So I think my comments were in line.


I don't. You were talking about, and I quote: "using multiple threads to render a single frame".
Which is the opposite of my situation: "Rendering multiple frames, with one thread each".
The rest of your post was just Captain Obvious on multithreading. I'm not sure with what intent you posted it, but if you think I didn't already know about that, you're sorely mistaken.

And this post again... If you bothered to READ my earlier post, you'd know that I had already eliminated the OS scheduler as the root cause, because, and I quote:
"This was on a GeForce GTX460. So I figured I'd also try it on some other machines, one with an ATi Radeon X1900XTX, and one with an AMD Radeon HD5770. Both ran more smoothly.
In D3D the GTX460 had no such problem, so I just reported it as a driver issue."

Next time, have the decency to READ a post before you reply.
Posted on 2012-03-09 01:40:59 by Scali
I am absolutely certain that windows directx would be doing a GetDC(windowhandle) under the hood, because its GOT to query for the rendering surface before it can write to it, opengl is just exposing this factoid.
I like that you bothered to re-post about the thread latency issue, blaming the drivers doesn't mean s***, they are chained, are they app level? running already and concurrent with the os (bet theres still balancer on it)? etc surely u took some time to qualify ur results and didnt merely observe ?
Posted on 2012-03-09 05:31:52 by Homer

I am absolutely certain that windows directx would be doing a GetDC(windowhandle) under the hood, because its GOT to query for the rendering surface before it can write to it, opengl is just exposing this factoid.


No it doesn't.
A DC is just a GDI-specific object.
DirectX is implemented at the same level as GDI:
http://msdn.microsoft.com/en-us/library/windows/desktop/ff729480(v=vs.85).aspx
In fact, since Vista, GDI is actually running on top of Direct3D, at least, when using Aero.
So there is no reason why you would need a DC object whatsoever for Direct3D. ANd likewise you would not need one for OpenGL either in theory.
As I said, the DC thing is probably a legacy thing. OpenGL has been around on Windows NT longer than Direct3D has, and was originally a software-only implementation. In that light, its ties with GDI are more logical.


I like that you bothered to re-post about the thread latency issue, blaming the drivers doesn't mean s***, they are chained, are they app level? running already and concurrent with the os (bet theres still balancer on it)? etc surely u took some time to qualify ur results and didnt merely observe ?


Excuse me if I don't play along with your little charade.
Posted on 2012-03-09 06:04:54 by Scali
I perform a bunch of concurrent threaded operations in my framework, each has a spend limit and can bail out early if the limit is breached, subject to the minimum cap on framerate. OpenGL contexts are device contexts, the DC isnt deprecated, its a thing.
Posted on 2012-03-09 06:21:52 by Homer

OpenGL contexts are device contexts, the DC isnt deprecated, its a thing.


OpenGL contexts are OpenGL contexts, . They are *a* context, not *the* Device Context as defined by GDI and GetDC().
Which should be quite obvious, since GDI and GetDC() are Windows-specific, and OpenGL is not.
And even on Windows, the OpenGL context is NOT the DC (as pointed out, DCs are not persistent, unlike OpenGL). An OpenGL context is referred to by a HGLRC (handle to a GL rendering context) in Windows, not by a HDC.

http://msdn.microsoft.com/en-us/library/dd374379(v=vs.85).aspx
Remarks

A rendering context is not the same as a device context.
Posted on 2012-03-09 06:27:18 by Scali
Under opengl, its like Lua contexts - you create a master context, and register all resources there, and notify the system that you intend to share them to a set of child contexts. So the problems of multithreading (almost) go away.
Posted on 2012-03-13 01:19:36 by Homer
nVidia just released a driver update: 296.10.
The starving issue is still there though.
nVidia has not responded to my bug report yet either.
Posted on 2012-03-14 07:56:21 by Scali
Just got a reply from nVidia. They confirmed the bug and said that the engineers already identified the problem in the driver. A fix will be made available in a future release. Cool!
Posted on 2012-04-05 14:37:49 by Scali
Just installed the new 301.42 driver.
And they fixed my bug! That was quick!
Posted on 2012-05-24 02:20:50 by Scali
very cool!
Posted on 2012-05-25 04:16:10 by Homer