Hello, all!

I'm not good with objects and with english :)

But here is one more crazyness I found about objects.

For some time I was using (well, trying to implement) my own object model.
(there alot "to do" in it, of coz)

Just one thing I've noticed:
When we use VTable, it usually contains some data (for example pointers to some metods, variables, etc)
Each data field is usually DWORD (we are on WIntel platform)

Now the main idea:
If field contains a pointer (to some address), then address is always < 8000 0000h for user mode.
So we surelly know, that the 31st bit must be alwais zero.
Let's say, we need not to save this bit in to memory ;) But anyway we will use DWORD.
So I use this bit for other purposes (in my case this flag shows has my object custom destructor or uses standart one)
It allow me to have smalest object size = DWORD (if it uses only one metod)
And one more, this object can be "suspended" (objects I'm using are automatically executed during main loop)
For this purpose I use the 30th bit as a flag.

Of cource, this "economy" will not give us much space saving if we're using objects that are ~100DWORD size, but if we have 1000000 small objects...

Since I've never seen this technic before, what do you think about it?

P.S. may be :stupid: ?
Posted on 2003-08-24 22:15:12 by S.T.A.S.
For many small objects nothing beat an array. ;)
Posted on 2003-08-24 22:34:07 by bitRAKE

For many small objects nothing beat an array. ;)

Yes, I have "array" of very small "objects" ;)

throw in another DWORD for 32 flags.

Be sure, I'll do so one day...
I mean flags not for "objects" but for "objects manager".
Really I use "big" and "smal" objects at the one time. I just want an universal format, that will work good with both of them (well, "in most cases")

My idea is really very strange, if we will think high level, but hard facts are: my CPU is able to execute up to ~4G instructions per second, but just able to read (theoretically) 2,1Mb/sec (really ~1700). I think It's better to calculate something, than to read/write/move from mem.

If RAM has "no limit" in size, so we shouldn't save it. But cache is not so big, what if i wanna all my objects to be in cache?
Posted on 2003-08-24 23:48:44 by S.T.A.S.

My idea is really very strange, if we will think high level, but hard facts are: my CPU is able to execute up to ~4G instructions per second, but just able to read (theoretically) 2,1Mb/sec (really ~1700). I think It's better to calculate something, than to read/write/move from mem.
Ah, you have more wisdom than you let on! My CPU can execute ~18G instructions per second and similarly disproportional memory speed -- there is no shortage of CPU cycles per memory access. The fact that people are able to test the difference in processors at these speeds is a product of poor compiler technology. Really makes us seem crazy for wanting to do it in ASM. ...then along comes hyperthreading to redeem the clock cycle care of the ASM programmer. ;)
Posted on 2003-08-25 00:09:11 by bitRAKE
Hi, bitRAKE!
What a nice CPU do you have?


..then along comes hyperthreading to redeem the clock cycle care of the ASM programmer. ;)

I know a bit about HT. It seems to me, that Intel has take a look at an old hack of its i8080 (made by Zilog)
Z80 has two sets of registers, and instructions to change beetwin them. It was quite usefull thing...
Hardware swithing, IMHO, will help only C programmers (with Intel compiler, only). Why I need 2 logical "CPU" , if one isn't able to get all the data that it can use? Then the second will just trashe cashe...
I read somewhere ot this board, that multithreading is even worse :o

I made this thread not to say "I know how to do the best object model", I'm just interesting in all possibilites "haw to do someting better". Discussion, IMHO, is one of best ways...

P.S. Just one real thing I've seen on CPU with HT : Zyxel UNO driver completly freezes win XP, when HT is on :grin:
Posted on 2003-08-25 01:31:09 by S.T.A.S.
I have a +2Ghz Athlon XP - it can execute 9 NOP's per cycle. :)
Posted on 2003-08-25 01:49:55 by bitRAKE

I have a +2Ghz Athlon XP - it can execute 9 NOP's per cycle. :)

That's what I just get!
But my code runs on it with the same speed as it does on my previous Athlon1000 :mad:
Thought CPU has at least 60% faster core.
That is why I'm thinking about such "odd" things like this "One more OOP crazyness" :rolleyes:
Posted on 2003-08-25 02:40:08 by S.T.A.S.
Okay, here's another, crazier idea:
parallel tables.

Let's view objects of the same class as a spreadsheet.

Let each row be one object, and each column be an attribute.


XPos YPos ZPos Color
Particle1 .. .. .. ..
Particle2 .. .. .. ..
Patricle3 .. .. .. ..


Now, most object languages arrange the data row-by-row; that is, in memory, they do:
Particle1.XPos
Particle1.YPos
Particle1.ZPos
Particle1.Color
Particle2.XPos
Particle2.YPos
Particle2.ZPos
Particle2.Color
Particle3.XPos
Particle3.YPos
Particle3.ZPos
Particle3.Color

However, AFAIK nothing prevents us from doing:
Particle1.XPos
Particle2.XPos
Particle3.XPos
Particle1.YPos
Particle2.YPos
Particle3.YPos
Particle1.ZPos
Particle2.ZPos
Particle3.ZPos
Particle1.Color
Particle2.Color
Particle3.Color


Considering the fact that SOME methods will need to work with several different objects, but only a few of the attributes. For instance, a particle system's PullAllParticlesDown method will prolly work with only the Y positions of all the particles. By lumping the YPos attributes together, we can make it more likely that, for the method's run, most of the data is in the cache. If a method makes use of 2 or three attributes from several objects, we, as ASM programmers, have the power to arrange the parallel tables so that they sit right next to each other.

While this means that adding new objects is kinda difficult, it also means that adding new ATTRIBUTES is also relatively easier. And there is no real need to reassemble existing methods - adding a new attribute that existing methods don't care about will not rearrange the existing tables. In the standard way of arranging data, at the very least we need to change the length of each record/object, so our method (which handles an array of objects) must be reassembled to, at the very least, change the length of the object.

NOTE however that such a layout is more powerful if, and only if, we are using methods that work with lots of objects but only a few attributes at a time. Most things I can think of fall in here.

Like for instance, when, say, rendering game objects, I care about the position and color, but I do not need to display certain status things that I don't want the player to see. At rendering time, all I care about are the position and color. However, at another time, I will be considering the effects of those status, then I will look at them, but I won't be interested in the position or color. Out of sight, out of mind.
Posted on 2003-08-25 03:02:06 by AmkG
Hi, AmkG!
Very simple and powerful! Like all genious ;)

While this means that adding new objects is kinda difficult

Why not to greate all objects, and mark some of them "suspended"?

And there is no real need to reassemble existing methods - adding a new attribute that existing methods don't care about will not rearrange the existing tables. In the standard way of arranging data, at the very least we need to change the length of each record/object, so our method (which handles an array of objects) must be reassembled to, at the very least, change the length of the object.

This should be posted to "The Crusades", IMHO :alright:


bitRAKE, as I think, AtlonXP2+, is able to execute 9*12,5*133 =15G NOP/sec
Posted on 2003-08-25 23:49:48 by S.T.A.S.

bitRAKE
, as I think, AtlonXP2+, is able to execute 9*12,5*133 =15G NOP/sec
Posted on 2003-08-26 00:16:59 by bitRAKE

My processor is running at 12*166 ~ 2Ghz

Now I see... I was thinking you're talking about AthlonXP 2,0+ Palomino/Thoroughbred (really 1667Mhz)
Sorry for SPAM...

Eight bytes, but how many cycles?

I'm not able to calculate it "in mind" now :(
I'll take a look at AMD docs & CodeAnalyst at home later... :rolleyes:
Posted on 2003-08-26 00:44:31 by S.T.A.S.

Hi, AmkG!
Very simple and powerful! Like all genious ;)


Why not to greate all objects, and mark some of them "suspended"?


What I had in mind exactly ;)

And since you just showed your shortcut for marking unused objects....

Gonna integrate that in that game I'm making which prolly won't see the light of day anyway :rolleyes:



This should be posted to "The Crusades", IMHO :alright:



Why? :eek:
I got in there once and........ I do wanna go in there again!
Posted on 2003-08-26 04:11:16 by AmkG

Now the main idea:

If field contains a pointer (to some address), then address is always < 8000 0000h for user mode.
So we surelly know, that the 31st bit must be alwais zero.
Let's say, we need not to save this bit in to memory ;) But anyway we will use DWORD.
So I use this bit for other purposes (in my case this flag shows has my object custom destructor or uses standart one)
It allow me to have smalest object size = DWORD (if it uses only one metod)
And one more, this object can be "suspended" (objects I'm using are automatically executed during main loop)
For this purpose I use the 30th bit as a flag.

Of cource, this "economy" will not give us much space saving if we're using objects that are ~100DWORD size, but if we have 1000000 small objects...

Since I've never seen this technic before, what do you think about it?
It's not a crazy idea. Also, packing data into redundant areas is not new. The Pentium (and IEEE) floating point format is a prime example. In a nonzero number, the most significant bit is always 1, so the normalized format deletes the MSB, allowing an 8-bit exponent in 32-bit format with 24 significant bits + sign bit (33-bits altogether!) This packing style goes back, at the very least, to the venerable PDP-11 minicomputer.

I have worked out a GC (Garbage Collection) scheme for objects. Each object has an embedded size field because it's simpler and faster to GC than if the size is stored in a class or "type" table. Because I'm allocating in DWORDs, the size has two redundant bits (they are always 0). If I use MOVSD, I can eliminate the two bits by storing the DWORD count instead of the object size. And because the objects can only be allocated in user memory (the lower 2G), there is another redundant bit. I can use one bit to indicate that an object has been collected, and another bit to indicate whether the object contains pointers only or no pointers. The all-or-none pointer property of objects is also a way to simplify GC. The GC will probably be the only code that actually uses this size-and-attributes field.
Posted on 2003-10-01 17:30:59 by tenkey

Okay, here's another, crazier idea:
parallel tables.
Also not a crazy idea. Compilers have been written in HLLs using parallel arrays for symbol tables. The HLLs were languages that didn't have records/structures. These tables were fixed in size because dynamic memory management was not a common feature of HLLs (in those old days). A counter told you how many entries were valid. If you filled up the table and needed to fit in one more entry, you had to give up - lose information or abort the program, take your pick.
Posted on 2003-10-01 18:01:51 by tenkey
Originally posted by tenkey
packing data into redundant areas is not new

It's new for me :rolleyes:

If I use MOVSD, I can eliminate the two bits by storing the DWORD count instead of the object size. And because the objects can only be allocated in user memory (the lower 2G), there is another redundant bit

supposing that your objects should not be 2G in size, I think there are more than 3 bits free in your DWORDs :)

Thanks for sharing your ideas, tenkey :alright:
Posted on 2003-10-02 19:07:47 by S.T.A.S.