i just check Iczelion's tutes and in tute # 11 it talks abt GlobalAlloc and GlobalLock...i think that's basic enuff for my needs atm
Posted on 2003-04-24 21:23:24 by AnotherWay83
AnotherWay83,

You will find that GlobalAlloc() does a lot of normal stuff fine and it is well suited for a number of different size allocations if thats what you need. Its also very simple to use when you use the GMEM_FIXED flag as the return value is also the starting address of the block of memory you allocate.

There are other memory allocation functions available when you need them that can be more accurately tailored to different tasks so there is plenty of options available to you when you ned them.

Regards,

hutch@movsd.com
Posted on 2003-04-24 22:15:02 by hutch--
msmith, VirtualAlloc is a bad generic allocation routine. If you look apart from implementation details (that it's one of the slower functions - this isn't documented by the interface, but it is), a number of problems are appearant:
*) there's no VirtualReAlloc
*) you get very high alignment, which isn't suitable for a lot of small allocations - also, allocation sizes are rounded to page boundaries.

AnotherWay83, if you need something simple, consider coding your own wrapper around HeapAlloc (then it's not a big deal if you want to make all allocs go to another heap later on, change between zeroed or unzeroed memory, etc), or if you want it even simpler, CoTaskMemAlloc which only takes a single parameter. There's really no excuse for Global/LocalAlloc when it isn't required (ie, some of the clipboard functions say you must use GlobalAlloc memory - I don't know whether this is true or a documentation error, but you'd better stick by the rules).
Posted on 2003-04-25 02:31:34 by f0dder
i dont think i know enuff to write my own wrapper around that function, or even quite what a wrapper means :grin:

but another question is, what sort of mnemonics get generated by functions like GlobalAlloc and other memory functions anyway? in 16-bit there was the low-level in, out and ins and outs inside the interrupts...what abt in 32-bit asm

fanks
Posted on 2003-04-25 13:19:38 by AnotherWay83
Well, GlobalAlloc and the like call system memory allocation functions. These do in turn update various system structures, and maybe fiddle around with page tables a little. There is a lot going on when you execute a memory allocation function. But these are just normal instructions - instructions that deal with memory, just like the instructions you use in your programs and it has nothing to do with whether the code uses 16-bit or 32-bit addresses and operands.
Posted on 2003-04-25 15:11:27 by Sephiroth3
Just thought I'd post my 2 cents worth here!

ALL memory allocation functions eventually call VirtualAlloc! Therefore VirtualAlloc is THE FASTEST! Damn ... had a picture from Microsoft to demonstrate this somewhere in MSDN but can't find it now ...

HeapAlloc and GlobalAlloc are now just wrappers (Since about Windows 95)! HeapAlloc and GlobalAlloc are just dinosaurs from old C/C++ DOS (Console) days with some 'ease of use'! GlobalReAlloc and HeapReAlloc call VirtualAlloc for the new size ... move all the data ... and then call VirtualFree on the origional allocation! Since we are Assembler programmers and we like things to be as low level as possible ... use VirtualAlloc people :) So the notion of Allocating 200 bytes in HeapAlloc is not what really happens! HeapAlloc uses VirtualAlloc which will align the 200 bytes to the nearest 4K page boundary. So a 4K chunk will be allocated cause that's how Windows allocates memory! In 4K sizes! Even the stack is actually allocated in 4K pages. Do not think VirtualAlloc is slow ... it's the fastest!!!!!! NOTHING can allocate memory faster in Win32 cause they all use it in the end anyway!!!! Damn ... I once told you guys this but nobody caught on!

Sorry I can't give more technical details on the whole process ... but I'm tired and not interested to go info hunting now!

Ciao
Posted on 2003-04-25 19:03:00 by SubEvil
Golly, I feel vindicated now.

But remember, the original question was what is the Win32 equivilant of malloc. For better or worse the vanilla answer is (and remains) VirtualAlloc. The other functions all do something in addition to was malloc does.

As for the reallocate issue, my original example does exactly that. I thought low level was king here.
Posted on 2003-04-26 17:38:08 by msmith
msmith,

=============================
Golly, I feel vindicated now.
=============================

Grin, glad to see you were not mislead. :tongue:

Wht you tend to get when you are working within an existing operating system is the way its structured which is not really ajustable internally so you basically pick the technique that best suits what you require.

Allocating a block by whatever technique and managing it yourself is very efficient when its tailored to exactly what you are after and you often get very good performance as well so it is a good way to go if thats what you are after.

Regards,

hutch@movsd.com
Posted on 2003-04-26 21:07:49 by hutch--
In response to an earlier post by f0dder, here is an excerpt written by him (presumably) at:
-------------------------------------------------------------------
http://f0dder.didjitalyphrozen.com/memalloc.htm

Tests where done with 256meg memory, as 384meg was too big for the static test :). The test consisted of writing one byte to each 4096 bytes of the allocated memory. The idea was to test pagefault overhead of the memory allocation, not memory speed.



VirtualAlloc: 190ms
HeapAlloc: 200ms
mmapped: 230ms
static: 230ms
CoTaskMemAlloc: 200ms
GlobalAlloc: 200ms
-------------------------------------------------------------------

So much for VirtualAlloc being "one of the slower functions"

I appreciate the contributions of f0dder and others to this forum but this issue seems to be divisive, causing people say things that they even disagree with themselves.

Also, for whatever its worth, fasm's asmwork program uses VirtualAlloc extensively and exclusively.
Posted on 2003-04-26 22:03:30 by msmith
Good information. If for no other reason, as mentioned above, Microsoft recommends not using GlobalAlloc and it is deprecated. That says a lot. Why use something that is going away?
Posted on 2003-04-27 08:02:28 by drhowarddrfine
SubEvil:

ALL memory allocation functions eventually call VirtualAlloc! Therefore VirtualAlloc is THE FASTEST! Damn ... had a picture from Microsoft to demonstrate this somewhere in MSDN but can't find it now ...

Actually, that is not true - I used to think the same, and I proved myself wrong :). if you time VirtualAlloc against Global/Heap/Whatever, you will see it has higher overhead. While the other functions will eventually have to rely on a low-level primitive (I dunno if they call VirtualAlloc, or go directly to a kernel mode equivalent), they can do a lot more managing in ring3 mode without having to transition to ring0. Dunno if it's the 3->0 transition of valloc that makes it slower, but slower it is. (Iirc, it's the same speed on 9x but slower on NT - but I will have to redo a bunch of testing before being able to say this with certainty).

As for the rest of your statements, they're wrong :). If you were right, do you think it would be possible to allocate 65536 16-byte blocks on a 256meg system? And allocating a new block + copying old stuff there? Ouch, that's an expensive way to resize a block if you have a free memory region beneath it.

msmith:

But remember, the original question was what is the Win32 equivilant of malloc. For better or worse the vanilla answer is (and remains) VirtualAlloc. The other functions all do something in addition to was malloc does.

Wrong - actually there's no _direct_ equivalent. VirtualAlloc is far off, and the routines that are closest are HeapAlloc, Global/LocalAlloc, CoTaskMemAlloc. Also, if I am not mistaken, the (outdated!) article at my site tests memory access speed, not allocation speed. There's quite a big difference. VirtualAlloc is fine for _large_ chunks of memory that will not be reallocated, and when you need the protection flags. Generic alloc routine? No!

drhowarddrfinedrhoward:

Good information. If for no other reason, as mentioned above, Microsoft recommends not using GlobalAlloc and it is deprecated. That says a lot. Why use something that is going away?

Exactly. Most people are probably using it because of it's simple syntax, and perhaps because a lot of people have been previously using it. As for the simple syntax: write a wrapper. The only time to use GlobalAlloc would be when the API says you have to (clipboard).

AnotherWay83:
A wrapper... well, instead of


invoke GetProcessHeap
invoke HeapAlloc, eax, HEAP_ZERO_MEMORY, 65536

You would


invoke MyAlloc, 65536


MyAlloc would then "fill in the blanks" and call HeapAlloc. This is actually what Local/GlobalAlloc do on NT, but with a longer codepath than you can come up with on your own (because they do some flag translation etc).
Posted on 2003-04-27 13:50:42 by f0dder
I took the trouble to check out HeapAlloc.

First it needs a "HeapCreate" to which you must keep the returned handle unless, of course you would rather do a 'GetProcessHeap' each time you use HeapAlloc.

You must specify an initial size and a maximum size for the heap. Then you must request memory from the heap (with its predefined maximum size) by calling HeapAlloc.

If you think that this set of operations is functionally equivilent to malloc(), I give up.

What do you do when you need to allocate more memory than max size of the HeapCreate? There are no issues like that with malloc or VirtualAlloc.

Back to the original question:

If you set the address parameter to NULL, and the flags correctly, VirtualAlloc IS the functional equivilent of malloc.

The HeapCreate/GetProcessHeap/HeapAlloc combination is NOT. This combo is not even the conceptual equivilent.

In the process of changing my compiler from c output to asm output, VirtualAlloc and VirtualFree were direct replacements for malloc and mfree. malloc has no concept of allocating a heap and then asking for chunks of it whatsoever.

Whether one is faster or not was not the question. Equivilence was.

I'm sure that you write great code using your methods. I'm equally sure I could learn a great deal from you and the other prominent contributers to this forum, and I intend to. But when you insist, as you have, in this malloc question... it makes me wonder if the other things you say are correct.
Posted on 2003-04-27 15:24:43 by msmith

If you think that this set of operations is functionally equivilent to malloc(), I give up.

I quote myself: "actually there's no _direct_ equivalent."
Funny thing is, many runtime libs (including the one from microsoft visual studio .net) rely on HeapAlloc. Also, what guarantess does malloc() give? It allocates memory from the heap. So does HeapAlloc (and global/localalloc, etc). VirtualAlloc doesn't, and is actually further away from malloc.

Next, you don't have to GetProcessHeap() in each and every call to HeapAlloc (well, there's actually no documentation about this in PlatformSDK that I've seen - either for or against.) For all I know, you can GetProcessHeap once at start and use that value throughout the application.


What do you do when you need to allocate more memory than max size of the HeapCreate? There are no issues like that with malloc or VirtualAlloc.

HeapCreate, PlatformSDK documentation: "If dwMaximumSize is zero, it specifies that the heap is growable".


If you set the address parameter to NULL, and the flags correctly, VirtualAlloc IS the functional equivilent of malloc.

CoTaskMemAlloc is closer - it only takes one parameter :tongue: .


The HeapCreate/GetProcessHeap/HeapAlloc combination is NOT. This combo is not even the conceptual equivilent.

It's actually a lot closer than VirtualAlloc. VirtualAlloc always aligns at page boundary (for NULL lpAddress, the boundary is even aligned to 64k!), and size is always aligned to page boundary (4k on normal x86). Furthermore, VirtualAlloc works (more or less) directly on the virtual address space, where malloc and HeapAlloc works on "heap memory".

Those observations are directly from PlatformSDK - ie, the public documented "interface" for these functions. Already now, it should be obvious that VirtualAlloc is a poor choice for generic allocations (it's mighty fine when you have special needs, sure... but it's horribly inefficient for small allocs or dynamic allocations, unless you build stuff ontop of it).

If you look at the implementation, you will see that VirtualAlloc is also pretty slow, allocation wise (on NT at least). Another good reason to avoid it as a general allocation method.


In the process of changing my compiler from c output to asm output, VirtualAlloc and VirtualFree were direct replacements for malloc and mfree. malloc has no concept of allocating a heap and then asking for chunks of it whatsoever.

But malloc/free has concepts of page protection, preferred starting address, and allocation type? :rolleyes:


Whether one is faster or not was not the question. Equivilence was.

True. Still can't see how VirtualAlloc is more equivalent than HeapAlloc. Both require more parameters than malloc... and virtualalloc certainly has characteristics that make it less than suboptimal for generic allocation. Can't remember the speed difference, but it was noticeable - and this was for big blocks of memory, many smaller block allocs will be worse.


I'm sure that you write great code using your methods. I'm equally sure I could learn a great deal from you and the other prominent contributers to this forum, and I intend to. But when you insist, as you have, in this malloc question... it makes me wonder if the other things you say are correct.

I might write great stuff, I dunno. It's mostly private use. Never had complaints about the larger stuff I've written, though (plenty of complaints about my XCOM fixes, but that's mainly because the documentation sucks and lots of non-technical people are using it ;-)). Also, I am only human, and thus make mistakes.

However, insisting that VirtualAlloc is a poor replacement for malloc is _not_ a mistake. Insisting that HeapAlloc is a better replacement is not a mistake either. If you ignore the parameters required, the characteristics of HeapAlloc allocation are much closer to malloc than VirtualAlloc.

The most direct equivalent in win32, both interface and characteristic wise, is CoTaskMemAlloc. It takes a single parameter (bytesize) and returns a memory block. There's *Free and *Realloc too, and it all works on heap alloc (noted by studying implementation, not guaranteed by interface).

I still prefer HeapAlloc, since it gives more control. It gives you the option of using your own heap _if_ you want it, or use the default if you don't care. You get the choice of uninitialized or zero allocated memory, etc. And you're guaranteed that this will allocate from the heap, since it's part of the interface of this function.


When I initially looked at win32 allocation functions, I used VirtualAlloc/Free as malloc/free equivalents for a while - until I realized the characteristics that make this an absolutely wrong thing to do.
Posted on 2003-04-27 15:49:02 by f0dder
I think you may be on to something with the CoTaskMemAlloc/CoTaskMemFree stuff.

In fact the docs say "Allocates a block of task memory in the same way that IMalloc::Alloc does. " That pretty well nails it.

I will change my compiler later to test this out.

Is CoTaskMemAlloc faster than VittualAlloc?

BTW, I told you I would learn something from you... Thanks!
Posted on 2003-04-27 15:59:52 by msmith
Yes, CoTaskMemAlloc is a much more direct equivalent (interface wise) than VirtualAlloc. I'd still advice a wrapper around HeapAlloc instead, since you get more flexibility that way. Furthermore, HeapAlloc is in kernel32.dll (on NT forwarded to NTDLL.DLL), while CoTaskMemAlloc is in ole32.dll. Ok, ole32.dll should always be in memory and DLL sharing will help, but there's probably still some DLL_PROCESS_ATTACH in ole32, plus some private memory pages etc. Also, CoTaskMemAlloc goes through COM methods, which mean a vtable call - HeapAlloc probably has a bit shorter code path. Nothing you should be able to measure, but it still feels better using the "faster" method :-). IMalloc memory is allocated from the heap anyway (as far as I know - a heap memory spy should verify this). Whether it allocates this via HeapAlloc or some other method (probably NTDLL call, or "something ugly" on 9x) is another case though.

Yes, (on NT at least), CoTaskMemAlloc is faster in allocation that VirtualAlloc, and you don't have those nasty problems of 64k align and 4k allocsize roundup (those can of course be desirable attributes for some types of allocation, but for generic allocation _not_). Note that speed is allocation speed - access speed is the same. The only thing that differs in access speed (that I know of) are memory mapped files, because of the way they are handled internally - and when backed by the pagefile (-1 filehandle), the speed difference isn't very large (but it's there nevertheless - and memory mapped files are in general slower than working with normal files, there's even larger allocation speed overhead than VirtualAlloc, and on 9x there's the problem of allocating from the GLOBAL memory part).


BTW, I told you I would learn something from you... Thanks!

:-)
Sorry if the first posts seemed a bit cocky/arrogant/whatever. I usually (though not always ;-)) have some reason for stating what I am stating. I'm of course happy when people can correct me, though that requires some technical evidence ;-)

Again, I'd have a look at HeapAlloc if I were you. The wrapper to change it to malloc() interface is a single line, if you do GetProcessHeap() at program startup, and if you look at "what it does", I should think it's the one that has a description closest to that of malloc (basically, "allocate heap memory").
Posted on 2003-04-27 16:14:39 by f0dder
Also, I should add that I will (when time permits, hopefully soon) try to do a rather comprehensive article about win32 memory stuff. There's a lot to cover, like differences between 9x and NT, difference between small and large memory allocs (virtualalloc _might_ turn out faster for "huge" allocs, and HeapAlloc isn't for >256meg allocs anyway), etc. Furthemore, interesting topics such as NT's "zeromemory thread".
Posted on 2003-04-27 16:28:52 by f0dder
I was just trying CoTaskMemAlloc on my compiler, but found that it is in ole32.dll. fasm (at least my copy) has no ole32 include file.

I just assumed it was in kernel32.dll. Someone should teach Microsoft the meaning of 'orthagonal' :-)

I'll get or make one and then try later.

The fact that HeapAlloc is in different dll's on different systems does not suit me because (up 'til now) I don't have to do any checking to see what system I'm compiling/running on during compile.
Posted on 2003-04-27 16:34:57 by msmith

I just assumed it was in kernel32.dll. Someone should teach Microsoft the meaning of 'orthagonal' :-)

Yes, Win32 API is horribly messy, and shows that it has origins in win16. Still beats being limited to libc+posix, and/or always having to mess with IoCtl's on devices to do more spiffy stuff.


The fact that HeapAlloc is in different dll's on different systems does not suit me because (up 'til now) I don't have to do any checking to see what system I'm compiling/running on during compile.

Don't worry, for all you have to care it's located in kernel32.dll - on NT, a method known as export forwarding (sorry if my terminology is wrong) is used to automatically and transparently forward this to NTDLL.NtHeapAlloc (function name might be wrong - but something along those lines). Export forwarding is also a reason why some peoples manual GetProcAddress funcs fail :)
Posted on 2003-04-27 16:38:51 by f0dder