Exerpt from SDK docs suggests using Heap instead of Global memory functions:

From GlobalRealloc:
...
Note The global functions are slower than other memory management functions and do not provide as many features. Therefore, new applications should use the heap functions. However, the global functions are still used with DDE and the clipboard functions.
...


From IStream:
...
Note If you are creating a stream object that is larger than the heap in your machine's memory and you are using an HGLOBAL handle to a global memory object, the stream object calls the GlobalRealloc method internally whenever it needs more memory. Because GlobalRealloc always copies data from the source to the destination, increasing a stream object from 20 megabytes (MB) to 25 MB, for example, consumes immense amounts of time. This is due to the size of the increments copied and is worsened if there is less than 45 MB of memory on the machine because of disk swapping.

The preferred solution is to implement an IStream method that uses memory allocated by VirtualAlloc instead of GlobalAlloc. This can reserve a large chunk of virtual address space and then commit memory within that address space as required. No data copying occurs and memory is committed only as it is needed.
...
Posted on 2002-06-27 06:05:36 by gfalen
Nothing new, sorry :rolleyes:

Ask f0dder, he swears on HeapAlloc :tongue:
Posted on 2002-06-27 06:26:46 by bazik
Yep,

There are numerous solutions to memory access depending on what you want to do with it, if you really have to mave large amounts around, one viable method is to use a memory mapped file and handle your own paging.

OLE string memory is fast, the CoTask## memory functions work well, VirtualAlloc if you don't mind using virtual memory mixed with physical memory but note that the older GlobalAlloc family of memory functions have finer granularity and once it is allocated, it has no speed problems at all.

Regards,

hutch@movsd.com
Posted on 2002-06-27 06:47:33 by hutch--
On NT, Local/GlobalAlloc symbols refer to the exact same address. Ie, they
are 100% identical. It :) calls HeapAlloc internally, after some parameter
conversions. Dunno really why this should be much slower than HeapAlloc
directly, but sure there's a bit extra code. Might be a different matter on
9x, there's so much suckiness there.

Using memory mapped files for memory allocation is not a good idea; firstly
you'll get exceptions 'every now and then'. Probably not on 4k boundaries,
the pagefault handler ought to be a bit smarter than that - but you *will*
get pagefaults, which slows down stuff. Furthermore mmaps that aren't backed
by a file handle are always backed by the pagefile. I assume this means that
writes to the memory *will* go to the pagefile, whether your app needs to
be swapped out or not - not tested though.


OLE string memory is fast, the CoTask## memory functions work well, VirtualAlloc
if you don't mind using virtual memory mixed with physical memory but note that
the older GlobalAlloc family of memory functions have finer granularity and once
it is allocated, it has no speed problems at all.

All win32 memory allocation functions (which means ring3 - but I dunno of any
ring0 win32 api memory allocations ;)) are using virtual memory. This doesn't
mean you stuff will necessarily go to the pagefile, except of course if your
app needs to be paged out - but you can't avoid that by using CoTask## or
whatever. The closest you can get to unswappable memory in ring3 (that I know
of) is by using VirtualLock, but be careful and only use this if you know
EXACTLY what you are doing and the implications on the system.

I recommend people to generally use HeapAlloc, as it is 'the preferred win32
way'. It's flexible, can be reallocated, etc. If you need large (multiples
of 4k) buffers that aren't going to be resized (like a system-memory video
backbuffer), go for VirtualAlloc - it's closer to the lowlevel memory allocation
primitives, offers control of page-level protection, and guarantees 4k alignment.
(Well, page alignment anyway, which shouldn't be anything but 4k on x86 win32 -
you can get system page size with a call to GetSystemInfo if you're paranoid).
Posted on 2002-06-27 13:10:28 by f0dder
One point to add to f0dder's reply: VirtualLock doesn't work in Win9x (is documented somewhere).
Posted on 2002-06-28 03:52:38 by japheth
All versions of 32 bit windows are identical in GlobalAlloc/LocalAlloc as the LOCAL distinction is no longer relevant in 32 bit flat memory model.

VirtualAlloc is OK for large block but very inefficient in large numbers of small blocks, in terms of average size/speed and flexibility, OLE string is fast, has fine granularity and is preallocated so it is also fast in startup terms. Limit under win95b is about 260 meg so it can handle most things with no problems.

I am surprised that you have had problems with memory mapped files, I have tested them on a half a gig and they are reliable and fast to use when you need a large block of aligned memory. I did my testing on win95b so I guess you must have done something unusual in how you used it.

Regards,

hutch@movsd.com
Posted on 2002-06-28 11:41:54 by hutch--

VirtualAlloc is OK for large block but very inefficient in large numbers
of small blocks, in terms of average size/speed and flexibility

Large number of small blocks would be stupid with VirtualAlloc, as all
alignments are done in 4k chunks - as I said, it's good for large blocks
that aren't resized. Slow speed? I doubt that very much ;).


OLE string is fast, has fine granularity and is preallocated
so it is also fast in startup terms.

"preallocated"? Do you mean the memory is touched-to-zero after
allocation, or that windows has a large preallocated pool it
gives memory from? I doubt the latter very much... when you
VirtualAlloc, you can specify whether the pages should be
committed right away, or you can commit pages as you want to.
Whatever flexibility you need.


I am surprised that you have had problems with memory mapped files,
I have tested them on a half a gig and they are reliable and fast
to use when you need a large block of aligned memory.

Then I doubt you have done very much testing. Memory mapped files
generate pagefaults, and that's a fact.
Posted on 2002-06-28 12:04:26 by f0dder
Actually I've been doing work with memory-mapped files and I was told it would be the most efficient way to handle large files. The application turned out to use large amounts of memory and become slow, so I changed the code to handle files in 4MB blocks and it became much better. :)
Just my two cents...
Posted on 2002-06-28 14:12:55 by comrade
My 2 cents

From the WIN32.hlp this seems to imply that HeapAlloc should be used when portability is an issue but HeapAlloc also has an Achillies Heal.

Memory allocated by HeapAlloc is not movable. Because the system cannot compact a private heap, the heap can become fragmented.
A possible use for the heap functions is to create a private heap when a process starts up, specifying an initial size sufficient to satisfy the memory requirements of the process. If the call to the HeapCreate function fails, the process can terminate or notify the user of the memory shortage; if it succeeds, however, the process is assured of having the memory it needs.

:alright:
Posted on 2002-06-28 16:59:29 by IwasTitan
Any memory allocation in win32 can become fragmented. Since
local/globalalloc are the same, and the local/global *lock/*unlock
functions aren't used, these are as hurt by fragmentation as HeapAlloc
(and since Local/GlobalAlloc uses HeapAlloc internally, this is rather
obvious ;)). Yeah, VirtuaAlloc can also get fragmented, but since there's
no "floating" allocations in win32 as there was in win16, there ins't
really any problems... unless you have very bad memory handling, this
will not be a problem. Only virtual per-process memory space fragmentation
matters anyway, physical fragmentation is more or less nonexistant because
of the wonders of paging. Or rather, problems with fragmentation is nonexistant.

If you don't *need* a private heap, don't allocate one. Cases where a
private heap can be necessary? Lotsa memory allocations where you need
to free all at once; this could for instance be be a view of a file in
an editor. But it all depends on your program etc - just that in most programs
there's no need for anything except the default heap, unless you're a slopcoder ;).
Posted on 2002-06-28 21:14:07 by f0dder
f0dder

Thanx for enlightening me on the limitations of WIN32

(somethiong usually not obviouse to a newbie in any help file)

The salient point:

Since
local/globalalloc are the same, and the local/global *lock/*unlock
functions aren't used, these are as hurt by fragmentation as HeapAlloc
(and since Local/GlobalAlloc uses HeapAlloc internally, this is rather
obvious )

Wasn't obvious to me

thanx
:alright:
Posted on 2002-06-28 21:42:31 by IwasTitan
I?ve done some benchmarking tests on memory (read, write and move) using API memory allocation and also using static data allocation
.data?
Buffer db 1024*1024*x dup (?)
But according to tests, memory allocated by API seems very slow
compared to static allocation (about 60% of static allocation). I?ve
done these using all possible kinds of ways (rep, MMX, Jmp, ect?).
My questions are
1. Why is the memory allocated by API running slow?
2. Is there a way improve the assembling time when using static
allocation. Because anything over 500k MASM takes very long time
to assemble.

Pradeepan.
Posted on 2002-06-28 22:14:55 by Pradeepan
I would think that any great giant such as Wintel would engineer their interface to suit their needs.

But according to tests, memory allocated by API seems very slow
compared to static allocation


Hey ..welcome to the high level language of wintel

I'm all for debug these days.

:alright:
Posted on 2002-06-28 23:42:00 by IwasTitan
hmmmm,

====
Then I doubt you have done very much testing. Memory mapped files generate pagefaults, and that's a fact.
====

This seems a strange thing to say, ANY memory can generate page faults depending on how you use it. I seriously doubt that the memory mapped file capacity in 32 bit windows was designed to "page fault" so unless there is an unknown bug in the OS code to provide it, the fault lies in the application code.

===
"preallocated"
===

Take it up with Microsoft, they use it for string support for UNICODE in later OS versions and it has been there since win95oem. For you reference the,

"SysAlloc....."
"SysReAlloc...."
"SysFree...."

family of functions that are in the OLE capacity for all versions of 32 bit Windows.

Now it finally does not matter how each version supplies the memory to OLE, it is a published interface that does the job.

Regards,

hutch@movsd.com
Posted on 2002-06-28 23:44:17 by hutch--

This seems a strange thing to say, ANY memory can generate page
faults depending on how you use it. I seriously doubt that the
memory mapped file capacity in 32 bit windows was designed to
"page fault" so unless there is an unknown bug in the OS code
to provide it, the fault lies in the application code.

This shows you do not understand how memory mapped files are implemented :).
Obviously the pagefaults aren't "the app has performed an illegal operation
and will be terminated" type PFs, but stuff that is handled internally
by windows.

Let's the the example where an existing file is mapped for read, as this
is the most simple case. You end up getting a pointer that gives access
to the file. This might seem lige magic, but it is implemented through the
use of pagefaults. When you access a page that hasn't been accessed yet,
a pagefault is generated, and this is handled by the internal windows #PF
handler. This has a lot of checking to do. Is it an unhandled PF? Is it
a PF caused by writing to read-only memory? Is the memory area a MMF?
In the case of MMF, pages are yanked in from the file, marked as present,
and the read operation is restarted. Luckily it seems that you don't get
a fault per page, but rather "regions" of the file are brought in - otherwise
the performance of MMF would have been disastrous, rather than just "somewhat
slower than normal file access".

Write access is more complicated, and I haven't studied this in great
detail yet, so I'll be careful not to speak too much of it. But as far
as I've heard/seen/read, it seems that there's some "idle time writeback
thread" that handles updating the physical files.


Now it finally does not matter how each version supplies the memory to OLE,
it is a published interface that does the job.

You were using words "fast" and "preallocated", and I questioned the "preallocated"
part. If you don't know what you mean by this word, don't use it, and avoid
confusing people.
Posted on 2002-06-29 03:56:32 by f0dder
You seem to be confusing the MMF mechanism available and making mistakes using it, after having tested about a half a gig MMF, I had no problems reading or writing to it at all.

Now it seems that you are making reference to how it is USED and ALLOCATED in terms of access. Access the wrong address in the wrong way and it will not work but that is a USER fault, not a fault with MMFs.

Preallocated.

This means JUST what it says, in another terminology, its what you call a "STRING POOL", something that anyone who codes in BASIC well understands and uses to advantage. "Garbage collection" is another original BASIC concept long before Windows and it just keeps going.

When you use OLE string, you are using the STRING POOL that is already allocated by the operating system. Without it UNICODE would not work at all.

It is a mistake to assume that what you don't understand does not exist, this stuff has been around for years so taking the ostrich approach to what you are not familiar with does not make sense.

Now your argument has been to date,

1. You get errors with MMFs so there is something wrong with MMFs.

2. You do not know how OLE string works so it does not work.

Surely you can do better than that.

Regards,

hutch@movsd.com
Posted on 2002-06-29 05:47:30 by hutch--

You seem to be confusing the MMF mechanism available and making mistakes
using it, after having tested about a half a gig MMF, I had no problems
reading or writing to it at all.

No, I am talking about how MMF is implemented internally. As I think I
stated previously, the PF's that occur with MMF is part of the design,
not program bugs. The PFs are totally transparent, handled by the kernel,
ie not the type that crashes your app. But there *are* pagefaults going
on when using MMFs.

1) you get me wrong, and you do not understand how MMF works.
2) I have not messed with OLE string memory allocation, but I do not
doubt it works. I doubt, however, that it is any more "preallocated"
than heap/virtualalloc/whatever memory, as that would be a waste of
precious system memory.

I'm in the process of doing "a good deal" of testing and better explanations
of the various methods, results should be coming up within long, when I've
finished writing the text and cleaned the program source code.

By the way, note the VirtualLock doesn't mean your pages are 100% immune from
being discard or swapped out - refer to "inside microsoft windows 2000" for
more details.
Posted on 2002-06-29 05:54:57 by f0dder
Here you go.
Posted on 2002-06-29 06:11:17 by f0dder
Some other disadvantages of MMF compared to normal virtual memory are:

- MapViewOfFile will enlarge Paging File by the number of bytes to map, regardless how much free physical memory is in system
- it will reduce shared memory area in Win9x systems (which is "only" 1 GB)

So Hutch, I would suggest to bel?eve what f0dder teaches. As always he knows what he is talking about.

japheth
Posted on 2002-06-29 07:06:48 by japheth
i just have the question.

if i want to access the data in a large file that can not load the whole data into memory.

whichone is better in term of speed between using the MMF and implement the page file by myself?

thanks,
doby.
Posted on 2002-06-29 21:28:25 by doby