How do I write to a mmap file?

Simple problem using mmap:
Open mmfile for reading.
OpenNew mmfile1 for writing.
Read contents of mmfile and write them out to mmfile1.

When I try to open a file I get a "File open failure" "HLA Exception (7)".

I can write out the contents of mmfile to a file using the fileio.write method, but not to the mmap opened one.

Cheers for any assistance.

/Bobby
Posted on 2005-01-21 00:54:44 by BobbyInOz
When you open a memory mapped file for writing, it is important to keep in mind that you cannot expand the file - you *must* open it with the largest size you expect to grow to, and then truncate the file when you're done.

Also, if you're just doing linear read/write, consider if normal file I/O isn't a better solution - there's a bit of overhead with MMF, because of address space limitations there's a limit on how big views you can have mapped (especially on Win9x!), etc.
Posted on 2005-01-21 01:15:02 by f0dder
I have tried opening it with plenty of room. The program never gets past the opening of the file.

Normal file IO is slow, too slow for my application.

Thanks for the response, though.

/b
Posted on 2005-01-21 02:22:58 by BobbyInOz
Hm, normal File I/O too slow? What is your access patterns? From my experience, MMAP I/O only really has an advantage when you have very random read/write access patterns with small amounst of data.

Btw, this might be a bit too obvious - but have you opened the underlying file for write access?
Posted on 2005-01-21 02:43:45 by f0dder
Here is an example from Randall's hlaexamples archive:



program mmap_example;
#include( "stdlib.hhf" )

// Quick program to demonstrate the use of a mmap object.

var
mmfile: mmap;

begin mmap_example;

// Initialize memory-mapped file object.

mmfile.create();

// Open the mmap.asm file.

mmfile.open( "mmapEx.asm", fileio.r );

// Get pointer to file's data

mov( mmfile.filePtr, edx );

// Display the contents of the file:

while( edx < mmfile.endFilePtr ) do

stdout.putc( (type char [edx]) );
inc( edx );

endwhile;


// Close the file:

mmfile.close();

// Clean up the associated object:

mmfile.destroy();

end mmap_example;
Posted on 2005-01-21 14:18:29 by Kain
Thanks for the replies.

I have opened the file for output, yes, or at least try to. Opening the file for output, whether using the openNew or just open it crashes.

I need to deal with the bytes of the file very quickly, changing their order and values. When reading them from normal IO it is slow due to the disk access speed on writing.

Kain: Thanks for posting the example. I started this path using that example. Randall has only opened a file for reading and then output the contents to stdout. I need to successfully output to another file.

Cheers.

/Bobby
Posted on 2005-01-21 17:13:21 by BobbyInOz
Memory mapped I/O still goes to disk, so you're still limited by "disk access speed on writing." - but of course if you're reading/writing small amounts of data, that is your bottleneck.
Posted on 2005-01-21 17:18:39 by f0dder
You are right about the bottleneck.

There has to be a way, though. What I am doing is reading bytes from one file and then writing them out to multiple files. Problem is that is it TOOOOOO slow.

There has to be a way to speed it up. Look at programs like Ghost, for example. It is very quick in writing bytes to files.
Posted on 2005-01-21 17:24:38 by BobbyInOz
Cache up the bytes before writing, so you write large chunks - that will improve speed a lot. If you only need sequential writes, this should work very well.
Posted on 2005-01-21 17:30:16 by f0dder
Great idea, but I don't know how to do that with HLA.

Would you read in the whole file, or just a few bytes at a time? It seems reading is fairly quick, just the writing is as slow as a wet week.
Posted on 2005-01-21 17:42:54 by BobbyInOz
Well, if there's nothing "strange" about your file access and it's linear (ie, a file copy), you should process the file in chunks. The best chunksize might be a bit hard to determine, but you could start with 64kb chunks - that will be a massive improvement over 1byte read/write, and won't use up too much memory.
Posted on 2005-01-21 17:52:23 by f0dder
Yeah, I'll take a look at that.

I wrote the program quite some time ago, but it is just too slow for the client.

What I did was open the file and read in three bytes, do my calculations on them, then write out some results to three other files....then loop back and do it again.

The method gave the exact resultant files I needed, but the efficiency was shocking, as you can imagine.

I thought a way to do it was to read the file into an array, then do the calcs puting the results into three arrays. Then, just spit out the arrays to the files....but I don't know how to write out the data in chunks from an array rathen than just each one byte element at a time.

The solutions has to be staring me in the face, and I just don't see it.
Posted on 2005-01-21 18:00:32 by BobbyInOz
You process 3 bytes at a time? Ok, read in <n>*3 bytes, process <n>*3 bytes, and write out <n>*3 bytes. I'm not familiar with the HLA API, but I guess it should have some "read" and "write" that lets you read/write an arbitrary amount of bytes?
Posted on 2005-01-21 18:12:15 by f0dder
Yeah, I can read and write n bytes at a time like:

fileio.read(inputHandle, (type byte RecordFromFile), 3); Reads 3 bytes
fileio.write(outputHandleB, RecordFromFile.w1, 1); Writes out 1 byte

What I don't know how to do is put a whole lot of 1 byte values into one varialbe so that when I write them out they are not added but rather just a concatination.
Posted on 2005-01-21 18:57:25 by BobbyInOz
If you can give an exact example of what you are trying to do, it would be easier for us to help you with a solution.
Posted on 2005-01-21 21:31:32 by Kain
Kain,

I need to do the following:

1) Read in a file in such a manor that I can deal with it on a byte by byte level.

2) Look at the bytes and make some calculations on them. (That part is not an issue)

3) Write the bytes back out to one of a few, let's say 3, files.

I have written a program that works a treat, but so slow it is unusable, because I am reading and writing bytes at a time. Pretty sure the answer is to read in a whole heap of bytes and write out a bunch of them at a time, but as I am new to HLA I am not sure how to do this.

/Bobby
Posted on 2005-01-23 16:19:01 by BobbyInOz
I'm not sure it is is related, I have written a library that reads in
configuration files (.ini) scans them for certain headers and keywords,
makes changes in memory and writes the result back to disk. Tests show
that it is resonably fast (fuch faster than the Windows API commands that
work on private profile strings).

This is the way I did things:




procedure LoadFile (var hFile:dword ; val Size:dword); // internal procedure

// loads hFile and copies it into a string of size Size
// returns: string pointer in esi
// caller is responsible for freeing string pointer
// caller is responsible for makeing sure there is no overflow
// this is meant as an internal procedure and as such,
// it is not very user-friendly: no checks and ballances

begin LoadFile;

_start: fileio.size(hFile);
mov(eax,ebx);
inc(Size);
str.alloc(Size);
mov(eax,esi);
cmp(ebx,0);
je _nload;

_bgload:
fileio.read(hFile, [esi], ebx);
_nload: mov(0, (type byte[esi+ebx]));
mov(ebx, (type str.strRec[esi]).length);
mov(esi,source_start);
fileio.close(hFile);

end LoadFile;



What the above procedure does is reads in the file passed through
a file handle in hFile, the 'Size" parameter is how much you want to
'grow' the file (it has to be at least the size of the file).
It creates an HLA string out of the disk file, meaning that you can
treat it in two ways:

1. As an array of bytes
2. As an actaul HLA string.. .you can use the entire assortment of
the standard library string and pattern functions on it.

It returns the string pointer in esi and in a global variable 'source_start'

Take care that the procedure is 'specialized' in that it has no error-
checking and doesn't preserve registers. I make sure everything is dandy before calling thie procedure.

To scan the file in memory:

Just start at esi:

mov ( , al ); // al now contains the first character


Now, if you don't need HLA's string processing power, you don't have
to load it into a string, you can load it into a standard memory by using
mem.alloc (instead of str.alloc) and remove the 2 lines after _nload:

---------------------------------------------------------------------------------

After you make changes to the file, it's easy enough to save it back
to disk:




procedure SaveFile (var File:string); // internal procedure
// Saves an HLA string passed in source_start to a file called <File>


begin SaveFile;

#if(os.linux)
// bug in linux? openNew does not delete the original data on the file.
fileio.delete(File);
#endif

fileio.openNew(File);
mov(eax,hFile);
mov(source_start,esi);
mov((type str.strRec[esi]).length,ecx);
fileio.write(hFile,[esi],ecx);
fileio.close(hFile);

end SaveFile;


Again this is for strings, but you can modify it to save only a memory
location.
PS. I think the linux bux is fixed in the latest version of HLA, I just
haven't had the time to change the code.

The full source code of this library is available in the hlaexamples
download on webster, in the users folder. You may be interested
in it as it has various search/replace functions that deal with an array of bytes.
Posted on 2005-01-23 17:02:15 by Kain
Thanks for that, Kain.

I will try it as soon as I get back to the office.

I think it will get me in the right direction. Sometimes a bit more knowledge about other languages can be a frustrating thing; "I know how to do with X language".

Will let you know.

Cheers.

/Bobby
Posted on 2005-01-23 18:48:37 by BobbyInOz
Hi All,

The original problem with the mmap file I/O routines has been fixed. The update will appear in HLA v1.75 whenever it appears. For those interested in a quicker fix, just grab the memmap.hla source files in the MISC directory of the HLA Standard Library source code. In openNew, you'll discover that I forget to pass along the maxSize parameter in the following call:


w.CreateFileMapping
(
handle,
NULL,
w.PAGE_READWRITE, // openNew is always read/write
0,
maxSize, // Was zero before!!!!!!
NULL // Don't provide an internal name.
);


BTW, my measurements suggest that memory-mapped file I/O on large blocks runs at roughly the same speed as stream file I/O. For random access through the file, mmap I/O does a bit better. Most importantly, however, memory-mapped I/O often lets you use better algorithms for accessing blocks of data (which is why mmapped I/O is typically faster). I've seen *very few* cases where mmapped I/O runs slower than stream I/O.

There is, of course, the problem that you're limited in file size (what is it? 256MB or something like that under Windows?)

Cheers,
Randy Hyde
Posted on 2005-01-25 22:43:34 by rhyde
Randy, on Win9x you're limited by the size of the "global data" address space, which is quite limited - I can't remember the size, but it's shared between all processes. On NT, mapped files are per-process and lies within the 2GB private space, which means you can map some 1+GB.
Posted on 2005-01-26 04:19:55 by f0dder