Suppose, i have 20M file. And I need insert 1 byte with offset 10 (byte) into the file. Is there a rational method to do it without reading into memory 20m tail and rewriting it?
Posted on 2001-08-02 11:24:27 by vit
Sure,

instead of rewriting do copy it (may prevent data losses).;)
Posted on 2001-08-02 11:49:16 by japheth
You can use the ReadFile or WriteFile apis.

Read in the first n bytes, insert your data, then read the remaining bytes.

This only works if you know the offset in the file for the inserted data BEFORE you load. Otherwise you can insert on the writing of the file in a similar way.

Mirno
Posted on 2001-08-02 12:24:38 by Mirno
a-a-a.... I don't understand. Where can I copy my data?
And how?
In my prog I often need INSERT some data into certain position (this is not text editor, but something like this). While the file is not too large I can simply invoke ReadFile from certain position and read the tail of the file into memory. Then I write new data into this position and write the rest of the file from memory.
I use the function WriteFile. As far as I now, it only rewrites new data over old data (and moves filepointer into new position, of course).
But if I have an enormous file this method becomes ... enefficiant...


p.s. Sorry for my English.
Posted on 2001-08-02 12:51:08 by vit
The previous posting was for Japheth.
For Mirno:
I know ReadFile and WriteFile.
I use the same method (read <b>n</b> bytes ... and so on). But I think that the age of the problem of writing files is equal to the age of files . And when the size of the file becomes too large... this method is not efficiant. May be the better way exists?
Posted on 2001-08-02 13:50:39 by vit
Sorry me for giving a vague and not-precise answer but have you checked the Memory Mapped Files APIs ?
Maybe they can shed some light..

Latigo
Posted on 2001-08-02 14:09:37 by latigo
If you know in advance where you will need to insert the bytes, try to add padding bytes between in your file.
Posted on 2001-08-02 14:32:26 by Dr. Manhattan
memory mapped files still require moving of the data :(. I think the
NTFS win2k uses has some efficient way of doing this, but I'm not
sure -- only scanned (quickly) through an article talking about this.
Something about files having "substreams" or something... btw,
with most filesystems it would be pretty easy to implement this - at
least if you need to insert <allocation size> amount of bytes. Haven't
seen any OS allowing this, though :/. Except perhaps for win2k.
Posted on 2001-08-02 19:17:12 by f0dder
If you must insert a byte, you don't have a choice. You must read the tail of the file and write it back to disk, either to the same file or to a new file.

If you are creating a large updatable database, you need to investigate techniques that minimize data movement. In one technique, a smaller "index file" is used to find record data in the larger file. The index file can also define an ordering upon the unordered records stored in the larger file.

There are many variations of the index file -- file systems (e.g., FAT, NTFS) use this basic theme to gain fast access and updating of large disks (the "index files" are called directories).
Posted on 2001-08-02 21:23:30 by tank