Ineed protcedure that allow delete multiple garbage strings from large file how to do that ? I do not want open file every time .
Posted on 2002-04-12 01:12:22 by Zebio

Can you tell us a little bit more about what you are trying to do so that someone can help you.

Posted on 2002-04-12 01:23:16 by hutch--
Forgot say Hi All ! sorry
General purpose is cleaning some large file from some garbage strings.

- find specified strings from some large file , delete these
strings from file store file .
I know that easy then you want delete one string by manipulate
FilePointer. But I need multiple search attempts and multiple srings delete without opening file every time ,"(by one pass)".
Posted on 2002-04-12 01:43:22 by Zebio
you can try to map your file in memory...

steps are:

CreateFile (=filehandle)
CreateFileMapping (=maphandle)
MapViewOfFile (EAX = POINTER TO YOUR FILE (mapaddress))
delete some strings or whatever... note that
you have DIRECT pointer access to your file
like: mov byte ptr ,12 will change the first
byte of your file... the rest is on you and your
algorythmic skills...
UnmapViewOfFile (mapaddress)
CloseHandle (maphandle)
SetFilePointer (new_end of your file)
SetEndOfFile (cut it!)
CloseHandle (filehandle)

good luck
Posted on 2002-04-12 03:31:59 by mob
If you want to *remove* the strings from the file, and thus make
the file smaller, an approach would look something like...

open file
open tempfile
if not(unwatedstring) then copy data
else skip data
close file
close tempfile
delete file
rename tempfile file

yes, this is a slow process for large files, but it's probably the only
way to really do it (without specialized fileformats) unless you want
to manipulate the file at FileSystem level... which I cannot recommend :)
Posted on 2002-04-12 10:25:22 by f0dder

but it's probably the only way to really do it

huh? maybe you oversaw my posting or you
newer got experienced with filemapping but
you *CAN* do it with with fm... and it would
be like 1000x faster and should not to hard
to implement...

SetFilePointer (*NEW* (decreased) end of file)
SetEndOfFile (cut it!)
Posted on 2002-04-12 11:12:15 by mob
Filemapping will be slower than a readfile/writefile approach, because
file mapping generates #PF for each 4k access. The readfile/writefile
method can also be used with a single file, but with large file operations
I generally tend to use a tempfile so nothing is lost in case of an error.
Note that I assume he wants to remove an arbitrary amount of strings from
an arbitrary position in an arbitrary file - there isn't any way to do
this (that I know of) that will not involve (possibly massive) data

The smart thing to do is of course to only "copy down data" that is not
used... ie, don't "skip bad string, copy all remaining data, loop".
Rather, "skip bad string(s), copy non-bad string(s) down, loop".
Posted on 2002-04-12 11:24:57 by f0dder
mh you could allocate a second buffer and do it like
you said... sure, datashuffling would be involved but
i thought this would be faster... mh a mixture between
mapping, a second buffer and your method would be
the best i think... but then... i never done things like
that... have a nice weekend bye...

open file
map file
allocate another buffer
scan file and copy all ok strings to the 2nd buffer
get the lenght of the dummy buffer
overwrite the filebuffer
setfile pointer to the new decreased lenght
Posted on 2002-04-12 11:31:37 by mob
Well, filemapping *will* be slower :). If it matters depends on what
you are doing. The only real advantage mapping has, is that you
can access it directly, ie you don't have to redesign "linear" algorithms
to work on blocks. Also, you can't map more than somewhat more
than a gig at a time, while readfile/writefile can work with *very*
large files (on NT+NTFS anyway).
Posted on 2002-04-12 11:56:44 by f0dder
hmm I made a program that converts Text Data files from one format to another. the biggest file I had to deal with was only 3 megs. I had no need to do any file mapping. I could have just used a couple of buffers. I used mem alocation for practice. The one I made really mostly adds stuff to the file. It does take out any extrenous Carriage Returns for the heck of it. Any way it manipulates files. So to take out info (or add it) all you have to do is adjust the actual filtering. Verry simple (I am just learning) I made it for a friend and for practice. Any way here's the source if you want to look at it. Like I have said before I just started so I dont really know how things are SUPPOSED to be done. I just do it lol. (specifically I know I use the wrong registers for the wrong purpouses) any way look at if it's what you need your welcome to it.

also reading here I found a few interesting things.
Like SetEndof File I didn't know that one when I worte this so thats another thing I did the worng way lol
Posted on 2002-04-12 15:56:21 by dionysus
If you open a file and stick it in a buffer you're essentially just file mapping. /shrug
Posted on 2002-04-12 18:52:29 by iblis
No you're not iblis, there's quite a difference between reading the
whole file into a buffer and doing file mapping... read up on it :)
Posted on 2002-04-14 11:22:17 by f0dder
I meant on a very basic level. In both scenarios, you're shoving file content into memory.
Posted on 2002-04-14 11:41:13 by iblis
Thing is... if you read the whole file into a buffer, you're reading all
of it at once. Filemapping only reads in the 4k pages you touch.
Also, when you run low on memory, filemapping will discard clean
(untouched) pages, while read_all_into_buffer has all pages dirty,
so they have to be paged out to disk (=slow). Also, when you modify
a byte in a filemap, it will be written automatically to the disk (of
course cached first), while the buffer method requires a writefile.

So while both methods let's you manipulate a file in memory, there
are enough (important) differences that they shouldn't be directly compared.
Posted on 2002-04-14 11:55:59 by f0dder
Sometimes I forget this is a Windows forum ;) You are right... that's how Window's does it. My mind wandered off and I focused more on the general definition of file mapping rather than the Windows specific implementation of it.
Posted on 2002-04-14 12:12:57 by iblis