Should I read one byte at a time from a file until I find an end of line character, and depend on the windows cache system
This would be a simple way, but will it be fast enough?

Or should I create a structure to read chunks from the file at a time and limit all the reads and writes to special routines instead of using directly the windows ones? This would be much more complex.

Basically I need to read from a file a line at a time, without using C/C++ to do that.

Any comments would be apreciated.
Posted on 2004-01-07 21:50:16 by Xanatose

For reading text files, the following will work. It is actual code from my compiler to read in a byte of source code.

When this code completes, RawData will contain the data just read. You can then check this var for CR or LF or whatever.

InpFile is an array of file handles which allows for the "include" statement to read in other files than the main one.

; LN:13213 STATUS=GET InpFile(InpFilPtr),RawData
mov esi,InpFile-(1*4)
mov eax, dword [InpFilPtr]
shl eax,2
add esi,eax;
mov [_TmpVec1],esi
mov edi,[_TmpVec1]
mov eax, dword [edi]
mov [_IOPthNum],eax
mov esi,[RawData]
mov [_XferAddr],esi
mov [XferCount],1
invoke ReadFile,[_IOPthNum],[_XferAddr],[XferCount],XferCount,0
mov [STATUS],eax

If XferCount=0 you have reached the end of the file.

You could determine the file size and then read in the whole thing if you prefer, and then scan through the buffer. That would be faster, but would require buffer management.

If you are just reading in text to process, the text processing time will likely be long compared to the file stuff, so I just do it the way given in the example.

Obviously, if you're doing this in pure asm, you can dispense with most of the vars shown and just use registers.

Hope this helps.

Posted on 2004-01-11 18:06:46 by msmith
For myself, when dealing with files under 5 MB I use the following routine, it is written in GoAsm but should be easy enough to translate. It reads the entire file in one chunk then processes it through a callback one line at a time. It is very fast (I think around 1.5 million lines/sec with but large files are verbotten because it uses physical memory. In this case it will just count the number of lines in the text file.

ReadFileLines FRAME hFile

uses esi,edi
LOCAL nBytes :D

invoke GetFileSize,[hFile],ADDR nBytes
mov edi,eax
inc eax
mov esi,eax
mov [pMem],eax
or eax,eax

invoke ReadFile,[hFile],esi,edi,ADDR nBytes,NULL

; verify that everything went well
invoke GetLastError
or eax,eax
jnz >>.ERROR

mov eax,[nBytes]
or eax,eax
jz >>.ERROR

mov ecx,edi
mov edi,esi
xor eax,eax
jmp >L4
mov al,13
repne scasb
dec ecx
push edi
push ecx
; Handle the last character problem
mov eax,edi
sub eax,esi
cmp B[edi-1],13
jne >L2
mov B[edi-1],0
dec eax
jmp >L3
mov B[edi],0
; Pass the line to the callback routine
invoke FileCB ,esi,eax
pop ecx
pop edi
inc edi
mov esi,edi
cmp ecx,0
jg <L1

invoke VirtualFree,[pMem],0,MEM_RELEASE
xor eax,eax
dec eax

push eax
invoke VirtualFree,[pMem],0,MEM_RELEASE
pop eax
invoke SetLastError,eax

xor eax,eax
dec eax


FileCB FRAME pszString,len

inc D[Lines]


Posted on 2004-01-11 20:02:46 by donkey
Posted on 2004-01-12 12:07:53 by Xanatose