Should I read one byte at a time from a file until I find an end of line character, and depend on the windows cache system
This would be a simple way, but will it be fast enough?

Or should I create a structure to read chunks from the file at a time and limit all the reads and writes to special routines instead of using directly the windows ones? This would be much more complex.

Basically I need to read from a file a line at a time, without using C/C++ to do that.

Any comments would be apreciated.
Posted on 2004-01-07 21:50:16 by Xanatose
Xanatose,

For reading text files, the following will work. It is actual code from my compiler to read in a byte of source code.

When this code completes, RawData will contain the data just read. You can then check this var for CR or LF or whatever.

InpFile is an array of file handles which allows for the "include" statement to read in other files than the main one.



; LN:13213 STATUS=GET InpFile(InpFilPtr),RawData
mov esi,InpFile-(1*4)
mov eax, dword [InpFilPtr]
shl eax,2
add esi,eax;
mov [_TmpVec1],esi
mov edi,[_TmpVec1]
mov eax, dword [edi]
mov [_IOPthNum],eax
mov esi,[RawData]
mov [_XferAddr],esi
mov [XferCount],1
invoke ReadFile,[_IOPthNum],[_XferAddr],[XferCount],XferCount,0
mov [STATUS],eax


If XferCount=0 you have reached the end of the file.

You could determine the file size and then read in the whole thing if you prefer, and then scan through the buffer. That would be faster, but would require buffer management.

If you are just reading in text to process, the text processing time will likely be long compared to the file stuff, so I just do it the way given in the example.

Obviously, if you're doing this in pure asm, you can dispense with most of the vars shown and just use registers.

Hope this helps.

Mike
Posted on 2004-01-11 18:06:46 by msmith
For myself, when dealing with files under 5 MB I use the following routine, it is written in GoAsm but should be easy enough to translate. It reads the entire file in one chunk then processes it through a callback one line at a time. It is very fast (I think around 1.5 million lines/sec with Windows.inc) but large files are verbotten because it uses physical memory. In this case it will just count the number of lines in the text file.

ReadFileLines FRAME hFile

uses esi,edi
LOCAL nBytes :D
LOCAL pMem :D

invoke GetFileSize,[hFile],ADDR nBytes
mov edi,eax
inc eax
invoke VirtualAlloc,NULL,eax,MEM_COMMIT,PAGE_READWRITE
mov esi,eax
mov [pMem],eax
or eax,eax
jz >>.ERRORMEM

invoke ReadFile,[hFile],esi,edi,ADDR nBytes,NULL

; verify that everything went well
invoke GetLastError
or eax,eax
jnz >>.ERROR

mov eax,[nBytes]
or eax,eax
mov eax,ERROR_NO_DATA
jz >>.ERROR

mov ecx,edi
mov edi,esi
xor eax,eax
cld
jmp >L4
L1:
mov al,13
repne scasb
dec ecx
push edi
push ecx
; Handle the last character problem
mov eax,edi
sub eax,esi
cmp B[edi-1],13
jne >L2
mov B[edi-1],0
dec eax
jmp >L3
L2:
mov B[edi],0
L3:
; Pass the line to the callback routine
invoke FileCB ,esi,eax
;
pop ecx
pop edi
inc edi
mov esi,edi
L4:
cmp ecx,0
jg <L1

invoke VirtualFree,[pMem],0,MEM_RELEASE
xor eax,eax
dec eax
RET

.ERROR
push eax
invoke VirtualFree,[pMem],0,MEM_RELEASE
pop eax
invoke SetLastError,eax

.ERRORMEM
xor eax,eax
dec eax
ret

ENDF

FileCB FRAME pszString,len

inc D[Lines]

ret

ENDF
Posted on 2004-01-11 20:02:46 by donkey
thanks.
Posted on 2004-01-12 12:07:53 by Xanatose