Is there a limit to how many lines of text can a program output as an Concole Application?

I ask this because:
While working to my new Assembler during holydays ... and since i had include files working ...
One devil like idea pooped out in my little head: give the HE_Game to be crunched this little assembler

Of course i have a very little number of instructions working, and all others generated an error :grin:

This filled up my console screen with error messages at very high rate....
Nevertheless i still wanted to know how "fast" is my assembler in this status.

I also wanted to know if it will parse the nested includes ok,
and how long it will take? 1hour or a few minutes :P ?

To my dispare i get and error and the application is closed by Windows after a LOT of errors messages are printed on console's screen.

I noticed that MASM and TASM abort after a number of errors are reported (a few hundreds) and i reported at least 10.000+ errors before the program was closed by windows ...

Of course it might be an error in my code but...

Any ideeas?
Posted on 2003-12-28 14:36:53 by BogdanOntanu

The limit on the number of messages reported seems to be a practical matter as I remember even the old C compilers of over 10 years ago used to limit the error reports to about 100.

Almost exclusively the only error that matters is the first one and while it is supposed to be a compiler feature, it is usually no use to report any of the others as they usually follow from the first one.

I think you can point data at the console almost forever so the restriction on the number of lines sent to the console is in the program, not the console.

Maybe its worth sending the output to a file in the development stages so you can see all of the problems.

Posted on 2003-12-28 19:28:57 by hutch--
Yes, i kind of solved it

I have added an option in Sol_Asm to output the debug info/error messages into a file and the crash was still there :( so it was not a problem with the massive console output.

Finnally it did "assemble" HE game :) after i did 2 fixes:
1)dealocated memory i alocated for each include file as i parsed it ...hmmm.
2)eliminated a lot of "./relative/path/" references in HE and changed them into "relative/path/" because they generated a "file not found error" in sol_asm.

I think it was the 2nd issue; because all memory alocated by include files was under 10M.
I guess i should have stopped parsing at a "file not found" error :grin:

Anyway it is NOT a console problem, it was an error in my code.

I think i will limit the error# to approximatve 100 also

:confused: and damn it took about 2 minutes to assemble whole HE game
Posted on 2003-12-28 19:50:21 by BogdanOntanu
I am sure you will get it faster once you have more of it done.

If you want to play with the word recognition technique I have automated, email me.

Posted on 2003-12-28 23:15:04 by hutch--
IMO 100 erros is overkill, 50 or 25 is more reasonalbe, even thoung IMO max 10 should be displayed.
Posted on 2003-12-29 06:13:27 by scientica
Hi Hutch,

thank you

I will sure email you when i get stuck somewhere
Curently i am just makeing things work....

I got a speed increase when i resolved a thing with labels recognition (i was generating too many wrong labels before)

I am not using strtok() like functions to recognise words/tokens for now.
Instead i am using a state machine to do byte parsing :grin:

Also my tokens are NOT null terminated strings, they have the string length in front of the string so i can first check for size match, this speeds up string compare a lot, esp when you are expecting many many missmatches, later on i will do a hash table
Posted on 2003-12-29 14:11:39 by BogdanOntanu
From what I can garther after writing a number of test pieces, you can safely seperate keywords and user defined words between two different testing methods.

Keyword lookup seems to be the fastest with a known range of words while I am open to results on a dynamically determined list of user defined words.

I have played with hash tables as they are supposed to be faster and have the collision rate down to about 6% which is supposed to be OK but a dynamic tree is probably more flexible as it can be extended more or less forever where a hash table needs to be preset in its member count.

I will certainly be interested in how you are going with this project. Something as an aside that would be an interesting idea is the capacity for user defined preprocessing as it opens up the possibility of many different front ends for languages that use the assembler backend to build the binary component.

We could end up with BogdanBasic, BogdanPascal and so on which would be a very interesting capacity.

Posted on 2003-12-29 22:02:34 by hutch--
Hi Hutch

Yes Labels, Procedures and ASM Tokens are curently separated in my design
I am also thinking to separate "data types" , Macro's and Directives

TASM uses a hash table that can be enlarged in the comand line, in HE i did not crossed its 32768 symbols limit yet :grin:

When i have some interesting progress i will send you a first version demo to test
and/or i will post executable here or in the MASM forums.

Let me know if you want just what i have now for testing (do not expect much)

About preprocessing i do not quite understand... do you mean:
1) a preprocesor / features in front of the assembler engine .
2) or the feature of changing the rules of parsing and the tokens by external files and in doing so changing the whole engine in a BASIC or Pascal or whatever language compiler?

Let me know whatever you might need from an assembler/parser and when i reach to that development point i will very seriousely consider implementing them...
Posted on 2003-12-29 22:25:59 by BogdanOntanu
I wonder why the tasm guys didn't use auto hashtable resize? :-s

Bogdan btw, a cute trick to implement when you need to do string comparisons - inline a equal-check on the first byte of the two strings at the call site, before branching off to a strcmp routine - can give a nice speed boost.
Posted on 2003-12-29 22:49:13 by f0dder
Thank you f0dder, very nice ideea ;)
Posted on 2003-12-30 02:17:58 by BogdanOntanu

The piece of genius I had in mind is something that is a good idea to set up in the early design stage where you have the flexibility of layout.

It depends on the sequence you had in mind from high level code (nmenonics) down to opcodes but if you can have an insertion point for different approaches for the front end, it would probably simplify emulating any x86 assembler you liked.

I have seen back ends that do nothing more than mnemonics and labels and this approach means the front end must do all the conversions from structures, high level stuff like MASMs .IF, PROC/ENDP etc ...

The idea I had in mind was a bit higher up the scale where you could implement your own macro expansion engine and do any of the high level constructions you liked, Block IF, SWITCH, string handling, function call processing with nested function calls, perhaps on the fly locals which would be very useful for intuitive code design while maintaining the entire range of assembler coding with conventional mnemonic usage.

Posted on 2003-12-30 02:24:26 by hutch--

Thank you f0dder, very nice ideea

Indeed it is, if done right - it takes a bit more work in straight asm than in C/C++... but even before adding hashtable lookup for symbols, this trick gave a nice speed boost to the quake-c compiler for quake1.
Posted on 2004-01-02 14:21:13 by f0dder
Check the last post in the following thread for hash table strategies:
Posted on 2004-01-05 16:40:21 by tenkey