I'm currently learning some C++ and so far I've managed to create a simple dialog window with an executable size of 2.5kB, not too bad for a compiler :) Anyway, I want to include a HUGE string list in my program but I can't get it to work.

I tried something like this:

char* Names[]=

...but when I did this with several thousand names, the compiler complained about too many sections (?). How can I acheive this without using any extra string/vector-classes (I want it to be on the "lowest" level as possible)?
Posted on 2004-05-02 05:23:39 by Delight
Well, one thing that might work is to build the string list into a separate binary file, then use a bin2obj tool, and link the obj to the exe.

Another thing that could work is to place multiple such arrays next to eachother and pretend it's one big array (if the compiler doesn't rearrange the order, it should work):

char* Names[] =
char* Names2[]=

Or, you could try another compiler, which can handle the large array without problems (which compiler did you use anyway?).
Posted on 2004-05-02 05:29:58 by Scali
I took a look at the assembly-output (I'm using Visual C++ 7.1), and every item in the array have it's own segment:

??_C@_06BEDDGHPA@abakus?$AA@ DB 'abakus', 00H ; `string'
; COMDAT ??_C@_07HDKOGCMM@abandon?$AA@
??_C@_07HDKOGCMM@abandon?$AA@ DB 'abandon', 00H ; `string'
; COMDAT ??_C@_04FPHJOJDL@abbe?$AA@
??_C@_04FPHJOJDL@abbe?$AA@ DB 'abbe', 00H ; `string'
; COMDAT ??_C@_09EPBJEOIM@abbedissa?$AA@
??_C@_09EPBJEOIM@abbedissa?$AA@ DB 'abbedissa', 00H ; `string'
; COMDAT ??_C@_07EKCCNIHJ@abborre?$AA@
??_C@_07EKCCNIHJ@abborre?$AA@ DB 'abborre', 00H ; `string'
; COMDAT ??_C@_0M@JDCCLGPD@abborrfiske?$AA@
??_C@_0M@JDCCLGPD@abborrfiske?$AA@ DB 'abborrfiske', 00H ; `string'


Is it possible to force the compiler to generate only one "CONST SEGMENT [...]" for the whole array?
Posted on 2004-05-02 05:37:46 by Delight
Posted on 2004-05-02 05:46:34 by Scali
Too bad :)
Posted on 2004-05-02 05:56:18 by Delight
Weird, delight - a similar setup only creates a single CONST segment for me. Probably because I didn't include the read-only string pooling (/GF) switch - turn it off for .cpp files with huge string tables.

There's probably a better way to handle your problem, though - a static char* array with thousands of elements doesn't sound too good :) - explain the situation a bit and perhaps somebody can come up with something?
Posted on 2004-05-02 08:26:57 by f0dder
I need it for a little game I'm doing, it's pretty much like Scrabble, and I need fast access to a large list of strings.
Posted on 2004-05-02 09:39:12 by Delight
Well, turn off string pooling for the file with the strings, and it will work. I would still suggest using some external format, though - you can still have the same lookup speed, and it makes editing of the wordfile simpler.
Posted on 2004-05-02 09:46:37 by f0dder
For fast lookup, you need a hashtable I suppose. Might aswell feed the table from an external dictionary file?
Posted on 2004-05-02 09:57:03 by Scali
How do you turn that option off for one specific file?
Posted on 2004-05-02 10:03:10 by Delight
If you're using vs.net, I believe you can right-click that single file in the project view, and tune the compiler settings.
Posted on 2004-05-02 10:15:35 by f0dder
That didn't work either, it's too many items. I think I'll save all names as a resource and then add them to a dynamicly created array.
Posted on 2004-05-02 12:03:43 by Delight
Add them to a hashtable
Posted on 2004-05-02 12:20:43 by Scali
Hm, what error did you get after turning off string pooling? Does the asm listing still have a segment per string?

If you're going to use a resource string table, you might as well do it in an external file instead, it will be less memory waste.

old_member, doesn't sound like a hashtable would be much of a benefit here? If Delight is addressing the strings like
randomword = stringtable;
how would a hashtable help?
Posted on 2004-05-02 12:24:54 by f0dder
Since he said it was like Scrabble, I assumed that it was some kind of dictionary, and he quickly wanted to check whether an input was a valid word.
So more like: if (hashtable.contains("validword")) { dosomething(); }
But perhaps I misunderstood.
Posted on 2004-05-02 12:43:29 by Scali
If that's the case, yes, a hashtable would beat a linear search :) - a binsearch wouldn't be bad either, though.

I guessed it was for the generation of the playing field, where it wouldn't really be advantagous. And when checking for correct input, you'd know the exact word to check for (well, two words - down and across).
Posted on 2004-05-02 12:57:03 by f0dder
A hash table seems like a good idea, thanks. However, I've just finished a function that finds all possible words that can be generated from a list of letters - it will still need to go through every word in the list, character by character...
Posted on 2004-05-02 13:30:17 by Delight
Perhaps declare the strings (static) const.
Posted on 2004-05-02 13:51:27 by death
hmmm, perhaps you should use another data representation, then! If your "list of letters" has to be traverse sequentially (ie, "korv" is different from "rovk"), as I would assume, you should probably use some form of search tree... where's Jibz when you need him? ;)
Posted on 2004-05-02 14:05:34 by f0dder
how about:

a binary search tree for finding if the words are valid (make it a red black tree), then you could try all permutations of a set of letters and see what permutations are valid words.


a tree with 26 subtrees then you follow the 'a' subtree to match 'a' etc. and it builds a tree of all the words, but that might be huge. so to search "apple" follow the 'a' subtree then the 'p' subtree then the 'p' subtree then the 'l' subtree etc.

personally i used a red-black tree when i wrote a scrabble thing.
Posted on 2004-05-02 18:58:56 by stormix