How is it that they manage to index everything but still keep the 'index' file so small?
How do they search so fast in the index?

Copernic Desktop Search can display results while you are typing the letters in the search box. How does it manage to search SO fast??

What algorithms do they use?
Can anyone give me even a rough idea? I am real curious.
Posted on 2004-12-18 19:20:53 by clippy
'probably' using XML and a smart search algorithms and sorting.
searching while typing when files is indexed, well, if the xml has started to build, than u search the file while it is being updated all the time by the app.
Posted on 2004-12-19 03:39:18 by wizzra
It is the search algorithm only that i want to know about.
Posted on 2004-12-19 09:29:07 by clippy
What about hashes?? the point is to reduce the space to search with each new character ;)...

I have one diferent thing in mind, dont remember if have a name (or exist), but supose that you construct a like tree (because have nodes and suchs), but each node is a word or a secuence of them, I like more a draw of it ;) than try to explain, then each node have a list of other posible leafs or terminators then each next character continue lowing the umber of posibilities.... altought should be a type dificult to programm :), but in some way easy to search, and my explanation isnt good... lol.. pheraphs a example....

erisd, eri, eris

e: eri, eris, erisd; then you can construct some like: eri->s|sd in the case than in this search you whant insert erik, then you only do: eri->s|sd|k

Now supose that you whant add eriktatu, erikko, erika, because with the anterior you already can form the start (erik) then you can try construct what is left (tatu, ko, a) and at the end you will have k->tatu|ko|a this k-> is like the star eri (and you see is a little recursive the definition)

I gues pheraphs the space is less or more I guess tha the space will go up, dosent matter that instead of have like in a hash "near" repetitions of the same, because you will have: eri, eris, erisd, erik, eriktatu, erikko, erika (35 bytes in characters and 7 pointers), but instead of have such keys, you will have: eri->s|sd|(k->tato|ko|a) (14 bytes in characters, 6 pointers [-> or |]) havent inverted much time in this, also I supose that there exist such construction already, then I not need to investigate it :P, :) pheraphs the space will be less than a hash, but the access time will not be some like direct, but will depend in how many parts a word can be breaked for be reused like a terminator start or in the middle of other word.....
Posted on 2004-12-19 12:29:51 by rea