Hi,
Have someone seen a program with open source that automatically creates the abstract. Abstract is a set of the proposals of the abstracted text, which one contain the key terms of the contents. So, that mean: I have a big file and program must create a scheme of it.
Posted on 2002-01-03 12:37:06 by Maestro
I can say honestly, from my experience of this board... No..

But then again, i dont really know what your getting at :confused:

Abstracts are to me, a short introduction to an essay, which covers the broadest topics only...

Your asking for a program that broadly discribes text documents??

Or am i off base here?

NaN
Posted on 2002-01-04 00:47:12 by NaN
If I understand you correctly:
Well, I know these programs exist, but they're probably not open source and probably written in C(++). Besides, I don't believe any of them are good (precise) enough to be used for anything else then making descriptions for search engines.

If I misunderstood you and you mean:
Has someone seen a program with open source that automatically creates an index of a document. A index is a list of the keywords (headings and subheadings) in the indexed text.

Every good word processor can do that, and I believe Corel's suite is open source (but C++ of course).
Sorry if I'm insulting your langauge skills with this second option, but in the "New post" screen I couldn't see where you're from...
Edit: You didn't specify your location at all :grin:
Posted on 2002-01-04 02:03:10 by Qweerdy
No, I am sorry for my bad English, but I can't describe what I need in two words. So, when you read books/articles you see a great deal of information. Some information you need, some don't. But you need to read it all before you can make a decision what info you need. It will be great if a program can create a short description of what is this paper about (about 1 sentence on every 6-7). It will be great.

About Corel sources: where have you seen them?
Posted on 2002-01-05 04:54:08 by Maestro
You'd get more mileage out of studying (and practising) some good reading
techniques. I gave the Photoreading course by Paul Scheele a try and it was
quite useful (although not living up to it's advertised hype of
sub-consciously reading at 25,000 words per minute). It gives some good
advice on how to mentally and physically prepare for a heavy reading session,
how to go about determining whether or not the literature you have is relevant
to the information you're seeking, how to structure the reading of lengthy
documents without getting fatigued and losing interest etc.. It's fantastic
if you're into dry, lengthy, technical material as I am.

It comes with a money back gaurantee too. Check it out:
http://www.learningstrategies.com/PhotoReading/Home.html

As for doing it programmaticly, you could maybe have a database of catagories
and keywords, then have a simple parser to run through the document, generate
tokens to send to an analyzer that generates statistics based on the query you
requested. I'm sure there's source code available that does something like that.

To go beyond that would be asking the machine to not only to analyse the
grammar but it's semantics, I've not seen anything that could do this in the
way you want. Think about how redunant English (and every other all spoken
language) is and the complex grammar productions rules you would have to
create. My advice is to go for the less accurate but more practical
statistical approach.

Have a look at: http://www.antlr.org/

Cheers,
Boggy
Posted on 2002-01-05 06:40:09 by Boggy