[natural Language Processing, Using Java] Resources
Started by
lewis
, Sep 28 2008 05:30 AM
9 replies to this topic
#1
Posted 28 September 2008 - 05:30 AM
Hi Everyone,
I don't think I am an advanced programmer as yet. Maybe if I'm lucky,
an intermediate one, I'm not sure.
The point being, I thought the following question, might be a bit of a more advanced
question, than for beginners -- hence me putting it here.
I just wanted to ask, if anybody knew of any books/resources, that deal with
natural language processing/information extraction etc., using Java ?
At present, I haven't been able to find too much on this.
I have found the following book:
===================================================
Mason O
Programming for Corpus Linguistics : How to do text analysis with Java
Edinburgh University Press
2000
===================================================
Just wanted to know, if anyone knew of any more literature/resources, on this topic ?
I found the following website :
http://aclweb.org/aclwiki/index.php?title=..._NLP/CL_courses
which lists a whole number of courses, that have taught NLP using Java,
but I haven't come up with too many leads with that, so far.
Can anyone help ?
#2
Posted 28 September 2008 - 05:41 AM
You might notice that the book I cited above, is about "text analysis".
So I'm not too sure, how much of a difference there is between
"natural language processing" and "text analysis" ?
I imagine, they are similar.
#3
Posted 28 September 2008 - 05:44 AM
Does anyone do work, in natural language processing/text analysis, using Java ?
#4
Posted 28 September 2008 - 10:39 AM
I don't think you should focus on Java.
Look at general books and either put the ideas in them to work in Java or use the language they suggest.
Look at general books and either put the ideas in them to work in Java or use the language they suggest.
#5
Posted 28 September 2008 - 03:21 PM
Hi,
Kraicheck may well have a point.
But if I could just play devil's advocate for a moment, I found the following 3
websites for example :
http://cache.spyfu.com/Default.aspx?d=2008...ocessing%20java
or
http://nlp.stanford.edu/software/
(The Stanford Natural Language Processing Group)
or
http://opennlp.sourceforge.net/
All of these, as examples, seem to have Java implementations of NLP.
I just thought that might be interesting.
#6
Posted 28 September 2008 - 06:47 PM
You're not really playing devil's advocate. It's better to learn the theory and understand the concepts of whatever you want to implement in a computer program, regardless of language. When you do properly understand the theory, then implementation in any language will be fairly trivial, assuming a good understanding of whatever language you end up using.
#7
Posted 28 September 2008 - 08:15 PM
What you're talking about is really advanced. NLP is more complicated than analysis of formal languages (like those used in programming) by compilers. You may download ANTLR, the graphical application which helps understand gramatical structures, shows that step by step. You need to write (or copy & paste) parser, lexer and a sample of code to analyse.
Editor Window
Grammar Interpreter
Integrated Debugger
Ambiguous Path Visualization
Decision DFA Visualization
#8
Posted 29 September 2008 - 04:37 AM
But it seems that you're showing ANTLR, being used to analyse programming languages,
which I find very interesting also -- maybe to try to learn a new programming language ?
But, what I'm thinking about at this stage, is about using it, to analyse
English text.
So you're suggesting that ANTLR can be used for that too ?
And that I "need to write (or copy & paste) parser, lexer and a sample of code to analyse" ?
How do you copy and paste a parser and lexer ?
I don't quite understand this sentence of Ami's.
And what about, instead of maybe, writing my own parser and lexer, to maybe try
one that say, the Stanford NLP Group has put out ? -- "off the shelf" ?
ie just simply use a kind of finished product ?
Although, you'd be able to learn about NLP, by perhaps writing your own code,
after learning the concepts involved.
Just that it would take me a lot of time to do, even though it would be good to do it.
You're suggesting ANTLR as 1 "off the shelf" product ?
And would I be able to, have a look at the code for OpenNLP ?
Maybe to see how they do it ?
Once again, not arguing with idea that would be good, if I wanted to learn how to
program NLP, that good to start learning concepts first, programming language later,
sort of thing.
Again, perhaps, as a kind of using "coding by example" ?
#9
Posted 29 September 2008 - 08:15 AM
I am having a look at JTextPro, which you can find here:
http://jtextpro.sourceforge.net/
(What about OpenNLP also ?)
You'll have to download it, if you want to have a look at it.
It's about 23MB.
I'm trying to execute/run the program, and am a bit stuck in trying to do this.
(I hope that it is ok to ask re. this here ?)
I had a look at the readme file, and it had a few instructions, including I think
on makefiles. I'm only a beginner programmer, so trying to do this, is a bit hard.
I just basically would like to run JTextPRo, or a good alternative
natural language processing/text analysis tool, to see what it actually does
(to some text), and what I might be able to do with it, sort of thing.
I was wondering if someone could help me, to try to execute this, or a similar type
program ?
I have tried to contact the creator of JTextPro, but he has not responded to me as yet.
#10
Posted 29 September 2008 - 12:33 PM
I've also downloaded the Stanford Part of Speech tagger, which is 40MB,
and can be found here again:
http://nlp.stanford.edu/software/
(The Stanford Natural Language Processing Group)
It also contains a readme file, with some instructions, but I haven't quite
got it to work yet.











