Jump to content


[natural Language Processing, Using Java] Resources


  • Please log in to reply
9 replies to this topic

#1 lewis

lewis

    Advanced Member

  • Members
  • PipPipPip
  • 41 posts

Posted 28 September 2008 - 05:30 AM


Hi Everyone,

I don't think I am an advanced programmer as yet. Maybe if I'm lucky,
an intermediate one, I'm not sure.
The point being, I thought the following question, might be a bit of a more advanced
question, than for beginners -- hence me putting it here.

I just wanted to ask, if anybody knew of any books/resources, that deal with
natural language processing/information extraction etc., using Java ?

At present, I haven't been able to find too much on this.

I have found the following book:

===================================================
Mason O
Programming for Corpus Linguistics : How to do text analysis with Java
Edinburgh University Press
2000
===================================================

Just wanted to know, if anyone knew of any more literature/resources, on this topic ?

I found the following website :

http://aclweb.org/aclwiki/index.php?title=..._NLP/CL_courses

which lists a whole number of courses, that have taught NLP using Java,
but I haven't come up with too many leads with that, so far.

Can anyone help ?



#2 lewis

lewis

    Advanced Member

  • Members
  • PipPipPip
  • 41 posts

Posted 28 September 2008 - 05:41 AM


You might notice that the book I cited above, is about "text analysis".
So I'm not too sure, how much of a difference there is between
"natural language processing" and "text analysis" ?
I imagine, they are similar.





#3 lewis

lewis

    Advanced Member

  • Members
  • PipPipPip
  • 41 posts

Posted 28 September 2008 - 05:44 AM


Does anyone do work, in natural language processing/text analysis, using Java ?

#4 Kraicheck

Kraicheck

    Advanced Member

  • Members
  • PipPipPip
  • 884 posts
  • Gender:Male
  • Location:Belgium

Posted 28 September 2008 - 10:39 AM

I don't think you should focus on Java.
Look at general books and either put the ideas in them to work in Java or use the language they suggest.

#5 lewis

lewis

    Advanced Member

  • Members
  • PipPipPip
  • 41 posts

Posted 28 September 2008 - 03:21 PM


Hi,

Kraicheck may well have a point.

But if I could just play devil's advocate for a moment, I found the following 3
websites for example :

http://cache.spyfu.com/Default.aspx?d=2008...ocessing%20java

or

http://nlp.stanford.edu/software/
(The Stanford Natural Language Processing Group)

or

http://opennlp.sourceforge.net/


All of these, as examples, seem to have Java implementations of NLP.

I just thought that might be interesting.



#6 Captain Pierce

Captain Pierce

    Advanced Member

  • Moderator
  • PipPipPip
  • 877 posts
  • Gender:Male
  • Location:Georgia

Posted 28 September 2008 - 06:47 PM

You're not really playing devil's advocate. It's better to learn the theory and understand the concepts of whatever you want to implement in a computer program, regardless of language. When you do properly understand the theory, then implementation in any language will be fairly trivial, assuming a good understanding of whatever language you end up using.

#7 ami

ami

    Advanced Member

  • Members
  • PipPipPip
  • 86 posts
  • Gender:Male
  • Location:Poland

Posted 28 September 2008 - 08:15 PM

What you're talking about is really advanced. NLP is more complicated than analysis of formal languages (like those used in programming) by compilers. You may download ANTLR, the graphical application which helps understand gramatical structures, shows that step by step. You need to write (or copy & paste) parser, lexer and a sample of code to analyse.

Editor Window

Grammar Interpreter

Integrated Debugger

Ambiguous Path Visualization

Decision DFA Visualization



#8 lewis

lewis

    Advanced Member

  • Members
  • PipPipPip
  • 41 posts

Posted 29 September 2008 - 04:37 AM


But it seems that you're showing ANTLR, being used to analyse programming languages,
which I find very interesting also -- maybe to try to learn a new programming language ?

But, what I'm thinking about at this stage, is about using it, to analyse
English text.

So you're suggesting that ANTLR can be used for that too ?

And that I "need to write (or copy & paste) parser, lexer and a sample of code to analyse" ?
How do you copy and paste a parser and lexer ?
I don't quite understand this sentence of Ami's.

And what about, instead of maybe, writing my own parser and lexer, to maybe try
one that say, the Stanford NLP Group has put out ? -- "off the shelf" ?
ie just simply use a kind of finished product ?
Although, you'd be able to learn about NLP, by perhaps writing your own code,
after learning the concepts involved.
Just that it would take me a lot of time to do, even though it would be good to do it.
You're suggesting ANTLR as 1 "off the shelf" product ?

And would I be able to, have a look at the code for OpenNLP ?
Maybe to see how they do it ?
Once again, not arguing with idea that would be good, if I wanted to learn how to
program NLP, that good to start learning concepts first, programming language later,
sort of thing.
Again, perhaps, as a kind of using "coding by example" ?


#9 lewis

lewis

    Advanced Member

  • Members
  • PipPipPip
  • 41 posts

Posted 29 September 2008 - 08:15 AM


I am having a look at JTextPro, which you can find here:

http://jtextpro.sourceforge.net/

(What about OpenNLP also ?)

You'll have to download it, if you want to have a look at it.
It's about 23MB.

I'm trying to execute/run the program, and am a bit stuck in trying to do this.

(I hope that it is ok to ask re. this here ?)

I had a look at the readme file, and it had a few instructions, including I think
on makefiles. I'm only a beginner programmer, so trying to do this, is a bit hard.

I just basically would like to run JTextPRo, or a good alternative
natural language processing/text analysis tool, to see what it actually does
(to some text), and what I might be able to do with it, sort of thing.

I was wondering if someone could help me, to try to execute this, or a similar type
program ?

I have tried to contact the creator of JTextPro, but he has not responded to me as yet.

#10 lewis

lewis

    Advanced Member

  • Members
  • PipPipPip
  • 41 posts

Posted 29 September 2008 - 12:33 PM


I've also downloaded the Stanford Part of Speech tagger, which is 40MB,
and can be found here again:

http://nlp.stanford.edu/software/
(The Stanford Natural Language Processing Group)

It also contains a readme file, with some instructions, but I haven't quite
got it to work yet.