Hi guys,
was wanting to implement this, wanted to create a lib for this…
case:
Given an article, I would like do some basic context analysis on it to
generate some tags for the same.
some loud thinking:
- convert document into words
- run a stemmer conversion for these words
- run the list through a stop-word list
- then based on some priority algo rank the keywords
- return top 5 (or based on parameter return top x keywords)
cant use tfidf… cause only one document… was wondering what should i
use to determine priority of the keywords… :S