Generate tags for a given article

Hi guys,

was wanting to implement this, wanted to create a lib for this…

Given an article, I would like do some basic context analysis on it to
generate some tags for the same.

some loud thinking:

  • convert document into words
  • run a stemmer conversion for these words
  • run the list through a stop-word list
  • then based on some priority algo rank the keywords
  • return top 5 (or based on parameter return top x keywords)

cant use tfidf… cause only one document… was wondering what should i
use to determine priority of the keywords… :S