OCR project in Gsoc


#1

Hi to all,

I planned to do project in gsoc… For OCR ( Optimal Character
Recoganisation ) …

That is ,

         If we scanning one full text page from book, it will open

into open office as word format. so that we can edit the page from
scanned text page… I planned to convert scanned letters to words for
Tamil, English Languages… I will try to support few more languages
also…This OCR project will can done by Using Rmagick , i will do
this successfully.

         This is my idea, if any one of you can suggest me and

guide me to do this…

Thank,

Arulalan.


#2

Hi,

What about Google Tesseract???

http://code.google.com/p/tesseract-ocr/

Harold
escribió:> There are many ways to accomplish this, none of them are easy…


#3

There are many ways to accomplish this, none of them are easy…

There’s ai4r’s backpropagation nueural nets implementation, with a
simple OCR example at http://ai4r.rubyforge.org/neuralNetworks.html

There’s also gnu Ocrad, which I’ve never used:
http://www.gnu.org/software/ocrad/,
and just found http://gtamilocr.sourceforge.net/ which does OCR for
Tamil characters as well.

I’d be glad to hear other suggestions…