Language auto detection?

Dec 21, 2009 at 8:24 PM

From the document, user needs to specify the language of the text to be recognized. I'm wondering is it possible let the ocr engine to detect the language automatically. 

 

var pumaPage = new PumaPage("page001.jpg");
using(pumaPage)
{
pumaPage.FileFormat = PumaFileFormat.RtfAnsi;
pumaPage.EnableSpeller = false;
pumaPage.Language = PumaLanguage.English;
pumaPage.RecognizeToFile("page001.rtf");
}

 

var pumaPage = new PumaPage("page001.jpg");

using(pumaPage)

{

pumaPage.FileFormat = PumaFileFormat.RtfAnsi;

pumaPage.EnableSpeller = false;

pumaPage.Language = PumaLanguage.English;

pumaPage.RecognizeToFile("page001.rtf");

}

 

 

Coordinator
Dec 23, 2009 at 1:30 PM

There's no such a capability, language of the image being recognized should be defined preliminary. There's also a 'mixed' language supported by recognition engine - EnglishRussian - it's possible to get text from document in these two languages. For other languages there're no mixed modes.