What is, in your experience, the best OCR software nowadays? Thread poster: Ivan Rocha, CT
|
Hello.
The company I work for is considering the purchase of an OCR software.
What is, in your personal experience, the best software in the field? And what would you recommend me (the files we work with are usually .pdfs with tons of tables and text, as well as with some graphs)?
Thanks in advance for your input.
Regards,
Ivan | | | Natalie Poland Local time: 17:33 Member (2002) English to Russian + ... Moderator of this forum SITE LOCALIZER | Most likely Abbyy Fine Reader | May 12, 2011 |
I like it however I did not try all programs on the market so I cannot be sure.
You can try a free demo (15 days, 50 pages) to see if it is good enough.
S | | | Abbyy for OCR | May 13, 2011 |
However, no PDF converter will convert complex files (tables in particular) flawlessly. PDF files were designed as a delivery mechanism, not as working files.
To see what to expect from Abbyy, do some tests here (20 free pages per month):
• https://www.ocrterminal.com
The reason I like Abbyy is, it stays away from floating text boxes and frames, which most other converters use heavily and which are a nightmare to the translator.
If your PDF ... See more However, no PDF converter will convert complex files (tables in particular) flawlessly. PDF files were designed as a delivery mechanism, not as working files.
To see what to expect from Abbyy, do some tests here (20 free pages per month):
• https://www.ocrterminal.com
The reason I like Abbyy is, it stays away from floating text boxes and frames, which most other converters use heavily and which are a nightmare to the translator.
If your PDF files don't contain imaged text (not requiring OCR) and you just want to convert the PDF text into a Word file (actually RTF files, although they name them .doc), this one is extremely good at tables:
• http://www.pdftoword.com
And it's free. Go figure.
Bear in mind that sending files over the web raises confidentiality issues. ▲ Collapse | |
|
|
I have nuance and it works pretty well. However I havent used Abbey so I can't compare | | | Good experiences with ABBYY FineReader | May 13, 2011 |
Although it is far from being perfect!
It works well with scanned pages, and the Word documents it produces are generally OK. However, for complex documents sometimes it is even better to scan the text and format it yourself, since the host of tiny boxes created by FineReader are really cumbersome to work with as a translator. | | | Peter Linton (X) Local time: 16:33 Swedish to English + ...
I use OmniPage 17 very successfully.
In a computer magazine test last year, OmniPage and ABBYY both came out on top. | | | InFix Pro - NOT an OCR software | May 13, 2011 |
If it's a "distilled" (i.e. not scanned) PDF, InFix is the way to go. It's a PDF editor with DTP-like resources. It lets you export tagged text to XML, translate it with your favorite tool, and then import back into the (equally tagged) PDF, preserving all formatting.
Of course, you'll have issues with partially-embedded fonts in the PDF and text swelling in translation. Yet the program lets you manage and solve them. My workflow is described in more detail ... See more If it's a "distilled" (i.e. not scanned) PDF, InFix is the way to go. It's a PDF editor with DTP-like resources. It lets you export tagged text to XML, translate it with your favorite tool, and then import back into the (equally tagged) PDF, preserving all formatting.
Of course, you'll have issues with partially-embedded fonts in the PDF and text swelling in translation. Yet the program lets you manage and solve them. My workflow is described in more detail here.
If it's a scanned PDF, I use an old but satisfactory version (14) of OmniPage and, after translation, I rebuild the whole publication using PageMaker, editing/adjusting the illustrations with PhotoImpact. Obviously I charge the client for the DTP work too. ▲ Collapse | |
|
|
Jo Macdonald Spain Local time: 17:33 Member (2005) Italian to English + ...
Been using Omnipage for years, quite happy with it, didn't cost much either.
Great with clean Pdfs and other electronic text, not so good with dirty scans/images. It will convert these but the results are often more time-consuming to work with than typing the translation from scratch.
Just tried Pdf-to-word with a dirty scan Pdf, took about 30 mins to receive a mail saying:
Failed to convert your document - Sorry, the result converted document is too large to be sent.<... See more Been using Omnipage for years, quite happy with it, didn't cost much either.
Great with clean Pdfs and other electronic text, not so good with dirty scans/images. It will convert these but the results are often more time-consuming to work with than typing the translation from scratch.
Just tried Pdf-to-word with a dirty scan Pdf, took about 30 mins to receive a mail saying:
Failed to convert your document - Sorry, the result converted document is too large to be sent.
Omnipage took less than a minute to convert this file and the resulting Word doc was about 1.2 Mb. I didn't actually end up use this file but typed the translation while reading the scan, imo less time consuming than converting-correcting-processing in a Cat-checking against scan, etc.
I've only had a few instances of files that made Omnipage crash.
No experience with Abbyy. ▲ Collapse | | | esperantisto Local time: 19:33 Member (2006) English to Russian + ... SITE LOCALIZER
Ivan Rocha wrote:
And what would you recommend me (the files we work with are usually .pdfs with tons of tables and text, as well as with some graphs)?
Avoid such clients. Or charge per hour for the OCR work. Whichever OCR program you choose, documents of the kind you describe will be a pain in the neck anyway. | | | Can't do it... | May 13, 2011 |
esperantisto wrote:
Ivan Rocha wrote:
And what would you recommend me (the files we work with are usually .pdfs with tons of tables and text, as well as with some graphs)?
Avoid such clients. Or charge per hour for the OCR work. Whichever OCR program you choose, documents of the kind you describe will be a pain in the neck anyway.
I have an in-house position, so I can't (or want) "avoid" this client.
As for all others who answered my question, thanks for your contribution. | | | OCR (especially tables) | May 13, 2011 |
I would also recommend ABBYY Finereader, but you got to be realistic about the results when you are talking about tables or just poor copies.
Tables are almost always a disaster and must be either extensively reformatted or simply retyped.
I currently live on the Philippines, and have a few Filipinos working for me when I need OCR documents cleaned up. I pay them for a couple of Euros a day (average wage here is maybe 100 Euros a month). They know English (official language in... See more I would also recommend ABBYY Finereader, but you got to be realistic about the results when you are talking about tables or just poor copies.
Tables are almost always a disaster and must be either extensively reformatted or simply retyped.
I currently live on the Philippines, and have a few Filipinos working for me when I need OCR documents cleaned up. I pay them for a couple of Euros a day (average wage here is maybe 100 Euros a month). They know English (official language in the Philippines) can type and layout.
Feel free to message me if you are interested in outsourcing some of the retyping/layouting/reviewing work. OCRs are a major time drain and hideously expensive at Western wage levels, much better to get it done in the developing world. ▲ Collapse | |
|
|
ABBYY for conversion | May 13, 2011 |
ABBYY for PDF conversion. I find it handles tables quite nicely, and if it cannot get an accurate reconstruction of the table, you can always draw the borders yourself which can be very helpful. | | | @Jo Macdonald | May 13, 2011 |
Like I said, PDFtoWord does not work with OCR.
Submit a PDF with accessible text and complex tables—it will do a good work of exporting to Word. | | | ABBYY PDF Transformer | Sep 12, 2011 |
ABBYY PDF Transformer. Try it on the ABBYY site. It gives excellent results on any PDFs, including scanned ones, even in automatic mode (I mostly draw boxes by hand to get the ultimate control).
The current version is 3.0 though I personally liked 2.0 better. I have one legitimate ABBYY PDF Transformer 3.0 box I don't use - send me a personal message if you're interested in getting it for not much money (the license is transferrable). | | | To report site rules violations or get help, contact a site moderator: You can also contact site staff by submitting a support request » What is, in your experience, the best OCR software nowadays? Protemos translation business management system | Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!
The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.
More info » |
| CafeTran Espresso | You've never met a CAT tool this clever!
Translate faster & easier, using a sophisticated CAT tool built by a translator / developer.
Accept jobs from clients who use Trados, MemoQ, Wordfast & major CAT tools.
Download and start using CafeTran Espresso -- for free
Buy now! » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |