Word count having 2 different languajes in source document Autor wątku: Ana Lopez
| Ana Lopez Meksyk Local time: 14:03 Członek ProZ.com od 2013 angielski > hiszpański + ...
Hello!! I'm working on a PDF document that has German/English in two "columns" and I only have to translate the English part, do you know any way I can ONLY count the English words? Trados has statistics, but I don't know if there is a tool to count by language. The only way I can think of is to count them manually. Do you know anything faster? Thank you. | | | Jack Doughty Wielka Brytania Local time: 21:03 rosyjski > angielski + ... In Memoriam Convert to Word | Jun 4, 2014 |
You can convert it to Word using an OCR. Abbyy fine Reader and Abbyy PDF Converter come to mind. | | | Ana Lopez Meksyk Local time: 14:03 Członek ProZ.com od 2013 angielski > hiszpański + ... NOWY TEMAT Can Word count by language? | Jun 4, 2014 |
Thanks! I already converted it to Word however, since the columns are mixed with images I cannot just "select" the English column. Thus asking if there is any other way than by marking page by page. Maybe there isn't, just asking | | | Tony M Francja Local time: 22:03 Członek ProZ.com francuski > angielski + ... SITE LOCALIZER Are languages set? | Jun 4, 2014 |
When you did the conversion using OCR, were you able to set the languages of the relevant bits? If the text DOES have its 'language' attributes correctly set, then you can do an ordinary word count in Word; then search and replace all for 'any character' + language attribute = (say) German, replacing with nothing. Then do another word count, and this will be the EN words without the German ones; in fact, you don't even need to have done the preliminary word count, I was... See more When you did the conversion using OCR, were you able to set the languages of the relevant bits? If the text DOES have its 'language' attributes correctly set, then you can do an ordinary word count in Word; then search and replace all for 'any character' + language attribute = (say) German, replacing with nothing. Then do another word count, and this will be the EN words without the German ones; in fact, you don't even need to have done the preliminary word count, I was just thinking of subtracting the EN from the total, since TOTAL – EN = German, of course! Naturally, if the language attribute was NOT correctly set in the first place, this won't work; but at least you'll know for next time. BTW, you say that the images are stopping you from selecting all the EN column, but why? Are they in merged cells or something? You ought to be able to process your table in such a way as to unmerge all the cells, which will probably push all the images into the l/h column or something, but will leave you with two clean columns you can select properly. Your are SURE it is in a proper Word table? OCR conversions have a nasty habit of 'organizing' (well, that's not what I call it...) text into newspaper-style columns, in which case you'll have a harder job on your hands trying to sort it out. It might even be simpler to convert everything to single-column and remove all column breaks from the document, and then see what you have left... ▲ Collapse | |
|
|
Tony M Francja Local time: 22:03 Członek ProZ.com francuski > angielski + ... SITE LOCALIZER Failing that... | Jun 4, 2014 |
...if the original document really is organized neatly into two columns, why not just do another 'dummy' OCR run on it, selecting ONLY the EN column as you go through, so you'll actually have a document at the end of it that ONLY contains the EN you need to translate; you might even be able to use this for your translation, or at worst, it will be a useful intermediate stage for your word count.
[Modifié le 2014-06-04 20:58 GMT] | | | Ana Lopez Meksyk Local time: 14:03 Członek ProZ.com od 2013 angielski > hiszpański + ... NOWY TEMAT I'll try the option | Jun 4, 2014 |
I'll try making a dummy OCR conversion, from Abbyy, only identifying English as language, and see how it goes with the find & replace. Thank you so much Tony M.!! | | | Ümit Karahan Turcja Local time: 23:03 angielski > turecki + ... Paste only text | Jun 5, 2014 |
Hi. Try to copy the all by Ctrl+A, Ctrl+C and then choose to paste it as text only in a blank word page. So you can get rid of images.
[Edited at 2014-06-05 01:14 GMT] | | | To report site rules violations or get help, contact a site moderator: You can also contact site staff by submitting a support request » Word count having 2 different languajes in source document Anycount & Translation Office 3000 | Translation Office 3000
Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.
More info » |
| Trados Studio 2022 Freelance | The leading translation software used by over 270,000 translators.
Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop
and cloud solution, empowering you to work in the most efficient and cost-effective way.
More info » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |