Trying to use > as a segment delimiter in memoQ Thread poster: ALAN LAMBSON
| ALAN LAMBSON United States Local time: 19:30 Member (2021) Spanish to English + ...
I must re-translate a JSON version of a file that was previously translated in InDesign. The JSON content is imported straightforwardly, but now has many insertions with angle brackets such as and and other, more complex but always ending in >, that prevent many segments from getting a good match with what I have in the legacy TM. For instance, I sometimes have several sentences "glued together" with these sequences. Each individual sentence has a match in the TM, but the new segment with all... See more I must re-translate a JSON version of a file that was previously translated in InDesign. The JSON content is imported straightforwardly, but now has many insertions with angle brackets such as and and other, more complex but always ending in >, that prevent many segments from getting a good match with what I have in the legacy TM. For instance, I sometimes have several sentences "glued together" with these sequences. Each individual sentence has a match in the TM, but the new segment with all of these concatenated no longer matches.
All I want to do is segment the JSON text on input so that it uses ">" as a segment-end delimiter. In theory, this should break out most or all of these sequences as separate segments, which can be simply copied from source and ignored. Exporting back to JSON should put it all back together again.
At least, this is my theory. But when I add the ">" character to the list of segment-end delimiters in segmentation rule set, apply to the project, then import the JSON file, it doesn't work.
Any help from an experienced segmentation-rule-writer would be much appreciated! ▲ Collapse | | | Segmentation works... but needs a space | Jun 23, 2023 |
You will see that adding ">" to a custom set of segmentation rules for your project does work... only if there is a space after the ">" . What they do not tell you in memoQ's help is that you do not only need one of the sentence-ending symbols, but also a space after it.
Is there any chance for you to replace ">" with "> " temporarily before you import the JSON file, translate that, and then remove the space with find/replace?
This would be the simplest approach I think... See more You will see that adding ">" to a custom set of segmentation rules for your project does work... only if there is a space after the ">" . What they do not tell you in memoQ's help is that you do not only need one of the sentence-ending symbols, but also a space after it.
Is there any chance for you to replace ">" with "> " temporarily before you import the JSON file, translate that, and then remove the space with find/replace?
This would be the simplest approach I think.
[Edited at 2023-06-23 06:14 GMT] ▲ Collapse | | | ALAN LAMBSON United States Local time: 19:30 Member (2021) Spanish to English + ... TOPIC STARTER Still doesn't work | Jun 23, 2023 |
Hi Tomás,
After inserting a space after > in the source JSON, the segmentation still does not break segments after "> ". I then thought that perhaps memoQ is handling and in a special way, perhaps translating them into a carriage return/line feed. So I changed all > to @, after verifying that the @ character does not appear in anywhere in the file. I then added @ to the list of segment ending characters. It still does not break after "@ ". Could I be doing something else wrong? | | | Stepan Konev Russian Federation Local time: 04:30 English to Russian
Can you give an example of your text for translation? 3 to 4 sentences. | |
|
|
HTML embedded inside the JSON | Jun 23, 2023 |
Alan was so nice as to share a sample with me privately and it was quickly apparent that the client had embedded HTML code inside the JSON file. After all, the solution was to import with a cascading filter (JSON + HTML filter) after a little bit of tweaking some tags that were not HTML standard. | | | ALAN LAMBSON United States Local time: 19:30 Member (2021) Spanish to English + ... TOPIC STARTER Tomás saves the day! | Jun 23, 2023 |
Thank you, Tomás, for your excellent and timely help! Thank you Stepan for offering to help, but I'll consider this posting closed now. | | | To report site rules violations or get help, contact a site moderator: You can also contact site staff by submitting a support request » Trying to use > as a segment delimiter in memoQ Anycount & Translation Office 3000 | Translation Office 3000
Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.
More info » |
| Wordfast Pro | Translation Memory Software for Any Platform
Exclusive discount for ProZ.com users!
Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value
Buy now! » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |