Segmention rule for soft line breaks in Excel when "\n" doesn't work Thread poster: XLTS
| XLTS Germany Local time: 08:56 Member (2011) English to German + ...
I would like to split the individual lines which carry soft line breaks in the cells of my (multilingual) Excel (2019) files into seperate translation units. However, these lines don't seem to be separated by a "normal" LF/CR (hence it doesn‘t help to add a "\n" segmentation rule), but by a string which is displayed as "_x000D_" when I open the xlsx archive and have a look at the file "sharedstrings.xml" in the "xl" directory. When I add a segmentation rule to a... See more I would like to split the individual lines which carry soft line breaks in the cells of my (multilingual) Excel (2019) files into seperate translation units. However, these lines don't seem to be separated by a "normal" LF/CR (hence it doesn‘t help to add a "\n" segmentation rule), but by a string which is displayed as "_x000D_" when I open the xlsx archive and have a look at the file "sharedstrings.xml" in the "xl" directory. When I add a segmentation rule to a translation memory (which I newly created for testing purposes) in entering "_x000D_" in the same advanced view where you would normally enter "\n", it will have no effect on the segmentation. (Adding the "\n" segmentation rule BTW just makes the "_x000D_" string appear in the source column of the Studio editor view. I could change this into a tag, but this is not what I seek.) What do I have to do to successfully split these lines into individual TUs? I am using Studio 2021.
[Bearbeitet am 2023-04-01 02:24 GMT] ▲ Collapse | | | Jaime Oriard Mexico Local time: 00:56 Member (2005) English to Spanish + ... | Stepan Konev Russian Federation Local time: 09:56 English to Russian
Can you share a screenshot of your strings? You can use imgbb.com for that (copy the BB code generated by that site and paste it here). Also, do you make sure to remove the file after you add a new rule both from source and target for Trados to rebuild both source and target sdlxliff files from scratch with the new rule? | | | XLTS Germany Local time: 08:56 Member (2011) English to German + ... TOPIC STARTER Your suggestion looks logical, but... | Apr 1, 2023 |
Thank you, I just have tried all of these, but none will give a different result, the TU still looks like: This is sentence 1._x000D_This is sentence 2. BTW, when I reopen the language resources tab of the translation memory, the string \r\n will appear as ".\r[\n]+". But no matter where I place the brackets or even when I put both codes into brackets individually, the result will remain the same. | |
|
|
XLTS Germany Local time: 08:56 Member (2011) English to German + ... TOPIC STARTER Action taken between attempts | Apr 1, 2023 |
Stepan Konev wrote: Can you share a screenshot of your strings? Which strings do you refer to? The segmentation rule or the TUs to be split? Also, do you make sure to remove the file after you add a new rule both from source and target for Trados to rebuild both source and target sdlxliff files from scratch with the new rule? I am using the option "translate single document", each time making sure to delete on my SSD the project file created on this occasion. I even have created several copies of the source file and closed Studio between two attempts, to no avail. | | | Stepan Konev Russian Federation Local time: 09:56 English to Russian
XLTS wrote: Which strings do you refer to? The segmentation rule or the TUs to be split? The TUs to be split. But I can see an example from your previous reply, it's ok. | | | Stepan Konev Russian Federation Local time: 09:56 English to Russian
I assume that it is not actually _x000D_ but simply x000D. Is that right? If yes, you should do the following steps: 1. Go to File - Options -File Types - Microsoft Excel 2007-2019 - Embedded Content 2. Tick the 'Enable Embedded Content' box and click the 'Extract in defined document structures' radio button. 3. In the 'Tag definition rules' window, click Add... 4. In the 'Start Tag:' field type x000D 5. Click Advanced and select 'Exclude'. 6. Cli... See more I assume that it is not actually _x000D_ but simply x000D. Is that right? If yes, you should do the following steps: 1. Go to File - Options -File Types - Microsoft Excel 2007-2019 - Embedded Content 2. Tick the 'Enable Embedded Content' box and click the 'Extract in defined document structures' radio button. 3. In the 'Tag definition rules' window, click Add... 4. In the 'Start Tag:' field type x000D 5. Click Advanced and select 'Exclude'. 6. Click OK as many times as necessary to close all windows and save the changes. 7. Open your single file for translation. ▲ Collapse | | | XLTS Germany Local time: 08:56 Member (2011) English to German + ... TOPIC STARTER Underline character not for highlighting purposes :) | Apr 1, 2023 |
Stepan Konev wrote: I assume that it is not actually _x000D_ but simply x000D. Is that right? Unfortunately not: each "x000D" carries a preceding and a tailing unterline character, also when there are several of these strings in a row (ex: Sentence1._x000D__x000D_Sentence2.) | |
|
|
Stepan Konev Russian Federation Local time: 09:56 English to Russian Ok, then try this | Apr 1, 2023 |
XLTS wrote: Unfortunately not: each "x000D" carries a preceding and a tailing unterline character, also when there are several of these strings in a row (ex: Sentence1._x000D__x000D_Sentence2.) Then use the same procedure, but in step 4 type _x005F *Edit: removed the full stop char after _x005F to avoid ambiguity.
[Edited at 2023-04-01 14:22 GMT] | | | XLTS Germany Local time: 08:56 Member (2011) English to German + ... TOPIC STARTER The solution I have found | Apr 1, 2023 |
Stepan Konev wrote: Then use the same procedure, but in step 4 type _x005F Before I tried this, I chose to follow your initial recipe... 3. In the 'Tag definition rules' window, click Add... 4. In the 'Start Tag:' field type x000D 5. Click Advanced and select 'Exclude'. ..., except that (in "Bilingual Excel", as I am dealing with this file type) I entered "_x000D_" instead and ticked the check box "Line break after the tag" (trl?), and this actually seems to have done the trick. For anyone who will encounter the same problem in the future, here is a summary of what I have done to eventually resolve the problem (I have at hand the German version of Studio 2019, so the wording of the commands may be approximative): 1. Go to File - Options - File Types - Bilingual Excel - Embedded Content 2. Tick the 'Enable Embedded Content' box and click the 'Extract in defined document structures' radio button. 3. In the 'Tag definition rules' window, click Add... 4. In the 'Start Tag:' field type _x000D_ (including the underline characters) 5. Click Advanced and both select 'Exclude' and tick the 'Line break after the tag' check box 6. Click OK as many times as necessary to close all windows and save the changes. I don't know whether the following is necessary, but now that I have found a solution that works for whatever reason, and with Trados Studio you have to be prudent not to ruin working solutions once you have found them, I have kept the changes I made to the translation memory: 1. In the Translation memory view, right click on the TM you would like to use, and select 'Settings'(or 'Properties'?). 2. In the 'Segmentation rules" windows of both the source and the target language, add a new Sentenced-base segmentation rule, using a suitable description. 3. Pick 'Anything' from the dropdown lists both in 'Before Break' and 'After Break', click on 'Advanced View', type .\r[\n]+ into the 'Before break' window, leaving the 'After break' window empty. 4. Click OK as many times as necessary to close all windows and save the changes. Now create a project from you bilingual Excel file and see if this has worked for you, too... Stepan: bol'shoe spasibo for your help!
[Bearbeitet am 2023-04-01 15:07 GMT] | | | To report site rules violations or get help, contact a site moderator: You can also contact site staff by submitting a support request » Segmention rule for soft line breaks in Excel when "\n" doesn't work Trados Studio 2022 Freelance | The leading translation software used by over 270,000 translators.
Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop
and cloud solution, empowering you to work in the most efficient and cost-effective way.
More info » |
| CafeTran Espresso | You've never met a CAT tool this clever!
Translate faster & easier, using a sophisticated CAT tool built by a translator / developer.
Accept jobs from clients who use Trados, MemoQ, Wordfast & major CAT tools.
Download and start using CafeTran Espresso -- for free
Buy now! » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |