New build: Support for scanned PDF documents using OCR is limited
Thread poster: Hans Lenting
Hans Lenting
Hans Lenting
Netherlands
Member (2006)
German to Dutch
Mar 15, 2023

Cumulative Update 6 for Trados Studio 2022 (Build 17.0.6.14902, Released on 13th of March, 2023)

Starting with this release, Trados Studio uses a new mechanism and underlying technology for processing PDF files in translation projects. Support for scanned PDF documents using OCR (optical character recognition) is limited out of the box.


shankar pichakkaran
 
Hans Lenting
Hans Lenting
Netherlands
Member (2006)
German to Dutch
TOPIC STARTER
Very clean XLIFF file Mar 16, 2023

I have tested the new converter with this pdf:

Screen Shot 2023-03-16 at 12.06.55

And I got a very clean XLIFF:

Screen Shot 2023-03-16 at 12.01.49

I wonder who the provider of the converter technology is.


[Edited at 2023-03-16 11:26 GMT]


Xenglee Xiaye
 
Stepan Konev
Stepan Konev  Identity Verified
Russian Federation
Local time: 11:33
English to Russian
Solid Mar 16, 2023

Hans Lenting wrote:
I wonder who the provider of the converter technology is.
Solid Documents I believe. At least for Trados Studio 2021. You can check out for 2022 by going to File Types - PDF - About


Platary (X)
Xenglee Xiaye
 
Hans Lenting
Hans Lenting
Netherlands
Member (2006)
German to Dutch
TOPIC STARTER
Not there anymore Mar 17, 2023

Screen Shot 2023-03-17 at 07.40.19

Stepan Konev
 
Laurent Di Raimondo
Laurent Di Raimondo  Identity Verified
France
Local time: 10:33
English to French
+ ...
SITE LOCALIZER
Nothing new under the sun of Trados... Mar 17, 2023

Having downloaded and installed Cumulative Update for Trados Studio 2022 (Build 17.0.6.14902, Released on 13th of March, 2023), I've found desperately no significant change so far.

Trados makes still possible to convert Text PDF documents (or "live PDF") into very clean and flawless DOCX file formats. That's a good point but it's not a revolution.

Apart from the fact that Solid Documents logo has just disappeared indeed, Trados still remains unable to convert Image PDF
... See more
Having downloaded and installed Cumulative Update for Trados Studio 2022 (Build 17.0.6.14902, Released on 13th of March, 2023), I've found desperately no significant change so far.

Trados makes still possible to convert Text PDF documents (or "live PDF") into very clean and flawless DOCX file formats. That's a good point but it's not a revolution.

Apart from the fact that Solid Documents logo has just disappeared indeed, Trados still remains unable to convert Image PDF documents (or "dead PDF") into editable Word documents. To this respect, I would like to underscore that Trados has never been designed for nor implemented with this very functionality, since it doesn't behave like a real OCR software properly speaking.

For that reason, I've purchased the genuine OCR software of Solid Documents (Trados so-called partner) in order for me to convert all my Image PDF documents into real editable documents more properly translable in Trados afterwards. It works like a charm and I can't complain so far. (Just for a paltry sum of €40.00 per owner licence, for the record.)

So nothing new under the sun with this new "release"...

It has been donkey's years that Trados have not made significant improvements into their prehistoric software and it could likely take much more than centuries before they get up from their laurels they have been resting on so far and get a move on. A real f*****g move on this time!...

[Modifié le 2023-03-17 20:17 GMT]
Collapse


 
Hipyan Nopri
Hipyan Nopri  Identity Verified
Indonesia
Local time: 15:33
Member (2005)
English to Indonesian
+ ...
Cannot Translate Scanned PDFs Mar 20, 2023

Hi Everyone,
Previously, I was able to translate scanned PDF documents without any problems using Trados Studio 2022.

After the latest update to version 17.0.6.14902, I can no longer translate scanned PDFs.

The editor is just blank like the attached screenshot.

Any help from fellow translators would be greatly appreciated.

Screenshot

[Edited at 2023-03-20 00:46 GMT]


 
Stepan Konev
Stepan Konev  Identity Verified
Russian Federation
Local time: 11:33
English to Russian
Have you read the original post? Mar 20, 2023

Hipyan Nopri wrote:
After the latest update to version 17.0.6.14902, I can no longer translate scanned PDFs.
As far as I understand form the original post, Starting with this release... support for scanned PDF documents using OCR (optical character recognition) is limited out of the box. Trados is not an OCR program, it's a CAT tool. In my opinion, it is quite honest of RWS to remove that feature.
memoQ can't import scanned PDF files: "memoQ doesn't extract text from scanned PDF files, where the pages are saved as images and not as text. To translate these documents, run them through a page reader program such as Nuance OmniPage or ABBYY FineReader (PDF Reader)."
Even with such a free CAT tool as Smartcat you still have to pay for OCRing.
It's ok, RWS just made it fair: CAT tools for translation, OCR tools for optical recognition.
The OCR software developers make their living from selling their OCR services. Why on earth Solid Documents must work for free? Besides, there were so many issues about the OCR feature in Trados that they should have removed it a long time ago rather than keep explaining why it doesn't work as expected. This is how I see it.

[Edited at 2023-03-20 02:48 GMT]


Jorge Payan
 
Hipyan Nopri
Hipyan Nopri  Identity Verified
Indonesia
Local time: 15:33
Member (2005)
English to Indonesian
+ ...
Yes, I have Mar 20, 2023

read the sentence, Stepan. However, it is 'limited' rather than 'terminated'.

Gurudutt Kamath
 
Stepan Konev
Stepan Konev  Identity Verified
Russian Federation
Local time: 11:33
English to Russian
Synonyms Mar 20, 2023

Hipyan Nopri wrote:
it is 'limited' rather than 'terminated'.
These words have the same meaning here: "Limit = 1. The point, edge, or line beyond which something ends". Also, you still can translate vector PDF files (other than bitmap PDF files), don't you? That is why it is limited.


 
Hipyan Nopri
Hipyan Nopri  Identity Verified
Indonesia
Local time: 15:33
Member (2005)
English to Indonesian
+ ...
Ambiguous Words Mar 20, 2023

Yes, companies often use ambiguous words indeed.

They say 'limited' when it means 'terminated'.

Even companies often use terms that are contradictory in meaning.

They use the word 'unlimited' (for example, 'unlimited usage' in Internet usage) while in fact it means 'limited'.


Stepan Konev
 
Tom in London
Tom in London
United Kingdom
Local time: 09:33
Member (2008)
Italian to English
Impressive Mar 21, 2023

I've downloaded the trial version of Solid Converter for Mac and given it a go.

I'm impressed: it's fast and accurate and offers different settings depending on the extent to which you want to strictly maintain the formatting of the original.

I tried it with a pdf that included illustrations with captions, and the main text wrapping around the illustrations. It was pretty good and I could use it - were it not for the cost ($99.95. i.e. $100). I so rarely convert PDFs t
... See more
I've downloaded the trial version of Solid Converter for Mac and given it a go.

I'm impressed: it's fast and accurate and offers different settings depending on the extent to which you want to strictly maintain the formatting of the original.

I tried it with a pdf that included illustrations with captions, and the main text wrapping around the illustrations. It was pretty good and I could use it - were it not for the cost ($99.95. i.e. $100). I so rarely convert PDFs that it wouldn't be a good investment (I usually ask the client to provide a conversion).
Collapse


Laurent Di Raimondo
 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 10:33
Member (2006)
English to Afrikaans
+ ...
@Hans Mar 21, 2023

Hans Lenting wrote:
Cumulative Update 6 for Trados Studio 2022 (Build 17.0.6.14902, Released on 13th of March, 2023)
Starting with this release, Trados Studio uses a new mechanism and underlying technology for processing PDF files in translation projects. Support for scanned PDF documents using OCR (optical character recognition) is limited out of the box.

https://gateway.sdl.com/apex/communityknowledge?articleName=CUs-Studio2022
There is now a "PDF Assistant" addon in the app store that utilizes Microsoft Word's OCR conversion feature. But yes, they basically say that Trados no longer offers OCR and that you should use a third-party OCR tool.


Laurent Di Raimondo
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

New build: Support for scanned PDF documents using OCR is limited







Trados Studio 2022 Freelance
The leading translation software used by over 270,000 translators.

Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop and cloud solution, empowering you to work in the most efficient and cost-effective way.

More info »
TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »