An Introduction to Computer-Aided Translation Tools – Part 1
Imagine that you have a translation project with often-used phrases, clauses and/or sentences. Here, using CAT (computer-aided translation) tools comes on the scene. As their name suggests, these are computer software used by translators, project managers, and translation companies to facilitate the translation and localization processes. Yet, to what extent are they helpful? And how do they differ from machine translation?
CAT Tools vs. Machine translation
Machine translation, also referred to as automated translation or instant translation, is the translation of text by a computer without human involvement. Such technology is frequently used for the purposes of looking for the main ideas of a text (gisting), checking internationalization issues in the target languages before committing to any professional translation (by means of pseudo-translation), and supporting human translators as well since modern CAT tools enable users now to translate source texts with the help of machine translation. Then, translators can decide which translations can be approved or declined during the translation process.
There are three types of machine translation systems: rule-based, statistical and neural systems
- In rules-based systems, combinations of language and grammar rules with dictionaries for commonly-used words are used. Then, specialized dictionaries are created to focus on certain industries and/or disciplines.
- Statistical systems have nothing to do with language rules. Yet, they “learn” to translate via analyzing large amount of data for each language pair. Plus, they can be “trained” for specific industries using extra data related to the sector needed.
- Neural machine translation is a new approach that makes machines learn how to translate through one large neural network. Recently, trained neural machine translation systems have started to showcase better translation performance in a variety of language pairs, compared to the other two systems above.
On the contrary, CAT tools rely on databases, known in the industry as translation memories, which save commonly used phrases and sentences with their target translations. They help translators perform translation tasks, monolingual and bilingual review, maintain the quality throughout a translation project as quickly as possible. When there is a need to process a project through one of such tools, the file(s) will be scanned to check and count the frequently-used word combinations and the tool will immediately give the translator/project manager the weighted word count with all project details at the click of a button.
Pros & Cons
One of the main assets of any professional translator, especially in the localization industry, is productivity. Yet, such productivity should not be affected by poor quality, whether from inconsistencies, typos, punctuation errors, numerical mismatches, among others. Therefore, computer-aided translation tools have a combo of benefits.
- Error Reduction: CAT tools are intuitively designed to help translators and reviewers enhance their product quality and minimize any potential errors. SDL Trados, one of the most commonly used CAT tools, has thorough quality assurance settings that can be easily customized depending on the clients’ requirements. Hence, the QA settings can check any missing translations, punctuation errors, inconsistencies, length verification, number mismatches. Plus, you can create your own word list to ask SDL Trados for an error notification when you confirm a translation with such errors in the word list you previously created.
- Collaboration: You may frequently work with a group of translators on the same project that could be divided and distributed for the sake of handling it as quickly as possible. Handling it away from any CAT tool may put the project consistencies at risk. However, thanks to termbases that you can create using CAT tools, this will make sure the final translation will appear consistent and uniform.
- Analysis & review: Given that you can filter any group of words, whether in the source text and/or the target translation, and thanks to the find & replace feature, with the advance tips & tricks in using the filtering feature and RegEx (This will be discussed later in a later article), the review process became a cinch with such tools.
- Cost reduction for clients: Instead of paying for a 10K-word project with many repetitive phrases, clauses, and sentences, clients can use CAT tools to analyze the weighted word count of a given project to help themselves pay a reasonable sun for the language service they need. Thanks to the analysis feature CAT tools have, a detailed sheet can be easily created and exported with the final word count of repetitions, new words, previously translated words (if applicable) and so on. Then, the estimated budget for the given project is highly expected to reduce.
Despite all the above benefits and more, CAT tools still have some cons that should be carefully considered. Below are some cons that could be a nightmare for some translators and my suggested solutions for such issues.
- Pricing: Most commonly-used CAT tools are intuitive and full of productive features for translators, project managers, and language service providers. Yet, most of them are not affordable, especially for translators who start their professional career in translation and localization. For example, SDL Trados Studio 2019 (Freelancer version), SDL Trados Studio 2019 (Freelancer Plus version), and SDL Trados Studio 2019 (Professional version) cost about €695, €855, and €2495 respectively (at the time of writing this article). SDL Trados, however, provides affordable but basic and scaled down limited features and annual subscription which costs about €100. Similarly, memoQ, another computer-aided translation leader in the industry, provides a translator pro version and a memoQ project manager version for €620 & €1500 respectively. It is noteworthy to say that there are other CAT tools in the industry with affordable prices and/or monthly/annual subscription but with less features compared to SDL Trados Studio and memoQ. Yet, ProZ.com offers a service called TGB (Translator Group Buy) where translators can join and group buy CAT tools and other software tools for a fraction of the price. You can check the TGP Page at https://www.proz.com/tgb and you can find more information about it at www.proz.com/faq/store_and_translator_group_buying_tgb. Plus, many translation companies opted to have their own translation management system, whether created by their localization engineering team, in collaboration the R&D department, or hired from any cloud-based translation management portals, such as Memsource, Smartling, etc. Also, freelance translators can overcome such pricing problems by using other free tools such as OmegaT, Smartcat, and MateCat. Such tools have the basic features that help translators, to a great extent, process their files in translatable formats.
- Poor Usability with literary genres: As mentioned earlier, translators can maximize their productivity in translation through CAT tools when handling projects that have often-used phrases, clauses, and sentences. This frequently happens with many subject-matters such as legal, medical, financial, and others. Nevertheless, using a CAT tool with literary genres, for example, could highly be ineffective and unproductive.
- Poor Segmentation: In many cases, some file formats, PDFs for example, could lead to poor segmentation when processed with a stand-alone CAT tool. This frequently happens when a non-OCRed* PDF format, especially a file loaded with many screenshots. In such cases, it is highly expected to find poor segmentation of phrases, clauses, and sentences across the whole project. As a potential solution, such difficult-to-process files formats may be typed first in a translatable format, MS Word format for example, before the translation phase starts.
In the next article, I will discuss some key terms in the CAT industry and which tool I recommend for investment. The choice may be biased, based on my own experience. However, I will try to shed light on the pros and cons of the key CAT tools in the industry.
*OCR = Optical Character Recognition