Advanced TopicsLast updated: 13 August 2024
Translation Caching
JCOGS Auto-Translate uses a powerful translation caching system to store copies of translated text on the server. Where possible, JCOGS Auto-Translate will always re-use a cached copy of some translated text: speeding up site loading time, and reducing the cost of content translation by avoiding the repeated translation of the same text.
Fragments and the Unified Cache
During operation, JCOGS Auto-Translate breaks any text for translation into multiple small fragments - so a whole-page of web output may be broken down into tens or hundreds of text fragments - typically fragments represent sentences or clauses within the text. This fragmentation is done to improve translation performance (both speed, and reliability) and reduce translation costs. Each of these fragments is translated separately and the translated text is then reassembled to generate the translated template output. JCOGS Auto-Translate uses a ‘unified cache’ to save each of these text fragments to a language specific cache. Before sending a text fragment to be translated JCOGS Auto-Translate checks to see if a translated version of the fragment already exists within its cache - if one is found that is used and a translation transaction is avoided - retrieving the translated text from the cache is usually much quicker than retrieving a translation of the text from the chosen machine translation service (and incurs no translation cost).
Handling changes
If the source text for a page is changed, some or all of the fragments it generates will also change: if a fragment does not change then any cached copy found will still remain valid and so can continue to be used. If JCOGS Auto-Translate finds fragments that do not match any previously cached text, then JCOGS Auto-Translate will trigger translations just for the new fragments. Thus small changes in text will not trigger ‘whole page’ re-translations, keeping the overall cost of translation low, and speeding the generation of translated pages.
Usage Notes
Use with other caching systems
JCOGS Auto-Translate's caching system only updates text in its cache when EE processes a template that contains JCOGS Auto-Translate tags and if finds that the cache does not contain a previously completed translation of a text fragment. When you are using either quasi static caching (e.g. EE's Template Caching) or fully static caching system (such as Speedy or Rocket) when a page in that caching system is requested EE may not notify JCOGS Auto-Translate that the source text has been changed, and so required changes to the translated text may not be made. Unfortunately control of the caching strategies used by EE and third-party caching add-ons is outside the scope of JCOGS Auto-Translate - to ensure that the cached text is up-to-date you will need to find methods to ensure that you static cache copy of the site is refreshed periodically: JCOGS Auto-Translate will update / refresh its translations during such periodic refresh events.
Which Translation Service?
JCOGS Auto-Translate is able to support multiple translation services: currently you can choose between DeepL and MS Azure AI Translator. Which service should you choose? Based on several year's of experience working with these services, there are three criteria you should consider in making a decision:
- Cost
- Language Support
- Translation Quality
Cost
Machine Translation Services charge for their services based on the number of characters submitted to their APIs for translation: a typical cost is €10 for 500,000 characters.
The character count used by the APIs includes any non-printing characters sent - which is one of the reasons that JCOGS Auto-Translate fragments the source text before sending it to the machine translation service - fragmentation can often remove large volumes of non-printing characters from the material sent to be translated.
Bear in mind that during site development, where the site text may go through multiple changes, many copies of the site text might be translated; JCOGS Auto-Translate's approach based on fragmentation and caching will help reduce this cost, but it is not uncommon for a moderately large site to incur translation costs of between €10s and €100s - so bear this in mind when planning your site translation strategy.
DeepL offers a useful “free” service - providing API translation of 500,000 characters per month at no cost - this is a useful option when you are doing initial evaluations of your approach etc.
Language Support
Naturally a machine translation service is only useful if it can translate text from your source language to your destination language. According to one reliable estimate there are over 8,000 languages in use around the world - but the very large majority of the world population can be reached via a much smaller number of languages - the 10 most widely spoken languages enable you to reach about ⅔ of the world population. Typically machine translation services offer between 30 and 200 languages - so it is highly likely that whatever service you choose you will probably be able to find the language pairs that you require. Of the two services currently supported by JCOGS Auto-Translate, Microsoft's MS Azure AI Translator easily offers the most language options (185, compared to 32 for DeepL) - though many of the MS language offerings are somewhat obscure (Klingon, Traditional Mongolian etc.).
Translation Quality
For some this is the most important criteria. There are many different approaches used by machine translation services, the choice of which is used is probably based on the economics faced by the service provider. Unsurprisingly some approaches work better than others, and in the past machine translation services have garnered a somewhat poor reputation in this respect. Changes since 2016 have improved the reputation of Google Translate (the most widely used machine translation service), others, in particular DeepL, have gained respect for the quality of the translations generated.
Of the two services currently supported by JCOGS Auto-Translate, DeepL generates what are generally viewed as far superior translations: within the language pairs it supports it provides much more natural langauge phrasing and has a better grasp of the meaning of colloquialisms and idioms in the source and target materials.