With all their great advantages, Statistical Machine Translation systems require massive quantities of data to construct a reliable statistical model to translate text with. Corporate data used in the learning process is usually smaller than that and is commonly augmented with non-corporate language for learning purposes. Due to the data-sensitive nature of this technology, bi-lingual translation output may not perfectly align with the intended corporate language, branding and unique choice of words. This can result in the need for costly post-editing work and limited use in corporate multi-lingual communications.
The “data dilution effect” has been one of the key areas of research for Safaba leading to the introduction of its ground-breaking Language Optimization Technology™ – a feature that distills the corporate language early in the translation process overcoming the ”data dilution effect” and faithfully reproducing unique corporate brand language, choice of words and terminology.