Rule-Based Machine Translation systems use large collections of rules, manually developed over time by human experts mapping structures from the source language to the target language. The human factor in rule-based systems helps deliver fairly good automated translations with predictable results. However, due to significant manual labor, rule-based systems can be quite costly, time consuming to implement and maintain and – as rules are added and updated – these systems have the potential of generating ambiguity and translation degradation over time.
Statistical Machine Translation systems use computer algorithms to produce a translation that looks best statistically from millions of permutations. Statistical models consist of words and phrases learned automatically from bilingual parallel sentences, creating a bilingual “database” of translations. The attractiveness of statistical systems comes from the level of automation in building new systems using its machine learning capabilities, leading to rapid turnaround time and the low cost of processing power required for constructing and operating these statistical models. However, the major downside with this type of engine is the “data-dilution effect” caused by scarcity of suitable data for ‘training’ these data-driven systems.
Hybrid Machine Translation. In order to address quality and time-to-market limitations, many Rule-Based Machine Translation developers are augmenting their core technology with Statistical Machine Translation technology to create ‘Hybrid Machine Translation’ solutions. Hybrids provide some quality improvement benefits, however, they keep the costs of Rule-Based systems high by adding complexities of managing side-by-side systems.
Next Generation Approaches. New “augmented” Machine Translation solutions are emerging, upgrading the capabilities (and overcoming the limitations) of Statistical Machine Translation. By introducing sophisticated data pre-processing (Language Transformation), Language Optimization Technologies and terminology management solutions, these new Statistical MT solutions are achieving the same quality improvements introduced by Hybrid MT while dispensing with the need for legacy technology – delivering a new standard in multi-lingual communication solutions.