As seen in our previous post, Machine Translation (MT) has recently become extremely popular and often the choice among the available MT systems can be hard for both companies and freelancers.
According to experts, there are four types of MT: Rule-based, Statistical, Hybrid and Neural.
- Rule-based MT (RbMT) is made up of a set of guidelines concerning all language aspects, dictionaries and glossaries created for specific enterprises or specialisms. It can examine the source language by means of these rules and then creates the sentences in the target language;
- Statistical MT (SMT), also known as Data-driven Machine Translation, is not aware of language patterns, but it operates through algorithms which analyse huge amounts of data in the language pairs and work out the most probable matches.
RbMT often produces accurate translations and it can be more effective for uncommon language pairs. That is as a result of there not being enough data, SMT won’t be able to carry out a reliable statistical study and provide sound results. On the other hand, RbMT turns out to be more expensive in terms of the creation of dictionaries and software customisation, and is less efficient at understanding metaphors and informal language than SMT.
- Hybrid MT (HMT) is a combination of Rule-based and Statistical Machine Translation and it can work in two different ways: first RbMT translates the text and then SMT reviews it, or SMT performs the actual translation and RbMT gives its contribution in reference to linguistic features.
- Neural MT (NMT) is a new system which uses a neural network created in a way similar to the human brain: it first examines in depth the sentence in the source language, then it offers a first representation in the target language, lastly it generates the actual translation.
Last year, the global search engine company Google inc. launched its NMT in order to improve one of the most popular MTs in the world: Google Translate.
But, how does Google Translate actually work?
It’s classified as a Statistical Machine Translation and it normally follows two procedures:
- direct, when a text is translated directly from the source language to the target language;
- pivot, when the transfer from source to target is carried out via another language.
The first method is used if there is a sufficient amount of data to analyse in the language pairs and source and target are similar to each other, or when one of the two languages is English.
In cases where the source and target belong to different language families and there is not enough data, the content is translated via English.
All of the data that Google translate utilises is found on the web.
In line with what experts affirm, the main flaws of this MT relate to morphology (such as verb conjugations, name declinations and articles), especially when the target language is morphologically richer than the source, word order and a lack of vocabulary for specific domains.
The main issues that concernsGoogle NMT at present are that it only works from and into English, it struggles with uncommon words and it is quite expensive from a computational point of view. Despite these issues, its introduction has produced a general enhancement of translation quality and accuracy thus far.