THE PROBLEM OF CORRELATION BETWEEN MANUAL AND AUTOMATIZED ASSESSMENT OF MACHINE TRANSLATION

Authors

DOI:

https://doi.org/10.31494/2412-9208-2022-1-3-379-388

Keywords:

machine translation, target language, source language, improvement, contextual meaning, communication.

Abstract

The article under review outlines the problems of development and assessment of machine translation that can greatly facilitate global communication, despite the imperfect quality of the source text. Most often the results of online tools require post-editing and can only be effectively used by those who already speak the target language to some extent. The need for a competent translation is growing every year. Today, the search for an algorithm to deliver this quality of translation is one of the most important questions in computer science and linguistics, therefore informing the scientific relevance of this work. It is analyzed different approaches to the machine translation systems, their characteristics, efficacy and the quality of their output. Different approaches to the machine translation systems, their characteristics, efficacy and the quality of their output are analyzed in the article. The main problems we see arising from such translations goes from the fact that the systems depend on a large amount of high-quality data sets (i.e., corpora of texts for specific language pairs). The quality of these sets directly influences the quality of the output, which in our case is the quality of the target language text. It can be seen by comparing the average quality of translation between Google’s and Microsoft’s systems. The former one makes less mistakes on average and does not have as many issues in regards to identifying a contextual meaning of a polysemantic lexeme. It is underlined in the article, that this issue can be fixed to a certain extent one of two ways: hiring professional translators and linguists to compile those parallel corpora or create a possibility for every person to contribute to this process even on a small scale. The first approach would be very time and labor consuming, but would ultimately provide us with a higher quality data set, which may lead to further improvements in MT. The second is already being deployed by all three major NMT systems but may lead slower progression due to lack of quality control and oversight. The potential prospect of this research is seen in widening the subject area of texts chosen to reflect the variety of writing styles in use on the Internet right now. Inclusion of texts from confessional, business, and other styles may allow us to highlight more lacunae in the neural network models and to suggest further means of improvement.

References

Вознюк М. Ю. Критерії оцінювання перекладу. Вісник ЛНУ імені Тараса Шевченка. 2011. № 9 (220). С. 143-149.

Chan S. Routledge Encyclopedia of Translation Technology. Oxon: Routledge, 2015. 285 p.

Darwish A. Transmetrics: A Formative Approach to Translator Competence Assessment and Translation Quality Evaluation for the New Millennium. 2001. URL : http://www.translocutions.com/translation/transmetrics_2001_revision.pdf

Forcada M. L. Making sense of neural machine translation. Translation Spaces, 2017. Iss. 6. P. 291-309.

Gordin M. D. Scientific Babel: How Science Was Done Before and After Global English. Chicago, Illinois: University of Chicago Press, 2015. 233 p.

Hearne M. Statistical Machine Translation: A Guide for Linguists and Translators. Language and Linguistics Compass 5(5). 2001. P. 205-226.

Hutchins W. J. Machine translation: past, present, future. New York City, 1986. 236 p.

Hutchins W. J. Machine translation: a concise history. New York, 2005. URL : https://pdfs.semanticscholar.org/e97a/40cc28ce17a17ce9b73d77e69ffa1210fa25.pdf.

Jurafsky D. Speech and Language Processing. An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. 2009. 2nd edition. Prentice Hall. 400 p.

Published

2022-12-06