Constructing a large-scale English-Persian : parallel corpus
Tipo de material: Recurso continuoIdioma: Persa Series Meta Volume 54, numéro 1, janvier 2009 ; v. 54, n. 1Detalles de publicación: Montréal : Université de Montréal , janvier 2009Descripción: p. 181-188 ilusISBN:- 978-2-7606-2146-6
- 0026-0452
Tipo de ítem | Biblioteca actual | Colección | Signatura topográfica | Estado | Fecha de vencimiento | Código de barras | |
---|---|---|---|---|---|---|---|
Artículos/Analíticas | Biblioteca Bartolomé Mitre | Colección Digital | H 23 (Navegar estantería(Abre debajo)) | Disponible | META-54-1_181-188 | ||
Artículos/Analíticas | Biblioteca Bartolomé Mitre | Colección General | H 23 (Navegar estantería(Abre debajo)) | Disponible |
incl. ref.
In this paper we present our work on constructing and using English-Persian parallel corpora to support research in fields such as English-Persian bilingual lexicography, developing translation memory software, English-Persian cross-language information retrieval, and statistically-based machine translation from English into Persian. We have tried to design a program to automatically align the corpus at sentence level, but here, our main concern is to introduce the procedures and techniques for developing an online parallel corpus of English and Persian texts in various domains. This corpus is extendable: more and more parallel sentences in the languages may be added, and it will be provided free to those interested in language and translation matters, especially translation trainees. One of the main activities associated with building such a corpus is developing software for parallel concordancing, in which a user can enter a search string in one language and see all citations for that string in the search language as well as corresponding sentences in the target language. Aligned bilingual corpora have in fact proved useful in many tasks, including machine translation (Brown etal. 1990; Sadler 1989), sense disambiguation (Brown etal. 1991a; Dagan et al. 1991; Gale et al.1991), cross-language information retrieval (Davis and Dunning 1995; Landauer and Littman 1990; Oard 1997) and bilingual lexicography (Klavans and Tzoukermann 1990); Warwick and Russell 1990).
No hay comentarios en este titulo.