Thursday, June 19, 2008

中翻英,大有可為

昨天晚上在天安門東靠長安街的酒店裡上網搜尋,無意間發現 "英文谷歌" (Google in English) 存有 "語國一方" 的網頁資料。職業病作祟加上好奇心驅使,我點擊了一旁的 [Translate this page],出現了英文版的 "語國一方"。哈哈,機器翻譯所呈現出來的結果,問題一大堆,可真是慘不忍睹啊!有志於中英語言工作的同學朋友,這個領域大有可為,你們可以認真考慮把它列入自己的生涯規劃。

茲以上一篇帖子為例,看看這個翻譯程式的表現如何。中文原稿請前翻,英文為網路自動翻譯所得,括弧內是我的隨性評論:

語國一方 Language of the Party [政黨語言?]

奧運前北京掠影 (1) Glimpse of Beijing before the Olympic Games (1) [還不錯,但 Glimpse 用複數可能會更好]

Photo 1: Mengmeng ["朦朦" 不是專有名詞啦!] axis -- from Jingshan million Chunting [萬春亭?], overlooking the classic Beijing

Photo 2: red axis -- Chairman Mao. Tiananmen Square. Side door [側門=端門?]. Kenneth doors [肯尼士門,現在還有人這樣用嗎?]

Photo 3: Lian Lian axis ["戀戀" 也不是專有名詞啦!] -- from Zhengyang Men (front door) South Wang [南望?] Jian Lou

Photo 4: alert today, why was [這是啥?] -- Chongwen outside the ruins of the city wall [語意搞顛倒了吧,大哥!], demolished around 1960, after the successful Olympic bid to recover the people collecting the ancient city of brick [什麼英文哪,這是!]

Photo 5: both traditional and modern -- Beijing Capital Airport has just completed on the 3rd terminal, look like the dragon coiled, internal significant modern atmosphere. [有高二英文程度的,都不會寫成這個樣子吧?]

Photo 6: No matter how far,统统two yuan [小姐,這也太偷懶了吧,原文照搬不加翻譯!] -- near the opening of the Metro Line 5, the North East from the three flags of the North Tiantong Yuan, arrived in the South, the Southern Song Jiazhuang three-ring along the axis in the East about two kilometers north and south, runs through [如果這是大考的翻譯題,讓我來閱卷的話,肯定送他一個大鴨蛋!]

Photo 7: New Bus -- two of the body [你說什麼?], let me recall 16 years ago when the bus ride to Taiji ["太擠" 你也不懂?太扯了吧!], friends stuck Xia Bulai [拜託,"下不來" 不是人名好不好!], no one is willing to let the car [讓車?], in front of my car The passenger vehicle burst Cukou, Hurricane obscenity ["爆粗口"=把粗口搞破,"狂飆髒話"=淫穢颶風?]

Photo 8: "North" Beijing Station -- Towards the Olympic Games, Beijing Railway Station workers are at the top, to the three big red characters System in New Look [我真的不懂你在說什麼啦!], plus English

Photo 9: Cangcang days, were the vast, where the family [我的英文太差,看不懂啦!] -- Beijing Railway Station Square, first-class train [老兄,你搞錯了!是 "北京火車站廣場上/等火車",不是 "北京火車站廣場/上等火車",好嗎?], sat on the floor of the elderly [好閨女!這是什麼語法啊?]

Posted by Chinese yuan [什麼時候我的名字 "曾泰元" 可以翻成 "Chinese yuan" 了?]

4 comments:

eubin said...

老師:

其實在奇摩知識+有很多人都是用這種方式在回答別人中英翻譯的問題,有些BBS站上面也有,這些東西幾乎都成了Flora工作之虞的笑話集錦!
其實語言學當中的一個句法理論LFG正好是以機器翻譯為目標之一所衍生出來的一套句法理論,當中有趣的地方是,這套句法理論可以直接轉換成程式語言,翻譯的過程先透過句法分析之後在進行翻譯,我想這樣的機器翻譯所能達到的準確性應該會好很多!

曾泰元 said...

Eubin,我不懂機器翻譯,不過根據我得到的消息,從句法學切入有很多盲點,至今仍難以克服。目前有一派學者專搞語料庫,用大量的實證語料來填補句法的空白,據說正確度提高許多!

我還是搞不明白,"曾泰元" 怎麼會被電腦的翻譯程式翻成 Chinese yuan?"曾泰" = Chinese 根據的是什麼?

哈哈,不是腦筋急轉彎哦!有誰可以幫我解解惑?

eubin said...

老師:

我的想法是這樣的:翻譯系統將老師的名字拆解成"曾泰""元"。當中因為"元"可以直接找到對應,所以就職接翻成了yuan。然後軟體就直接認定"曾泰"是某種元,因為找不到任何的對應詞可以翻譯,所以就猜想成某種中國的錢,因為"元"的關係,所以成了Chinese Yuan。
以語料庫當作翻譯的基礎的確可以提高正確率。但是有個缺陷是,語言的創造性遠遠超過語料庫的統整速度。所以說,仰賴語料庫還是會有一段青黃不接期。也就是說,當今天要翻譯的一個句子是語料庫中所缺少的,翻譯出來的成果可能也不能盡如人意。但是,語料庫是個不錯的方向。

Anonymous said...

我覺得Eubin老師所說是蠻有可能的...不過為何「曾泰」元是中國的錢,而不是其他國家的呢...?
哈哈~套句關辰雄老師的話:「這真是令人費解啊!」