编辑: You—灰機 2019-12-03
? This document is downloaded from CityU Institutional Repository, Run Run Shaw Library, City University of Hong Kong.

Title? A simple automatic Chinese terminology standardization builder? Author(s)? Han,?Xiaoyu?(韩晓瑜)? Citation? Han,?X.?Y.?(2011).?A?simple?automatic?Chinese?terminology? standardization?builder?(Outstanding?Academic?Papers?by?Students? (OAPS)).?Retrieved?from?City?University?of?Hong?Kong,?CityU? Institutional?Repository.? Issue?Date? 2011? URL? http://hdl.handle.net/2031/6451? Rights? This?work?is?protected?by?copyright.?Reproduction?or?distribution?of? the?work?in?any?format?is?prohibited?without?written?permission?of? the?copyright?owner.?Access?is?unrestricted.? ? ? CITY UNIVERSITY OF HONG KONG Final Year Project Report A simple automatic Chinese terminology standardization builder ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? Student?Name:?HAN?Xiaoyu? Course?Code:?CTL4235? Supervisor:?Dr.?LUN,?Suen?Caesar? Due?Date:?19/05/2011? ? ? ? ? Page?|?1? ? ? Contents 0.? Abstract?2? 1.? Introduction?3? 2.? Background?3? 3.? Method?4? 3.1? Preparation?5? 3.11? Lexicons?5? 3.12? Database?of?Core?and?Translation?Terms?5? 3.13? Corpus?of?definitions?6? 3.14? Rule?Base?6? 3.2? Process?7? 3.21? Step?1?9? 3.22? Step?2?9? 3.23? Step?3?10? 3.24? Step?4?11? 4? Results?11? 6? Discussion?and?Implication? 15? 7? Conclusion?18? 8? References?19? 9? Appendices?20? 9.1? Appendix?A:?List?of?terms?contain? data 20? 9.2? Appendix?B:?Basic?terms?24? 9.3? Appendix?C:?Analysis?of?samples?26? ? Tables?and?Figures? Table?1:?the?List?of?rules?in?the?rule?base?6? Table?2:?the?ordered?rules?used?in?the?step?3.? 10? Table?3:?Analysis?of? data 12? Table?4:?Analysis?of? data?base 13? Table?5:?Analysis?of? data?bank 14? ? Figure?1:?the?process?chart?of?the?Chinese?Terminology?builder?8? ? ? Page?|?2? ? ? 0. Abstract? Nowadays,? Terminology? Systems? are? not? developed? enough,? so? terminology? standardization?is?necessary?and?urgent?needed.?Terminology?standardization?always? involves?a?choice?among?competing?terms.?In?this?paper,?two?Chinese?terminology? systems? CMainland? and? Taiwan,? are? used? as? samples? and? a? Chinese? terminology? builder? will? be? introduced? from? two? aspects.? On? the? one? hand,? the? Chinese? terminology?builder?has?to?prepare?lexicons,?database?of?core?and?translation?terms,? corpus?of?definitions,?and?rule?base?well.?On?the?other?hand,?it?has?to?go?through?the? Chinese? terminology? builder? step? by? step.? Firstly,? input? a? term? into? the? Chinese? terminology?builder.?If?the?input?is?a?compound?word,?break?it?into?individual?words.? Secondly,?search?each?individual?word?from?technical?dictionary,?find?meanings?in?the? particular?field,?and?look?for?corresponding?senses?for?each?word.?Thirdly,?select?the? most?appropriate?terminology?for?the?each?individual?word?through?applying?rules? which? are? in? the? rule? base.? This? step? is? the? core? measure.? Fourthly,? combine? the? terminology?of?each?individual?word?to?get?the?standardized?terminology.?Based?on? the?Chinese?terminology?builder,?several?typical?data?will?generated?as?examples?and? the?two?Chinese?terminology?systems?will?be?compared.?Finally,?some?defaults?of?the? Chinese? terminology? builder? and? difficulties? maybe? come? off? in? the? process? of? terminology?standardization?will?be?discussed.? ? ? ? ? ? ? Page?|?3? ? ? 1. Introduction? Terminology?standardization?is?one?of?the?languagerelated?tasks?that?utilize?linguistic? knowledge?to?facilitate.?With?the?rapid?development?of?hightechnology,?there?has? been?a?growing?interest?in?the?study?of?technical?terms?and?the?need?of?computing? terminology?standardization?is?increased.?Within?the?computing?domain,?there?are?a? number? of? defects? that? need? to? be? improved? including? vague? definition,? inconsistency?in?different?regions,?coexisting?terms?for?one?concept,?unconscionable? terminology,? etc.? In? the? context? of? the? Chinese? language,? the? situation? is? rather? complex? because? of? the? existence? of? two? competitive? political? entities? and? two? scripts? (traditional? and? simplified)? (Lun,? 1997).? In? order? to? achieve? consistency? in? nomenclature?and?thus?resulting?in?transparency?within?in?terminology,this?paper? will?propose?a?simple?automated?Chinese?terminology?builder?through?which?terms? in?the?terminology?could?be?implemented?systematically?and?mechanically.? ? ? 2. Background? Terminology?is?an?interdisciplinary?job,?which?requires?dedicated?efforts?from?experts? from?all?fields,?including?linguistics,?lexicographers,?lexicologist,?computer?scientists,? cognitive?psychologists?and?specialists?of?each?subject?field,?and?possibly?also?many? other?fields?too?(Lun,?1997).?This?paper?stood?on?results?of?previous?researches.? ? ? Terminology?can?be?standardized?in?different?ways.?Firstly,?different?alternatives?can? be?invented?by?people?involved?in?various?fields.?After?a?certain?period?of?contention,? Page?|?4? ? ? one? of? the? alternatives? will? be? adopted? or? different? alternatives? may? coexist? by? finding?different?usages.?It?is?probably?the?most?common?method,?which?requires?not? much?coordination?and?is?rather?democratic,?but?it?takes?a?long?time?that?and?slow? down?the?standardization?process.?Secondly,?a?national?institution?for?standardization? may?collect?opinions?and?finally?determine?which?term?to?be?codified?with?a?standard? usage.?Compared?to?the?first?method,?the?second?one?is?a?topdown?approach?and? requires?a?lot?of?coordination?and?centralized?efforts,?but?it?is?quite?authoritative?and? expensive.?Thirdly,?deliberating?and?systematizing?coinage?of?terminology?by?a?panel? of?concerned?parties.?This?method?could?pull?the?strengths?of?all?those?involved?into? business? and? let? the? computer? do? some? preliminary? work? to? provide? alternative? terms? according? to? linguistic? findings? (Lun,? 1997).? The? automated? Chinese? terminology?builder?proposed?in?this?paper?could?be?served?as?an?integral?part?of?the? third?method.? ? ? 3. Method? As?Kageura?said?(2002),?if?one?wants?to?undertake?a?theoretical?study?of?terminology,? therefore,? one? has? to? start? from? a? set? of? terms? which? is? representative? of? the? terminology? of? a? domain? instead? of? individual? terms.? Appendix? A? is? 102? selected? samples?of? data ?in?the?computing?domain.? ? ? ? ? Page?|?5? ? ? 3.1 Preparation? Before? setting? up? the? automatic? Chinese? Terminology? builder,? some? necessary? preparation?has?to?been?done.? ? ? 3.11 Lexicons? The?lexicon?means?a?combination?of?ordinary?vocabulary?and?technical?dictionary?in? the?field?concerned.?For?instance,?the?basic?senses?for? element ?in?Chinese?is? (基本) 要素、成份、部分、分子 ,?and?in?different?fields?its?senses?various?as? (化学)元素;

(数学)元、素;

(机械)单元、单体;

(无线电)元件;

(植物学)原种;

(军事)小队、分队;

等等 .?The?lexicon?stores?all?the?information?and?retrieves?terms? from?it?during?terminology?standardization.?Besides,?it?could?be?used?to?calculate?the? frequency?of?particular?terms?for?comparison.?One?of?the?typical?examples?is? block ? in?Appendix?C,?it?is?translated?as? 段 ?based?on?the?frequency?in?the?database.?The? larger?the?lexicon,?the?more?accuracy?of?the?frequency?will?be.? ? 3.12 Database?of?Core?and?Translation?Terms? Besides?the?translations?used?in?Terminology?system?or?dictionary,?direct?translations? may?be?a?source?of?alternative?terms,?transliterations?too.?Moreover,filtering?core? words? from? the? definition? is? another? way? to? create? possible? terms.? For? example,? according?to?the?definition?of? concentrator ,? 集散器 ?is?created?to?combine?two? major?functions?together?(Appendix?C).? ? ? Page?|?6? ? ? 3.13 Corpus?of?definitions? It? refers? to? the? professional? definitions? of? vocabularies? in? a? domain.? The? basic? assumption? is? that? there? are? ideas? technical? dictionaries? in? English,? the? universal? language?for?sciences?that?have?well?written?sense?definitions.?In?reality,?that?may?be? far?from?the?truth.?In?this?paper,?? and? ? were? taken? for? computer?science?as?sample?definitions.? ? ? 3.14 Rule?Base? The? rule? base? consists? of? phrase? structure? rules? for? parsing? the? definition? and? for? choosing?which?particular?sense?to?use?in?a?term.?Subsets?of?rules?can?be?involved?to? distinguish? the? synonymies? and? find? out? which? alternative? term? is? the? best? for? translation.?The?rules?include?the?humanistic?aspects,?the?size?of?scope?it?refers?to,? the?accuracy?of?information,?and?so?on.? ? ? Table?1:?the?List?of?rules?in?the?rule?base? Rules? Chinese? ? Definition? Basic?meaning? 基本意义? Based? on? their? definition,? basic? function,? process,?features,?etc.? Size?variousness? 大小? Based? on? their? size? differences,? such? as? scope,?physical?properties,?space,?etc.? simplification? 简化? The? principle? of? as? simple? as? possible? without?any?ambiguity.? ? Information? accuracy? 信息准确性? To? maintain? the? accuracy,? no? missing? or? surplus?information.? Ambiguity? 歧义? Having? overlapping? meaning? or? result? in? misunderstanding?with?others? humanistic? 人文社会化色彩? Social? and? humanistic? aspects? including,? active? &

? passive,? optimistic? &

? pessimistic,? etc.? Page?|?7? ? ? Internal? ambiguity? 二义性? Having? different? senses? for? an? individual? term? Adjacent?usage? 搭配使用? Special?usage?with?some?particular?adjacent? Etymological?view 词源角度? Already?existed?word?or?newly?created?word? terminological? 术语化? Having?terminological?features?or?not,?such? as?oral?or?written?form,?formal?or?informal,? etc.? Part?of?Speech? 词性? Part?of?speech?including,?n,?adj,?v,?adv,?etc.? descriptive? 形象化? Whether?it?could?be?describe?vividly?or?not? frequency? 使用频率? The? frequency? of? word? occurring? in? the? database? existence? (存在)抽象&

具体? ? The?way?of?existing,?abstract?or?concrete.? ? Internal?relation? 内部关系? Including?the?sequence,?format,?etc.? ? Semantic? Grammar? Property? 语义语法属性? Following? the? basic? principle? of? semantics? and?grammars.? Head?retrieval? 中心词提取? To? retrieval? the? head? or? most? important? word? Extension? ? 隐申义? Extend?meaning?from?its?basic?meaning? universality? 普遍性、广泛性? Whether?it?is?used?universally?and?be?widely? accepted?or?not? Multiattribute? 多层定语? The? order? of? multiattribute? has? to? follow? the? basic? grammar,? and? according? to? the? relationship?with?core?words? Focus? 语义重心? Semantic?focus? Naming? 术语规定? The?rules?of?naming,?such?as?no?punctuation? o........

下载(注:源文件不在本站服务器,都将跳转到源网站下载)
备用下载
发帖评论
相关话题
发布一个新话题