통계 의미론

통계 의미론(Statistical semantics)은 언어학에서 이상적으로는 비지도 학습을 통해 정보 검색 목적에 최소한 충분한 정밀도로 단어나 구문의 의미를 결정하는 문제에 통계 방법을 적용한다.

역사[편집]

통계 의미론이라는 용어는 워렌 위버(Warren Weaver)가 기계 번역에 관한 그의 유명한 논문에서 처음 사용했다.^[1] 그는 기계 번역에 대한 단어 의미 명확화는 주어진 대상 단어 근처의 문맥 단어의 동시 발생 빈도에 기초해야 한다고 주장했다. 존 루퍼트 퍼스는 "단어는 그것이 유지하는 회사에 의해 특징지어진다"는 기본 가정을 옹호했다.^[2] 이 가정은 언어학에서 분포 가설로 알려져 있다.^[3] 에밀 델라베네이(Emile Delavenay)는 통계 의미론을 "단어의 의미와 그 빈도 및 재발 순서에 대한 통계적 연구"로 정의했다.^[4] 퍼나스(Furnas, 1983) 등은 통계 의미론에 대한 근본적인 기여로 자주 인용된다.^[5] 해당 분야의 초기 성공은 잠재 의미 분석이었다.

같이 보기[편집]

출처[편집]

Delavenay, Emile (1960). 《An Introduction to Machine Translation》. New York, NY: Thames and Hudson. OCLC 1001646.
Firth, John R. (1957). “A synopsis of linguistic theory 1930-1955”. 《Studies in Linguistic Analysis》 (Oxford: Philological Society): 1–32.
Reprinted in Palmer, F.R., 편집. (1968). 《Selected Papers of J.R. Firth 1952-1959》. London: Longman. OCLC 123573912.
Frank, Eibe; Paynter, Gordon W.; Witten, Ian H.; Gutwin, Carl; Nevill-Manning, Craig G. (1999). 〈Domain-specific keyphrase extraction〉. 《Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence》. IJCAI-99. California: Morgan Kaufmann. 668–673쪽. CiteSeerX 10.1.1.148.3598. ISBN 1-55860-613-0.
Furnas, George W.; Landauer, T. K.; Gomez, L. M.; Dumais, S. T. (1983). “Statistical semantics: Analysis of the potential performance of keyword information systems” (PDF). 《Bell System Technical Journal》 62 (6): 1753–1806. doi:10.1002/j.1538-7305.1983.tb03513.x. S2CID 22483184. 2016년 3월 4일에 원본 문서 (PDF)에서 보존된 문서. 2012년 7월 12일에 확인함.
Hearst, Marti A. (1992). 〈Automatic Acquisition of Hyponyms from Large Text Corpora〉 (PDF). 《Proceedings of the Fourteenth International Conference on Computational Linguistics》. COLING '92. Nantes, France. 539–545쪽. CiteSeerX 10.1.1.36.701. doi:10.3115/992133.992154. 2012년 5월 22일에 원본 문서 (PDF)에서 보존된 문서. 2012년 7월 12일에 확인함.
Landauer, Thomas K.; Dumais, Susan T. (1997). “A solution to Plato's problem: The latent semantic analysis theory of the acquisition, induction, and representation of knowledge”. 《Psychological Review》 104 (2): 211–240. CiteSeerX 10.1.1.184.4759. doi:10.1037/0033-295x.104.2.211. S2CID 1144461.
Lund, Kevin; Burgess, Curt; Atchley, Ruth Ann (1995). 〈Semantic and associative priming in high-dimensional semantic space〉 (PDF). 《Proceedings of the 17th Annual Conference of the Cognitive Science Society》. Cognitive Science Society. 660–665쪽. ^{[깨진 링크]}
McDonald, Scott; Ramscar, Michael (2001). 〈Testing the distributional hypothesis: The influence of context on judgements of semantic similarity〉. 《Proceedings of the 23rd Annual Conference of the Cognitive Science Society》. 611–616쪽. CiteSeerX 10.1.1.104.7535.
Pantel, Patrick; Lin, Dekang (2002). 〈Discovering word senses from text〉. 《Proceedings of ACM SIGKDD Conference on Knowledge Discovery and Data Mining》. KDD '02. 613–619쪽. CiteSeerX 10.1.1.12.6771. doi:10.1145/775047.775138. ISBN 1-58113-567-X.
Sahlgren, Magnus (2008). “The Distributional Hypothesis” (PDF). 《Rivista di Linguistica》 20 (1): 33–53. 2012년 3월 15일에 원본 문서 (PDF)에서 보존된 문서. 2012년 11월 20일에 확인함.

Sahlgren, Magnus; Karlgren, Jussi (2009). 《Terminology mining in social media》. CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge management. doi:10.1145/1645953.1646006.

Terra, Egidio L.; Clarke, Charles L. A. (2003). 〈Frequency estimates for statistical word similarity measures〉 (PDF). 《Proceedings of the Human Language Technology and North American Chapter of Association of Computational Linguistics Conference 2003》. HLT/NAACL 2003. 244–251쪽. CiteSeerX 10.1.1.12.9041. doi:10.3115/1073445.1073477. 2013년 11월 3일에 원본 문서 (PDF)에서 보존된 문서. 2012년 7월 12일에 확인함.
Turney, Peter D. (May 2000). “Learning algorithms for keyphrase extraction”. 《Information Retrieval》 2 (4): 303–336. arXiv:cs/0212020. CiteSeerX 10.1.1.11.1829. doi:10.1023/A:1009976227802. S2CID 7007323.
Turney, Peter D. (2001). “Answering subcognitive Turing Test questions: A reply to French”. 《Journal of Experimental and Theoretical Artificial Intelligence》 13 (4): 409–419. arXiv:cs/0212015. CiteSeerX 10.1.1.12.8734. doi:10.1080/09528130110100270. S2CID 59099.
Turney, Peter D. (2003). 〈Coherent keyphrase extraction via Web mining〉. 《Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence》. IJCAI-03. Acapulco, Mexico. 434–439쪽. arXiv:cs/0308033. Bibcode:2003cs........8033T. CiteSeerX 10.1.1.100.3751.
Turney, Peter D. (2004). 〈Word sense disambiguation by Web mining for word co-occurrence probabilities〉. 《Proceedings of the Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text》. SENSEVAL-3. Barcelona, Spain. 239–242쪽. arXiv:cs/0407065. Bibcode:2004cs........7065T.
Turney, Peter D. (2006). “Similarity of semantic relations”. 《Computational Linguistics》 32 (3): 379–416. arXiv:cs/0608100. Bibcode:2006cs........8100T. CiteSeerX 10.1.1.75.8007. doi:10.1162/coli.2006.32.3.379. S2CID 2468783.
Turney, Peter D.; Littman, Michael L. (October 2003). “Measuring praise and criticism: Inference of semantic orientation from association”. 《ACM Transactions on Information Systems》 21 (4): 315–346. arXiv:cs/0309034. Bibcode:2003cs........9034T. CiteSeerX 10.1.1.9.6425. doi:10.1145/944012.944013. S2CID 2024.
Turney, Peter D.; Littman, Michael L. (2005). “Corpus-based Learning of Analogies and Semantic Relations”. 《Machine Learning》 60 (1–3): 251–278. arXiv:cs/0508103. Bibcode:2005cs........8103T. CiteSeerX 10.1.1.90.9819. doi:10.1007/s10994-005-0913-1. S2CID 9322367.
Turney, Peter D.; Littman, Michael L.; Bigham, Jeffrey; Shnayder, Victor (2003). 〈Combining Independent Modules to Solve Multiple-choice Synonym and Analogy Problems〉. 《Proceedings of the International Conference on Recent Advances in Natural Language Processing》. RANLP-03. Borovets, Bulgaria. 482–489쪽. arXiv:cs/0309035. Bibcode:2003cs........9035T. CiteSeerX 10.1.1.5.2939.
Weaver, Warren (1955). 〈Translation〉 (PDF). Locke, W.N.; Booth, D.A. 《Machine Translation of Languages》. Cambridge, Massachusetts: MIT Press. 15–23쪽. ISBN 0-8371-8434-7. 2019년 1월 29일에 원본 문서 (PDF)에서 보존된 문서. 2012년 7월 12일에 확인함.
Yarlett, Daniel G. (2008). 《Language Learning Through Similarity-Based Generalization》 (PDF) (학위논문). Stanford University. 2014년 4월 19일에 원본 문서 (PDF)에서 보존된 문서.

[1] Weaver 1955

[2] Firth 1957

[3] Sahlgren 2008

[4] Delavenay 1960

[5] Furnas 등. 1983

[1]

[2]

[3]

[4]

[5]