Global Trend Radar
Web: grokipedia.com US web_search 2026-05-06 14:08

WordNet

元記事を開く →

分析結果

カテゴリ
AI
重要度
60
トレンドスコア
24
要約
WordNetは、名詞、動詞、形容詞、副詞を認知的同義語のセットに整理した大規模な英語の語彙データベースです。
キーワード
WordNet — Grokipedia Fact-checked by Grok 3 months ago WordNet Ara Eve Leo Sal 1x WordNet is a large lexical database of English that organizes nouns, verbs, adjectives, and adverbs into sets of cognitive synonyms called synsets , each representing a distinct lexicalized concept, with synsets linked by conceptual-semantic and lexical relations such as hypernymy, hyponymy, meronymy, and antonymy. [1] Developed at Princeton University under the leadership of psychologist George A. Miller, it was designed for use in computational linguistics and natural language processing , providing a structured representation of word meanings inspired by psycholinguistic theories of human lexical memory. [2] Unlike traditional dictionaries or thesauri, WordNet distinguishes between word senses and explicitly labels semantic relations, making it suitable for program-controlled applications like machine translation and information retrieval . [2] The project began in 1985 as part of Miller's research into the mental lexicon , evolving from earlier psycholinguistic experiments to create an online reference system that combines lexicographic information with computational efficiency. By the mid-1990s, WordNet had grown to include over 90,000 synsets, accompanied by glosses (definitions) and example sentences for many entries, covering a broad range of English vocabulary while emphasizing common words. [3] Its development continued through collaborative efforts involving linguists, psychologists, and computer scientists at Princeton until 2011, after which the core Princeton WordNet ceased active maintenance, though the database remains freely downloadable and widely used. [1] WordNet has significantly influenced the field of natural language processing , serving as a foundational resource for tasks such as word sense disambiguation , semantic similarity measurement, and ontology construction, and inspiring multilingual extensions through projects like the Global WordNet Association. [4] While Princeton ceased development after 2011, community efforts such as Open English WordNet continue to update and extend the resource. [5] [6] Despite its English-centric focus and limitations in handling polysemy or rare terms, its open availability and relational structure continue to support research and applications in artificial intelligence . [1] Overview Definition and Purpose WordNet is a large lexical database for the English language , encompassing nouns, verbs, adjectives, and adverbs grouped into sets of synonyms known as synsets, where each synset represents a distinct concept or meaning. [7] These synsets form the basic units of the database, linking words that share similar meanings while emphasizing conceptual organization over alphabetical listing. The Princeton version of WordNet covers approximately 117,000 synsets and includes around 206,000 word-sense pairs, providing a comprehensive resource for semantic exploration. [7] The primary purpose of WordNet is to offer a structured representation of word meanings and their interrelationships, drawing inspiration from psycholinguistic theories of human lexical memory to model how speakers organize and access vocabulary. [8] This design facilitates applications in computational linguistics by enabling machine-readable access to lexical knowledge, supporting tasks such as natural language processing and semantic analysis that mimic aspects of human language understanding. [2] Unlike traditional dictionaries, which prioritize definitions, pronunciations, and etymology , WordNet focuses exclusively on semantic relations between concepts, treating meanings as interconnected nodes in a network rather than isolated entries. [8] Current Status The original Princeton WordNet project ceased active development following the release of version 3.1 in 2011 , though its database and associated tools continue to be freely available for download and use. [9] In response, the Open English WordNet (OEWN) emerged as an active, community-driven fork , maintaining and extending the resource under an open-source model. The 2024 edition, released on November 1, 2024, incorporates significant updates, including a renovated verb hierarchy to improve semantic coherence, the addition of more gendered terms for better representation, and the removal of outdated gendered language to enhance inclusivity. [6] [10] [11] These efforts are supported by the Global WordNet Association (GWA), which coordinates ongoing community contributions and hosts events such as the 13th International Global WordNet Conference (GWC2025) held in Pavia , Italy , from January 27 to 31, 2025. The OEWN 2024 edition contains approximately 120,630 synsets, reflecting enhancements aimed at inclusivity and semantic accuracy through crowdsourced edits and expert reviews. [12] [13] [14] Looking ahead, OEWN's maintenance relies on open-source contributions via its GitHub repository, with a focus on format standardization, as evidenced by discussions and updates to Global WordNet Formats in recent GWA publications from 2025. [15] History Origins and Development WordNet originated in 1985 at the Cognitive Science Laboratory of Princeton University , initiated by psychologist George A. Miller to create a machine-readable lexical database modeled on psycholinguistic theories of human memory for words. [8] The project received initial funding from the U.S. National Science Foundation , enabling the assembly of a team dedicated to constructing a semantic network of English vocabulary. [16] The development process relied on manual curation by interdisciplinary teams of linguists and psychologists, who began with nouns and progressively expanded coverage to verbs, adjectives, and adverbs. [8] Early efforts focused on organizing nouns into synsets—groups of synonyms representing distinct concepts—and linking them via semantic relations, particularly hypernymy (is-a relations forming hierarchies). [8] A key challenge was constructing comprehensive hypernym hierarchies for nouns, which required resolving ambiguities and ensuring hierarchical consistency; this foundational work was largely completed by 1990. [8] Major milestones marked the project's evolution, with the first public release, WordNet 1.0, occurring in 1995 and providing initial coverage primarily of nouns. Subsequent versions built on this foundation: version 1.5 in 1998 expanded verb coverage significantly, while version 2.0 in 2001 incorporated more adjectives and refined relations across parts of speech. [17] Version 3.0, released in 2006, represented the final major update from the Princeton team, consolidating enhancements in all lexical categories before development shifted to community efforts. [17] Key Contributors George A. Miller founded and directed the WordNet project at Princeton University's Cognitive Science Laboratory, establishing its psycholinguistic framework inspired by theories of human lexical memory. [8] He authored the seminal 1995 paper "WordNet: a lexical database for English," which outlined the database's structure and semantic relations among approximately 70,000 synsets at the time. [2] Christiane Fellbaum served as project leader from the 1990s onward, overseeing the expansion of lexical content and the mapping of semantic relations across parts of speech. [1] She edited the 1998 book WordNet: An Electronic Lexical Database , which compiled foundational descriptions of the project's design, including contributions on nouns, verbs, adjectives, and applications. [4] Other key figures included Katherine J. Miller, who handled administrative leadership and contributed to the organization of adjective synsets and overall database compilation. [8] Claudia Leacock specialized in semantic relations, developing methods for sense identification using corpus statistics and WordNet links, as detailed in her work on building semantic concordances. [18] The project relied on collaborative efforts at Princeton's Cognitive Science Laboratory, involving teams of undergraduate annotators and researchers who manually created and linked synsets for approximately 147,000 unique word forms by the early 2000s. [7] By 2007, the initiative had engaged numerous contributors focused on sense disambiguation and lexical expansion, reflecting its community-driven evolution. This community involvement has continued, with projects like Open English WordNet releasing updated editions annually, including the 2024 edition on November 1, 2024. [19] [6] Structure and Content Synsets and Word Senses In WordNet, the core organizational unit is the synset, defined as a set of one or more synonymous words, known as lemmas, that together represent a single distinct concept or meaning. [20] For instance, the synset for the concept of a common pet includes the lemmas dog , domestic_dog , and Canis_familiaris , all of which can be used interchangeably in appropriate contexts to denote the same idea. [21] Synsets thus capture lexical synonyms while avoiding redundancy by grouping forms that share the same underlying semantics, providing a structured way to represent the mental lexicon . [20] Words in WordNet are polysemous, meaning a single word form can participate in multiple synsets to account for different senses depending on context . Each occurrence of a word in a synset corresponds to one of its senses, with senses ordered and numbered by estimated frequency of usage, derived from corpus-based tagging. [20] For example, the noun bank appears in three primary senses: the first (most frequent) as a financial institution like a place for depositing money ; the second as the sloping edge of a river; and the third as a general incline or mound. [21] This numbering facilitates sense disambiguation in computational tasks by prioritizing common interpretations. [20] The sense inventory for synsets was compiled by linguists drawing from machine-readable dictionaries, including the Longman Dictionary of C

類似記事(ベクトル近傍)