Semantic similarity
From Wikipedia, the free encyclopedia
Semantic similarity, is a concept whereby a set of documents or terms within term lists are assigned a metric based on the likeness of their meaning / semantic content.
According to some opinions the concept of semantic similarity is different from semantic relatedness because semantic relatedness includes concepts as antonymy and meronymy, while similarity doesn't [1]. However, much of the literature uses these terms interchangeably, along with terms like semantic distance. In essence, semantic similarity, semantic distance, and semantic relatedness all mean, "How much does term A have to do with term B?"
The answer to this question, as given by the many automatic measures of semantic similarity/relatedness, is usually a number, usually between -1 and 1, or between 0 and 1, where 1 signifies extremely high similarity/relatedness, and 0 signifies little-to-none.
An intuitive way of displaying terms according to their semantic similarity is by grouping together closer related terms and spacing more distantly related ones wider apart. This is common - if sometime subconscious - practice for mind maps and concept maps.
Concretely, this can be achieved for instance by defining a topological similarity, by using ontologies to define a distance between words (a naive metric for terms arranged as nodes in a directed acyclic graph like a hierarchy would be the minimal distance (in separating edges) between the two term nodes), or using statistical means to correlate words and textual contexts from a suitable text corpus (co-occurrence).
[edit] See also
[edit] References
- ^ Evgeniy Gabrilovich and Shaul Markovitch (2007). "Computing Semantic Relatedness using Wikipedia-based Explicit Semantic Analysis" (PDF). Retrieved on 2007-09-18.
[edit] External links
- List of related literature
- WordNet::Similarity (using WordNet as an ontology)de:semantische Nähe

