- Term properties
rank
[Integer]isSingleWord
,isSwt
[Boolean]documentFrequency
,dFreq
[Integer]frequencyNorm
,fNorm
[Double]generalFrequencyNorm
,gfNorm
[Double]specificity
,spec
[Double]frequency
,freq
[Integer]OrthographicScore
,ortho
[Double]IndependantFrequency
,iFreq
[Integer]Independance
,ind
[Double]pilot
[String]lemma
[String]tf-idf
,tfIdf
[Double]spec-idf
,specIdf
[Double]groupingKey
,key
[String]pattern
[String]spottingRule
,rule
[String]isFixedExpression
,isFixedExp
[Boolean]SwtSize
,swtSize
[Integer]Filtered
,isFiltered
[Boolean]Depth
,depth
[Integer]
- Relation properties
VariationRank
,vRank
[Integer]VariationRule
,vRules
[Set]DerivationType
,derivType
[String]GraphSimilarity
,graphSim
[Double]Score
,vScore
[Double]AffixGain
,affGain
[Double]AffixSpec
,affSpec
[Double]AffixRatio
,affRatio
[Double]AffixScore
,affScore
[Double]NormalizedAffixScore
,nAffScore
[Double]AffixOrthographicScore
,affOrtho
[Double]ExtensionScore
,extScore
[Double]NormalizedExtensionScore
,nExtScore
[Double]HasExtensionAffix
,hasExtAffix
[Boolean]IsExtension
,isExt
[Boolean]VariantBagFrequency
,vBagFreq
[Integer]SourceGain
,srcGain
[Double]NormalizedSourceGain
,nSrcGain
[Double]IsInfered
,isInfered
[Boolean]IsGraphical
,isGraph
[Boolean]IsDerivation
,isDeriv
[Boolean]IsPrefixation
,isPref
[Boolean]IsSyntagmatic
,isSyntag
[Boolean]IsMorphological
,isMorph
[Boolean]IsSemantic
,isSem
[Boolean]Distributional
,isDistrib
[Boolean]SemanticSimilarity
,semSim
[Double]Dico
,isDico
[Boolean]SemanticScore
,semScore
[Double]
Term properties
rank
[Integer]
The rank of the term assigned by TermSuite post-processor engine.
isSingleWord
, isSwt
[Boolean]
Wether this term is single-word or not.
documentFrequency
, dFreq
[Integer]
The number of documents in corpus in which the term is occurring.
frequencyNorm
, fNorm
[Double]
The number of occurrences of the term in the corpus every 1000 words.
generalFrequencyNorm
, gfNorm
[Double]
The number of occurrences of the term in the general language corpus every 1000 words.
specificity
, spec
[Double]
The weirdness ratio, i.e. the specificity of the term in the corpus in comparison to general language.
frequency
, freq
[Integer]
The number of occurrences of the term in the corpus.
OrthographicScore
, ortho
[Double]
The probability for the covered text of the term for being an actual term assigned by TermSuite post-processor engine.
IndependantFrequency
, iFreq
[Integer]
The number of times a term occurrs in corpus as it is, i.e. not as any of its variant forms, assigned by TermSuite post-processor engine.
Independance
, ind
[Double]
The
IndependantFrequency
divided byfrequency
, assigned by TermSuite post-processor engine.
pilot
[String]
The most frequent form of the term.
lemma
[String]
The concatenation of the term’s word lemmas.
tf-idf
, tfIdf
[Double]
frequency
divided byDOCUMENT_FREQUENCY
.
spec-idf
, specIdf
[Double]
specificity
divided byDOCUMENT_FREQUENCY
.
groupingKey
, key
[String]
The unique id of the term, built on its pattern and its lemma.
pattern
[String]
The pattern of the term, i.e. the concatenation of syntactic labels of its words.
spottingRule
, rule
[String]
The name of the UIMA Tokens Regex spotting rule that found the term in the corpus.
isFixedExpression
, isFixedExp
[Boolean]
Wether the term is a fixed expression.
SwtSize
, swtSize
[Integer]
The number of words composing the term that are single-words.
Filtered
, isFiltered
[Boolean]
Wether the term has been marked as filtered by TermSuite post-processor engine. Usually, such a term is not meant to be displayed.
Depth
, depth
[Integer]
The minimum level of extensions of the term starting from a single-word term.
Relation properties
VariationRank
, vRank
[Integer]
The rank of the variation among all variations starting from the same source term, when the relation is a variation.
VariationRule
, vRules
[Set]
The set of YAML variation rules that detected this pair of terms as a term variation, when the relation is a variation.
DerivationType
, derivType
[String]
The derivation type of the variation, when the relation is a variation.
GraphSimilarity
, graphSim
[Double]
The edition distance between the two terms of the relation.
Score
, vScore
[Double]
The global variation score of the relation assigned by TermSuite post-processor engine, when the relation if a variation.
AffixGain
, affGain
[Double]
When the relation is a variation of type “extension”, the FREQUENCY of the variant divided by the FREQUENCY of the affix term.
AffixSpec
, affSpec
[Double]
When the relation is a variation of type “extension”, the SPECIFICITY of the affix term.
AffixRatio
, affRatio
[Double]
When the relation is a variation of type “extension”, the FREQUENCY of the affix term divided by the FREQUENCY of the base term.
AffixScore
, affScore
[Double]
When the relation is a variation of type “extension”, the weighted average of
AFFIX_GAIN
andAFFIX_RATIO
.
NormalizedAffixScore
, nAffScore
[Double]
When the relation is a variation of type “extension”, the min-max normalization of
AffixScore
.
AffixOrthographicScore
, affOrtho
[Double]
When the relation is a variation of type “extension”, the orthographic score of extension affix term.
ExtensionScore
, extScore
[Double]
When the relation is a variation of type “extension”, the score of the extension affix term (combines
AffixGain
andAffixGain
).
NormalizedExtensionScore
, nExtScore
[Double]
When the relation is a variation of type “extension”, the min-max normalization of
ExtensionScore
.
HasExtensionAffix
, hasExtAffix
[Boolean]
When the relation is a variation of type “extension”, wether there is an affix term.
IsExtension
, isExt
[Boolean]
Wether this relation is an extension.
VariantBagFrequency
, vBagFreq
[Integer]
When the relation is a variation, the total of number of occurrences of the variant term and of variant’s variant terms (order-2 variants).
SourceGain
, srcGain
[Double]
When the relation is a variation, the log10 of
VariantBagFrequency
divided by the FREQUENCY of the base term.
NormalizedSourceGain
, nSrcGain
[Double]
When the relation is a variation of type “extension”, the linear normalization of
SourceGain
.
IsInfered
, isInfered
[Boolean]
When the relation is a variation, wether it has been infered from two other base variations.
IsGraphical
, isGraph
[Boolean]
When the relation is a variation, wether there is a graphical similarity between the two terms.
IsDerivation
, isDeriv
[Boolean]
When the relation is a variation, wether one term is the derivation of the other.
IsPrefixation
, isPref
[Boolean]
When the relation is a variation, wether one term is the prefix of the other.
IsSyntagmatic
, isSyntag
[Boolean]
When the relation is a variation, wether it is a syntagmatic variation.
IsMorphological
, isMorph
[Boolean]
When the relation is a variation, wether the variation implies morphosyntactic variations.
IsSemantic
, isSem
[Boolean]
When the relation is a variation, wether there is a semantic similarity between the two terms.
Distributional
, isDistrib
[Boolean]
When the relation is a semantic relation, wheter the relation is of type “distributional”, i.e. the variation has been found by context vector alignment.
SemanticSimilarity
, semSim
[Double]
When the relation is a semantic variation found by alignment, the similarity of the two context vectors of the two terms of the relation.
Dico
, isDico
[Boolean]
When the relation is a semantic relation, wheter the relation is of type “dictionary”, i.e. the variation has been found with a synonymic dico.
SemanticScore
, semScore
[Double]
When the relation is a semantic variation, the score of pertinency of the variation. This property is set for all types of semantic variations, both from dico and distributional.