- Term properties
rank[Integer]isSingleWord,isSwt[Boolean]documentFrequency,dFreq[Integer]frequencyNorm,fNorm[Double]generalFrequencyNorm,gfNorm[Double]specificity,spec[Double]frequency,freq[Integer]OrthographicScore,ortho[Double]IndependantFrequency,iFreq[Integer]Independance,ind[Double]pilot[String]lemma[String]tf-idf,tfIdf[Double]spec-idf,specIdf[Double]groupingKey,key[String]pattern[String]spottingRule,rule[String]isFixedExpression,isFixedExp[Boolean]SwtSize,swtSize[Integer]Filtered,isFiltered[Boolean]Depth,depth[Integer]
- Relation properties
VariationRank,vRank[Integer]VariationRule,vRules[Set]DerivationType,derivType[String]GraphSimilarity,graphSim[Double]Score,vScore[Double]AffixGain,affGain[Double]AffixSpec,affSpec[Double]AffixRatio,affRatio[Double]AffixScore,affScore[Double]NormalizedAffixScore,nAffScore[Double]AffixOrthographicScore,affOrtho[Double]ExtensionScore,extScore[Double]NormalizedExtensionScore,nExtScore[Double]HasExtensionAffix,hasExtAffix[Boolean]IsExtension,isExt[Boolean]VariantBagFrequency,vBagFreq[Integer]SourceGain,srcGain[Double]NormalizedSourceGain,nSrcGain[Double]IsInfered,isInfered[Boolean]IsGraphical,isGraph[Boolean]IsDerivation,isDeriv[Boolean]IsPrefixation,isPref[Boolean]IsSyntagmatic,isSyntag[Boolean]IsMorphological,isMorph[Boolean]IsSemantic,isSem[Boolean]Distributional,isDistrib[Boolean]SemanticSimilarity,semSim[Double]Dico,isDico[Boolean]SemanticScore,semScore[Double]
Term properties
rank [Integer]
The rank of the term assigned by TermSuite post-processor engine.
isSingleWord, isSwt [Boolean]
Wether this term is single-word or not.
documentFrequency, dFreq [Integer]
The number of documents in corpus in which the term is occurring.
frequencyNorm, fNorm [Double]
The number of occurrences of the term in the corpus every 1000 words.
generalFrequencyNorm, gfNorm [Double]
The number of occurrences of the term in the general language corpus every 1000 words.
specificity, spec [Double]
The weirdness ratio, i.e. the specificity of the term in the corpus in comparison to general language.
frequency, freq [Integer]
The number of occurrences of the term in the corpus.
OrthographicScore, ortho [Double]
The probability for the covered text of the term for being an actual term assigned by TermSuite post-processor engine.
IndependantFrequency, iFreq [Integer]
The number of times a term occurrs in corpus as it is, i.e. not as any of its variant forms, assigned by TermSuite post-processor engine.
Independance, ind [Double]
The
IndependantFrequencydivided byfrequency, assigned by TermSuite post-processor engine.
pilot [String]
The most frequent form of the term.
lemma [String]
The concatenation of the term’s word lemmas.
tf-idf, tfIdf [Double]
frequencydivided byDOCUMENT_FREQUENCY.
spec-idf, specIdf [Double]
specificitydivided byDOCUMENT_FREQUENCY.
groupingKey, key [String]
The unique id of the term, built on its pattern and its lemma.
pattern [String]
The pattern of the term, i.e. the concatenation of syntactic labels of its words.
spottingRule, rule [String]
The name of the UIMA Tokens Regex spotting rule that found the term in the corpus.
isFixedExpression, isFixedExp [Boolean]
Wether the term is a fixed expression.
SwtSize, swtSize [Integer]
The number of words composing the term that are single-words.
Filtered, isFiltered [Boolean]
Wether the term has been marked as filtered by TermSuite post-processor engine. Usually, such a term is not meant to be displayed.
Depth, depth [Integer]
The minimum level of extensions of the term starting from a single-word term.
Relation properties
VariationRank, vRank [Integer]
The rank of the variation among all variations starting from the same source term, when the relation is a variation.
VariationRule, vRules [Set]
The set of YAML variation rules that detected this pair of terms as a term variation, when the relation is a variation.
DerivationType, derivType [String]
The derivation type of the variation, when the relation is a variation.
GraphSimilarity, graphSim [Double]
The edition distance between the two terms of the relation.
Score, vScore [Double]
The global variation score of the relation assigned by TermSuite post-processor engine, when the relation if a variation.
AffixGain, affGain [Double]
When the relation is a variation of type “extension”, the FREQUENCY of the variant divided by the FREQUENCY of the affix term.
AffixSpec, affSpec [Double]
When the relation is a variation of type “extension”, the SPECIFICITY of the affix term.
AffixRatio, affRatio [Double]
When the relation is a variation of type “extension”, the FREQUENCY of the affix term divided by the FREQUENCY of the base term.
AffixScore, affScore [Double]
When the relation is a variation of type “extension”, the weighted average of
AFFIX_GAINandAFFIX_RATIO.
NormalizedAffixScore, nAffScore [Double]
When the relation is a variation of type “extension”, the min-max normalization of
AffixScore.
AffixOrthographicScore, affOrtho [Double]
When the relation is a variation of type “extension”, the orthographic score of extension affix term.
ExtensionScore, extScore [Double]
When the relation is a variation of type “extension”, the score of the extension affix term (combines
AffixGainandAffixGain).
NormalizedExtensionScore, nExtScore [Double]
When the relation is a variation of type “extension”, the min-max normalization of
ExtensionScore.
HasExtensionAffix, hasExtAffix [Boolean]
When the relation is a variation of type “extension”, wether there is an affix term.
IsExtension, isExt [Boolean]
Wether this relation is an extension.
VariantBagFrequency, vBagFreq [Integer]
When the relation is a variation, the total of number of occurrences of the variant term and of variant’s variant terms (order-2 variants).
SourceGain, srcGain [Double]
When the relation is a variation, the log10 of
VariantBagFrequencydivided by the FREQUENCY of the base term.
NormalizedSourceGain, nSrcGain [Double]
When the relation is a variation of type “extension”, the linear normalization of
SourceGain.
IsInfered, isInfered [Boolean]
When the relation is a variation, wether it has been infered from two other base variations.
IsGraphical, isGraph [Boolean]
When the relation is a variation, wether there is a graphical similarity between the two terms.
IsDerivation, isDeriv [Boolean]
When the relation is a variation, wether one term is the derivation of the other.
IsPrefixation, isPref [Boolean]
When the relation is a variation, wether one term is the prefix of the other.
IsSyntagmatic, isSyntag [Boolean]
When the relation is a variation, wether it is a syntagmatic variation.
IsMorphological, isMorph [Boolean]
When the relation is a variation, wether the variation implies morphosyntactic variations.
IsSemantic, isSem [Boolean]
When the relation is a variation, wether there is a semantic similarity between the two terms.
Distributional, isDistrib [Boolean]
When the relation is a semantic relation, wheter the relation is of type “distributional”, i.e. the variation has been found by context vector alignment.
SemanticSimilarity, semSim [Double]
When the relation is a semantic variation found by alignment, the similarity of the two context vectors of the two terms of the relation.
Dico, isDico [Boolean]
When the relation is a semantic relation, wheter the relation is of type “dictionary”, i.e. the variation has been found with a synonymic dico.
SemanticScore, semScore [Double]
When the relation is a semantic variation, the score of pertinency of the variation. This property is set for all types of semantic variations, both from dico and distributional.
