analyzeText module

analyzeText.analyze_single_sentence(sentence)

Analyze a single sentence and return a one dimensional matrix with scores for all attributes of this sentence.

Parameters

sentence – the single sentence to analyze

Returns

a score matrix 1 x m, where m is the number of attributes

analyzeText.calculate_nominal_form_score(sentence)

Calculate the Nominal Forms (NF). This is the combination of the noun-to-verb ratio and the number of nominal forms. Nominal forms include gerunds, nominalized words and nouns. Nominalized words contain words with the endings ing, ity, ness, and similar.

Parameters

sentence – the single sentence to analyze

Returns

a value that represents the NF score of the sentence

analyzeText.calculate_sentence_length_score(sentence)

Calculate the Sentence Length (SL). This is defined by the number of words in a sentence.

Parameters

sentence – the single sentence to analyze

Returns

a value that represents the SL score of the sentence

analyzeText.calculate_sentence_structure_score(sentence)

Calculate the Sentence Structure (SS). The complexity is measured by branching in the sentence tree. It is increased when the sentence is interrupted by sub-sentences or parenthesis.

Parameters

sentence – the single sentence to analyze

Returns

a value that represents the SS score of the sentence

analyzeText.calculate_vocabulary_complexity_score(sentence)

Calculate the Vocabulary Complexity (VC). This is the percentage of terms not contained in a list of the 1000 most frequent terms in english language.

Parameters

sentence – the single sentence to analyze

Returns

a value that represents the VC score of the sentence

analyzeText.calculate_word_length_score(sentence)

Calculate the Word Length (WL). This is the average number of characters in a word.

Parameters

sentence – the single sentence to analyze

Returns

a value that represents the WL score of the sentence

analyzeText.map_to_score(value, min_limit, max_limit)

Map a value from [min_limit max_limit] to the interval [0 1] and return a score value.

Parameters
  • value – the value to map

  • min_limit – the min limit of the value

  • max_limit – the max limit of the value

Returns

the mapped score value in the interval [0 1]

analyzeText.replace_punctuation(sentence)

Replace the punctuation in a sentence

Parameters

sentence – the sentence to process

Returns

the processed sentence without spaces and without punctuation

analyzeText.text_analysis(sentences)

Analyze multiple sentences, set an annotation text for feature scores that hit a predefined limit and return a matrix with scores for all attributes.

Parameters

sentences – all sentences to analyze

Returns

a score matrix n x m, where n is the number of sentences and m is the number of attributes