Dictionary Use

These methods retrieve count the number of occurrences of the words in the dictionaries, across different speakers and/or segments. The function dictionaryStatistics() calculates statistics for dictionaries with multiple entries, dictionaryStatisticsSingle() only for a single word list.

Extract the number part from a QDDictionaryStatistics table as a matrix

dictionaryStatistics(drama,
  fields = DramaAnalysis::base_dictionary[fieldnames],
  fieldnames = c("Liebe"), segment = c("Drama", "Act", "Scene"),
  normalizeByCharacter = FALSE, normalizeByField = FALSE,
  byCharacter = TRUE, column = "Token.lemma", ci = TRUE)

dictionaryStatisticsSingle(drama, wordfield = c(), segment = c("Drama",
  "Act", "Scene"), normalizeByCharacter = FALSE,
  normalizeByField = FALSE, byCharacter = TRUE,
  fieldNormalizer = length(wordfield), column = "Token.lemma",
  ci = TRUE, colnames = NULL)

# S3 method for QDDictionaryStatistics
as.matrix(x, ...)

Arguments

drama	A QDDrama object.
fields	A list of lists that contains the actual field names. By default, we load the `base_dictionary`.
fieldnames	A list of names for the dictionaries.
segment	The segment level that should be used. By default, the entire play will be used. Possible values are "Drama" (default), "Act" or "Scene".
normalizeByCharacter	Logical. Whether to normalize by character speech length.
normalizeByField	Logical. Whether to normalize by dictionary size. You usually want this.
byCharacter	Logical, defaults to TRUE. If false, values will be calculated for the entire segment (play, act, or scene), and not for individual characters.
column	The table column we apply the dictionary on. Should be either "Token.surface" or "Token.lemma", the latter is the default.
ci	Whether to ignore case. Defaults to TRUE, i.e., case is ignored.
wordfield	A character vector containing the words or lemmas to be counted (only for `*Single`-functions)
fieldNormalizer	Defaults to the length of the wordfield. If normalizeByField is given, the absolute numbers are divided by this number.
colnames	The column names to be used in the output table.
x	An object of the type `QDDictionaryStatistics`, e.g., the output of `dictionaryStatistics`.
...	All other parameters are passed to `as.matrix.data.frame()`.

Value

A numeric matrix that contains the frequency with which a dictionary is present in a subset of tokens

Examples

# Check multiple dictionary entries
data(rksp.0)
dstat <- dictionaryStatistics(rksp.0, fieldnames=c("Krieg","Familie"))
# Check a single dictionary entries
data(rksp.0)
fstat <- dictionaryStatisticsSingle(rksp.0, wordfield=c("der"))
mat <- as.matrix(dictionaryStatistics(rksp.0, fieldnames=c("Krieg","Familie")))

Arguments

Value

See also

Examples

Contents