This function can be used to filter characters from all tables that contain a character column (and are of the class QDHasCharacter).

filterCharacters(hasCharacter, drama, by = c("rank", "tokens", "name"),
  n = ifelse(by == "tokens", 500, ifelse(by == "rank", 10, c())))

Arguments

hasCharacter

The object we want to filter.

drama

The QDDrama object.

by

Character vector. Specifies the filter mechanism.

n

The threshold or a list of character names/ids to keep.

Value

The filtered QDHasCharacter object

Details

The function supports three filter mechanisms: The filter by rank sorts the characters according to the number of tokens they speak and keeps the top $n$ characters. The filter called tokens keeps all characters that speak $n$ or more tokens. The filter called name keeps the characters that are provided by name as a vector as n.

Examples

data(rjmw.0) dstat <- dictionaryStatistics(rjmw.0) filterCharacters(dstat, rjmw.0, by="tokens", n=1000)
#> corpus drama character Liebe #> 6 test rjmw.0 marwood 78 #> 7 test rjmw.0 mellefont 76 #> 8 test rjmw.0 norton 6 #> 9 test rjmw.0 sara 128 #> 10 test rjmw.0 sir_william 29 #> 11 test rjmw.0 waitwell 19