This function can be used to filter characters from all tables that contain a character column (and are of the class QDHasCharacter).
filterCharacters(hasCharacter, drama, by = c("rank", "tokens", "name"), n = ifelse(by == "tokens", 500, ifelse(by == "rank", 10, c())))
hasCharacter | The object we want to filter. |
---|---|
drama | The QDDrama object. |
by | Character vector. Specifies the filter mechanism. |
n | The threshold or a list of character names/ids to keep. |
The filtered QDHasCharacter object
The function supports three filter mechanisms: The filter by
rank
sorts the characters according to the number of tokens they speak
and keeps the top $n$ characters. The filter called tokens
keeps
all characters that speak $n$ or more tokens. The filter called name
keeps the characters that are provided by name as a vector as n
.
data(rjmw.0) dstat <- dictionaryStatistics(rjmw.0) filterCharacters(dstat, rjmw.0, by="tokens", n=1000)#> corpus drama character Liebe #> 6 test rjmw.0 marwood 78 #> 7 test rjmw.0 mellefont 76 #> 8 test rjmw.0 norton 6 #> 9 test rjmw.0 sara 128 #> 10 test rjmw.0 sir_william 29 #> 11 test rjmw.0 waitwell 19