8 Configuration
This section does not refer to the configuration of the R-package, but to the literary analysis concept configuration matrix (Pfister 1988).
8.1 Matrices
Configuration matrices can be extracted with the function configuration()
. As usual, entering ?configuration
provies more detailed information about the options the function provides. This function can only deal with a single play at a time, but using lappy()
, we can call it for many plays quickly (this the same we did with the function utteranceStatistics()
above).
configuration(rksp.0)
## corpus drama character 1 2 3 4 5
## 1 test rksp.0 der_prinz 2947 0 965 809 832
## 2 test rksp.0 der_kammerdiener 42 0 0 0 0
## 3 test rksp.0 conti 764 0 0 0 0
## 4 test rksp.0 marinelli 1062 527 1857 1377 837
## 5 test rksp.0 camillo_rota 106 0 0 0 0
## 6 test rksp.0 claudia_galotti 0 1264 641 232 0
## 7 test rksp.0 pirro 0 366 0 0 0
## 8 test rksp.0 odoardo 0 622 0 765 2011
## 9 test rksp.0 angelo 0 416 262 0 0
## 10 test rksp.0 emilia 0 1189 417 0 757
## 11 test rksp.0 appiani 0 1133 0 0 0
## 16 test rksp.0 battista 0 0 195 8 0
## 22 test rksp.0 orsina 0 0 0 2962 0
This creates a basic configuration matrix, but instead of just containing the presence or absence of a figure, it contains the number of spoken tokens for each act for each character. This information is in fact similar to what we can extract with characterStatistics(rksp.0, segment="Act")
, but in a different form and structure. The above table contains a lot of information that can be visualised.
We first need to extract the numeric content from the above table. This can easily be done with the function as.matrix()
(this will, in detail, be rerouted to the function as.matrix.QDConfiguration()
, which knows what part of the table needs to be in the matrix). A matrix containing only numbers can easily be plotted using the regular barplot()
function, as shown below.
<- configuration(rksp.0)
c
<- as.matrix(c)
mat
par(mar=c(2,2,2,10))
barplot(mat,
legend.text = c$character, # legend text
args.legend = list(x=ncol(mat)+3, # legend x position
y=max(colSums(mat)) # legend y pos
), col=qd.colors)
8.1.1 Filtering unimportant characters
This is informative, but doesn’t look very nice and some colors are difficult to associate with characters because colors are repeating. We will therefore use the function filterCharacters()
, which is very similar to characterNames()
: It can be applied to any object of the type QDHasCharacter
and removes rows according to certain criteria. In this case, we filter every character except the five most talkative ones. As usual, see ?filterCharacters
to see other options.
<- configuration(rksp.0) %>%
c filterCharacters(rksp.0, n = 5) %>%
characterNames(rksp.0)
<- as.matrix(c)
mat
par(mar=c(2,2,2,10))
barplot(mat,
names.arg = as.roman(1:ncol(mat)), # roman act numbers
legend.text = c$character, # legend text
args.legend = list(x=ncol(mat)+3, # legend x position
y=max(colSums(mat)) # legend y pos
), col=qd.colors)
Of course, the speech of characters that do not speak often is now removed, leaving only a portion of each act covered in the plot.
8.1.2 Normalization
Since each act has a different length, it is often useful to normalize each block, according to the total number of spoken tokens. This way, we can display the relative active presence of each character in each act. In combination with the filtering we did before, however, we need to be careful: If we scale with the filtered matrix, the resulting values do not reflect the portion of the speech in the scene, but only to the top 8 characters. This way, one would get the impression that the filtered characters’ speeches add up to 100% of a segment, which is not the case. The following snippet shows how to scale using the original matrix, but still only include the top eight characters into the matrix.
<- configuration(rksp.0) # getting the full configuration (with all characters)
c <- as.matrix(c) # extracting the matrix
mat_orig
<- c %>% filterCharacters(rksp.0,n = 8) # filtering, so that only the top 8 characters remain
c_filtered <- as.matrix(c_filtered) # extracting the filtered matrix
mat_filtered
# scaling, using values from the unfiltered matrix
<- scale(mat_filtered,
mat center=FALSE,
scale=colSums(mat_orig))
# plot, as above
par(mar=c(2,3,2,10))
barplot(mat,
names.arg = as.roman(1:ncol(mat)), # roman act numbers
# legend
legend.text=c_filtered$character, # set legend text
args.legend = list(x=ncol(mat)+3, # legend x position
y=max(colSums(mat)) # legend y pos
),# visual
las=1, # rotate labels on y axis
col=qd.colors # provide colors
)
8.2 Copresence
Configuration matrices are also often used to get an overview of who is copresent on stage. This can also be achieved using the function configuration()
. First, we create a configuration matrix that only represents presence or absence of a figure (and we switch to scenes). Obviously, the resulting matrix has many more columns, we include only the first 10 below.
configuration(rksp.0, onlyPresence = TRUE, segment="Scene")
## corpus drama character 1 2 3 4 5 6 7
## 1 test rksp.0 der_prinz TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## 2 test rksp.0 der_kammerdiener TRUE FALSE FALSE FALSE FALSE FALSE TRUE
## 4 test rksp.0 conti FALSE TRUE FALSE TRUE FALSE FALSE FALSE
## 9 test rksp.0 marinelli FALSE FALSE FALSE FALSE FALSE TRUE FALSE
## 14 test rksp.0 camillo_rota FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 15 test rksp.0 claudia_galotti FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 16 test rksp.0 pirro FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 17 test rksp.0 odoardo FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 20 test rksp.0 angelo FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 25 test rksp.0 emilia FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 27 test rksp.0 appiani FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 47 test rksp.0 battista FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 65 test rksp.0 orsina FALSE FALSE FALSE FALSE FALSE FALSE FALSE
Creating a co-occurrence matrix is a simple matter of matrix multiplication, and we already know how to create a matrix.
# extract the configuration
<- configuration(rksp.0, onlyPresence = TRUE, segment="Scene")
c
# extract a matrix
<- as.matrix(c)
mat
# multiply the matrix with its inverse
# this creates the copresence matrix
<- mat %*% t(mat)
copresence
# add character names
rownames(copresence) <- c$character
colnames(copresence) <- c$character
copresence
## der_prinz der_kammerdiener conti marinelli camillo_rota
## der_prinz 17 2 2 9 1
## der_kammerdiener 2 2 0 0 0
## conti 2 0 2 0 0
## marinelli 9 0 0 19 0
## camillo_rota 1 0 0 0 1
## claudia_galotti 0 0 0 3 0
## pirro 0 0 0 1 0
## odoardo 2 0 0 4 0
## angelo 0 0 0 1 0
## emilia 2 0 0 4 0
## appiani 0 0 0 2 0
## battista 1 0 0 3 0
## orsina 1 0 0 3 0
## claudia_galotti pirro odoardo angelo emilia appiani battista
## der_prinz 0 0 2 0 2 0 1
## der_kammerdiener 0 0 0 0 0 0 0
## conti 0 0 0 0 0 0 0
## marinelli 3 1 4 1 4 2 3
## camillo_rota 0 0 0 0 0 0 0
## claudia_galotti 13 3 3 0 3 4 2
## pirro 3 4 1 1 0 1 0
## odoardo 3 1 12 0 2 0 0
## angelo 0 1 0 2 0 0 0
## emilia 3 0 2 0 7 1 1
## appiani 4 1 0 0 1 5 0
## battista 2 0 0 0 1 0 4
## orsina 1 0 3 0 0 0 0
## orsina
## der_prinz 1
## der_kammerdiener 0
## conti 0
## marinelli 3
## camillo_rota 0
## claudia_galotti 1
## pirro 0
## odoardo 3
## angelo 0
## emilia 0
## appiani 0
## battista 0
## orsina 6
The resulting copresence matrix shows the number of scenes in which two characters are both present. The diagonal shows the number of scenes in which a character is present in total (because each character is always copresent with itself, so to speak).
There are multiple ways to visualise this copresence. One option is a heat map, as shown below.
8.2.1 As Heatmap
The copresence can be visualised in a simple heat map. We first focus on the lower triangle and also remove the diagonal values. The actual plotting is a bit more complicated in this case, because we are just using the values in the copresence matrix as pixel intensities in the plot. Also, the axes need to be suppressed first, and can be added later with the proper names of the characters. If needed the code can also be used to include labels into the heat map.
<- configuration(rksp.0, onlyPresence = TRUE, segment="Scene") %>%
c filterCharacters(rksp.0, n = 7) %>%
characterNames(rksp.0)
# extract a matrix
<- as.matrix(c)
mat
# multiply the matrix with its inverse
# this creates the copresence matrix
<- mat %*% t(mat)
copresence
# add character names
rownames(copresence) <- c$character
colnames(copresence) <- c$character
# since it's a square matrix, we don't need the bottom right triangle
# and diagonales.
lower.tri(copresence,diag=TRUE)] <- NA
copresence[
par(mar=c(10,10,1,1)) # plot margins
image(copresence,
col = rgb(256,111,184, alpha=(seq(0,255)),
maxColorValue = 256),
xaxt= "n", # no x axis
yaxt= "n", # no y axis
frame=TRUE # print a frame around the heatmap
)
# include values as labels
text(y=(rep(1:ncol(copresence), each=nrow(copresence))-1)/(nrow(copresence)-1),
x=(1:nrow(copresence)-1)/(nrow(copresence)-1),
labels=as.vector(copresence))
# add the x axis
axis(1, at = seq(0,1,length.out = length(c$character)), labels = c$character, las=3)
# add the y axis
axis(2, at = seq(0,1,length.out = length(c$character)), labels = c$character, las=1)
Apparently, Marinelli and the prince have the most shared scenes. Marinelli also shares a scene with most other figures (sum of the vertical bar: 16).
8.2.2 As Network
The same information can also be visualized as a copresence network. For this, we employ the R-package igraph
. A nice introduction to igraph
can be found in (Arnold and Tilton 2015), particularly for literary networks.
Technically, the matrix we created before is an adjacency matrix. It is therefore simple to convert it to a graph, and igraph offers the function graph_from_adjacency_matrix()
for this.
<- configuration(rksp.0, onlyPresence = TRUE, segment="Scene") %>%
c filterCharacters(rksp.0, n = 7) %>%
characterNames(rksp.0)
# extract a matrix
<- as.matrix(c)
mat
# multiply the matrix with its inverse
# this creates the copresence matrix
<- mat %*% t(mat)
copresence
# add character names
rownames(copresence) <- c$character
colnames(copresence) <- c$character
# convert the adjacency matrix to a graph object
<- igraph::graph_from_adjacency_matrix(copresence,
g weighted=TRUE, # weighted graph
mode="undirected", # no direction
diag=FALSE # no looping edges
)
Now the variable g
holds the graph object. There are different things we can do with the graph. First, we can visualise it. igraph
uses the same mechanism of R that we have used before for plotting, specifying a plot()
function that can plot graph objects.
# Now we plot
plot(g,
layout=layout_with_gem, # how to lay out the graph
main="Copresence Network: Emilia Galotti", # title
vertex.label.cex=0.8, # label size
vertex.label.color="black", # font color
vertex.color=qd.colors[6], # vertex color
vertex.frame.color=NA, # no vertex border
edge.width=E(g)$weight # scale edges according to their weight
)
8.2.3 Network properties
igraph
offers a number of function to extract graph properties. We will show them below without explanation. The documentation of each function can be found here.
8.2.3.2 Average nearest neighbor degree
Function: knn()
knn(g)
## $knn
## Der Prinz Marinelli Claudia Odoardo Galotti Emilia
## 5.571429 4.360000 4.571429 4.928571 5.000000
## Appiani Orsina
## 5.285714 5.250000
##
## $knnk
## [1] NaN NaN 5.285714 5.410714 4.833333 4.360000
8.2.4 Graph Export
As a final step, one might want to further work on the graph using Gephi, or other tools. In order to do so, one can export the graph into an appropriate file:
::write_graph(g,
igraph"rksp.0.graphml", # specify the file name
format="graphml" # specify the file format
)
This results in a file called rksp.0.graphml
, that starts similarly as this:
<?xml version="1.0" encoding="UTF-8"?>
graphml xmlns="http://graphml.graphdrawing.org/xmlns"
< xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://graphml.graphdrawing.org/xmlns
http://graphml.graphdrawing.org/xmlns/1.0/graphml.xsd">
<!-- Created by igraph -->
key id="v_name" for="node" attr.name="name" attr.type="string"/>
<key id="e_weight" for="edge" attr.name="weight" attr.type="double"/>
<graph id="G" edgedefault="undirected">
<node id="n0">
<data key="v_name">DER KAMMERDIENER</data>
<node>
</node id="n1">
<data key="v_name">DER PRINZ</data>
<node>
</ ...
This file can be opened with Gephi.