2nd October 2016

Nils Reiter

Technology Overview

This post gives an overview of the current state of the technology we are using and how we plan to proceed in the future. All programs, scripts and components are available on our github page.

NLP Processing

The NLP processing components (e.g., PoS-Tagging and Lemmatization, but also speaker identification etc.) are grouped in the module DramaNLP.

Web service

To have a clear repository of the processed texts, we created a tomcat web app as a web service. The web service can be used to update certain kinds of meta data or re-run NLP analyses to a certain extent. Most importantly, the web services offers querying options to retrieve CSV-tables that can be processed with R. In fact, R directly access the web service using read.csv.

R Analysis

The R package (currently called DramaAnalysis, which is a bit unfortunate…) collects a number of often used functions. It will grow over time and be extended.

This web page / web app

We have experimented with an interactive web application to show some analyses. Some of the ideas are now re-used in the form of this web page. Basically, we export data from R into json and then javascript to make it interactive.