Skip to content

Calculating Readability in R

Fri 18th April 2014

I’ve finally got round to exploring an idea around readability, and was excited to find out the programming language R already has a library that will calculate a number of readability metrics. Should save me some time writing my own one or using an API. Having installed this library: install.packages('koRpus') I was hoping it would be as easy as calling the function and giving it some text: readability("hello my name is Rikki") Of course it wasn’t going to be that easy. Here’s my guide to the minimum you have to do to get a readability score out of R. There’s plenty of other options to explore, and feel free to ask questions in the comments below.

  1. Install the koRpus library: install.packages('koRpus')
  2. Install TreeTagger. There are installation steps on that site. There are too many (i.e. it could be simpler), but go with it.
    1. Choose somewhere sensible to put the directory (I put the files in /usr/bin/TreeTagger/ on my Mac).
    2. Download each of the files it tells you to: tagger package, tagging scripts, install-tagger.sh and a parameter file for the language of the text you will be analysing. I didn’t download the English chunker file yet (I’ll see if it’s necessary later).
    3. Don’t unzip the archives.
    4. chmod u+x install-tagger.sh
    5. ./install-tagger.sh
    6. Add $TAGGER_PATH to your PATH variable as well (in your ~/.profile or ~/.bash_profile) and source ~/.profile export TAGGER_CMD=/usr/bin/TreeTragger/cmd export TAGGER_BIN=/usr/bin/TreeTragger/bin export TAGGER_PATH=$TAGGER_CMD:$TAGGER_BIN
    7. Test echo 'Hello world!' | cmd/tree-tagger-english
  3. Set up your TreeTagger and readability options in R: set.kRp.env(TT.cmd="/usr/bin/TreeTagger/cmd/tree-tagger-english", lang="en")
  4. Write your text to a file: tf = tempfile() write(words, tf)
  5. Run the readability function: rdb
  6. Get a value out: rdb@Flesch.Kincaid$grade

There we go. Way more complicated than it needed to be, but that’s how you do it. Install an application that the R library interfaces with, write your words to a temporary file and then call the function. Any questions, pop them in the comments below!

Advertisements

From → Programming

Leave a Comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: