data mining - r : Why is findAssocs() not working?


Keywords:r 


Question: 

findAssocs() is not working, as is seen below. "Lucid" and "dreaming" occur together quite often in the book.

> docs <- tm_map(docs, stemDocument)
> dtm <- DocumentTermMatrix(docs)
> freq <- colSums(as.matrix(dtm))
> ord <- order(freq)
> freq[tail(ord)]
one experi   will   can lucid dream
287   312   363   452   1018   2413
> freq[head(ord)]
abbey abdomin   abdu abraham absent   abus
1       1       1       1       1       1
> findAssocs(dtm, "dream", corlimit=0.6)
$dream
numeric(0)
> findAssocs(dtm, "dream", corlimit=0.1)
$dream
numeric(0)
> findAssocs(dtm, "lucid", corlimit=0.1)
$lucid
numeric(0)
> findAssocs(dtm, "lucid", corlimit=0.6)
$lucid
numeric(0)
> 

The corpus is a single document, the text version of a book. Does this function require at least two documents? If so, if I split the book in half will I get the correlations regarding the book as a whole, or in regards to how the two halves compare to each other?


1 Answer: 

It counts the number of documents, ignoring duplicate occurences.

Split e.g. into sentences or paragraphs.