r - Error in simple_triplet_matrix when creating bigrams with RTextTools and tm -


i trying create bigrams text following different approaches proposed on web. simplest example following:

library(rtexttools) texts <- c("this first document.", "this second file.", "this third text.") matrix <- create_matrix(texts,ngramlength=3) 

which results in:

error in simple_triplet_matrix(i = i, j = j, v = as.numeric(v), nrow = length(allterms),  :  'i, j, v' different lengths in addition: warning messages: 1: in mclapply(unname(content(x)), termfreq, control) : scheduled cores encountered errors in user code 2: in simple_triplet_matrix(i = i, j = j, v = as.numeric(v), nrow = length(allterms),  : nas introduced coercion 

this set up:

r version 3.1.1 (2014-07-10) platform: x86_64-pc-linux-gnu (64-bit)  locale:  [1] lc_ctype=en_gb.utf-8       lc_numeric=c                [3] lc_time=en_gb.utf-8        lc_collate=en_gb.utf-8      [5] lc_monetary=en_gb.utf-8    lc_messages=en_gb.utf-8     [7] lc_paper=en_gb.utf-8       lc_name=c                   [9] lc_address=c               lc_telephone=c              [11] lc_measurement=en_gb.utf-8 lc_identification=c         attached base packages:  [1] graphics  grdevices utils     datasets  stats     methods   base       other attached packages:  [1] tm_0.6-2         nlp_0.1-8        rtexttools_1.4.2 sparsem_1.6       [5] snowballc_0.5.1  reshape2_1.4.1   ggplot2_1.0.0    plyr_1.8.1        loaded via namespace (and not attached):  [1] bitops_1.0-6        catools_1.17.1      class_7.3-11         [4] codetools_0.2-9     colorspace_1.2-4    digest_0.6.8         [7] e1071_1.6-4         foreach_1.4.2       glmnet_2.0-2         [10] grid_3.1.1          gtable_0.1.2        ipred_0.9-4           [13] iterators_1.0.7     lattice_0.20-29     lava_1.4.1           [16] mass_7.3-34         matrix_1.1-4        maxent_1.3.3.1       [19] munsell_0.4.2       nnet_7.3-8          parallel_3.1.1       [22] prodlim_1.5.1       proto_0.3-10        randomforest_4.6-10  [25] rcpp_0.11.3         rpart_4.1-8         scales_0.2.4         [28] slam_0.1-32         splines_3.1.1       stringr_0.6.2        [31] survival_2.37-7     tau_0.0-18          tcltk_3.1.1          [34] tools_3.1.1         tree_1.0-36         

any hints on happening?


Comments