{{counter}} R !!!R.package Rで便利なパッケージ {{outline}} !!パッケージのインストール *たとえば、gplots を使いたい場合 {{pre > install.packages("gplots", dependencies = T) > library(gplots) }} !!stringr !str_detect() *該当する文字列があるかどうか調べる str_detect(データ, "正規表現") *subset() と合わせて使うと便利 **データフレーム中の特定の列に「ある種の文字列」があるかどうかを調べて、その文字列を含む行だけを選び出す。 ***「ある種の文字列」の例：小文字の連続で書かれている「単語」が複数あるもの {{pre fragJBnozeroMW <- subset(fragJBnozero, str_detect(fragJBnozero[,3], "[:lower:]+ +[^a-z]*[:lower:]+"), select=c("total","MW")) > fragJBnozeroMW total MW 251 256 a [NN] of [NP] 278 253 do n't [VP] 285 252 the [NN] of [NP] 291 219 do not [VP] 306 235 want to [VP] 325 240 a lot 330 213 for example 341 210 a lot of [NP] }} ---- 参考サイト https://heavywatal.github.io/rstats/stringr.html !!psych 心理学系のパッケージ基本的な記述統計量は <>で出るが、もう少し詳しく見るには、このパッケージをインストールして <>を使う。 {{pre > describe(x) vars n mean sd median trimmed mad min max range skew kurtosis se X1 1 100 0.06 1.06 0.04 0.07 0.87 -2.9 2.51 5.41 -0.04 -0.32 0.11 }} *標準偏差、歪度、尖度、標準誤差なども出る。 !!tm https://www.rdocumentation.org/packages/tm/versions/0.7-3 *Boost_tokenizer(x) *MC_tokenizer(x) *removePunctuation(tmp) *removeNumbers(x) ---- {{pre tmp.v <- Vectorsource(tmp) tmp.c <- Corpus(tmp.v) tmpc.td <- TermDocumentMatrix(tmp.c) findFreqTerms(tmpc.td) findMostFreqTerms(tmpc.td) $`1` the said and computer its terminal 15 7 6 6 5 5 }} ---- !!koRpus https://reaktanz.de/R/pckg/koRpus/koRpus_vignette.html *MTLD > library(koRpus) > ns002 <- tokenize(choose.files(), lang="en") で、たとえば、NS002のテキストだけのファイルを読み込んで、 > MTLD(ns002) Language: "en" Total number of tokens: 463 Total number of types: 218 Measure of Textual Lexical Diversity MTLD: 87.62 Number of factors: 5.28 Factor size: 0.72 SD tokens/factor: 36.8 (all factors) 30.05 (complete factors only) Note: Analysis was conducted case insensitive. !!rpart >model1 = rpart(LMH ~ DD + SL + MDD, data = C3L2) > rpart.plot(model) !!gplots {{pre install.packages("gplots", dependencies = T) library(gplots) > head(meanMHD) Group MHD 1 C2 1.500000 2 C2 1.000000 3 C2 2.000000 4 C2 1.250000 5 C2 1.333333 6 C2 1.333333 attach(meanMHD) plotmeans(MHD ~ Group) detach(oneWayMHD) }} {{ref_image meanComparisonMHD.png}} !!orddom *psychパッケージがインストールしてあること orddom(x, y) *これだけで各種効果量を出してくれて、お好きなのをどうぞ、って感じ。 {{pre > orddom(dmu02shd$MDD, dmu03shd$MDD) ordinal metric var1_X "group 1 (x)" "group 1 (x)" var2_Y "group 2 (y)" "group 2 (y)" type_title "indep" "indep" n in X "592" "592" n in Y "547" "547" N #Y>X "176563" "176563" N #Y=X "13390" "13390" N #YY "0.413406665349078" "0.432442005161643" PS Y>X "0.545243712634023" "0.567557994838357" A X>Y "0.434081476357528" "0.434081476357528" A Y>X "0.565918523642472" "0.565918523642472" delta "0.131837047284944" "0.128573222406733" 1-alpha "95" "95" CI low "0.0647643614128363" "0.0665435144203836" CI high "0.197723891470872" "0.190602930393082" s delta "0.0339578629779545" "0.533045085206667" var delta "0.00115313645802954" "0.284137062862983" se delta NA "0.0316134060048762" z/t score "3.88237173141703" "4.0670474540738" H1 tails p/CI "2" "2" p "0.000109396219747593" "5.52499619534335e-05" Cohen's d "0.177125083672158" "0.241205155014015" d CI low "0.0839111293425067" "0.124543945177201" d CI high "0.275869162430133" "0.357866364850828" var d.i "0.291361840460969" "0.251891629550177" var dj. "0.362041171144457" "0.319040086833437" var dij "0.941272277686609" "0.569924729477023" df "1137" "1094.88514066995" NNT "7.58512133420789" "5.70864949792907" }}