R !!!R.package Rで便利なパッケージ {{outline}} !!パッケージのインストール *たとえば、gplots を使いたい場合 {{pre > install.packages("gplots", dependencies = T) > library(gplots) }} !どんなライブラリーを、読み込むことができるか <> !どんなライブラリーが、すでに使えるようになっているか <> !パッケージの概要を知る <> !!!よく使うパッケージ {{pre library(openxlsx) library(tidyverse) }} * tidyverseには、以下のものが含まれる {{pre ✔ ggplot2 3.3.6 ✔ purrr 0.3.4 ✔ tibble 3.1.7 ✔ dplyr 1.0.9 ✔ tidyr 1.2.0 ✔ stringr 1.4.0 ✔ readr 2.1.2 ✔ forcats 0.5.1 }} !!!A-G !!corpus !!dagitty !!easystats !!eyetrackingR !!ggplot2 !!ggstatsplot !!!H-P !!ngram !!!O-U !!quanteda !!tidyverse !!tm !!!V-Z ---- !!!List of Packages !!retimes https://cran.r-project.org/web/packages/retimes/index.html !!stringi !!stringr !!psych 心理学系のパッケージ 基本的な記述統計量は <>で出るが、 もう少し詳しく見るには、このパッケージをインストールして <>を使う。 {{pre > describe(x) vars n mean sd median trimmed mad min max range skew kurtosis se X1 1 100 0.06 1.06 0.04 0.07 0.87 -2.9 2.51 5.41 -0.04 -0.32 0.11 }} *標準偏差、歪度、尖度、標準誤差なども出る。 !!sjmisc !find_var 該当するデータ列を検索・選択 find_var(data, pattern="パターン", out=出力形式) find_var(data, pattern="score", out="df") *列名に score を含むものを選んで、データフレーム形式で出力 !!tm https://www.rdocumentation.org/packages/tm/versions/0.7-3 *Boost_tokenizer(x) *MC_tokenizer(x) *removePunctuation(tmp) *removeNumbers(x) ---- {{pre tmp.v <- Vectorsource(tmp) tmp.c <- Corpus(tmp.v) tmpc.td <- TermDocumentMatrix(tmp.c) findFreqTerms(tmpc.td) findMostFreqTerms(tmpc.td) $`1` the said and computer its terminal 15 7 6 6 5 5 }} ---- !!koRpus https://reaktanz.de/R/pckg/koRpus/koRpus_vignette.html !tokenize() install.packages("koRpus.lang.en") library(koRpus.lang.en) temp <- tokenize(choose.files(), lang="en") これで、例えば、Project Gutenbergから、グリム童話のGolden Birdのテキストファイルを読み込む {{pre A certain king had a beautiful garden, and in the garden stood a tree which bore golden apples. These apples were always counted, and about the time when they began to grow ripe it was found that every night one of them was gone. The king became very angry at this, and ordered the gardener to keep watch all night under the tree. The gardener set his eldest son to watch; but about twelve o’clock he fell asleep, and in }} {{pre > temp doc_id token tag lemma lttr wclass desc stop stem idx sntc 1 A word.kRp 1 word 1 1 2 certain word.kRp 7 word 2 1 3 king word.kRp 4 word 3 1 4 had word.kRp 3 word 4 1 5 a word.kRp 1 word 5 1 6 beautiful word.kRp 9 word 6 1 [...] 2948 a word.kRp 1 word 2948 140 2949 great word.kRp 5 word 2949 140 2950 many word.kRp 4 word 2950 140 2951 many word.kRp 4 word 2951 140 2952 years word.kRp 5 word 2952 140 2953 . .kRp 1 fullstop 2953 140 }} !lex.div() *各種の語彙多様性指標の算出 lex.div(temp) !MTLD() > library(koRpus) > ns002 <- tokenize(choose.files(), lang="en") で、たとえば、NS002のテキストだけのファイルを読み込んで、 > MTLD(ns002) Language: "en" Total number of tokens: 463 Total number of types: 218 Measure of Textual Lexical Diversity MTLD: 87.62 Number of factors: 5.28 Factor size: 0.72 SD tokens/factor: 36.8 (all factors) 30.05 (complete factors only) Note: Analysis was conducted case insensitive. !MATTR() {{pre > MATTR(temp) Language: "en" Total number of tokens: 606 Total number of types: 261 Moving-Average Type-Token Ratio MATTR: 0.69 SD of TTRs: 0.05 Window size: 100 }} !!rpart >model1 = rpart(LMH ~ DD + SL + MDD, data = C3L2) > rpart.plot(model) !!gplots {{pre install.packages("gplots", dependencies = T) library(gplots) > head(meanMHD) Group MHD 1 C2 1.500000 2 C2 1.000000 3 C2 2.000000 4 C2 1.250000 5 C2 1.333333 6 C2 1.333333 attach(meanMHD) plotmeans(MHD ~ Group) detach(oneWayMHD) }} {{ref_image meanComparisonMHD.png}} !!orddom 効果量を出してくれる *psychパッケージがインストールしてあること orddom(x, y) *これだけで各種効果量を出してくれて、お好きなのをどうぞ、って感じ。 {{pre > orddom(dmu02shd$MDD, dmu03shd$MDD) ordinal metric var1_X "group 1 (x)" "group 1 (x)" var2_Y "group 2 (y)" "group 2 (y)" type_title "indep" "indep" n in X "592" "592" n in Y "547" "547" N #Y>X "176563" "176563" N #Y=X "13390" "13390" N #YY "0.413406665349078" "0.432442005161643" PS Y>X "0.545243712634023" "0.567557994838357" A X>Y "0.434081476357528" "0.434081476357528" A Y>X "0.565918523642472" "0.565918523642472" delta "0.131837047284944" "0.128573222406733" 1-alpha "95" "95" CI low "0.0647643614128363" "0.0665435144203836" CI high "0.197723891470872" "0.190602930393082" s delta "0.0339578629779545" "0.533045085206667" var delta "0.00115313645802954" "0.284137062862983" se delta NA "0.0316134060048762" z/t score "3.88237173141703" "4.0670474540738" H1 tails p/CI "2" "2" p "0.000109396219747593" "5.52499619534335e-05" Cohen's d "0.177125083672158" "0.241205155014015" d CI low "0.0839111293425067" "0.124543945177201" d CI high "0.275869162430133" "0.357866364850828" var d.i "0.291361840460969" "0.251891629550177" var dj. "0.362041171144457" "0.319040086833437" var dij "0.941272277686609" "0.569924729477023" df "1137" "1094.88514066995" NNT "7.58512133420789" "5.70864949792907" }} ノンパラメトリックの場合、Cliff's deltaをみると、以下の基準に基づき判断すればよい: https://www.rdocumentation.org/packages/effsize/versions/0.7.4/topics/cliff.delta The magnitude is assessed using the thresholds provided in (Romano 2006), i.e. |d|<0.147 "negligible", |d|<0.33 "small", |d|<0.474 "medium", otherwise "large" !!effsize 効果量を出してくれる *ノンパラメトリックのCliff's deltaを見てみる例 {{pre install.packages("effsize") library(effsize) > cliff.delta(dmu02shd$MDD, dmu03shd$MDD) Cliff's Delta delta estimate: -0.131837 (negligible) 95 percent confidence interval: lower upper -0.19765420 -0.06483658 }} *「(negligible)」と、評価もコメントしてつけてくれる。 !!lawstat Brunner-Munzel Testが入っている。 {{pre > brunner.munzel.test(dmu02shd$MDD, dmu03shd$MDD) Brunner-Munzel Test data: dmu02shd$MDD and dmu03shd$MDD Brunner-Munzel Test Statistic = 3.8809, df = 1098.7, p-value = 0.0001103 95 percent confidence interval: 0.5325908 0.5992463 sample estimates: P(X brunnermunzel.test(dmu02shd$MDD, dmu03shd$MDD) Brunner-Munzel Test data: dmu02shd$MDD and dmu03shd$MDD Brunner-Munzel Test Statistic = 3.8809, df = 1098.7, p-value = 0.0001103 95 percent confidence interval: 0.5325908 0.5992463 sample estimates: P(X