{{counter}} R !!!R.package Rで便利なパッケージ {{outline}} !!パッケージのインストール *たとえば、gplots を使いたい場合 {{pre > install.packages("gplots", dependencies = T) > library(gplots) }} !!stringr !str_detect() *該当する文字列があるかどうか調べる str_detect(データ, "正規表現") *subset() と合わせて使うと便利 **データフレーム中の特定の列に「ある種の文字列」があるかどうかを調べて、その文字列を含む行だけを選び出す。 ***「ある種の文字列」の例：小文字の連続で書かれている「単語」が複数あるもの {{pre fragJBnozeroMW <- subset(fragJBnozero, str_detect(fragJBnozero[,3], "[:lower:]+ +[^a-z]*[:lower:]+"), select=c("total","MW")) > fragJBnozeroMW total MW 251 256 a [NN] of [NP] 278 253 do n't [VP] 285 252 the [NN] of [NP] 291 219 do not [VP] 306 235 want to [VP] 325 240 a lot 330 213 for example 341 210 a lot of [NP] }} !!psych 心理学系のパッケージ基本的な記述統計量は <>で出るが、もう少し詳しく見るには、このパッケージをインストールして <>を使う。 {{pre > describe(x) vars n mean sd median trimmed mad min max range skew kurtosis se X1 1 100 0.06 1.06 0.04 0.07 0.87 -2.9 2.51 5.41 -0.04 -0.32 0.11 }} *標準偏差、歪度、尖度、標準誤差なども出る。 !!tm https://www.rdocumentation.org/packages/tm/versions/0.7-3 *Boost_tokenizer(x) *MC_tokenizer(x) *removePunctuation(tmp) *removeNumbers(x) ---- {{pre tmp.v <- Vectorsource(tmp) tmp.c <- Corpus(tmp.v) tmpc.td <- TermDocumentMatrix(tmp.c) findFreqTerms(tmpc.td) findMostFreqTerms(tmpc.td) $`1` the said and computer its terminal 15 7 6 6 5 5 }} ---- !!koRpus https://reaktanz.de/R/pckg/koRpus/koRpus_vignette.html *MTLD > library(koRpus) > ns002 <- tokenize(choose.files(), lang="en") で、たとえば、NS002のテキストだけのファイルを読み込んで、 > MTLD(ns002) Language: "en" Total number of tokens: 463 Total number of types: 218 Measure of Textual Lexical Diversity MTLD: 87.62 Number of factors: 5.28 Factor size: 0.72 SD tokens/factor: 36.8 (all factors) 30.05 (complete factors only) Note: Analysis was conducted case insensitive. !!rpart >model1 = rpart(LMH ~ DD + SL + MDD, data = C3L2) > rpart.plot(model) !!gplots {{pre install.packages("gplots", dependencies = T) library(gplots) > head(meanMHD) Group MHD 1 C2 1.500000 2 C2 1.000000 3 C2 2.000000 4 C2 1.250000 5 C2 1.333333 6 C2 1.333333 attach(meanMHD) plotmeans(MHD ~ Group) detach(oneWayMHD) }} {{ref_image meanComparisonMHD.png}}