トップ履歴一覧 Farm ソース検索ヘルプ PDF RSS ログイン

NICE.tips

NICE.tips

学習者コーパス NICE, NICER, NICESTを使った分析の際のちょっとしたこと

CHILDESのCHATフォーマット形式のデータ（テキストファイル）から、
- データの本文部分だけを抜き出し（ヘッダー部分は削除）
- 行頭の話者記号（*JPN...:\t もしくは *NS...:\t）を削除し
- 全部小文字にして、
- 句読点・スペースを削除し（英数文字のみに）
- 単語の並びを返すスクリプト nice.body()

nice.body.R(267)

nice.body <- function(){
  lines.tmp <- scan(choose.files(), what="char", sep="\n")
  data.tmp <- grep("\\*(JPN|NS)...:\t", lines.tmp, value=T)
  body.tmp <- gsub("\\*(JPN|NS)...:\t", "", data.tmp)
  body.tmp <- body.tmp[body.tmp != ""] #空行を削除。
  body.tmp <- tolower(body.tmp)
  word.tmp <- unlist(strsplit(body.tmp, "\\W+"))
  return(word.tmp)
}

> nice.body()
Read 120 items
  [1] "education"  "of"         "yotori"     "there"      "was"        "the"        "education"  "system"     "that"      
 [10] "called"     "yotori"     "in"         "japan"      "several"    "years"      "ago"        "i"          "heard"     
 [19] "that"       "system"     "was"        "made"       "by"         "the"        "people"     "who"        "thought"   
 [28] "japanese"   "education"  "system"     "should"     "be"         "more"       "free"       "for"        "children"  
 [37] "i"          "was"        "born"       "in"         "1993"       "so"         "i"          "was"        "student"   

> jpn502 <- nice.body()
Read 120 items
> head(jpn502)
[1] "education" "of"        "yotori"    "there"     "was"       "the"

https://sugiura-ken.org/wiki/

Menu

keyword

category

[GLMM]
[Linux]
[HSP]
[R]
[Python]
[Corpus]
[Google]

更新履歴

2024/5/3

Excel.tips

2024/5/2

dplyr

2024/4/29

case_when

2024/4/28

R.tips

2024/4/26

2024/4/24

2024/4/23

stringr

2024/4/22

2024/4/16

sugiura-ken

2024/4/6

sjPlot

2024/4/5

MicrosoftTeams

2024/3/23

順序ロジスティック回帰

2024/3/22

ggplot2

2024/3/20

emmeans

2024/3/16

Windows11

2024/3/8

R

2024/1/28

sakura editor

2024/1/17

spacyr

2024/1/11

2024/1/8

sum contrasts

2024/1/6

jtools

2024/1/3

contrast coding

2023/12/29

abline()

2023/12/22

Word.tips

2023/12/21

easystats

2023/12/20

table()

2023/12/15

Constructional Diversity Analyzer

2023/12/11

Kivy

2023/12/8

多重比較

2023/12/1

Multiple Regression Analysis

2023/11/30

performance

2023/11/26

2023/11/23

相関係数の検定

2023/11/19

2023/11/18

SCA

2023/11/15

Emacs

2023/11/10

inkscape

2023/11/8

forward digit span test

2023/10/29

2023/10/28

ChatGPT

2023/10/26

Thunderbird

2023/10/23

IPSyn

2023/10/21

AntConc

2023/10/16

Outlook

2023/10/15

GoogleSlides

2023/10/9

HSP

2023/10/7

Edge

2023/9/29

TeX

2023/9/27

Discord

2023/9/26

NUCT

2023/9/25

Freemind

2023/9/19

2023/9/18

fitdistrplus

2023/9/17

ICC

2023/9/16

2023/9/14

ownCloud

2023/8/22

xtabs()

2023/8/20

モデル選択

2023/8/8

2023/7/29

Git

2023/7/28

Overleaf

2023/7/21

Zotero

2023/7/13

Beamer

2023/7/11

simpleboot

2023/7/10

2023/7/9

lingpsych

2023/6/28

Brunner-Munzel検定

2023/6/21

2023/6/18

DALL.E

2023/6/17

glmmTMB
Menu

2023/6/13

Bing Image Creator

2023/6/12

ggeffects

2023/6/7

ExpbyHSP

2023/6/2

ifelse

2023/5/30

TextForHSP04

2023/5/25

facet

2023/5/24

TextForHSP03

2023/5/16

antisaccade test

2023/5/10

TextForHSP

2023/5/9

2023/5/2

geom_boxplot()

2023/4/28

mutate()

2023/4/14

CloudLaTeX

2023/4/10

MicrosoftOffice

2023/3/26

反応時間の分析

2023/3/24

p.adjust

2023/3/22

Tobii

2023/3/11

2023/3/7

CHAT

2023/2/21

SQL

2023/2/20

TACT

2023/1/25

COCA

2023/1/19

R.package

2023/1/15

stopwords

2022/12/28

2022/12/26

flexplot

2022/12/25

2022/12/23

VIF

2022/12/20