http://sgr.gsid.nagoya-u.ac.jp/wordpress/?page_id=1301
NICER1_3.zip
マウス右ボタンクリックで、「開く」(すべて解凍)
ファイル・フォルダ | 説明 |
---|---|
NICER1_3readme_2020-01-16.txt | 概要の説明 |
Learner_Instructions.pdf | 学習者用指示文 |
Learner_Profile_List.xls | 学習者情報一覧 |
Learner_Questionnaire.pdf | 学習者用質問事項 |
Native_Instructions.pdf | 母語話者用指示文 |
Native_Profile_List.xls | 母語話者情報一覧 |
Native_Questionnaire.pdf | 母語話者向け質問事項 |
NICER_NNS/ | 学習者コーパス・データ |
NICER_NS/ | 母語話者コーパス・データ |
(注)#はコメント
+ # 足す
- # 引く
* # かける
/ # 割る
^ # べき乗
sqrt(x) #ルート (1/2乗)
log(x) #xの自然対数を取る。底はe
log2(x) #xの対数を取る。底は2
1 + 1
## [1] 2
8 - 5
## [1] 3
2 * 2
## [1] 4
15 / 3
## [1] 5
2^3
## [1] 8
sqrt(9)
## [1] 3
log(2.7)
## [1] 0.9932518
log2(8)
## [1] 3
abc <- 3 # abcという変数を作り、3を代入
abc # abcの中身を表示
## [1] 3
abc <- 300 # abcの値に300が上書きされる
abc # abcの中身を表示
## [1] 300
efj <- 6
jke <- 2987
ls() # これまでに作った変数一覧を表示
## [1] "abc" "efj" "jke"
rm(jke) #変数jkeを削除
ls()
## [1] "abc" "efj"
namae <- "sugiura" # namaeにはsugiuraという文字列が入る
class(abc) # class(変数名)で変数の「クラス」を表示
## [1] "numeric"
class(namae)
## [1] "character"
kazu <- c(2,4,6,8) # kazuという配列を作り、2, 4, 6, 8 を代入
kazu
## [1] 2 4 6 8
names <- c("I","you","he") # 配列の中身に文字列を代入
names
## [1] "I" "you" "he"
length(kazu)
## [1] 4
length(names)
## [1] 3
作業スペースを保存しておくと、作業記録をとっておける。
作業を中断する場合、作業スペースと履歴を保存しておくと、作成した変数や使用したコマンドが保存されるので便利。
scan(file="ファイルの場所と名前", what="char")
# どこのフォルダーの何という名前のファイルか (後で作業ディレクトリの話)
# 文字データなので、what="char"を指定
scan(choose.files(), what="char")
(出力例)
Read 853 items
[1] "@Begin" "@Participants:"
[3] "NS501" "@PID:"
[5] "PIDNS501" "@Age:"
[7] "27" "@Sex:"
[9] "M" "@L1:"
[11] "AmE" "@FatherL1:"
[13] "none" "@MotherL1:"
(以下省略)
scan(choose.files(), what="char", sep="\n")
(出力例)
Read 104 items
[1] "@Begin"
[2] "@Participants:\tNS501"
[3] "@PID:\tPIDNS501"
[4] "@Age:\t27"
[5] "@Sex:\tM"
[6] "@L1:\tAmE"
[7] "@FatherL1:\tnone"
[8] "@MotherL1:\tnone"
[9] "@AcademicBackground:\tM1"
(以下省略)
母語話者データのファイル NS501.txt を読み込んでRの中に変数(配列)として保存。
配列名「ns501」という名前にすることにする
ns501 <- scan(choose.files(), what="char", sep ="\n")
getwd()
## [1] "C:/Users/sugiura/Dropbox/ed/2020/2020後期/金曜2 第二特論/Rstudio-text/LCR"
list.files()
(出力例)
[1] "JPN501.txt" "JPN502.txt" "JPN503.txt" "JPN504.txt"
[5] "JPN505.txt" "JPN506.txt" "JPN507.txt" "JPN508.txt"
[9] "JPN509.txt" "JPN510.txt" "JPN511.txt" "JPN512.txt"
[13] "JPN513.txt" "JPN514.txt" "JPN515.txt" "JPN516.txt"
setwd("NICER1_3") # 引用符に入れる点に注意
setwd("NICER_NNS")
# setwd("..") # 一つ上へ移動
setwd("../NICER_NS") # 一つ上へ行ってその下にあるNICE-NSへ移動
getwd()
## [1] "C:/Users/sugiura/Dropbox/ed/2020/2020後期/金曜2 第二特論/Rstudio-text/LCR/NICER1_3/NICER_NS"
★「choose.files()」の代わりに、ファイル名を直接明記する
setwd("NICER1_3/NICER_NNS")
scan("JPN501.txt", what="char")
scan("JPN501.txt", what="char", sep="\n")
setwd("NICER1_3/NICER_NS")
ns501 <- scan("ns501.txt", what="char", sep="\n")
現在のNICERのデータは、CHATフォーマットになっている。
「CHAT Transcription Manual」
『今日から使える発話データベースCHILDES入門』 宮田Susanne 編 Brian MacWhinney 監修 ひつじ書房 2004年11月
CHATフォーマットでは、一つのファイルに、「ヘッダー情報」と「本文情報」が入っている。
ヘッダー部分は、行頭が @ で始まる。
本文部分は、 行頭が * で始まる。
@Begin
@Languages: en
@Participants: CHI Ross Child, FAT Brian Father
@ID: en|macwhinney|CHI|2;10.10||||Target_Child||
@ID: en|macwhinney|FAT|35;2.||||Target_Child||
*ROS: why isn't Mommy coming?
%com: Mother usually picks Ross up around 4 PM.
*FAT: don't worry.
*FAT: she'll be here soon.
*CHI: good.
@End
grep("However", ns501)
## [1] 45 54 84
grep("However", ns501, value=T)
## [1] "*NS501:\tHowever in the French educational system instead of a head or a body there is a thesis and an anti-thesis or point and counter point in which the writer must oppose his or her original statements."
## [2] "*NS501:\tHowever what the French lose in logical flow they gain in critical thinking."
## [3] "*NS501:\tHowever, sadly with the continuous failings of the American educational system, these lofty dreams yet remain dreams for a generation of potential Newtons and Einsteins."
grep("however", ns501, value=T) # 小文字 h
## [1] "*NS501:\tThis makes the facts easy to access, however, it does not force the writer to challenge his or her own logic in the process, leaving the ideas themselves rigid."
ns501[grep("However", ns501)]
## [1] "*NS501:\tHowever in the French educational system instead of a head or a body there is a thesis and an anti-thesis or point and counter point in which the writer must oppose his or her original statements."
## [2] "*NS501:\tHowever what the French lose in logical flow they gain in critical thinking."
## [3] "*NS501:\tHowever, sadly with the continuous failings of the American educational system, these lofty dreams yet remain dreams for a generation of potential Newtons and Einsteins."
Rでは正規表現「*」の「エスケープ」に「\」を二重に使う点に注意)
grep("[hH]owever", ns501, value=T)
## [1] "*NS501:\tHowever in the French educational system instead of a head or a body there is a thesis and an anti-thesis or point and counter point in which the writer must oppose his or her original statements."
## [2] "*NS501:\tThis makes the facts easy to access, however, it does not force the writer to challenge his or her own logic in the process, leaving the ideas themselves rigid."
## [3] "*NS501:\tHowever what the French lose in logical flow they gain in critical thinking."
## [4] "*NS501:\tHowever, sadly with the continuous failings of the American educational system, these lofty dreams yet remain dreams for a generation of potential Newtons and Einsteins."
grep("however", ns501, value=T, ignore.case=T)
## [1] "*NS501:\tHowever in the French educational system instead of a head or a body there is a thesis and an anti-thesis or point and counter point in which the writer must oppose his or her original statements."
## [2] "*NS501:\tThis makes the facts easy to access, however, it does not force the writer to challenge his or her own logic in the process, leaving the ideas themselves rigid."
## [3] "*NS501:\tHowever what the French lose in logical flow they gain in critical thinking."
## [4] "*NS501:\tHowever, sadly with the continuous failings of the American educational system, these lofty dreams yet remain dreams for a generation of potential Newtons and Einsteins."