R !!!対応分析 {{outline}} ---- !パッケージ install.packages("ca", dependencies=T) library(ca) !サンプルデータ *NICER のファイル5つ {{pre > setwd("NICERsample/") > list.files() [1] "JPN501.txt" "JPN502.txt" "JPN503.txt" "JPN504.txt" "JPN505.txt" "JPN506.txt" "JPN507.txt" "JPN508.txt" "JPN509.txt" [10] "JPN510.txt" }} *myIndices4.R で、基本的言語特徴量を算出 {{pre > myIndices4() Read 123 items Read 120 items Read 70 items Read 114 items Read 106 items Read 93 items Read 111 items Read 93 items Read 91 items Read 72 items > nicerSampleIndices <- read.table(choose.files()) > nicerSampleIndices V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 1 JPN501.txt 4 319 135 30 0.4231975 7.558549 0.5921317 4.304075 10.63333 2 JPN502.txt 4 356 161 29 0.4522472 8.532983 0.6649157 4.233146 12.27586 3 JPN503.txt 3 201 121 13 0.6019900 8.534682 0.7170149 4.746269 15.46154 4 JPN504.txt 4 260 140 27 0.5384615 8.682431 0.6877692 4.761538 9.62963 5 JPN505.txt 4 420 175 25 0.4166667 8.539126 0.6341905 3.995238 16.80000 6 JPN506.txt 3 261 124 20 0.4750958 7.675407 0.6390038 4.072797 13.05000 7 JPN507.txt 4 362 151 26 0.4171271 7.936384 0.6485083 4.292818 13.92308 8 JPN508.txt 3 198 98 20 0.4949495 6.964557 0.6174747 4.545455 9.90000 9 JPN509.txt 4 263 104 19 0.3954373 6.412915 0.5726616 4.034221 13.84211 10 JPN510.txt 3 183 99 14 0.5409836 7.318291 0.6577049 4.387978 13.07143 > names(nicerSampleIndices) <- c("file", "Score", "Token", "Type", "NoS", "TTR", "GI", "MATTR", "AWL", "ASL") > nicerSampleIndices file Score Token Type NoS TTR GI MATTR AWL ASL 1 JPN501.txt 4 319 135 30 0.4231975 7.558549 0.5921317 4.304075 10.63333 2 JPN502.txt 4 356 161 29 0.4522472 8.532983 0.6649157 4.233146 12.27586 3 JPN503.txt 3 201 121 13 0.6019900 8.534682 0.7170149 4.746269 15.46154 4 JPN504.txt 4 260 140 27 0.5384615 8.682431 0.6877692 4.761538 9.62963 5 JPN505.txt 4 420 175 25 0.4166667 8.539126 0.6341905 3.995238 16.80000 6 JPN506.txt 3 261 124 20 0.4750958 7.675407 0.6390038 4.072797 13.05000 7 JPN507.txt 4 362 151 26 0.4171271 7.936384 0.6485083 4.292818 13.92308 8 JPN508.txt 3 198 98 20 0.4949495 6.964557 0.6174747 4.545455 9.90000 9 JPN509.txt 4 263 104 19 0.3954373 6.412915 0.5726616 4.034221 13.84211 10 JPN510.txt 3 183 99 14 0.5409836 7.318291 0.6577049 4.387978 13.07143 }} !対応分析準備 *行名を番号ではなく、ファイル名にする {{pre > rownames(nicerSampleIndices) <- nicerSampleIndices$file > nicerSampleIndices file Score Token Type NoS TTR GI MATTR AWL ASL JPN501.txt JPN501.txt 4 319 135 30 0.4231975 7.558549 0.5921317 4.304075 10.63333 JPN502.txt JPN502.txt 4 356 161 29 0.4522472 8.532983 0.6649157 4.233146 12.27586 JPN503.txt JPN503.txt 3 201 121 13 0.6019900 8.534682 0.7170149 4.746269 15.46154 JPN504.txt JPN504.txt 4 260 140 27 0.5384615 8.682431 0.6877692 4.761538 9.62963 JPN505.txt JPN505.txt 4 420 175 25 0.4166667 8.539126 0.6341905 3.995238 16.80000 JPN506.txt JPN506.txt 3 261 124 20 0.4750958 7.675407 0.6390038 4.072797 13.05000 JPN507.txt JPN507.txt 4 362 151 26 0.4171271 7.936384 0.6485083 4.292818 13.92308 JPN508.txt JPN508.txt 3 198 98 20 0.4949495 6.964557 0.6174747 4.545455 9.90000 JPN509.txt JPN509.txt 4 263 104 19 0.3954373 6.412915 0.5726616 4.034221 13.84211 JPN510.txt JPN510.txt 3 183 99 14 0.5409836 7.318291 0.6577049 4.387978 13.07143 }} *左端にファイル名が二つ並んで見えるが、 **一番左は行名 **二つ目は列名fileの各値 !対応分析 <> *ファイル名は、分析対象から外すので[ , -1]とすることで、一列目を除いて、対応分析を行う {{pre > niserSample.ca <- ca(nicerSampleIndices[, -1]) > nicerSampl.ca Principal inertias (eigenvalues): 1 2 3 4 5 6 7 8 Value 0.005225 0.001995 0.000701 3.3e-05 9e-06 1e-06 0 0 Percentage 65.61% 25.05% 8.8% 0.41% 0.11% 0.01% 0% 0% Rows: JPN501.txt JPN502.txt JPN503.txt JPN504.txt JPN505.txt JPN506.txt JPN507.txt JPN508.txt JPN509.txt JPN510.txt Mass 0.109972 0.123870 0.079131 0.097887 0.140689 0.093288 0.122593 0.073425 0.089278 0.069868 ChiDist 0.077607 0.051114 0.171263 0.097029 0.085809 0.023531 0.061386 0.081665 0.079874 0.127017 Inertia 0.000662 0.000324 0.002321 0.000922 0.001036 0.000052 0.000462 0.000490 0.000570 0.001127 Dim. 1 0.828663 0.569189 -2.257347 -0.556775 0.882663 -0.241572 0.812881 -0.698749 0.447961 -1.695976 Dim. 2 -1.033676 -0.453632 1.034698 -1.902720 1.155776 0.248239 0.382980 -1.160287 0.974881 0.568028 Columns: Score Token Type NoS TTR GI MATTR AWL ASL Mass 0.007740 0.606927 0.281212 0.047944 0.001023 0.016803 0.001383 0.009325 0.027645 ChiDist 0.144788 0.051699 0.076812 0.165881 0.335743 0.198663 0.243273 0.261869 0.248121 Inertia 0.000162 0.001622 0.001659 0.001319 0.000115 0.000663 0.000082 0.000639 0.001702 Dim. 1 -1.053972 0.683722 -0.953127 0.381575 -4.493261 -2.699578 -3.205499 -3.243154 -2.620518 Dim. 2 -0.891426 0.329033 -0.324783 -3.513723 -0.783634 -0.639111 -0.435240 -1.080868 3.227127 }} !プロットする > plot(nicerSampl.ca) {{ref_image caSample.png}}