!!!NICER1.1SampleData *myTextIndexTopic.R で言語特徴量を抽出 [myTextIndexTopic.R|https://sugiura-ken.org/wiki/wiki.cgi/exp?page=myTextIndex%2ER] {{pre > head(JPNindexTopic) file Topic Score Token Type NoS TTR GI MATTR AWL ASL 1 JPN501.txt sports 4 994 260 123 0.2615694 8.246699 0.4919115 3.888330 8.081301 2 JPN502.txt education 4 997 283 120 0.2838516 8.962700 0.5160281 3.925777 8.308333 3 JPN503.txt education 3 561 207 70 0.3689840 8.739547 0.5566488 4.247772 8.014286 4 JPN504.txt sports 4 817 245 114 0.2998776 8.571465 0.4998409 4.057528 7.166667 5 JPN505.txt sports 4 1024 274 106 0.2675781 8.562500 0.5182812 3.676758 9.660377 6 JPN506.txt money 3 638 197 93 0.3087774 7.799305 0.5082132 3.578370 6.860215 > anyNA(JPNindexTopic) [1] TRUE > str(JPNindexTopic) 'data.frame': 349 obs. of 11 variables: $ file : Factor w/ 349 levels "JPN501.txt","JPN502.txt",..: 1 2 3 4 5 6 7 8 9 10 ... $ Topic: Factor w/ 3 levels "education","money",..: 3 1 1 3 3 2 1 3 3 1 ... $ Score: int 4 4 3 4 4 3 4 3 4 3 ... $ Token: int 994 997 561 817 1024 638 1033 635 734 575 ... $ Type : int 260 283 207 245 274 197 274 185 203 201 ... $ NoS : int 123 120 70 114 106 93 111 93 91 72 ... $ TTR : num 0.262 0.284 0.369 0.3 0.268 ... $ GI : num 8.25 8.96 8.74 8.57 8.56 ... $ MATTR: num 0.492 0.516 0.557 0.5 0.518 ... $ AWL : num 3.89 3.93 4.25 4.06 3.68 ... $ ASL : num 8.08 8.31 8.01 7.17 9.66 ... }} *欠損値を削除 349 -> 347個に {{pre > anyNA(JPNindexTopic) [1] TRUE > JPNindexTopic.b <- na.omit(JPNindexTopic) > str(JPNindexTopic.b) 'data.frame': 347 obs. of 11 variables: $ file : Factor w/ 349 levels "JPN501.txt","JPN502.txt",..: 1 2 3 4 5 6 7 8 9 10 ... $ Topic: Factor w/ 3 levels "education","money",..: 3 1 1 3 3 2 1 3 3 1 ... $ Score: int 4 4 3 4 4 3 4 3 4 3 ... $ Token: int 994 997 561 817 1024 638 1033 635 734 575 ... $ Type : int 260 283 207 245 274 197 274 185 203 201 ... $ NoS : int 123 120 70 114 106 93 111 93 91 72 ... $ TTR : num 0.262 0.284 0.369 0.3 0.268 ... $ GI : num 8.25 8.96 8.74 8.57 8.56 ... $ MATTR: num 0.492 0.516 0.557 0.5 0.518 ... $ AWL : num 3.89 3.93 4.25 4.06 3.68 ... $ ASL : num 8.08 8.31 8.01 7.17 9.66 ... - attr(*, "na.action")= 'omit' Named int 83 159 ..- attr(*, "names")= chr "83" "159" }}