*disclaimer
1198520
randomForest ランダムフォレスト
library(randomForest)
randomForest 4.6-14
Type rfNews() to see new features/changes/bug fixes.
Score
<fctr>
Token
<int>
Type
<int>
NoS
<int>
TTR
<dbl>
GI
<dbl>
MATTR
<dbl>
AWL
<dbl>
ASL
<dbl>
1 4 319 135 30 0.4231975 7.558549 0.5921317 4.304075 10.63333
2 4 356 161 29 0.4522472 8.532983 0.6649157 4.233146 12.27586
3 3 201 121 13 0.6019900 8.534682 0.7170149 4.746269 15.46154
4 4 260 140 27 0.5384615 8.682431 0.6877692 4.761538 9.62963
5 4 420 175 25 0.4166667 8.539126 0.6341905 3.995238 16.80000
6 3 261 124 20 0.4750958 7.675407 0.6390038 4.072797 13.05000
6 rows
jpn.RFmodel
Call:
randomForest(formula = Score ~ ., data = jpn.5c)
Type of random forest: classification
Number of trees: 500
No. of variables tried at each split: 2
OOB estimate of error rate: 17.19%
Confusion matrix:
1 2 3 4 5 class.error
1 1 1 0 0 0 0.5000000
2 0 2 7 0 0 0.7777778
3 0 3 100 12 0 0.1304348
4 0 0 15 105 4 0.1532258
5 0 0 0 7 28 0.2000000
OOB estimate of error rate: 17.19% というのは、予測結果の誤りの割合
importance(jpn.RFmodel)
MeanDecreaseGini
Token 52.734429
Type 38.862079
NoS 14.514875
TTR 16.444470
GI 16.044448
MATTR 9.509776
AWL 15.182801
ASL 16.387543
各要因の影響力の強さ(ジニ係数)
この後、テストデータで予測してみる
jpn.RFpredict = predict(jpn.RFmodel, newdata)
https://sugiura-ken.org/wiki/