トップ 差分 一覧 ソース 検索 ヘルプ PDF RSS ログイン

skimr

*disclaimer
81345

R
R.package

skimr データフレームの概要をわかりやすく表示

  • summary()やstr()の代わりに使う。
  • skim() と、パッケージはrが最後についているが、コマンドにはついていないので注意。
> skim(jpn4.b)
-- Data Summary ------------------------
                           Values
Name                       jpn4.b
Number of rows             285   
Number of columns          10    
_______________________          
Column type frequency:           
  factor                   1     
  numeric                  9     
________________________         
Group variables            None  

-- Variable type: factor -------------------------------------------------------
  skim_variable n_missing complete_rate ordered n_unique top_counts                    
1 file                  0             1 FALSE        285 JPN: 1, JPN: 1, JPN: 1, JPN: 1

-- Variable type: numeric ------------------------------------------------------
  skim_variable n_missing complete_rate    mean      sd     p0     p25     p50     p75    p100 hist 
1 Score                 0             1   3.64   0.765   1       3       4       4       5     ▁▁▇▇▂
2 Token                 0             1 293.    98.8    90     221     275     337     736     ▃▇▃▁▁
3 Type                  0             1 131.    33.1    50     107     127     151     252     ▂▇▆▂▁
4 NoS                   0             1  22.9    7.19    7      18      21      27      51     ▂▇▃▁▁
5 TTR                   0             1   0.461  0.0660  0.252   0.417   0.460   0.509   0.658 ▁▃▇▅▁
6 GI                    0             1   7.67   0.976   4.82    7.03    7.64    8.41   10.5   ▁▅▇▅▁
7 MATTR                 0             1   0.635  0.0455  0.436   0.607   0.639   0.664   0.759 ▁▁▆▇▁
8 AWL                   0             1   4.37   0.333   3.37    4.12    4.37    4.58    5.39  ▁▅▇▃▁
9 ASL                   0             1  13.0    2.90    7.04   11.0    12.5    14.8    24.3   ▃▇▃▁▁
  • これが、summary()だと
> summary(jpn4.b)
         file         Score           Token            Type            NoS             TTR               GI        
 JPN501.txt:  1   Min.   :1.000   Min.   : 90.0   Min.   : 50.0   Min.   : 7.00   Min.   :0.2519   Min.   : 4.820  
 JPN502.txt:  1   1st Qu.:3.000   1st Qu.:221.0   1st Qu.:107.0   1st Qu.:18.00   1st Qu.:0.4167   1st Qu.: 7.028  
 JPN503.txt:  1   Median :4.000   Median :275.0   Median :127.0   Median :21.00   Median :0.4602   Median : 7.638  
 JPN504.txt:  1   Mean   :3.635   Mean   :292.7   Mean   :130.8   Mean   :22.94   Mean   :0.4612   Mean   : 7.668  
 JPN505.txt:  1   3rd Qu.:4.000   3rd Qu.:337.0   3rd Qu.:151.0   3rd Qu.:27.00   3rd Qu.:0.5090   3rd Qu.: 8.408  
 JPN506.txt:  1   Max.   :5.000   Max.   :736.0   Max.   :252.0   Max.   :51.00   Max.   :0.6581   Max.   :10.453  
 (Other)   :279                                                                                                    
     MATTR             AWL             ASL       
 Min.   :0.4360   Min.   :3.368   Min.   : 7.04  
 1st Qu.:0.6072   1st Qu.:4.119   1st Qu.:11.04  
 Median :0.6385   Median :4.370   Median :12.49  
 Mean   :0.6346   Mean   :4.366   Mean   :12.99  
 3rd Qu.:0.6643   3rd Qu.:4.583   3rd Qu.:14.82  
 Max.   :0.7593   Max.   :5.388   Max.   :24.26  
  • str()だと
> str(jpn4.b)
'data.frame':	285 obs. of  10 variables:
 $ file : Factor w/ 287 levels "JPN501.txt","JPN502.txt",..: 1 2 3 4 5 6 7 8 9 10 ...
 $ Score: int  4 4 3 4 4 3 4 3 4 3 ...
 $ Token: int  319 356 201 260 420 261 362 198 263 183 ...
 $ Type : int  135 161 121 140 175 124 151 98 104 99 ...
 $ NoS  : int  30 29 13 27 25 20 26 20 19 14 ...
 $ TTR  : num  0.423 0.452 0.602 0.538 0.417 ...
 $ GI   : num  7.56 8.53 8.53 8.68 8.54 ...
 $ MATTR: num  0.592 0.665 0.717 0.688 0.634 ...
 $ AWL  : num  4.3 4.23 4.75 4.76 4 ...
 $ ASL  : num  10.63 12.28 15.46 9.63 16.8 ...
 - attr(*, "na.action")= 'omit' Named int  83 159
  ..- attr(*, "names")= chr  "83" "159"