!!!sum contrasts (deviation coding)
{{outline}}
----

!別名
* sum coding
* deviation coding
* effect coding

!!options()
* Rの各種オプションを設定するコマンド
* contrastsの設定がどうなっているかを確認する
{{pre
 options()$contrasts

        unordered           ordered 
"contr.treatment"      "contr.poly" 
}}
* contrastsのオプションは、unorderedとorderedの二種類ある

* このオプションを変えてしまうこともできるが、変えてしまうと、その後もその設定がデフォルトになってしまうので、普通のデフォルトにまた戻さないと、困ったことになる。
 contrasts(flevels) <- NULL

* モデル式のオプションとして、指定する場合は、その場合だけそのオプションになるので、安全


!! contr.sum
* 以下の「Understanding Sum Contrasts for Regression Models: A Demonstration」の例
 lm(RT~Gender*Strategy, data = df, contrasts = list(Gender = "contr.sum", Strategy = "contr.sum"))


* 多重共線性を避けるため、原則、最後のレベルを投入しない。省略する。
** omitオプションを使うと省略するレベルを明示的に指定できる。
 omit= "除くもの"

!具体的手順

+ コントラストがどうなっているかの確認
{{pre
contrasts(NP.dat.jp2$Prompt)

   P2 P3 P4 P5 P6 P7 P8
P1  0  0  0  0  0  0  0
P2  1  0  0  0  0  0  0
P3  0  1  0  0  0  0  0
P4  0  0  1  0  0  0  0
P5  0  0  0  1  0  0  0
P6  0  0  0  0  1  0  0
P7  0  0  0  0  0  1  0
P8  0  0  0  0  0  0  1
}}
** 一番最初のP1が全部 0 になっている。
+ レベルの表示
{{pre
levels(NP.dat.jp2$Prompt)

[1] "P1" "P2" "P3" "P4" "P5" "P6" "P7" "P8"
}}
+ このレベルに対して、contr.sum()をかける
{{pre
contr.sum(levels(NP.dat.jp2$Prompt))

   [,1] [,2] [,3] [,4] [,5] [,6] [,7]
P1    1    0    0    0    0    0    0
P2    0    1    0    0    0    0    0
P3    0    0    1    0    0    0    0
P4    0    0    0    1    0    0    0
P5    0    0    0    0    1    0    0
P6    0    0    0    0    0    1    0
P7    0    0    0    0    0    0    1
P8   -1   -1   -1   -1   -1   -1   -1
}}
** 一番最後のP8がすべて -1 になっている。
** この状態で、各カラムはP1からP7まで 1 なので、P8と合わせたら、±0 となる。
+ もともとのcontrastsのベクトルをsum contrastsのベクトルに置き換える
{{pre
contrasts(NP.dat.jp2$Prompt) <- contr.sum(levels(NP.dat.jp2$Prompt))
}}
+ 置き換わったかどうか確認
{{pre
contrasts(NP.dat.jp2$Prompt)

   [,1] [,2] [,3] [,4] [,5] [,6] [,7]
P1    1    0    0    0    0    0    0
P2    0    1    0    0    0    0    0
P3    0    0    1    0    0    0    0
P4    0    0    0    1    0    0    0
P5    0    0    0    0    1    0    0
P6    0    0    0    0    0    1    0
P7    0    0    0    0    0    0    1
P8   -1   -1   -1   -1   -1   -1   -1
}}


!!References
* [Understanding Sum Contrasts for Regression Models: A Demonstration|https://rpubs.com/monajhzhu/608609]
* https://www.rcps.jp/doku.php?id=%E3%83%A1%E3%83%A2:%E3%83%9E%E3%83%AB%E3%83%81%E3%83%AC%E3%83%99%E3%83%AB%E3%83%A2%E3%83%87%E3%83%AB:%E4%B8%80%E8%88%AC%E7%B7%9A%E5%BD%A2%E3%83%A2%E3%83%87%E3%83%AB
* https://stats.oarc.ucla.edu/r/library/r-library-contrast-coding-systems-for-categorical-variables/#DEVIATION
* https://cran.r-project.org/web/packages/faux/vignettes/contrasts.html

!Brehm and Alday (2022) Contrast coding in a decade of mixed models
Journal of Memory and Language 125: 1-13

https://osf.io/jkpxt/

[AMLaP 2020|https://mediaup.uni-potsdam.de/Play/Chapter/223]

! ３レベル以上の sum codingの場合
* 切片は、grand mean（変数のすべてのレベルの平均）
* 各固定効果は、そのレベルがgrand meanと比べ有意に差があるかを示す
* 複数のレベルのうち一つはreference levelとして表示されない
* grand meanに近いレベルをreference levelとする
** そのレベルはgrand meanに一番近いので、差があるかどうかという観点からすれば、一番重要度が低いものとなる（一番差がない）
* こうすることで、個々のレベルが全体の平均からどれだけ離れているかが分かる

! 指針
* カテゴリー変数を使う場合は、どのコントラストを使ったかを明示的に書くこと。
* デフォルトのtreatment codingの場合もその旨書くこと。
** 例："Factor A (magenta, green) was treatment coded or the three levels of Factor B, coffee, tea, and cocoa, were coded with two contrasts: (.25,.25, -.5) and (.5, -.5.0)"
* そのうえで、何と何を対比して比較したかをわかりやすく言い換えて説明すること。
** 例１："The model intercept therefore reflects the reference level of factor A, magenta
** 例２："The first contrast tests caffeinated versus non-caffienated beverages, and the second tests coffee versus tea"
* コントラストが適切に行われていれば、事後テストをしなくてもよい。
** 必要な場合のみ、emmeansで行えばよい。
*** その場合はその旨明記しておく。例：An additional set of pairwise comparisons was performed to directly compare tea versus cocoa using the R package emmeans.