Interestingly, removing only ingroup positive sentences leads to a reduction in both ingroup solidarity and outgroup hostility.
5/
Interestingly, removing only ingroup positive sentences leads to a reduction in both ingroup solidarity and outgroup hostility.
5/
We found that nearly all base models, and some instruction- and preference-tuned models, showed clear signs of ingroup favoritism and outgroup derogation.
3/
We found that nearly all base models, and some instruction- and preference-tuned models, showed clear signs of ingroup favoritism and outgroup derogation.
3/