參考文獻 |
[1] 蕭宇, “台男悲歌?「母豬教」論述中的陽剛焦慮與厭女邏輯,” Master’s thesis, 高雄醫學大學, Jan 2021.
[2] Y. Li, M. Du, R. Song, X. Wang, and Y. Wang, “A survey on fairness in large language models,” 2024.
[3] Y. Chang, X. Wang, J. Wang, Y. Wu, L. Yang, K. Zhu, H. Chen, X. Yi, C. Wang, Y. Wang et al., “A survey on evaluation of large language models,” ACM Transactions on Intelligent Systems and Technology.
[4] I. O. Gallegos, R. A. Rossi, J. Barrow, M. M. Tanjim, S. Kim, F. Dernoncourt, T. Yu, R. Zhang, and N. K. Ahmed, “Bias and fairness in large language models: A survey,” 2024.
[5] S. Barikeri, A. Lauscher, I. Vulić, and G. Glavaš, “RedditBias: A real-world resource for bias evaluation and debiasing of conversational language models,” in Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), C. Zong, F. Xia, W. Li, and R. Navigli, Eds. Online: Association for Computational Linguistics, Aug. 2021, pp. 1941–1955. [Online]. Available: https://aclanthology.org/2021.acl-long.151
[6] J. Zhao, M. Fang, Z. Shi, Y. Li, L. Chen, and M. Pechenizkiy, “CHBias: Bias evaluation and mitigation of Chinese conversational language models,” in Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), A. Rogers, J. Boyd-Graber, and N. Okazaki, Eds. Toronto, Canada: Association for Computational Linguistics, Jul. 2023, pp. 13 538–13 556. [Online]. Available: https://aclanthology.org/2023.acl-long.757
[7] W. Zhong, R. Cui, Y. Guo, Y. Liang, S. Lu, Y. Wang, A. Saied, W. Chen, and N. Duan, “Agieval: A human-centric benchmark for evaluating foundation models,” 2023.
[8] H. Zeng, “Measuring massive multitask chinese understanding,” 2023.
[9] Y. Huang, Y. Bai, Z. Zhu, J. Zhang, J. Zhang, T. Su, J. Liu, C. Lv, Y. Zhang, J. Lei, Y. Fu, M. Sun, and J. He, “C-eval: A multi-level multi-discipline chinese evaluation suite for foundation models,” 2023.
[10] H. Li, Y. Zhang, F. Koto, Y. Yang, H. Zhao, Y. Gong, N. Duan, and T. Baldwin, “Cmmlu: Measuring massive multitask language understanding in chinese,” 2023.
[11] X. Zhang, C. Li, Y. Zong, Z. Ying, L. He, and X. Qiu, “Evaluating the performance of large language models on gaokao benchmark,” 2023.
[12] L. Xu, H. Hu, X. Zhang, L. Li, C. Cao, Y. Li, Y. Xu, K. Sun, D. Yu, C. Yu, Y. Tian, Q. Dong, W. Liu, B. Shi, Y. Cui, J. Li, J. Zeng, R. Wang, W. Xie, Y. Li, Y. Patterson, Z. Tian, Y. Zhang, H. Zhou, S. Liu, Z. Zhao, Q. Zhao, C. Yue, X. Zhang, Z. Yang, K. Richardson, and Z. Lan, “CLUE: A Chinese language understanding evaluation benchmark,” in Proceedings of the 28th International Conference on Computational Linguistics, D. Scott, N. Bel, and C. Zong, Eds. Barcelona, Spain (Online): International Committee on Computational Linguistics, Dec. 2020, pp. 4762–4772. [Online]. Available: https://aclanthology.org/2020.coling-main.419
[13] L. Xu, A. Li, L. Zhu, H. Xue, C. Zhu, K. Zhao, H. He, X. Zhang, Q. Kang, and Z. Lan, “Superclue: A comprehensive chinese large language model benchmark,” 2023.
[14] Z. Gu, X. Zhu, H. Ye, L. Zhang, J. Wang, Y. Zhu, S. Jiang, Z. Xiong, Z. Li, W. Wu, Q. He, R. Xu, W. Huang, J. Liu, Z. Wang, S. Wang, W. Zheng, H. Feng, and Y. Xiao, “Xiezhi: An ever-updating benchmark for holistic domain knowledge evaluation,” 2024.
[15] C. C. Shao, T. Liu, Y. Lai, Y. Tseng, and S. Tsai, “Drcd: a chinese machine reading comprehension dataset,” 2019.
[16] P. Ennen, P.-C. Hsu, C.-J. Hsu, C.-L. Liu, Y.-C. Wu, Y.-H. Liao, C.-T. Lin, D.-S. Shiu, and W.-Y. Ma, “Extending the pre-training of bloom for improved support of traditional chinese: Models, methods and results,” 2023.
[17] S.-B. Luo, C.-C. Fan, K.-Y. Chen, Y. Tsao, H.-M. Wang, and K.-Y. Su, “Chinese movie dialogue question answering dataset,” in Proceedings of the 34th Conference on Computational Linguistics and Speech Processing (ROCLING 2022), Y.-C. Chang and Y.-C. Huang, Eds. Taipei, Taiwan: The Association for Computational Linguistics and Chinese Language Processing (ACLCLP), Nov. 2022, pp. 7–14. [Online]. Available: https://aclanthology.org/2022.rocling-1.2
[18] C.-J. Hsu, C.-L. Liu, F.-T. Liao, P.-C. Hsu, Y.-C. Chen, and D. shan Shiu, “Advancing the evaluation of traditional chinese language models: Towards a comprehensive benchmark suite,” 2023.
[19] Z.-R. Tam, Y.-T. Pai, Y.-W. Lee, S. Cheng, and H.-H. Shuai, “An improved traditional chinese evaluation suite for foundation model,” arXiv preprint arXiv:2403.01858, 2024.
[20] Y. Liu, Y. Yao, J.-F. Ton, X. Zhang, R. Guo, H. Cheng, Y. Klochkov, M. F. Taufiq, and H. Li, “Trustworthy llms: a survey and guideline for evaluating large language models’ alignment,” 2024.
[21] L. Sun, Y. Huang, H. Wang, S. Wu, Q. Zhang, Y. Li, C. Gao, Y. Huang, W. Lyu, Y. Zhang, X. Li, Z. Liu, Y. Liu, Y. Wang, Z. Zhang, B. Vidgen, B. Kailkhura, C. Xiong, C. Xiao, C. Li, E. Xing, F. Huang, H. Liu, H. Ji, H. Wang, H. Zhang, H. Yao, M. Kellis, M. Zitnik, M. Jiang, M. Bansal, J. Zou, J. Pei, J. Liu, J. Gao, J. Han, J. Zhao, J. Tang, J. Wang, J. Vanschoren, J. Mitchell, K. Shu, K. Xu, K.-W. Chang, L. He, L. Huang, M. Backes, N. Z. Gong, P. S. Yu, P.-Y. Chen, Q. Gu, R. Xu, R. Ying, S. Ji, S. Jana, T. Chen, T. Liu, T. Zhou, W. Wang, X. Li, X. Zhang, X. Wang, X. Xie, X. Chen, X. Wang, Y. Liu, Y. Ye, Y. Cao, Y. Chen, and Y. Zhao, “Trustllm: Trustworthiness in large language models,” 2024.
[22] T. Hagendorff, “Mapping the ethics of generative ai: A comprehensive scoping review,” 2024.
[23] H. Sun, Z. Zhang, J. Deng, J. Cheng, and M. Huang, “Safety assessment of chinese large language models,” 2023.
[24] G. Xu, J. Liu, M. Yan, H. Xu, J. Si, Z. Zhou, P. Yi, X. Gao, J. Sang, R. Zhang, J. Zhang, C. Peng, F. Huang, and J. Zhou, “Cvalues: Measuring the values of chinese large language models from safety to responsibility,” 2023.
[25] Z. Zhang, L. Lei, L. Wu, R. Sun, Y. Huang, C. Long, X. Liu, X. Lei, J. Tang, and M. Huang, “Safetybench: Evaluating the safety of large language models with multiple choice questions,” arXiv preprint arXiv: 2309.07045, 2023.
[26] W. Wang, Z. Tu, C. Chen, Y. Yuan, J. tse Huang, W. Jiao, and M. R. Lyu, “All languages matter: On the multilingual safety of large language models,” 2023.
[27] L. Xu, K. Zhao, L. Zhu, and H. Xue, “Sc-safety: A multi-round open-ended question adversarial safety benchmark for large language models in chinese,” 2023.
[28] K. Huang, X. Liu, Q. Guo, T. Sun, J. Sun, Y. Wang, Z. Zhou, Y. Wang, Y. Teng, X. Qiu, Y. Wang, and D. Lin, “Flames: Benchmarking value alignment of chinese large language models,” 2023.
[29] Y. Huang and D. Xiong, “Cbbq: A chinese bias benchmark dataset curated with human-ai collaboration for large language models,” 2023.
[30] 王甫昌, “台灣弱勢族群意識發展之歷史過程考察,” 台灣文學研究, no. 4, pp. 60–83, Jun 2013.
[31] 謝國斌, “台灣族群研究的發展,” 台灣原住民族研究學報, vol. 1, no. 1, pp. 1–27, Mar 2011.
[32] A. Caliskan, J. J. Bryson, and A. Narayanan, “Semantics derived automatically from language corpora contain human-like biases,” Science, vol. 356, no. 6334, p. 183–186, Apr. 2017. [Online]. Available: http://dx.doi.org/10.1126/science.aal4230
[33] A. Lauscher, G. Glavaš, S. P. Ponzetto, and I. Vulić, “A general framework for implicit and explicit debiasing of distributional word vector spaces,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 05, pp. 8131–8138, Apr. 2020. [Online]. Available: https://ojs.aaai.org/index.php/AAAI/article/view/6325
[34] 鍾佳玲, “族群通婚中的性別文化與權力配置,” Master’s thesis, 國立中央大學, Jan 2007.
[35] 王雯君, “婚姻對女性族群認同的影響-以台灣閩客通婚為例,” 思與言:人文 與社會科學期刊, vol. 43, no. 2, pp. 119–178, Jun 2005.
[36] 徐振傑, “女性商品,男性代言電視廣告中的“新"男性形象與再現意涵,” 傳 播與管理研究, vol. 3, no. 2, pp. 133–159, Jan 2004.
[37] 胡瓊勻, “汽車廣告所再現的「新女性」形象,” 文化事業與管理研究, vol. 23, no. 2, pp. 28–46, Apr 2023.
[38] 林吟鴻, “臺灣民眾性別角色態度之研究,” Master’s thesis, 國立臺灣大學, Jan 2011.
[39] 朱蘭慧, “男性性別角色刻板印象之形成與鬆動,” 應用心理研究, no. 17, pp.85–119, Mar 2003.
[40] 賴禹安, “受害不失男子氣概?遭偷拍男性的受害經驗初探,” Master’s thesis, 國 立政治大學, Jan 2021.
[41] 蔡沛毓, “不同世代男性在婚姻關係中的性別角色展演,” Master’s thesis, 台南應 用科技大學, Jan 2013.
[42] 高穎超, “做兵、 儀式、 男人類:台灣義務役男服役過程之陽剛氣質研究 (2000-2006),” Master’s thesis, 國立臺灣大學, Jan 2006.
[43] 孫子靖 and 呂明蓁, “「厭娘」與「拒 c」?—大專校院學生性別刻板印象之探 究,” 性別平等教育季刊, no. 96, pp. 153–156, Jan 2022.
[44] 林慧慈, “從粉色浪潮談刻板印象、偏見與歧視,” 清流雙月刊, no. 28, pp. 35–39, Jul 2020.
[45] 郭爵菀, “男性科技工程師婚姻觀之研究-男性研究的理論觀點,” Master’s thesis, 國立暨南國際大學, Jan 2009.
[46] 黃淑玲, “男子性與喝花酒文化:以 bourdieu 的性別支配理論為分析架構,” 台灣社會學, no. 5, pp. 73–132, Jun 2003.
[47] 台灣客家研究概論:. 行政院客家委員會, 2007. [Online]. Available: https://books.google.com.tw/books?id=_aVThrwC7lUC
[48] 蔡芬芳, “性別、 族群與客家研究,” 女學學誌:婦女與性別研究, no. 39, pp.165–203, Dec 2016.
[49] 馮建彰, “臺鐵客家人的工作與生活,” 客.觀, no. 4, pp. 35–57, Aug 2023.
[50] 王雯君, “客家邊界:客家意象的詮釋與重建,” 東吳社會學報, no. 18, pp.117–156, Jun 2005.
[51] 孫榮光, “電視綜藝節目的象徵暴力與客籍藝人的生存心態:以小鐘、澎澎為 例,” 人文社會科學研究, vol. 10, no. 4, pp. 23–43, Dec 2016.
[52] 莊雅涵, 王奕婷, and 吳偉立, “我們活在不同的台北?-台北政治暨文化性格: 1994-2002,” 政治科學論叢, no. 21, pp. 49–74, Sep 2004.
[53] 譚光鼎, “被扭曲的他者:教科書中原住民偏見的檢討,” 課程與教學, vol. 11, no. 4, pp. 27–49, Nov 2008.
[54] 呂枝益, “教科書中族群偏見的探討與革新,” 原住民教育季刊, no. 17, pp. 34–51, Feb 2000.
[55] 蘇船利, “當原住民學生遇到漢族老師,” 師友月刊, no. 468, pp. 40–43, Jun 2006.
[56] 童一寧, “外省第三代的國家認同,” Master’s thesis, 國立臺灣大學, Jan 2005.
[57] 張寧珈, “感性與理性:外省族群意識、民族認同與國家想像,” Master’s thesis, 國立臺灣大學, Jan 2018.
[58] 吳永毅, “香蕉. 豬公. 國:「返鄉」電影中的外省人國家認同,” 中外文學, vol. 22, no. 1, pp. 32–44, Jun 1993.
[59] 陳師孟, “外省族群與統獨迷思,” 新使者, no. 50, pp. 21–24, Feb 1999.
[60] J. Cohen, “A coefficient of agreement for nominal scales,” Educational and Psychological Measurement, vol. 20, no. 1, pp. 37–46, 1960. [Online]. Available: https://doi.org/10.1177/001316446002000104
[61] D. Colquhoun, “The reproducibility of research and the misinterpretation of p values,” Royal Society Open Science, vol. 4, p. 171085, 12 2017.
[62] H. Touvron, L. Martin, K. Stone, P. Albert, A. Almahairi, Y. Babaei, N. Bashlykov, S. Batra, P. Bhargava, S. Bhosale, D. Bikel, L. Blecher, C. C. Ferrer, M. Chen, G. Cucurull, D. Esiobu, J. Fernandes, J. Fu, W. Fu, B. Fuller, C. Gao, V. Goswami, N. Goyal, A. Hartshorn, S. Hosseini, R. Hou, H. Inan, M. Kardas, V. Kerkez, M. Khabsa, I. Kloumann, A. Korenev, P. S. Koura, M.-A. Lachaux, T. Lavril, J. Lee, D. Liskovich, Y. Lu, Y. Mao, X. Martinet, T. Mihaylov, P. Mishra, I. Molybog, Y. Nie, A. Poulton, J. Reizenstein, R. Rungta, K. Saladi, A. Schelten, R. Silva, E. M. Smith, R. Subramanian, X. E. Tan, B. Tang, R. Taylor, A. Williams, J. X. Kuan, P. Xu, Z. Yan, I. Zarov, Y. Zhang, A. Fan, M. Kambadur, S. Narang, A. Rodriguez, R. Stojnic, S. Edunov, and T. Scialom, “Llama 2: Open foundation and fine-tuned chat models,” 2023.
[63] A. Q. Jiang, A. Sablayrolles, A. Mensch, C. Bamford, D. S. Chaplot, D. de las Casas, F. Bressand, G. Lengyel, G. Lample, L. Saulnier, L. R. Lavaud, M.-A. Lachaux, P. Stock, T. L. Scao, T. Lavril, T. Wang, T. Lacroix, and W. E. Sayed, “Mistral 7b,” 2023.
[64] L. Ouyang, J. Wu, X. Jiang, D. Almeida, C. Wainwright, P. Mishkin, C. Zhang, S. Agarwal, K. Slama, A. Ray, J. Schulman, J. Hilton, F. Kelton, L. Miller, M. Simens, A. Askell, P. Welinder, P. F. Christiano, J. Leike, and R. Lowe, “Training language models to follow instructions with human feedback,” in Advances in Neural Information Processing Systems, S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, Eds., vol. 35. Curran Associates, Inc., 2022, pp. 27 730–27 744. [Online]. Available: https://proceedings.neurips.cc/paper_files/ paper/2022/file/b1efde53be364a73914f58805a001731-Paper-Conference.pdf
[65] Y.-T. Lin and Y.-N. Chen, “Taiwan llm: Bridging the linguistic divide with a culturally aligned language model,” 2023.
[66] C.-J. Hsu, C.-L. Liu, F.-T. Liao, P.-C. Hsu, Y.-C. Chen, and D.-S. Shiu, “Breeze-7b technical report,” 2024.
[67] TAIDE, “Taide-lx-7b-chat,” 2024. [Online]. Available: https://huggingface.co/taide/TAIDE-LX-7B-Chat
[68] L. Zheng, W.-L. Chiang, Y. Sheng, S. Zhuang, Z. Wu, Y. Zhuang, Z. Lin, Z. Li, D. Li, E. Xing et al., “Judging llm-as-a-judge with mt-bench and chatbot arena,” Advances in Neural Information Processing Systems, vol. 36, 2024.
[69] E. Ferrara, “Fairness and bias in artificial intelligence: A brief survey of sources, impacts, and mitigation strategies,” Sci, vol. 6, no. 1, p. 3, Dec. 2023. [Online]. Available: http://dx.doi.org/10.3390/sci6010003
[70] J. Giner-Miguelez, A. Gómez, and J. Cabot, “Describeml: a tool for describing machine learning datasets,” in Proceedings of the 25th International Conference on Model Driven Engineering Languages and Systems: Companion Proceedings, ser. MODELS ’22. New York, NY, USA: Association for Computing Machinery, 2022, p. 22–26. [Online]. Available: https://doi.org/10.1145/3550356.3559087
[71] A. Yohannis and D. Kolovos, “Towards model-based bias mitigation in machine learning,” in Proceedings of the 25th International Conference on Model Driven Engineering Languages and Systems, ser. MODELS ’22. New York, NY, USA: Association for Computing Machinery, 2022, p. 143–153. [Online]. Available: https://doi.org/10.1145/3550355.3552401
[72] K. Webster, X. Wang, I. Tenney, A. Beutel, E. Pitler, E. Pavlick, J. Chen, and S. Petrov, “Measuring and reducing gendered correlations in pre-trained models,” CoRR, vol. abs/2010.06032, 2020. [Online]. Available: https://arxiv.org/abs/2010.06032
[73] M. Nadeem, A. Bethke, and S. Reddy, “StereoSet: Measuring stereotypical bias in pretrained language models,” in Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), C. Zong, F. Xia, W. Li, and R. Navigli, Eds. Online: Association for Computational Linguistics, Aug. 2021, pp. 5356–5371. [Online]. Available: https://aclanthology.org/2021.acl-long.416
[74] Y. Cui, Z. Yang, and X. Yao, “Efficient and effective text encoding for chinese llama and alpaca,” 2023.
[75] I. Loshchilov and F. Hutter, “Decoupled weight decay regularization,” in International Conference on Learning Representations, 2019. [Online]. Available: https://openreview.net/forum?id=Bkg6RiCqY7 |