Deep Learning Approaches for Detecting Text Generated by Artificial Intelligence

Authors

  • David BIRIS Faculty of Mathematics and Computer Science, Babeș-Bolyai University, Cluj-Napoca, Romania. Email: david.biris1@stud.ubbcluj.ro

DOI:

https://doi.org/10.24193/subbi.2024.2.03

Keywords:

machine learning, chatbot, AIGT, detection

Abstract

Large language models have been a hot topic for discussion and research for quite a few years, allowing them to infiltrate in many industries, especially education. Their rise in popularity among students was caused by their vast capabilities in giving quick and reliable answers to questions on any topic. The use of these models for the purpose of generating schoolwork can be seen as a challenge to academic integrity. We investigate the development of AI capable of detecting AI-generated texts and explore with training different types of deep learning models, on a mixed dataset, containing essays, both human written and AI-generated, as well as movie reviews and books. We experimented with LSTM (Long short- term memory) and fine-tuning transformer-based models. We achieve results close to the state of the art, and, in some cases, we surpass a few of these models. For instance, one of our models surpasses a state-of-the-art model on a set of both student written and generated essays, in terms of accuracy by up to 5%, and F1 score by up to 4%, in two different experiments. Furthermore, another model of ours surpasses a state of the art model on a set of essays, but this time only in terms of precision, by only 1%. These results indicate the potential of properly fine-tuned transformer-based models, as well as the importance of a well-prepared dataset.

Received by editors: 31 July 2024

2010 Mathematics Subject Classification. 68P15, 94A12

1998 CR Categories and Descriptors. I.2.7 [Artificial Intelligence]: Natural Language Processing – Text Analysis; I.2.6 [Artificial Intelligence]: Learning – Deep Learning; H.3.1 [Information Storage and Retrieval]: Content Analysis and Indexing – Content Analysis and Feature Selection

References

1. Bao, G., Zhao, Y., Teng, Z., Yang, L., and Zhang, Y. Fast-detectgpt: Efficient zero-shot detection of machine-generated text via conditional probability curvature, 2024.

2. Biewald, L. Experiment tracking with weights and biases, 2020. Software available from wandb.com.

3. Center, P. R. About 1 in 5 u.s. teens who’ve heard of chatgpt have used it for school- work, November 2023.

4. Cheplukov, A. Raw Ielts essays, 2024. Retrieved May 10, 2024 from https://www. kaggle.com/datasets/arsenycheplukov/raw-ielts-essays.

5. Cipriano, B. P., and Alves, P. "ChatGPT is here to help, not to replace anybody" – an evaluation of students’ opinions on integrating ChatGPT in cs courses, 2024.

6. Crossley, S. A., Baffour, P., Tian, Y., Picou, A., Benner, M., and Boser, U. The persuasive essays for rating, selecting, and understanding argumentative and discourse elements (persuade) corpus 1.0. Assessing Writing 54 (2022). https://doi.org/10.1016/j.asw.2022.100667.

7. Demir, E. Daigt gemini-pro 8.5k essays, 2023. Retrieved May 10, 2024 from https://www.kaggle.com/datasets/datafan07/daigt-gemini-pro-8-5k-essays.

8. Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding, 2019.

9. Face, H. Natural language processing course, chapter 6. https://huggingface.co/learn/nlp-course/en/chapter6/6, 2023.

10. Gallego, V. Alpaca-gpt4 dataset, 2023. Retrieved May 10, 2024 from https://huggingface.co/datasets/vicgalle/alpaca-gpt4.

11. Gerami, S. Ai vs human text, 2024. Retrieved May 10, 2024 from https://www.kaggle.com/datasets/shanegerami/ai-vs-human-text.

12. Goodfellow, I. J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., and Bengio, Y. Generative adversarial networks, 2014.

13. He, P., Liu, X., Gao, J., and Chen, W. Deberta: Decoding-enhanced bert with disentangled attention, 2021.

14. Hochreiter, S., and Schmidhuber, J. Long short-term memory. Neural computation 9, 8 (1997), 1735–1780.

15. Hu, X., Chen, P.-Y., and Ho, T.-Y. Radar: Robust ai-text detection via adversarial learning, 2023.

16. Ibrahim, M. Ielts writing scored essays dataset, 2023. Retrieved May 10, 2024 from https://www.kaggle.com/datasets/mazlumi/ielts-writing-scored-essays-dataset.

17. Kłeczek, D. Daigt-v4-train-dataset, 2024. Retrieved May 10, 2024 from https://www. kaggle.com/datasets/thedrcat/daigt-v4-train-dataset/data.

18. Li, L., Wang, P., Ren, K., Sun, T., and Qiu, X. Origin tracing and detecting of llms, 2023.

19. Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., and Stoyanov, V. Roberta: A robustly optimized bert pretraining approach, 2019.

20. Liu, Y., Zhang, Z., Zhang, W., Yue, S., Zhao, X., Cheng, X., Zhang, Y., and Hu, H. Argugpt: evaluating, understanding and identifying argumentative essays generated by GPT models, 2023.

21. Mitchell, E., Lee, Y., Khazatsky, A., Manning, C. D., and Finn, C. DetectGPT: Zero-shot machine-generated text detection using probability curvature, 2023.

22. OpenAI. Gpt-4. https://openai.com/index/gpt-4-research, 2023.

23. Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Köpf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., Bai, J., and Chintala, S. Pytorch: An imperative style, high-performance deep learning library, 2019.

24. Pathi, L. N. IMDB dataset of 50k movie reviews, 2019. Retrieved May 10, 2024 from https://www.kaggle.com/datasets/lakshmi25npathi/imdb-dataset-of-50k-movie-reviews.

25. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., and Duchesnay, E. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research 12 (2011), 2825–2830.

26. Peng, B., Li, C., He, P., Galley, M., and Gao, J. Instruction tuning with GPT-4, 2023.

27. Radford, A., Narasimhan, K., Salimans, T., and Sutskever, I. Improving language understanding with unsupervised learning. OpenAI Blog (June 2018).

28. Solaiman, I., Brundage, M., Clark, J., Askell, A., Herbert-Voss, A., Wu, J., Radford, A., Krueger, G., Kim, J. W., Kreps, S., McCain, M., Newhouse, A., Blazakis, J., McGuffie, K., and Wang, J. Release strategies and the social impacts of language models, 2019.

29. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., and Polosukhin, I. Attention is all you need, 2023.

30. Wang, P., Li, L., Ren, K., Jiang, B., Zhang, D., and Qiu, X. Seqxgpt: Sentence-level AI-generated text detection, 2023.

Downloads

Published

2025-03-09

How to Cite

BIRIS, D. (2025). Deep Learning Approaches for Detecting Text Generated by Artificial Intelligence. Studia Universitatis Babeș-Bolyai Informatica, 69(2), 39–58. https://doi.org/10.24193/subbi.2024.2.03

Issue

Section

Articles

Similar Articles

1 2 3 4 5 > >> 

You may also start an advanced similarity search for this article.