Event extraction based on self-data augmentation with large language models (original) (raw)

References

  1. Gao J, Zhao H, Yu C, Xu R (2023) Exploring the feasibility of chatgpt for event extraction. arXiv preprint arXiv:2303.03836
  2. Feng SY, Gangal V, Wei J, Chandar S, Vosoughi S, Mitamura T, Hovy E (2021) A survey of data augmentation approaches for nlp. arXiv preprint arXiv:2105.03075
  3. Mikołajczyk A, Grochowski M (2018) Data augmentation for improving deep learning in image classification problem. In: 2018 international interdisciplinary PhD workshop (IIPhDW), pp 117–122. IEEE
  4. Wei J, Zou K (2019) Eda: easy data augmentation techniques for boosting performance on text classification tasks. arXiv preprint arXiv:1901.11196
  5. Zhang M, Jiang G, Liu S, Chen J, Zhang M (2024) Llm–assisted data augmentation for chinese dialogue–level dependency parsing. Comput Linguist 1–24
  6. Møller AG, Pera A, Dalsgaard J, Aiello L (2024) The parrot dilemma: human-labeled vs. llm-augmented data in classification tasks. In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 2: Short Papers), pp 179–192
  7. LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
    Article MATH Google Scholar
  8. Coulombe C (2018) Text data augmentation made simple by leveraging nlp cloud apis. arXiv preprint arXiv:1812.04718
  9. Gül GŞ, Steedman M (2019) Data augmentation via dependency tree morphing for low-resource languages. arXiv preprint arXiv:1903.09460
  10. Kruspe A, Kersten J, Wiegmann M, Stein B, Klan F (2018) Classification of incident-related tweets: tackling imbalanced training data using hybrid cnns and translation-based data augmentation. Notebook papers of TREC
  11. Chen H, Yao X (2010) Multiobjective neural network ensembles based on regularized negative correlation learning. IEEE Trans Knowl Data Eng 22(12):1738–1751
    Article MATH Google Scholar
  12. Wang X, Lyu S, Yang L, Zhan Y, Chen H (2024) A dual-module framework for counterfactual estimation over time. In: The 41st International Conference on Machine Learning. JMLR
  13. Sennrich R, Haddow B, Birch A (2015) Improving neural machine translation models with monolingual data. arXiv preprint arXiv:1511.06709
  14. Zhang H, Cisse M, Dauphin YN, Lopez-Paz D (2017) mixup: beyond empirical risk minimization. arXiv preprint arXiv:1710.09412
  15. Rizos G, Hemker K, Schuller B (2019) Augment to prevent: short-text data augmentation in deep learning for hate-speech classification. In: Proceedings of the 28th ACM international conference on information and knowledge management, pp 991–1000
  16. Chen H, Zhang L, Wang Z, Chang H, Xie X, Liangzheng F, Zhang Y, Quan F (2019) Resveratrol improved the developmental potential of oocytes after vitrification by modifying the epigenetics. Mol Reprod Dev 86(7):862–870
    Article Google Scholar
  17. Wang X, Ban T, Chen L, Usman M, Tianhao W, Chen Q, Chen H (2023) A distribution-based representation of knowledge quality. Knowl-Based Syst 281:111054
    Article MATH Google Scholar
  18. Sun X, He J (2020) A novel approach to generate a large scale of supervised data for short text sentiment analysis. Multimed Tools Appl 79(9):5439–5459
    Article MATH Google Scholar
  19. Qiu S, Binxia X, Zhang J, Wang Y, Shen X, De Melo G, Long C, Li X (2020) Easyaug: an automatic textual data augmentation platform for classification tasks. In: Companion proceedings of the web conference 2020:249–252
  20. Chen H, Tang F, Tino P, Yao X (2013) Model-based kernel for efficient time series analysis. In: Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 392–400
  21. Wang X, Chen L, Ban T, Usman M, Guan Y, Liu S, Tianhao W, Chen H (2021) Knowledge graph quality control: a survey. Fund Res 1(5):607–626
    MATH Google Scholar
  22. Chen H, Tiňo P, Rodan A, Yao X (2013) Learning in the model space for cognitive fault diagnosis. IEEE Trans Neural Netw Learn Syst 25(1):124–136
    Article MATH Google Scholar
  23. Wang X, Ban T, Chen L, Usman M, Guan Y, Lyu D, Cheng J, Chen H, Leung C, Miao C (2023) Decentralised knowledge graph evolution via blockchain. IEEE Trans Serv Comput
  24. Chen H, Tang F, Tino P, Cohn AG, Yao X (2015) Model metric co-learning for time series classification. In: Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, pp 3387–3394. AAAI Press
  25. Wang X, Chen L, Ban T, Lyu D, Guan Y, Xingyu W, Zhou X, Chen H (2023) Accurate label refinement from multiannotator of remote sensing data. IEEE Trans Geosci Remote Sens 61:1–13
    Article MATH Google Scholar
  26. Walker M, Ji H, Stent A (2018) Proceedings of the 2018 conference of the north american chapter of the association for computational linguistics: Human language technologies, volume 1 (long papers). In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Vol 1 (Long Papers)
  27. Chen H, Zhang H, Tang W, Xi Q, Xiaoyun L, Yu D, Chao L (2013) Thyroid function and morphology in overweight and obese children and adolescents in a chinese population. J Pediatr Endocrinol Metab 26(5–6):489–496
    Google Scholar
  28. Wang X, Li Y, Ban T, Zhu J, Chen L, Usman M, Wang X, Chen H, Chen X, Leung C et al (2022) Dynamic link prediction for discovery of new impactful covid-19 research approaches. IEEE J Biomed Health Inform 26(12):5883–5894
    Article MATH Google Scholar
  29. Devlin J, Chang M-W, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805
  30. Anaby-Tavor A, Carmeli B, Goldbraich E, Kantor A, Kour G, Shlomov S, Tepper N, Zwerdling N (2020) Do not have enough data? deep learning to the rescue! In: Proceedings of the AAAI conference on artificial intelligence 34:7383–7390
  31. Chen H, Tiňo P, Yao X (2014) Cognitive fault diagnosis in Tennessee Eastman process using learning in the model space. Comput Chem Eng 67:33–42
    Article MATH Google Scholar
  32. Wang X, Lyu S, Xingyu W, Tianhao W, Chen H (2022) Generalization bounds for estimating causal effects of continuous treatments. Adv Neural Inform Process Syst 35:8605–8617
    MATH Google Scholar
  33. Lee K, Guu K, He L, Dozat T, Chung HW (2021) Neural data augmentation via example extrapolation. arXiv preprint arXiv:2102.01335
  34. Brown T, Mann B, Ryder N, Subbiah M, Kaplan JD, Dhariwal P, Neelakantan A, Shyam P, Sastry G, Askell A et al (2020) Language models are few-shot learners. Adv Neural Inform Process Syst 33:1877–1901
    Google Scholar
  35. Yoo KM, Park D, Kang J, Lee S-W, Park W (2021) Gpt3mix: leveraging large-scale language models for text augmentation. arXiv preprint arXiv:2104.08826
  36. Kalpakchi D, Boye J (2023) Quasi: a synthetic question-answering dataset in swedish using gpt-3 and zero-shot learning. In: The 24th Nordic Conference on Computational Linguistics (NoDaLiDa 2023), 22-24 May 2023, Tórshavn, Faroe Islands, pp 477–491
  37. Samuel V, Aynaou H, Chowdhury AG, Ramanan KV, Chadha A (2023) Can llms augment low-resource reading comprehension datasets? opportunities and challenges. arXiv preprint arXiv:2309.12426
  38. Bai J, Bai S, Chu Y, Cui Z, Dang K, Deng X, Fan Y, Ge W, Han Y, Huang F, Hui B, Ji L, Li M, Lin J, Lin R, Liu D, Liu G, Lu C, Lu K, Ma J, Men R, Ren X, Ren X,Tan C, Tan S, Tu J, Wang P, Wang S, Wang W, Wu S, Xu B, Xu J, Yang A, Yang H, Yang J, Yang S, Yao Y, Yu B, Yuan H, Yuan Z, Zhang J, Zhang X, Zhang Y, Zhang Z, Zhou C, Zhou J, Zhou X, Zhu T (2023)Qwen technical report. arXiv preprint arXiv:2309.16609
  39. Zhao X, Li M, Lu W, Weber C, Lee JH, Chu K, Wermter S (2023) Enhancing zero-shot chain-of-thought reasoning in large language models through logic. arXiv preprint arXiv:2309.13339

Download references