Developing Simplified Chinese Psychological Linguistic Analysis Dictionary for Microblog (original) (raw)

Abstract

The words that people use could reveal their emotional states, intentions, thinking styles, individual differences, etc. LIWC (Linguistic Inquiry and Word Count) has been widely used for psychological text analysis, and its dictionary is the core. The Traditional Chinese version of LIWC dictionary has been released, which is a translation of LIWC English dictionary. However, Simplified Chinese which is the world’s most widely used language has subtle differences with Traditional Chinese. Furthermore, both English LIWC dictionary and Traditional Chinese version dictionary were both developed for relatively formal text. Microblog has become more and more popular in China nowadays. Original LIWC dictionaries take less consideration on microblog popular words, which makes it less applicable for text analysis on microblog. In this study, a Simplified Chinese LIWC dictionary is established according to LIWC categories. After translating Traditional Chinese dictionary into Simplified Chinese, five thousand words most frequently used in microblog are added into the dictionary. Four graduate students of psychology rated whether each word belonged in a category. The reliability and validity of Simplified Chinese LIWC dictionary were tested by these four judges. This new dictionary could contribute to all the text analysis on microblog in future.

Preview

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Kosinski, M., Stillwell, D., Graepel, T.: Private traits and attributes are predictable from digital records of human behavior. Proceedings of the National Academy of Sciences 110(15), 5802–5805 (2013)
    Article Google Scholar
  2. Tumasjan, A., et al.: Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment. In: ICWSM, pp. 178–185 (2010)
    Google Scholar
  3. Ding, X., et al.: De-anonymizing Dynamic Social Networks. In: 2011 IEEE Global Telecommunications Conference, Globecom 2011 (2011)
    Google Scholar
  4. Ebner, M., et al.: Microblogs in Higher Education - A chance to facilitate informal and process-oriented learning? Computers & Education 55(1), 92–100 (2010)
    Article Google Scholar
  5. Eysenbach, G.: Infodemiology and Infoveillance: Framework for an Emerging Set of Public Health Informatics Methods to Analyze Search, Communication and Publication Behavior on the Internet. Journal of Medical Internet Research 11(1) (2009)
    Google Scholar
  6. Jansen, B.J., et al.: Twitter Power: Tweets as Electronic Word of Mouth. Journal of the American Society for Information Science and Technology 60(11), 2169–2188 (2009)
    Article Google Scholar
  7. Narayanan, A., Shmatikov, V.: De-anonymizing Social Networks. In: Proceedings of the 2009 30th IEEE Symposium on Security and Privacy, pp. 173–187 (2009)
    Google Scholar
  8. Pennebaker, J.W., et al.: The Development and Psychometric Properties of LIWC2007 (2007)
    Google Scholar
  9. Tausczik, Y.R., Pennebaker, J.W.: The Psychological Meaning of Words: LIWC and Computerized Text Analysis Methods. Journal of Language and Social Psychology 29(1), 24–54 (2010)
    Article Google Scholar
  10. Choy, M.: Effective Listings of Function Stop words for Twitter (IJACSA) International Journal of Advanced Computer Science and Applications 3(6), 8–11 (2012)
    Google Scholar
  11. Golbeck, J., Robles, C., Turner, K.: Predicting personality with social media. In: CHI 2011 Extended Abstracts on Human Factors in Computing Systems, pp. 253–262. ACM, Vancouver (2011)
    Chapter Google Scholar
  12. Golbeck, J., Robler, J., Edmondson, M., Turner, K.: Predicting Personality from Twitter. In: 2011 IEEE Third International Conference on Privacy, Security, Risk and Trust (PASSAT) and 2011 IEEE Third International Conference on Social Computing (SocialCom), Boston, USA, pp. 149–156 (2011)
    Google Scholar
  13. Piolat, A., et al.: The French dictionary for LIWC: Modalities of construction and examples of use. Psychologie Francaise 56(3), 145–159 (2011)
    Article Google Scholar
  14. Huang, C.-L., et al.: The Development of the Chinese Linguistic Inquiry and Word Count Dictionary. Chinese Journal of Psychology 55(2), 185–201 (2012)
    Google Scholar
  15. Lowe, W.: Software for content analysis–A review (2013)
    Google Scholar
  16. Borelli, J.L., et al.: Experiential connectedness in children’s attachment interviews: An examination of natural word use. Personal Relationships 18(3), 341–351 (2011)
    Article MathSciNet Google Scholar
  17. Ireland, M.E., Pennebaker, J.W.: Language Style Matching in Writing: Synchrony in Essays, Correspondence, and Poetry. Journal of Personality and Social Psychology 99(3), 549–571 (2010)
    Article Google Scholar
  18. Ireland, M.E., et al.: Language Style Matching Predicts Relationship Initiation and Stability. Psychological Science 22(1), 39–44 (2011)
    Article MathSciNet Google Scholar
  19. Tumasjan, A., et al.: Election Forecasts With Twitter: How 140 Characters Reflect the Political Landscape. Social Science Computer Review 29(4), 402–418 (2011)
    Article Google Scholar
  20. Zehrer, A., Crotts, J.C., Magnini, V.P.: The perceived usefulness of blog postings: An extension of the expectancy-disconfirmation paradigm. Tourism Management 32(1), 106–113 (2011)
    Article Google Scholar
  21. Peng, G., Minett, J.W., Wang, W.S.Y.: Cultural background influences the liminal perception of Chinese characters: An ERP study. Journal of Neurolinguistics 23(4), 416–426 (2010)
    Article Google Scholar
  22. Chung, F.H.-K., Leung, M.-T.: Data analysis of Chinese characters in primary school corpora of Hong Kong and mainland China: preliminary theoretical interpretations. Clinical Linguistics & Phonetics 22(4-5), 379–389 (2008)
    Article Google Scholar
  23. Chung, W.Y., et al.: Internet searching and browsing in a multilingual world: An experiment on the Chinese Business Intelligence Portal (CBizPort). Journal of the American Society for Information Science and Technology 55(9), 818–831 (2004)
    Article Google Scholar
  24. Ramirez-Esparza, N., et al.: The psychology of word use: A computer program that analyzes texts in Spanish. Revista Mexicana De Psicologia 24(1), 85–99 (2007)
    Google Scholar
  25. Akers, G.A.: LogoMedia TRANSLATE (TM), version 2.0. In: Richardson, S.D. (ed.) Machine Translation: From Research to Real Users, pp. 220–223 (2002)
    Google Scholar
  26. Al-Dubaee, S.A., Ahmad, N.: New Direction of Applied Wavelet Transform in Multilingual Web Information Retrieval. In: Fifth International Conference on Fuzzy Systems and Knowledge Discovery, FSKD 2008 (2008)
    Google Scholar
  27. Zhang, H.-P., et al.: Chinese lexical analysis using hierarchical hidden markov model. In: Proceedings of the Second SIGHAN Workshop on Chinese Language Processing, vol. 17. Association for Computational Linguistics (2003)
    Google Scholar
  28. Zhang, H.-P., et al.: HHMM-based Chinese lexical analyzer ICTCLAS. In: Proceedings of the Second SIGHAN Workshop on Chinese Language Processing, vol. 17. Association for Computational Linguistics (2003)
    Google Scholar

Download references

Author information

Authors and Affiliations

  1. Institute of Psychology, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Beijing, 100190, P.R. China
    Rui Gao, Bibo Hao, Yusong Gao & Tingshao Zhu
  2. National Computer System Engineering Research Institute of China, Beijing, 100083, P.R. China
    He Li

Authors

  1. Rui Gao
    You can also search for this author inPubMed Google Scholar
  2. Bibo Hao
    You can also search for this author inPubMed Google Scholar
  3. He Li
    You can also search for this author inPubMed Google Scholar
  4. Yusong Gao
    You can also search for this author inPubMed Google Scholar
  5. Tingshao Zhu
    You can also search for this author inPubMed Google Scholar

Editor information

Editors and Affiliations

  1. Department of Systems Life Engineering, Maebashi Institute of Technology, 460-1 Kamisadori-cho, 371-0816, Maebashi, Gunma, Japan
    Kazayuki Imamura
  2. Electronics-Inspired Interdisciplinary Research Institute, Toyohashi University of Technology, 1-1 Hibarigaoka Tenpaku-cho, 441-8580, Toyohashi, Aichi, Japan
    Shiro Usui
  3. Department of Neurobiology and Behavior, Gunma University Graduate School of Medicine, 39-22 Showamachi 3-chome, 371-8511, Maebashi, Gunma, Japan
    Tomoaki Shirao
  4. The Smith Kettlewell Eye Research Institute, 2318 Fillmore Street, 94115, San Francisco, CA, USA
    Takuji Kasamatsu
  5. University of Rostock, 18051, Rostock, Germany
    Lars Schwabe
  6. Department of Life Science and Informatics, Maebashi Institute of Technology, 460-1 Kamisadori-cho, 371-0816, Maebashi, Gunma, Japan
    Ning Zhong

Rights and permissions

© 2013 Springer International Publishing Switzerland

About this paper

Cite this paper

Gao, R., Hao, B., Li, H., Gao, Y., Zhu, T. (2013). Developing Simplified Chinese Psychological Linguistic Analysis Dictionary for Microblog. In: Imamura, K., Usui, S., Shirao, T., Kasamatsu, T., Schwabe, L., Zhong, N. (eds) Brain and Health Informatics. BHI 2013. Lecture Notes in Computer Science(), vol 8211. Springer, Cham. https://doi.org/10.1007/978-3-319-02753-1\_36

Download citation

Publish with us