CRIE: A tool for analyzing Chinese text characteristics
No Thumbnail Available
Date
2012-11-15
Authors
Chen, J. L.
Cha, J. H. Cha
Chang, T. H.
Sung, Y. T.
Hsieh, K. S.
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Studies focusing on language analyses of alphabetic writing systems have been researched for more than 40 years. Abundant and advanced research outcomes and practical needs have made automated text analyzing tools possible; however, such tools designed for Chinese writing system are insufficient and scarce. Following the latest trend of multi-level analyses of text features (Graesser et al., 2004), we develop a tool called Chinese Readability Indices Explorer, CRIE, which can extract 90 features based on features of Chinese characters, words, syntax, and cohesion. The modules used in CRIE include lexicons, segmentation, syntactic parsers, corpora, latent semantic analysis, and other components that are widely used in computational linguistics. Not only does CRIE provide multi-level linguistic feature analyses, CRIE is also able to deal with literary Chinese and domain-specific texts. CRIE provides outputs on measures of individual linguistic features as well as providing formulas for different text domains , age groups.