Hidden Markov Model for Text Analysis Student : Tun Tao Tsai Advisor : Dr. Mark Stamp Committee member : Dr. Jeff Smith Committee Member : Dr. Chris Pollett Department puter Science, San Jose State University Email: joetsai@ Hidden Markov Model for Text Analysis 1 Abstract 3 1. Introduction 4 2. The Basic probability theory 6 2. The Cave and Neuwirth experiment 7 3. Hidden Markov model 13 The Markov property 14 The Hidden Markov model definition 15 Finding the probability of the observed sequence 15 The forward-backward algorithm 16 The forward recursion 17 The backward recursion 18 Choosing the best state sequence 19 Parameter re-estimation 19 4. Chinese information processing 21 5. Phonology 22 Chinese phonemic transcription 22 English phoneme transcription 24 6. Experiment design and the software 25 Number of iterations 25 Numbers of states 26 Chinese corpus 26 The software 27 7. Experiment results 28 English alphabet experiment results 28 English phoneme results 28 English phoneme experiment using 2 States 29 English phoneme experiment with more than two states 29 Chinese characters experiment result 30 Zhuyin experiment results 31 Entropy 33 8. Summary and conclusions 35 9. Future work 35 Reference 36 Appendix 1 : Experiment results 38 Appendix 2: Entropy experiment results 48 Appendix 3 : Brown University corpus. 51 Appendix 4. Chinese character encoding 52 Appendix 5: Zhuyin – Pinyin conversion table 54 Appendix 6: CMU pronouncing dictionary phoneme chart 55 Appendix 7 : Trial experiment result to determine the number of iterations to use. 56 Abstract In the field of Natural Language processing, the Hidden Markov Model (hereafter as HMM) method is proven to be useful in the application area of finding patterns from sequence of data. In this study, we apply HMM technique to the Brown corpus [1], the Brown corpus in i
Hidden arkov Model for Text Analysis 来自淘豆网m.daumloan.com转载请标明出处.