Tuesday, December 9, 2008

Good HTK Tutorial in Chinese

I found a good resource for HTK demo.
All of materials are belong to the author of the blog and HTK team.
Author is 苏统华, 哈尔滨工业大学人工智能研究室, 2006年10月30日

According to the Chapter 3 of HTKBook, above 4 URLs is following the below steps.

Section 1 is Data Preparation - 資料準備
  • Step 1 - the Task Grammer 辨識模型要用的文法資料(gram->wdnet)
  • Step 2 - the Dictionary 辨識模型要用的字典資料 (利用字典將wlist裡的單字翻譯為phones字串, one -> w ah n)
  • Step 3 - Recording the Data 錄製辨識用的語音檔 (產生劇本並依照劇本錄製*.wav)
  • Step 4 - Creating the Transcription Files 建立翻譯檔 (根據字典翻譯實驗資料)
  • Step 5 - Coding the Data 將語音檔編碼 (*.wav->*.mfcc)
Section 2 is Creating Monophone HMMs
  • Step 6 - Creating Flat Start Monophones
  • Step 7 - Fixing the Slience Models
  • Step 8 - Realigning the Training Data
Section 3 is Creating Tied-State Triphones
  • Step 9 - Making Triphones from Monophones
  • Step 10 - Making Tied-State Triphones
Section 4 is Recogniser Evaluation
  • Step 11 - Recognising the Test Data
... so on.
It's very useful. Really appreciate the author to save a lot's time for me.
I plan to write a new one following the tutorial with my discoveries, my questions, and my opinions.
I am a freshman of using HTK. Please feel free to give me comments or suggestions. 

6 comments:

Unknown said...

hi Nice to meet u, Howard^^
My name is Lip..i am a master research student from Malaysia..now i am doing the research about the speech therapy training for speech disorders...i like to apply htk engine in my system building..i tried to find various kind of source..actually really appreciate your sharing of this blog..haha learning a lot from your blog...now i try to run htk book chapter 3..i have implemented successfully until the step 10 which is Making Tied-State Triphones ...
but i still can't find a solution which is how to display recognition result in phoneme level?what i got the result all are shown in word-based level..i like to know which phoneme is recognize wrongly in phoneme level..is it possible to do it..?
Howard, i really appreciate your sharing and waiting for your favorable reply^^thanks..

Unknown said...

HI, Lip. I am happy that the information is helpful to you. Unfortunately, I have a long time not to touch HTK and almost research material and books left at another place. I am sure that you can get result in phoneme level and you have to code some script or code to help to get it. Sorry about my late reply.

Unknown said...

Happy to see your reply..thanks Howard ^^
However...if got any related information when you get it about this ..could you please send them to me..? we can contact each others through email..my email and msn is weelipchin@yahoo.com..happy to know you yup..

Regards From Lip

Unknown said...
This comment has been removed by the author.
Unknown said...

Hi Howard ...what you mean.. is it that i need to code the perl script or code myself(add code)?Or htk has this function in engine already..what i need to do is find out this script?
(Using perl script such as mkclscript in htktutorial directory?)

Unknown said...

Hello, Lip.
I can't remember the demo code including process codes in the phoneme level. I think it's not.
That what I mean you may need to code to show the results and calculate the related information.

What I have is all from Internet. You will get more than me because I have not focus on this topic for a long while. :P

Clicky

Clicky Web Analytics