tag:blogger.com,1999:blog-24286077.comments2023-09-13T05:43:54.943-07:00The Grand Janitor's BlogArthur Chanhttp://www.blogger.com/profile/18162527494132410362noreply@blogger.comBlogger64125tag:blogger.com,1999:blog-24286077.post-91648447542636905262016-04-29T02:51:10.936-07:002016-04-29T02:51:10.936-07:00What you are thinking perhaps is to add a duration...What you are thinking perhaps is to add a duration model into HMM. Basically means that instead of having a geometric distribution of length, you want to have certain distribution of length. <br /><br />The easiest way, while not the best, is perhaps using N-best rescoring with your duration model. Say first you dump 100-best out of your decoder, then try to use your duration model to score Arthur Chanhttps://www.blogger.com/profile/18162527494132410362noreply@blogger.comtag:blogger.com,1999:blog-24286077.post-85302322634332484592016-04-06T18:18:38.701-07:002016-04-06T18:18:38.701-07:00How introducing phoneme length parameter into stan...How introducing phoneme length parameter into standard HMM ?<br />Knowing the length of a specific phoneme can be used to limit the length of the HMM chain that represent it, which in turn increase the speed of recognition and its accuracy, so, how introducing phoneme length parameter into standard HMM model and producing new one to be used in building accurate ASR system using HTK?<br /> thanks Anonymoushttps://www.blogger.com/profile/06605546836509561336noreply@blogger.comtag:blogger.com,1999:blog-24286077.post-91972877561073595902014-12-23T02:38:46.439-08:002014-12-23T02:38:46.439-08:00This comment has been removed by a blog administrator.Anonymoushttps://www.blogger.com/profile/08754269016537766537noreply@blogger.comtag:blogger.com,1999:blog-24286077.post-51945959764124103532013-11-16T16:35:00.802-08:002013-11-16T16:35:00.802-08:00Hint: Write one yourself.
The algorithm is in...Hint: Write one yourself. <br /><br />The algorithm is in the literature and if you look at the tools, it is quite unlikely you want to replicate what HTK or sphinx already do. <br /><br />ArthurArthur Chanhttps://www.blogger.com/profile/18162527494132410362noreply@blogger.comtag:blogger.com,1999:blog-24286077.post-55075769884253563112013-11-16T16:33:03.716-08:002013-11-16T16:33:03.716-08:00Hey Richard,
Both HTK and Sphinx is great toolki...Hey Richard, <br /><br />Both HTK and Sphinx is great toolkits. If you are into ASR, read the source codes of both, I believe you will learn great insights (as well as idiosyncrasies) in the process. <br /><br />Of course, don't forget Kaldi!<br /><br />ArthurArthur Chanhttps://www.blogger.com/profile/18162527494132410362noreply@blogger.comtag:blogger.com,1999:blog-24286077.post-5908517035779201702013-11-09T16:15:35.485-08:002013-11-09T16:15:35.485-08:00This comment has been removed by a blog administrator.Ổn Áp Biến Áp Standa Chính Hãnghttps://www.blogger.com/profile/10247632096149994792noreply@blogger.comtag:blogger.com,1999:blog-24286077.post-10162801039962358882013-11-08T16:43:02.794-08:002013-11-08T16:43:02.794-08:00This comment has been removed by a blog administrator.Anonymoushttps://www.blogger.com/profile/12733293639533290664noreply@blogger.comtag:blogger.com,1999:blog-24286077.post-2403499352175152972013-11-05T20:24:16.922-08:002013-11-05T20:24:16.922-08:00Thanks. Your post is a lot of fun. I will look at ...Thanks. Your post is a lot of fun. I will look at it much more carefully today. I have used sphinx - perhaps only the pre-compiled demonstrations. ANyway seeing your blog has inspired me to have another look at Sphinx.<br /><br />Richard Mullins<br />telfer cronoshttps://www.blogger.com/profile/17855431611611929232noreply@blogger.comtag:blogger.com,1999:blog-24286077.post-24197944110462806012013-11-05T20:24:03.606-08:002013-11-05T20:24:03.606-08:00Thanks. Your post is a lot of fun. I will look at ...Thanks. Your post is a lot of fun. I will look at it much more carefully today. I have used sphinx - perhaps only the pre-compiled demonstrations. ANyway seeing your blog has inspired me to have another look at Sphinx.<br /><br />Richard Mullins<br />telfer cronoshttps://www.blogger.com/profile/17855431611611929232noreply@blogger.comtag:blogger.com,1999:blog-24286077.post-89514062209186938412013-07-06T12:51:35.941-07:002013-07-06T12:51:35.941-07:00Hey Peter,
Thanks for the note! I have updated t...Hey Peter, <br /><br />Thanks for the note! I have updated the link. Come back from time to time!<br /><br />ArthurArthur Chanhttps://www.blogger.com/profile/18162527494132410362noreply@blogger.comtag:blogger.com,1999:blog-24286077.post-18076227302906175712013-07-06T04:49:04.959-07:002013-07-06T04:49:04.959-07:00Hi Arthur,
Simon developer here. Thanks for the k...Hi Arthur,<br /><br />Simon developer here. Thanks for the kind words!<br /><br />Simon now has a new homepage at http://simon.kde.org<br />It's also just called "Simon" ("Simon listens" was the NPO originally created to support development).<br />Please update the link in the side bar.<br /><br />Thanks!<br /><br />Best regards,<br />PeterAnonymoushttps://www.blogger.com/profile/05807135490747405853noreply@blogger.comtag:blogger.com,1999:blog-24286077.post-21961236944104617422013-06-27T16:46:51.919-07:002013-06-27T16:46:51.919-07:00Curious. Is there any reason why we are choosing ...Curious. Is there any reason why we are choosing SWIG instead of cython?Arthur Chanhttps://www.blogger.com/profile/18162527494132410362noreply@blogger.comtag:blogger.com,1999:blog-24286077.post-53904818220820894232013-06-27T14:20:18.043-07:002013-06-27T14:20:18.043-07:00> Cython is probably the best
Cython is being ...> Cython is probably the best<br /><br />Cython is being dropped from pocketsphinx right now. The common bindgings for Java and Python and Ruby will be implemented through SWIG.<br />Nickolay Shmyrevhttps://www.blogger.com/profile/11220369315272283124noreply@blogger.comtag:blogger.com,1999:blog-24286077.post-74902339101848864982013-05-04T15:59:00.539-07:002013-05-04T15:59:00.539-07:00City, No problem. Check out this blog from time to...City, No problem. Check out this blog from time to time!<br /><br />ArthurArthur Chanhttps://www.blogger.com/profile/18162527494132410362noreply@blogger.comtag:blogger.com,1999:blog-24286077.post-54971852758663463102013-04-01T07:34:15.648-07:002013-04-01T07:34:15.648-07:00Hey Pranav,
Thanks for your comment. Let me che...Hey Pranav, <br /><br />Thanks for your comment. Let me check it out your link and see if I dig up something. <br /><br />ArthurArthur Chanhttps://www.blogger.com/profile/18162527494132410362noreply@blogger.comtag:blogger.com,1999:blog-24286077.post-36648876476580821502013-04-01T07:32:52.557-07:002013-04-01T07:32:52.557-07:00Hey Jigar,
My two cents......
The determinant te...Hey Jigar,<br /><br />My two cents......<br /><br />The determinant term in a multi-variate Gaussian distribution has the purpose of normalize the exponential terms such that the integral is 1. Most of the systems I know preserved such mathematical nicety. <br /><br />So what if you deviate from it? Say you have a Baum-Welch algorithm and trained a correct HMM with corresponding GMMS, but in Arthur Chanhttps://www.blogger.com/profile/18162527494132410362noreply@blogger.comtag:blogger.com,1999:blog-24286077.post-77901282124615810762013-04-01T04:09:57.162-07:002013-04-01T04:09:57.162-07:00Hey,
while computing the log likelihood, the deno...Hey,<br /><br />while computing the log likelihood, the denominator has a <b>determinant</b> of variance term (Mahabalonis distance). <br />Each of the variance values are on an average 0.01 and multiplication for 39 such terms would lead to 10^(-60) which is a very small number.<br />Log of reciprocal of that number will be very high which will totally overcome the values of (x - mu)^2. <br />MyAnonymoushttps://www.blogger.com/profile/08308593335746999292noreply@blogger.comtag:blogger.com,1999:blog-24286077.post-71964399777355586592013-03-27T19:30:54.339-07:002013-03-27T19:30:54.339-07:00There was an EU project on interactive TV. See htt...There was an EU project on interactive TV. See http://dicit.fbk.eu and <br />http://www.lms.lnt.de/research/projects/interactive_tv_frontend/<br /><br />I wonder what happened to the technology once the project ended.Pranav Jawalehttps://www.blogger.com/profile/12324752217158886769noreply@blogger.comtag:blogger.com,1999:blog-24286077.post-58332406608529258062013-03-26T02:37:26.590-07:002013-03-26T02:37:26.590-07:00Hey Pranav,
Thanks for your comments. I have no...Hey Pranav, <br /><br />Thanks for your comments. I have noticed your earlier comments wasn't shown so I decided to revert to an earlier templates. Sorry for the missing comments. <br /><br />ArthurArthur Chanhttps://www.blogger.com/profile/18162527494132410362noreply@blogger.comtag:blogger.com,1999:blog-24286077.post-65567192118796413092013-03-26T02:14:05.721-07:002013-03-26T02:14:05.721-07:00This comment has been removed by the author.Arthur Chanhttps://www.blogger.com/profile/18162527494132410362noreply@blogger.comtag:blogger.com,1999:blog-24286077.post-75910129817429759532013-03-26T01:57:45.491-07:002013-03-26T01:57:45.491-07:00+1 for 'verify the correctness of an experimen...+1 for 'verify the correctness of an experiment before an experiment starts'.<br /><br />Another imp thing is to pay attention to the warnings (wrt SphinxTrain). At times there could be warnings like "senone never found" in context independent stage. Although training doesn't halt, it is better to modify the phoneset (by merging two or more phones into one) if one finds suchPranav Jawalehttps://www.blogger.com/profile/12324752217158886769noreply@blogger.comtag:blogger.com,1999:blog-24286077.post-24986646802216816672013-03-23T13:14:08.695-07:002013-03-23T13:14:08.695-07:00Hey Pranav,
Thanks for your comment!
I am certa...Hey Pranav, <br /><br />Thanks for your comment!<br /><br />I am certainly not an expert on DNN. Though I will recommend you to read up Microsoft's paper on the topic because I think they are probably the first group which work out the whole thing with more than 3000 hours of data. <br /><br />If you want to know the theory, another good place to start is the Prof Hinton's Coursera Arthur Chanhttps://www.blogger.com/profile/18162527494132410362noreply@blogger.comtag:blogger.com,1999:blog-24286077.post-27944488054430562262013-03-23T13:06:59.312-07:002013-03-23T13:06:59.312-07:00Wow, you've got great enthusiasm about the fie...Wow, you've got great enthusiasm about the field! Congratulations on the century...<br /><br />Do you have some reading recommendations about DNN in ASR?Pranav Jawalehttps://www.blogger.com/profile/12324752217158886769noreply@blogger.comtag:blogger.com,1999:blog-24286077.post-16351288195314854442013-03-22T08:45:42.795-07:002013-03-22T08:45:42.795-07:00
The shorter answer is their internal structure ar...<br />The shorter answer is their internal structure are completely different. So you might expect a lot of difference between the two. <br /><br />But then again, once you specify the same input. There are a lot of nuances in the algorithm. So I might write about it soon. <br /><br />Arthur Chanhttps://www.blogger.com/profile/18162527494132410362noreply@blogger.comtag:blogger.com,1999:blog-24286077.post-66007947822418852592013-03-21T23:25:18.349-07:002013-03-21T23:25:18.349-07:00Hello,
Here's another article request -
In w...Hello,<br /><br />Here's another article request -<br /><br />In what ways does sphinx3_align differ from sphinx3_decode with constrained finite state graph? For ex. I can align "I'm a girl" with an audio and can also decode the same audio with an FSG "I'm -> a -> girl" <br /><br />Are there any algorithmic/practical differences?Anonymousnoreply@blogger.com