I have been crazily busy so blogging was rather slow for me. Though I have a stronger and stronger feeling that my understanding is closer to the state of the art of speech recognition. And for now, the state of the art of speech recognition, we got to talk about the whole deep neural network trend.
There is nothing conceptually new in the use of hybrid HMM-DBN-DNN. It has been proposed under the name HMM-ANN in the past. What is new is that there is new algorithm which allow fast training of multi-layered neural network. It is mainly due to Hinton's breakthrough in 2006: it suggests training a DBN-DNN can be first initialized by pretrained RBM.
I am naturally very interested in this new trend. IBM, Microsoft and Googles' results show that DBN-DNN is not a toy model we saw last two decades.
Well, that's all for my excitement on DBN, I still have tons of things to learn. Back to the "Grand Janitor Blog", as I had tried to improve the blog layout 4 months ago, I got to say I feel very frustrated by Blogger and finally decide to move to WordPress.
I hope to move within the next month or so. I will write a more proper announcement later on.
There are some questions on LinkedIn about the whereabouts of this blog. As you may notice, I haven't done any updates for a while. I was crazy busy by work in Voci (Good!) and many life challenges, just like everyone. Having a lot of fun with programming, as I am working with two of my most favorite languages - C and Python. Life is not bad at all.
My apology to all readers though, it could be tough to blog sometimes. Hopefully, this situation will change later this year.....
I don't know how to make out of the lawsuit but only feel a bit sad. Dragon has been the homes of many elite speech programmers/developers/researchers. Many old-timers of speech were there. Most of them sigh about the whole L&H fiasco. If I were them, I would feel the same too. In fact, once you know a bit of ASR history, you would notice that the fall of L&H gave rise to one you-know-its-name player nowadays. So in a way, the fate of two generations of ASR guys are altered.
As for the MS piece, we are following another trend these days, which is the emergence of DBN. Is it surprising? Probably not, it's rather easy to speed up neural network calculation. (Training is harder, but that's what DBN is strong compared to previous NN approach.)
On Sphinx, I will point out one recent bug contributed by Ricky Chan, which exposed a problem in bw's MMIE training. I am yet to try it but I believe Nick has already incorporated into the open-source code base.
Another items which Nick has been stressing lately is to use python, instead of perl, as the scripting language of SphinxTrain. I think that's a good trend. I like perl and use one-liner, map/grep type of program a lot. Generally though, it's hard to find a concrete coding standard for perl. Whereas python seems to be cleaner and naturally lead to OOP. This is an important issue - perl programmers and perl programming style seems to be spawned from many different type of languages. The original (bad) C programmer would fondly use globals and write functions with 10 arguments. The original C++ programmer might expect language support on OOP but find that "it is just a hash". These style difference could make perl training script hard to maintain.
That's why I like python more. Even very bad script seems to convert itself to more maintainable script. There is also a good pathway for python/C connect. (Cython is probably the best.)
In any case, that's what I have this time. I owe all of you many articles. Let's see if I can write some in the near future.