Monday, November 21, 2011

The Grand Janitor After CMU Sphinx

I have left the development of CMU Sphinx for around 6 years.  Geez.  Talking about changes.  During the time, I went to work for one startup and one defense contractor.   Start numerous non-speech related blogs.

I certainly have fun but feel drifted at the same time - both companies I worked with are extraordinary but their causes are not mine.    As you know, life without a cause is a tough life.

And now when I am inspecting Sphinx and open source speech recognition again.   Wow, there are tons of changes.   The awareness of the need of open source speech recognition has never been so acute and high.   The performance of open source speech recognition still requires a lot of work but it is no longer unthinkable to deploy an open source speech recognizer in a real application.

There are more resources for learning how to use a speech recognizer.   Thanks to dedicated Sphinx developers such as David Huggins-Daines and Nickolay Shmyrev.  Many more people learn about how to properly use Sphinx and there are more documentation around.

There are also more resources for building a speech recognizer.  One notable effort is Voxforge led by Ken McClean which dedicated to accumulate clean and transcribed data over the time.   Though I don't know how large is its size, I admire the dedication of Ken.    Someone should start such a project long long time ago.   Once it is started, there is a chance that open source data would be an important source of speech data in future.

In my last 6 years, I can only act as a bystander of Sphinx development.   I change job again recently and will work with a company which is close to Sphinx.   I don't know how much I will do *real* work.   But I am glad that Sphinx and I cross paths again.   At the very least, I hope to contribute ideas to the community and help this great project grows.

The Grand Janitor