Some thoughts after Interspeech 2017
I've just returned from the Interspeech 2017 conference in Stockholm. I've had a great time there, getting to know other researchers and engineers working on different aspects of speech. Since there were many tracks conducted in parallel, I'm pretty sure that I've missed many interesting presentations, but I tried my best to move around quickly and absorb as much knowledge as I could. Here's some of the papers/posters/presentations that I especially liked.
My presentation about the range-v3 C++ library
This post is going to be short. A few weeks ago I was giving a presentation at a local C++ User Group meetup about the Ranges TS (technical specification) which is most likely going to enter the C++20 standard. An existing implementation of the Ranges TS is the range-v3 library by Eric Niebler, which introduces some interesting new features beyond the STL algorithms, such as e.g. lazily evaluated range adaptors. You can find the slides from my talk at slides.com. There is also the source code for the examples on github.
Speed up ML model development with GNU Parallel
for fun and profit.
Recently, I feel like I've found the Holy Grail of parallel/distributed computation - the GNU Parallel program. This program allows you to run any command, script, pipeline or program in parallel with different arguments, possibly even distributing the jobs between several nodes. In contrast to frameworks such as Hadoop, it practically doesn't require any setup and has a very small overhead (perhaps it's not as robust though, and may scale worse for big data). I believe that any machine learning dev who works on "medium data", has access to a few Linux machines (e.g. in academia or a small startup/company) and would like to find these tricky hyperparameters more efficiently can benefit from this program. Especially if you're working with a kind of model that's not easy to parallalelize, or for any reason works only on a single core.