Some Demos

Contact info:
Siebel Center for Computer Science
201 N. Goodwin Ave.
Urbana, IL, 61801, USA (map)
I’m a professor at the CS and ECE depts. at the University of Illinois at Urbana-Champaign. My primary research interests revolve around making machines that can listen. I’ve done plenty of work on signal processing, machine learning and statistics as they relate to artificial perception, and in particular computational audition. I also love working on anything related to audio! The bulk of my work on audio is on source separation, and various machine learning approaches to traditional signal processing problems.

I am fortunate to have been associated with some amazing research labs. I completed my masters, Ph.D. and a postdoc at the Machine Listening Group at the MIT Media Lab under the supervision of Barry Vercoe. I work with Adobe Systems’ Advanced Technology Labs, used to be at MERL, and have spent some time at Interval Research and Starlab. I was also a visiting scientist at MIT’s McGovern Institute for Brain Research. In 2006 I was selected by MIT’s Technology Review as one of the year’s top young technology innovators. I'm an IEEE fellow, was an IEEE Distinguished Lecturer for 2016-2017, and I am currently on the Board of Governors for the IEEE Signal Processing Society.

I’m a descendant of a long music lineage dating to the early 1600s. My Erdös number is 4. You can get my academic stats here, and some of my (US) patents here

Selected Recent Offerings (complete list here)

E. Tzinis, S. Venkataramani, and P. Smaragdis. Usupervised Deep Clustering for Source Separation: Direct Learning from Mixtures Using Spatial Information, in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), May 2018, Brighton, UK. [PDF]

J. Casebeer, B. Luc, and P. Smaragdis. Multi-View Networks for Denoising of Arbitrary Numbers of Channels, in IEEE International Workshop on Acoustic Signal Enhancement, September, Tokyo, Japan. [PDF]

Kim, M. and P. Smaragdis. Bitwise Neural Networks for Efficient Single-Channel Source Separation, in Workshop for Audio Signal Processing, NIPS 2017 [PDF]

Venkataramani, S., J. Casebeer, and P. Smaragdis. Adaptive Front-ends for End-to-end Source Separation, in Workshop for Audio Signal Processing, NIPS 2017 [PDF]

Venkataramani, S., Y.C. Sübakan, and P. Smaragdis. 2017. A Neural Network Alternative to Convolutive Audio Models For Source Separation. 2017. In IEEE Workshop on Machine Learning for Signal Processing (MLSP), Tokyo, Japan, September. 2017.[PDF]

Correa Carvalho, R.G., and P. Smaragdis. 2017. Towards End-to-end Polyphonic Music Transcription: Transforming Music Audio Directly to a Score, in IEEE Workshop for Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, NY, USA. October 2017. [PDF]

Huang, P.-S., M. Kim, M. Hasegawa-Johnson, P. Smaragdis. 2015. Joint Optimization of Masks and Deep Recurrent Neural Networks for Monaural Source Separation. In IEEE Transactions of Audio Speech and Language Processing, to appear [PDF]

Sübakan, Y.C., J. Traa and P. Smaragdis. 2014. Spectral Learning of Mixture of Hidden Markov Models, in Neural Information Processing Systems (NIPS) 2014. Montreal, Canada. [PDF]

Liu, D., P. Smaragdis, M. Kim. 2014. Experiments on Deep Learning for Speech Denoising, in Proceedings of the annual conference of the International Speech Communication Association (INTERSPEECH), Singapore. 2014 [PDF]

Huang, P-S., M. Kim, M. Hasegawa-Johnson, P. Smaragdis. 2014. Deep Learning for Monaural Speech Separation, in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Florence, Italy. 2014 [PDF]

Smaragdis, P., C. Fevotte, G. Mysore, N. Mohammadiha, M. Hoffman 2014. Static and Dynamic Source Separation Using Nonnegative Factorizations: A unified view, in IEEE Signal Processing Magazine, May 2014 [PDF]

Virtanen, T., J. Gemmeke, B. Raj and P. Smaragdis. 2014. Compositional models for audio processing, in IEEE Signal Processing Magazine, accepted [PDF]

Smaragdis, P. 2013. Keynote slides from WASPAA 2013 [PDF] (includes embedded audio files when opened with Acrobat Reader)

Smaragdis, P. and M. Kim. 2013. Non-Negative Matrix Factorization for Irregularly-Spaced Transforms, in IEEE Workshop for Applications of Signal Processing in Audio and Acoustics. New Paltz, NY. October 2013. [PDF]

Kim, M. and P. Smaragdis. 2013. Manifold Preserving Hierarchical Topic Models for Quantization and Approximation, in International Conference on Machine Learning, Atlanta, GA. June 2013. [PDF]

Kim, M. and P. Smaragdis. 2013. Collaborative Audio Enhancement Using Probabilistic Latent Component Sharing, in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouver, Canada. 2013. [PDF]

M.A. Pathak, B. Raj, S. Rane, P. Smaragdis. 2013. Privacy Preserving Speech Processing [Draft PDF][Final PDF]

Smaragdis, P. 2012. Keynote slides from LVA/ICA 2012 [PDF] [PPT]

Smaragdis, P. 2011. Approximate nearest-subspace representations for sound mixtures. In Proceedings International Conference on Acoustics, Speech and Signal Processing (ICASSP). Prague, Czech Republic, May, 2011 [PDF] Invited paper

Mysore, G., Smaragdis, and B. Raj. 2010. Non-negative hidden Markov modeling of audio with application to source separation. In 9th international conference on Latent Variable Analysis and Signal Separation (LCA/ICA). St. Malo, France. September, 2010 [PDF] Best paper award

Smaragdis, P. and B. Raj. 2010. The Markov selection model for concurrent speech recognition. In IEEE international workshop on Machine Learning for Signal Processing (MLSP). Kittilä, Finland. August 2010 [PDF]

Smaragdis, P., M. Shashanka, and B. Raj. 2009. A sparse non-parametric approach for single channel separation of known sounds. In in Neural Information Processing Systems. Vancouver, BC, Canada. December 2009. [PDF]

Smaragdis, P. 2009. User guided audio selection from complex sound mixtures. in the 22nd ACM Symposium on User Interface Software and Technology (UIST 09). Victoria, BC, Canada, October 2009. [PDF]

Smaragdis, P. 2009. Dynamic Range Extension using Interleaved Gains, in IEEE Transactions of Audio, Speech and Language Processing, July 2009. [PDF]

Smaragdis, P. 2009. Relative Pitch Tracking of Multiple Arbitrary Sounds. In Journal of the Acoustical Society of America, Volume 125, Issue 5, pp. 3406-3413 (May 2009) [PDF]

Shashanka, M.V., B. Raj and P. Smaragdis, 2008. Probabilistic Latent Variable Models as Non-Negative Factorizations. In special issue on Advances in Non-negative Matrix and Tensor Factorization, Computational Intelligence and Neuroscience Journal. May 2008. [PDF]

Shashanka, M.V., B. Raj, P. Smaragdis, 2007. Sparse Overcomplete Latent Variable Decomposition of Counts Data. In Neural Information Processing Systems (NIPS), Vancouver, BC, Canada. December 2007. Paper: [PDF], technical supplement: [PDF]

Smaragdis, P. and M.V. Shashanka, 2007. A Framework for Secure Speech Recognition. In IEEE Transactions on Audio, Speech and Language Processing. May 2007. Paper: [PDF]

Smaragdis, P. 2007. Convolutive Speech Bases and their Application to Speech Separation. In IEEE Transactions of Speech and Audio Processing. January 2007. Paper: [PDF]

Smaragdis, P. and P. Boufounos, 2007. Position and Trajectory Learning for Microphone Arrays, In IEEE Transactions on Speech and Audio Processing. January 2007. [PDF]

Smaragdis, P., B. Raj, and M.V. Shashanka, 2006, A probabilistic latent variable model for acoustic modeling, Advances in models for acoustic processing workshop, NIPS 2006. Paper: [PDF], Presentation: [PPT]

Smaragdis, P. Component based techniques for monophonic speech separation and recognition, in "Blind Speech Separation", S. Makino, T-W.Lee and H. Sawada (eds.) Blind Speech Separation, Springer. [Book Link]