Welcome to my home page. I'm a fifth year PhD candidate in CBCL at the Massachusetts Institute of Technology. I am working under prof. Tomaso Poggio. My research interests include computational neuroscience (visual attention, object recognition in particular), computer vision, machine learning and biometrics.
What and where: A Bayesian inference theory of visual attention
In the theoretical framework described in this thesis, attention is part of the inference process that solves the visual recognition problem of what is where. The theory proposes a computational role for attention and leads to a model that predicts some of its main properties at the level of psychophysics and physiology. In our approach, the main goal of the visual system is to infer the identity and the position of objects in visual scenes: spatial attention emerges as a strategy to reduce the uncertainty in shape information while feature-based attention reduces the uncertainty in spatial information. Featural and spatial attention represent two distinct modes of a computational process solving the problem of recognizing and localizing objects, especially in difficult recognition tasks such as in cluttered natural scenes. We describe a specific computational model and relate it to the known functional anatomy of attention. We show that several well-known attentional phenomena – including bottom-up pop-out effects, multiplicative modulation of neuronal tuning curves and shift in contrast responses – emerge naturally as predictions of the model. We also show that the bayesian model predicts well human eye fixations (considered as a proxy for shifts of attention) in natural scenes. Finally, we demonstrate that the same model, used to modulate information in an existing feedforward model of the ventral stream, improves
its object recognition performance in clutter.
- S. Chikkerur, T. Serre, C. Tan and Poggio, "What and Where: A Bayesian inference theory of visual attention", Vision Research, 2010 (to appear).
S. Chikkerur, T. Serre, and T. Poggio, "A Bayesian inference theory of attention: neuroscience and algorithms" MIT-CSAIL-TR-2009-047/CBCL-280, Massachusetts Institute of Technology, Cambridge, MA, October 3, 2009.
S. Chikkerur, T. Serre, and T. Poggio, "Attentive processing improves object recognition" MIT-CSAIL-TR-2009-046 /CBCL-279, Massachusetts Institute of Technology, Cambridge, MA, October 2, 2009.
S. Chikkerur, C. Tan, T. Serre, and T. Poggio, "An integrated model of visual attention using shape-based features" MIT-CSAIL-TR-2009-029 / CBCL-278, Massachusetts Institute of Technology, Cambridge, MA, June 20, 2009.
- Chikkerur, S., Serre, T., Tan, C, Poggio, T. The role of top-down feature-based and contextual guidance mechanisms in complex natural visual search. Society for Neuroscience annual meeting, Washington, DC, Nov.(2008).
- Tan, C., Serre, T., Chikkerur, S., & Poggio, T. Feature-based and contextual guidance mechanisms in complex natural visual search [Abstract]. Journal of Vision, 9(8):1189, 1189a, http://journalofvision.org/9/8/1189/,doi:10.1167/9.8.1189.
- Chikkerur, S., Serre, T., Tan, C, Poggio, T Bayesian inference theory predicts the physiological properties of attention. Workshop on normative electrophysiology, NIPS 2009
- Chikkerur, S., Serre, T., Tan, C, Poggio, T Bayesian inference theory predicts the physiological properties of attention. Workshop on bounded-rational analysis of human cognition, NIPS 2009
Biologically Inspired Speech Processing
- Rifkin, R., J. Bouvrie, K. Schutte, S. Chikkerur, M. Kouh, T. Ezzat and T. Poggio. Phonetic Classification Using Hierarchical, Feed-forward, Spectro-temporal Patch-based Architectures, CBCL Paper #267/AI Technical Report #2007-019, Massachusetts Institute of Technology, Cambridge, MA, March, 2007.
Computer vision techniques for leaf identificationAbstract/Poster presentations
- Peter Wilf, Sharat Chikkerur, Stefan Little, Scott Wing, Thomas Serre. Leaf Identification Automated Using a Computational Model of the Primate Vision System. Botany and Mycology conference, 2009
Fingerprint Image enhancement
- Sharat Chikkerur, Alexander N. Cartwright, Venu Govindaraju: Fingerprint enhancement using STFT analysis. Pattern Recognition 40(1): 198-211 (2007)
- Sharat Chikkerur, Venu Govindaraju, Alexander N. Cartwright: Fingerprint Image Enhancement Using STFT Analysis. ICAPR (2) 2005: 20-29
- Sharat Chikkerur, Alexander N. Cartwright, Venu Govindaraju: K-plet and Coupled BFS: A Graph Based Fingerprint Representation and Matching Algorithm. ICB 2006: 309-315
- Sharat Chikkerur, Sharath Pankanti, Alan Jea, Nalini K. Ratha, Ruud M. Bolle: Fingerprint Representation Using Localized Texture Features. ICPR (4) 2006: 521-524
- Sharat Chikkerur, Nalini K. Ratha: Impact of Singular Point Detection on Fingerprint Matching Performance. AutoID 2005: 207-212
- Sharat Chikkerur, Chaohang Wu, Venu Govindaraju: A Systematic Approach for Feature Extraction in Fingerprint Images. ICBA 2004: 344-350
- Karthik Sridharan, Sankalp Nayak, Sharat Chikkerur, Venu Govindaraju: A Probabilistic Approach to Semantic Face Retrieval System. AVBPA 2005: 977-986
- Amit Mhatre, Sharat Chikkerur, Venu Govindaraju: Indexing Biometric Databases Using Pyramid Technique. AVBPA 2005: 841-849
- Shamalee Deshpande, Sharat Chikkerur, Venu Govindaraju: Accent Classification in Speech. AutoID 2005: 139-143
Attention and Object recognition in Videos
Recognition and localization of multiple objects in cluttered visual scenes is a difficult problem for biological as well as machine vision systems. Computer vision techniques rely upon scanning the image at all positions and scales to detect objects in large scenes. In contrast, the biological visual system deals with scene complexity with the help of visual attention, the ability to focus(’attend’) to behaviorally relevant stimuli while ignoring clutter.
In this work, we provide experimental results demonstrating that a biological approach that does not require scanning is a
practical solution. The system consists of a general purpose, biologically-motivated bottom-up attentional front-end (Itti et al' 01)
together with a hierarchical feed-forward recognition algorithm (Serre et al '07).
Speaker Recognition and Identification
Recipes for Gist Computation
Optical Character Recognition using Wavelets and Neural Networks