TECH TALK @ SHUTTERSTOCK Modeling visual clutter using proto-objects Presenter: Research Advisors: Chen-Ping Yu, PhD Candidate Dr. Dimitris Samaras (computer science) Dr. Greg Zelinsky (psychology) Department of Computer Science Stony Brook University February 5, 2014 AGENDA • Visual search • Examples • Visual clutter • Models • Proto-objects • Parametric proto-object segmentation • Superpixels • Graph and clustering • Data • Experiment and results • Conclusion VISUAL SEARCH • Visual search • Ubiquitous, happens everyday. • Finding your car in a parking lot, finding you keys on a cluttered desk, etc. • Modeling visual search performance • Are we able to predict how easy/hard a search task is? • Helps in advertisement design, item placement (i.e. shelf organization for supermarkets, electronic stores). • Attributes that affect visual search performance • The similarity of the target to the distractor items (Wolfe, 1994, 1998). • The similarity of the distractors (Duncan & Humphreys, 1989). • Set size – the number of items in an image (Wolfe, 1998). VISUAL SEARCH • Example: find the target patch in the query image Target patch VISUAL SEARCH Source: M. Asher, D. Tolhurst, T. Troscianko and I. Gilchrist, “Regional effects of clutter on human target detection performance”, Journal of Vision, 2013 VISUAL SEARCH VISUAL SEARCH • Another example Target patch VISUAL SEARCH VISUAL SEARCH VISUAL SEARCH • Set size effect example Target patch VISUAL SEARCH Source: M. Neider, and G. Zelinsky, “Cutting through the clutter: searching for targets in evolving complex scenes.” Journal of Vision, 2011. VISUAL SEARCH VISUAL SEARCH VISUAL SEARCH VISUAL SEARCH VISUAL CLUTTER • Visual clutter • In general, it is a ‘‘confused collection’’ or a ‘‘crowded disorderly state’’. • Alternatively, it is the state in which excess items, or their representation or organization, lead to a degradation of performance at some task (Rosenholtz et al. 2007). VISUAL CLUTTER • Set size effect • Set size: number of items/objects in an image • Visual search task performance degrades as more objects are added to the display, i.e. looking for a particular building in a rural setting vs in an urban setting (Neider et al. 2008, 2011). • Number of objects is proportional to level of clutter. • Set size in the real world • However, most “objects” in real world scene are not visually countable • grass, rocks, patches of textures, shadows, etc. • Alternative approach • Analysis in the feature space VISUAL CLUTTER Both contain 24 objects! What are objects in these scenes? What is the ranking of their clutterness? CLUTTER MODELS Segmenting objects is difficult, therefore: • Edge density model (Mack et al. 2004) • Counts the pixels on a Canny edge detected image. (r = 0.83) • Result is very sensitive to Canny’s edge detection setting, i.e. smoothing, thresholding. Top row: input images Bottom row: edge density CLUTTER MODELS • Feature congestion model (Rosenholtz et al. 2007) • Compute the feature variances of: Color, Luminance, and Orientation • Build a 3D ellipse using the feature variances, and the volume of the ellipse is the clutter measure for that image. • State-of-the-art, widely being used as the comparison gold standard. (r = 0.75) Left: input, Right: feature variance ellipses 25 weather and US map dataset CLUTTER MODELS • Power Law model (Bravo et al. 2008) • Using Felzenszwalb’s graph-based method to segment the input image, r = 0.62. PROTO-OBJECTS • Direct modeling of set size: proto-objects • Low-level information processed before the focus of attention, and then focus of attention acts as a ‘‘hand’’ that grabs the relating proto-objects together into forming a true stable object, and proto-object itself are groupings of similar low level features that are nearby by the visual neurons (Rensink 1997, 2000). • Directly related to set size. • Better representation of set size than “objects”. Proto-objects as color blobs * Left images: from Wischnewski et al. 2010; Right image: from Bravo et al. 2008 (24 objects) PROTO-OBJECT SEGMENTATION • Our clutter model • Quantify set size, using # of proto-objects instead of objects • Segment proto-objects by performing superpixel clustering Input image Superpixels Proto-objects SUPERPIXEL SEGMENTATION • Superpixel segmentation • Over-segment an image into regions of similar pixels that are also boundary preserving. • As a pre-processing can reduce the need to find boundaries. • Can provide region statistics. Image from left to right: input image, mean-shift, graph-based, turbopixel, normalized-cut. PROTO-OBJECT SEGMENTATION • Superpixel graph • Neighboring superpixels are connected, into a graph structure Input image SLIC k = 1000 Superpixels Superpixel Graph PROTO-OBJECT SEGMENTATION 0.15 0.15 0.77 0.77 0.11 0.11 0.35 0.35 0.63 0.63 0.28 0.86 0.75 0.12 0.28 0.86 0.75 0.12 0.77 0.77 0.04 0.04 0.31 0.31 0.82 0.21 0.82 0.81 0.21 0.93 0.32 0.32 0.65 0.38 0.65 0.68 0.68 0.71 0.81 0.93 0.71 0.05 0.23 0.75 Compute similarity threshold, remove edges that are higher than the threshold 0.38 0.05 0.23 0.75 Within-cluster edge Between-cluster edge (identify, then remove) Merge the connected clusters, represented as proto-objects PARAMETRIC PROTO-OBJECT SEGMENTATION Orientation Color Weibull-Mixture Model (WMM): Intensity Similarity Threshold – the crossing point between the two components: PARAMETRIC PROTO-OBJECT SEGMENTATION • Clutter model • Count the resulting # of proto-objects. • Divide the count by the initial # of superpixels, results in a scale-invariant normalized clutter measure. • The clutter measure is between 0 and 1, the larger the more cluttered. DATA • 90 images from the SUN dataset • 800x600 • Real world images • 6 groups with 15 images each (total = 90 images). • • • • Group 1: 1~10 objects Group 2: 11~20 objects … Group 6: 51~60 objects • Rated by 15 human subjects age from 18~30, from least to most clutter. • Avg correlation over all pairs of subjects: R = 0.6919 (p<0.001) • Using the median ranked position for each image as the ground truth. RESULTS • Results • Achieved R = 0.7557, p<0.001 against human rated ground truth ordering by clutter • 10-fold cross validation with avg test set correlation of R = 0.6808. **latest results: RESULTS Clutter measure: 0.1713 Clutter measure: 0.2612 RESULTS Clutter measure: 0.3725 Clutter measure: 0.5038 RESULTS Clutter measure: 0.6750 CONCLUSION • Applications • Image-level feature for image retrieval. • Image-to-painting style transformation. • Advertisement, user interface, and item organization quantified analysis. • Next steps • Apply our clutter model to the target search task performances. • Explore more on proto-objects for automatic object formation and detection. • Eye-movement related projects. CONCLUSION • Related papers • Chen-Ping Yu, Wen-Yu Hua, Dimitris Samaras, and Gregory Zelinsky, “Modeling clutter perception using parametric proto-object partitioning.” Advances in Neural Information Processing (NIPS), Lake Tahoe, USA, Dec 2013. • Chen-Ping Yu, Dimitris Samaras, and Gregory Zelinsky, “Modeling visual clutter perception using proto-object segmentation.”, Journal of Vision (to appear), 2014. • For more information, please visit my project webpage: http://mysbfiles.stonybrook.edu/~cheyu/projects/proto-objects.html • For full citation information of this presentation, please refer to the NIPS 2013 paper.