Image quality is an important practical challenge that is often overlooked in the design of machine vision systems. Commonly, machine vision systems are trained and tested on high quality image datasets, yet in practical applications the input images can not be assumed to be of high quality. Modern deep neural networks (DNNs) have been shown to perform poorly on images affected by blur or noise distortions. In this work we investigate whether human subjects also perform poorly on distorted stimuli, and provide a direct comparison with the performance of deep neural networks. Specifically, we study the effect of Gaussian blur and additive Gaussian noise on human and DNN classification performance. We perform two experiments: one crowd-sourced experiment with unlimited stimulus display time, and one lab experiment with 100ms display time. In both cases we found that humans outperform neural networks on distorted stimuli, even when the networks are retrained with distorted data.
While text-to-speech software has largely made textual information accessible in the digital space, analogous access to graphics still remains an unsolved problem. Because of their portability and ubiquity, several studies have alluded to touchscreens as a potential platform for such access, yet there is still a gap in our understanding of multimodal information transfer in the context of graphics. The current research demonstrates feasibility for following lines, a fundamental graphical concept, via vibrations and sounds on commercial touchscreens. The first study examined the presentation of straight, uniform lines using a multitude of line representations, such as vibration-only, auditory-only, and bordered lines. The results of this study demonstrated that bordered lines were optimal for fine tracing, although both vibration- and auditory-only lines were also sufficient for tracking, with minimal deviations. The second study examined the presentation of curving, non-uniform lines. Conditions differed on the number of auditory reference points presented at the inflection and deflection points. Participants showed minimal deviation from the lines during tracing, performing nearly equally in both 1 and 3 point conditions. From these studies, we demonstrate that line following via multimodal feedback is possible on touchscreens, and we present guidelines for the presentation of such non-visual graphical concepts.
We investigate how the perceived abstraction quality of computer-generated illustrations is related to the number of primitives (points and small lines) used to create them. Since it is difficult to find objective functions that quantify the visual quality of such illustrations, we gather comparative data from a crowdsourcing user study and employ a paired comparison model to deduce absolute quality values. Based on this study we show that it is possible to model the perceived quality of stippled representations based on the properties of an input image and show the generalizability of our approach by comparing models for different stippling methods. We give guidance for the number of stipple points typically enough to represent an input image well. By showing that our proposed approach also works for small lines, we demonstrate its applicability for quantifying other kinds of visual stimuli. Our results are related to Weber-Fechner's law from psychophysics and indicate a logarithmic relation between numbers of rendering primitives and perceived abstraction quality.
The individual shape of the human body, including the geometry of its articulated structure and the distribution of weight over that structure, influences the kinematics of a person's movements. How sensitive is the visual system to inconsistencies between shape and motion introduced by retargeting motion from one person onto the shape of another? We used optical motion capture to record five pairs of male performers with large differences in body weight, while they pushed, lifted, and threw objects. From these data, we estimated both the kinematics of the actions as well as the performer's individual body shape. To obtain consistent and inconsistent stimuli, we created animated avatars by combining the shape and motion estimates from either a single performer or from different performers. Using these stimuli we conducted three experiments in an immersive virtual reality environment. First, a group of participants detected which of two stimuli was inconsistent. Performance was very low and results were only marginally significant. Next, a second group of participants rated perceived attractiveness, eeriness, and humanness of consistent and inconsistent stimuli, but these judgements of animation characteristics were not affected by consistency of the stimuli. Finally, a third group of participants rated properties of the objects rather than of the performers. Here, we found strong influences of shape-motion inconsistency on perceived weight and thrown distance of objects. This suggests that the visual system relies on its knowledge of shape and motion and that these components are assimilated into an altered perception of the action outcome. We propose that the visual system attempts to resist inconsistent interpretations of human animations. Actions involving object manipulations present an opportunity for the visual system to reinterpret the introduced inconsistencies as a change in the dynamics of an object, rather than as an unexpected combination of body shape and body motion.
Material appearance is often represented by a bidirectional reflectance distribution function (BRDF). Although the concept of the BRDF is widely used in computer graphics and related applications, the number of actual captured BRDFs is limited due to a time and resources demanding measurement process. Several BRDF databases have already been provided publicly, yet subjective properties of underlying captured material samples, apart from single photographs, remain unavailable for users. In this paper we analyzed material samples, used in the creation of the UTIA BRDF database, in a psychophysical study with ten subjects and assessed its twelve visual, tactile, and subjective attributes. Further, we evaluated the relationship between the attributes and six material categories. We consider the presented perceptual analysis as valuable and complementary information to the database; therefore, it has been made publicly available.
Visual salience can increase search efficiency in complex displays, but does that influence persist when completing a specific search? In two experiments, participants were asked to search web pages for the prices of specific products. Those products were located near an area of higher visual saliency or lower visual salience. In Experiment 1, participants were read the name of the product before searching; in Experiment 2, participants were shown an image of the exact product before searching. In both cases participants completed their search more quickly in the high salience condition. This was true even when there was no ambiguity about the visual characteristics of the product. Our findings suggest that saliency guides users through complex displays under realistic, goal-driven, task conditions. Designers can use this knowledge to create interfaces that are easier to search by aligning saliency and task critical elements.