EYEPHONE TECHNOLOGY
EYEPHONE TECHNOLOGY
ABSTRACT
Eye phone is a hands free interfacing system that is used for activating mobile phone by eye. Here the the functions of the phone can be drive easily. The phone functions activated by blinking of the eye. The navigation key fuctions are done by the movement of eye. The principle behind in eye phone technology is Eye tracking systems.There is no need for any other devices that placed in the eye for tracking the movements It is done by the movement of pupil in the eye. The device sense the movement of the eye using the pupil movement. Normal devices are used front camera to sense the eye movement. But in modern phones uses sensors used to track eye movements. The details , design and working of the eye phone shown below report.
TABLE OF CONTENT
NUMBER TITLE PAGE
1. INTRODUCTION
2. Eye phone Technology
3. Human Phone Intraction
What is eye tracking?
Eye tracking Technology
4. EyePhone Design
An eye detection
An open eye template creation phase
An eye tracking phase
A blink detection phase
5. Evaluvation
Daylight Exposure Analysis for a Stationary Subject
Artificial Light Exposure For A Stationary Subject
6. Applications
7. Conclusion
8 Reference
INTRODUCTION
As smartphones evolve researchers are studying new techniques to ease the human-mobile interaction. We propose EyePhone, a novel“hand-free”interfacing system capable of driving mobile applications/functions using only the user’s eyes movement and actions (e.g., wink). EyePhone tracks the user’s eye movement across the phone’s display using the camera mounted on the front of the phone; more specifically, machine learning algorithms are used to: i) track the eye and infer its position on the mobile phone display as a user views a particular application; and ii) detect eye blinks that emulate mouse clicks to activate the target application under view. We present a prototype implementation of EyePhone on a Nokia N810, which is capable of tracking the position of the eye on the display, mapping this positions to an application that is activated by a wink. At no time does the user have to physically touch the phone display.
Eye phone Technology
EyePhone tracks the user’s eye movement across the phone’s display using the camera mounted on the front of the phone. It a novel“hand-free”interfacing system capable of driving mobile applications/functions using only the user’s eyes movement and actions (e.g., wink). EyePhone tracks the user’s eye movement across the phone’s display using the camera mounted on the front of the phone.Eye phone track the eye and infer its position on the mobile phone display as a user views a particular application; and eye phone detect eye blinks that emulate mouse clicks to activate the target application under view. We present a prototype implementation of EyePhone.Eye phone is capable of tracking the position of the eye on the display, mapping this positions to an application that is activated by a wink. At no time does the user have to physically touch the phone display. EyePhone is the first system capable of tracking a user’s eye and mapping its current position on the display to a function/application on the phone using the phone’s front-facing camera. EyePhone allows the user to activate an application by simply“blinking at the app”, emulating a mouse click. While other interfaces could be used in a hand-free manner, such as voice recognition, we focus on exploiting the eye as a driver of the HPI. We believe EyePhone technology is an important alternative to, for example, voice activation systems based on voice recognition, since the performance of a voice recognition system tends to degrade in noisy environments.
v The front camera is the only requirement in EyePhone. Most of the smartphones today are equipped with a front camera and we expect that many more will be introduced inthe future (e.g., Apple iPhone 4G ) in support of video conferencing on the phone. The EyePhone system uses machine learning techniques that afterdetecting the eyecreate a template of the open eye and use template matching foreye tracking. Correlation matching is exploited for eye wink detection. We implement EyePhone on the Nokia N810 tablet and present experimental results in different settings. These nitial results demonstrate that EyePhone is capable of driving the mobile phone.
Human Computer Interaction(HCI)
“HCI (human-computer interaction) is the study of how people interact with computers and to what extent computers are or are not developed for successful interaction with human beings”. Most HCI technology addresses the interaction between people and computers in “ideal” environments, i.e., where people sit in front of a desktop machine with specialized sensors and cameras centered on them
Human Phone Interaction(HPI)
Human-Computer Interaction (HCI) researchers and phone vendors are continuously searching for new approaches to reduce the effort users exert when accessing applications on limited form factor devices such as mobile phones.
Human-phone interaction (HPI) extends the challenges not typically found in HCI research, more specially related to the phone and how we use it. In order to address these goals HPI technology should be less intrusive; that is ,
• it should not rely on any external devices other than the mobile phone itself;
• it should be readily usable with minimum user dependency as possible;
• it should be fast in the inference phase;
• it should be lightweight in terms of computation;
• it should preserve the phone user experience, e.g., it should not deplete the phone battery over normal operations
HUMAN-PHONE INTERACTION
Human-Phone Interaction represents an extension of the field of HCI since HPI presents new challenges that need to be addressed specifically driven by issues of mobility, the form factor of the phone, and its resource limitations (e.g., energy and computation). More specifically, the distinguishing factors of the mobile phone environment are mobility and the lack of sophisticated hardware support, i.e., specialized headsets, overhead cameras, and dedicated sensors, that are often required to realize HCI applications. In what follows, we discuss these issues.
Mobility Challenges:
One of the immediate products of mobility is that a mobile phone is moved around through unpredicted context, i.e., situations and scenarios that are hard to see or predict during the design phase of a HPI application. A mobile phone is subject to uncontrolled movement, i.e., people interact with their mobile phones while stationary, on the move, etc. It is almost impossible to predict how and where people are going to use their mobile phones. A HPI application should be able to operate reliably in any encountered condition. Consider the following examples
Two HPI applications, one using the accelerometer, the other relying on the phone’s camera. Imagine exploiting the accelerometer to infer some simple gestures a person can perform with the phone in their hands, e.g., shake the phone to initiate a phone call, or tap the phone to reject a phone call . What is challenging is being able to distinguish between the gesture itself and any other action the person might be performing. For example, if a person is running or if a user tosses their phone down on a sofa, a sudden shake of the phone could produce signatures that could be easily confused with a gesture. There are many examples where a classifier could be easily confused. In response, erroneous action could be triggered on the phone. Similarly, if the phone’s camera is used to infer a user action , it becomes important to make the inference algorithm operating on the video captured by the camera robust against lighting conditions, which can vary from place to place. In addition, video frames blur due to the phone movement. Because HPI application developers cannot assume any optimal operating conditions (i.e., users operating in some idealized manner) before detecting gestures in this example, (e.g., requiring a user to stop walking or running before initiating a phone call by a shaking movement), then the effects of mobility must be taken into account in order for the HPI application to be reliable and scalable.
Hardware Challenges:
As opposed to HCI applications, any HPI implementation should not rely on any external hardware. Asking people to carry or wear additional hardware in order to use their phone might reduce the penetration of the technology. Moreover, state-of-the art HCI hardware, such as glass mounted cameras, or dedicated helmets are not yet small enough to be conformably worn for long periods of time by people. Any HPI application should rely as much as possible on just the phone’s on-board sensors. Although modern smartphones are becoming more computationally capable, they are still limited when running complex machine learning algorithms . HPI solutions should adopt lightweight machine learning techniques to run properly and energy efficiently on mobile phones.
What is eye tracking?
Eye tracking refers simply to recording eye movements whilst a participant examines a visual stimulus (Collewijn, 1991). Accurate eye tracking must account for both the position of the head and the position of the eyes relative to the head. The earliest eye trackers were mechanical devices that must have caused participants great discomfort due to their size and invasiveness (see Collewijn, 1991 and Hayoe, & Ballard, 2005 for descriptions of various techniques). The technology has progressed significantly since then, such that systems are now available that do not require headgear or any physical attachments to be worn by the participant.
Two important aspects of eye tracking are calibrating the system to specific participants and managing eye drift, via drift correction. Calibration normally involves participants looking at an image (e.g., a dot or a fixation cross) in a known location. The eye tracking system compares the true location of the image to where it detects the participant’s gaze on the screen, and applies
a suitable correction for future fixations. Drift correction measures how much the difference between a participant’s gaze and a central point “drifts” over a short time period. Drift can occur because of factors such as fatigue and changes in body (head) position. Moreover, the longer a viewing session, the more drift that occurs, and thus the less precise the gaze recording.
Fixation duration, frequency, location and sequencing are the primary measures of visual behaviour used to study face processing. Fixation duration provides an index of the speed with which information is processed. Increasing fixation duration is associated with tasks that require more detailed visual analysis (e.g., Xiaohua, C., & Liren, 2007). Frequency of fixations often serves as a measure of sampling quantity. Fixation location and sequencing provide information regarding the regions of the face to which participants are attending and information about the order in which stimulus properties are sampled.
EYE TRACKING TECHNOLOGY
Eye tracking is a technique where by an individual’s eye movements are measured so that the researcher knows both where a person is looking at any given time and the sequence in which their eyes are shifting from one location to another. Tracking people’s eye movements can help HCI researchers understand visual and display-based information processing and the factors that may impact upon the usability of system interfaces. In this way, eye movement recordings can provide an objective source of interface-evaluation data that can inform the design of improved interfaces. Eye movements can also be captured and used as control signals to enable people to interact with interfaces directly without the need for mouse or keyboard input, which can be a major advantage for certain populations of users such as disabled individuals. We begin this chapter with an overview of eye-tacking technology, and progress toward a detailed discussion of the use of eye tracking in HCI and usability research. A key element of this discussion is to provide a practical guide to inform researchers of the various eye-movement measures that can be taken, and the way in which these metrics can address questions about system usability. We conclude by considering the future prospects for eye-tracking research in HCI and usability testing.
Most commercial eye-tracking systems available today measure point-of-regard by the “corneal-reflection/pupil centre” method . These kinds of trackers usually consist of a standard desktop computer with an infrared camera mounted beneath (or next to) a display monitor, with image processing software to locate and identify the features of the eye used for tracking. In operation, infrared light from an LED embedded in the infrared camera is first directed into the eye to create strong reflections in target eye features to make them easier to track (infrared light is used to avoid dazzling the user with visible light). The light enters the retina and a large proportion of it is reflected back, making the pupil appear as a bright, well defined disc (known as the “bright pupil” effect). The corneal reflection (or first Purkinje image) is also generated by the infrared light, appearing as a small, but sharp, glint (see Figure 1).
Figure 1. Corneal reflection and bright pupil as seen in the infrared camera image.
Once the image processing software has identified the centre of the pupil and the location of the corneal reflection, the vector between them is measured, and, with further trigonometric calculations, point-of-regard can be found. Although it is possible to determine approximate point-of-regard by the corneal reflection alone (as shown in Figure 2), by tracking both features eye movements can, critically, be disassociated from head movements
Figure 2. Corneal reflection position changing according to point of regard (cf. Redline & Lankford, 2001).
Video-based eye trackers need to be fine-tuned to the particularities of each person’seye movements by a “calibration” process. This calibration works by displaying a dot on the screen, and if the eye fixes for longer than a certain threshold time and within a certain area, the system records that pupil centre/corneal-reflection relationship as corresponding to a specific x,y coordinate on the screen.
EyePhone Design
o An eye detection phase
o An open eye template creation phase
o An eye tracking phase
o A blink detection phase
Eye Detection
By applying a motion analysis technique which operates on consecutive frames, this phase consists on finding the contour of the eyes. The eye pair is identified by the left and right eye contours. While the original algorithm identifies the eye pair with almost no error when running on a desktop computer with a fixed camera (see the left image in Figure 1).
Figure 1: Left figure: example of eye contour pair returned by the original algorithm running on a desktop with a USB camera. The two white clusters identify the eye pair. Right figure: example of number of contours returned by EyePhone on the Nokia N810. The smaller dots are erroneously interpreted as eye contours.
We obtain errors when the algorithm is implemented on the phone due to the quality of the N810 camera compared to the one on the desktop and the unavoidable movement of the phone while in a person’s hand (refer to the right image in Figure 1). Based on these experimental observations, we modify the original algorithm by:
i) reducing the image resolution, which according to the authors in reduces the eye detection error rate, and
ii) adding two more criteria to the original heuristics that filter out the false eye contours. In particular, we filter out all the contours for which their width and height in pixels are such that widthmin ≤ width ≤ widthmax and heightmin ≤ height ≤ heightmax. The widthmin, widthmax, heightmin, and heightmax thresholds, which identify the possible sizes for a true eye contour, are determined under various experimental conditions (e.g., bright, dark, moving, not moving) and with different people.
Open Eye Template Creation
While the authors in adopt an online open eye template creation by extracting the template every time the eye pair is lost (this could happen because of lighting condition changes or movement in the case of a mobile device), EyePhone does not rely on the same strategy. The reduced computation speed compared to a desktop machine and therestricted battery requirements imposed by the N810 dictate a different approach. EyePhone creates a template of a user’s open eye once at the beginning when a person uses the system for the first time using the eye detection algorithm described above1. The template is saved in the persistent memory of the device and fetched when EyePhone is invoked. By taking this simple approach, we drastically reduce the runtime inference delay of EyePhone, the application memory footprint, and the battery drain. The downside of this off-line template creation approach is that a template created in certain lighting conditions might not be perfectly suitable for other environments. We intend to address this problem as part of our future work. In the current implementation the system is trained individually, i.e., the eye template is created by each user when the application is used for the first time. In the future, we will investigate eye template training by relying on pre-collected data from multiple individuals. With this supervised learning approach users can readily use EyePhone without going through the initial eye template creation phase
Eye Tracking:
The eye tracking algorithm is based on the template matching.
Figure 2: Eye capture using the Nokia N810 front camera running the EyePhone system. The inner white box surrounding the right eye is used to discriminate the nine positions of the eye on the phone’s display. The outer box encloses the template matching region.
The template matching function calculates a correlation score between the open eye template, created the first time the application is used, and a search window. In order to reduce the computation time of the template matching function and save resources, the search window is limited to a region which is twice the size of a box enclosing the eye. These regions are shown in Figure 2, where the outer box around the left eye encloses the region where the correlation score is calculated. The correlation coefficient we rely on, which is often used in template matching problems, is the normalized correlation coefficient defined in . This coefficient ranges between -1 and 1. From our experiments this coefficient guarantees better performance than the one used in. If the normalized correlation coefficient equals 0.4 we conclude that there is an eye in the search window. This threshold has been verified accurate by means of multiple experiments under different conditions.
Blink Detection
To detect blinks we apply a thresholding technique for the normalized correlation coefficient returned by the template matching function as suggested in. However, our algorithm differs from the one proposed in. In the authors introduce a single threshold T and the eye is deemed to be open if the correlation score is greater than T, and closed vice versa. In the EyePhone system, we have two situations to deal with: the quality of the camera is not the same as a good USB camera, and the phone’s camera is generally closer to the person’s face than is the case of using a desktop and USB camera. Because of this latter situation the camera can pick up iris movements, i.e., the interior of the eye, due to Eyeball rotation.
Table 1: EyePhone average eye tracking accuracy for different positions of the eye in different lighting and movement conditions and blink detection average accuracy. Legend: DS = eye tracking accuracy measured in daylight exposure and being steady; AS = eye tracking accuracy measured in artificial light exposure and being steady; DM = eye tracking accuracy measured in daylight exposure and walking; BD = blink detection accuracy in daylight exposure
Figure 3: Eye tracking accuracy for the middlecenter position as a function of different distances between the phone and the eyes when the person is steady and walking.
EVALUATION
In this section, we discuss initial results from the evaluation of the EyePhone prototype. We implement EyePhone on the Nokia N810 [19]. The N810 is equipped with a 400 MHz processor and 128 MB of RAM2. The N810 operat-
ing system is Maemo 4.1, a Unix based platform on which we can install both the C OpenCV (Open Source Computer Vision) library [20] and our EyePhone algorithms which are cross compiled on the Maemo scratchbox. To intercept the video frames from the camera we rely on GStreamer [21], the main multimedia framework on Maemo platforms. In what follows, we _rst present results relating to average accuracy for eye tracking and blink detection for di_erent lighting and user movement conditions to show the performance of Eye- Phone under di_erent experimental conditions. We also report system measurements, such as CPU and memory usage, battery consumption and computation time when running EyePhone on the N810. All experiments are repeated fivetimes and average results are shown
Daylight Exposure Analysis for a Stationary Subject
The first experiment shows the performance of Eye-Phone when the person is exposed to bright daylight, i.e., in a bright environment, and the person is stationary. The eye tracking results are shown in Figure 2. The inner white box
in each picture, which is a frame taken from the front camera when the person is looking at the N810 display while holding the device in their hand, represents the eye position on the phone display. It is evident that nine di_erent positions
for the eye are identi_ed. These nine positions of the eye can be mapped to nine di_erent functions and applications as shown in Figure 4. Once the eye locks onto a position (i.e., the person is looking at one of the nine buttons on the display), a blink, acting as a mouse click, launches the application corresponding to the button. The accuracy of the eye tracking and blink detection algorithms are reported in Table 1. The results show we obtain good tracking accuracy of the user's eye. However, the blink detection algorithms accuracy oscillates between _67 and 84%. We are studying further improvements in the blink detection as part of future work.
Impact of Distance Between Eye and Tablet.
Since in the current implementation the open eye template is created once at a fixed distance, we evaluate the eye tracking performance when the distance between the eye and the tablet is varied while using EyePhone. We carry out the measurements for the middle-center position in the display (similar results are obtained for the remaining eight positions) when the person is steady and walking. The results are shown in Figure 3. As expected, the accuracy degrades for distances larger than 18-20 cm (which is the distance be tween the eye and the N810 we currently use during the eye template training phase). The accuracy drop becomes severe when the distance is made larger (e.g., _45 cm). These results indicate that research is needed in order to design eye template training techniques which are robust against distance variations between the eyes and the phone.
System Measurements.
In Table 2 we report the average CPU usage, RAM usage, battery consumption, and computation time of the EyePhone system when processing one video frame { the N810 camera is able to produce up to 15 frames per second. EyePhone is quite lightweight in terms of CPU and RAM needs. The computation takes 100 msec/frame, which is the delay between two consecutive inference results. In addition, the EyePhone runs only when the eye pair is detected implying that the phone resources
are used only when a person is looking at the phone's display and remain free otherwise. The battery drain of the N810 when running the EyePhone continuously for three hours is shown in the 4th column of Table 3. Although this is not a realistic use case, since a person does not usually continuously interact with their phone for three continuous hours,the result indicates that the EyePhone algorithms need to be further optimized to extend the battery life as much as possible.
Artificial Light Exposure For A Stationary Subject
In this experiment, the person is again not moving but in an artificially lit environment (i.e., a room with very low daylight penetration from the windows). We want to verify if different lighting conditions impact the system’s performance. The results, shown in Table 1, are comparable to the daylight scenario in a number of cases. However, the accuracy drops. Given the poorer lighting conditions, the eye tracking algorithm fails to locate the eyes with higher frequency. Daylight Exposure for Person Walking. We carried out an experiment where a person walks outdoors in a bright environment to quantify the impact of the phone’s natural movement; that is, shaking of the phone in the hand induced by the person’s gait. We anticipate a drop in the accuracy of the eye tracking algorithm because of the phone movement. This is confirmed by the results shown in Table 1, column 4. Further research is required to make the eye tracking algorithm more robust when a person is using the system on the move.
ADVANTAGE
· Simply used
· Hands free interfacing system
DISADVANTAGES
o Light dependent
o Accuracy less in night
APPLICATIONS
v EyeMenu:
An example of an EyePhone application is EyeMenu as shown in belowfigure. EyeMenu is a way to shortcut the access to some of the phone’s functions. The set of applications in the menu can be customized by the user. The idea is the following: the position of a person’s eye is mapped to one of the nine buttons. A button is highlighted when EyePhone detects the eye in the position mapped to the button. If a user blinks their eye, the application associated with the button is lunched. Driving the mobile phone user interface with the eyes can be used as a way to facilitate the interaction with mobile phones or in support of people with disabilities.
Car Driver Safety:
EyePhone could also be used to detect drivers drowsiness and distraction in cars. While car manufactures are developing technology to improve drivers safety by detecting drowsiness and distraction using dedicated sensors and cameras , EyePhone could be readily usable for the same purpose even on low-end cars by just clipping the phone on the car dashboard.
Phone face detection(Smart stay):
The Smart stay feature in Samsung Galaxy Series phones uses the front camera to detect when you are looking at your device so that the screen stays on regardless of the screen timeout setting and it auto adjust screen for better using phone. Smart stay technology is works by using the technology of eye phone. Smart stay is a feature on the Samsung Galaxy in which it detects if you are looking at the phone. By doing so, it keeps the screen on while you are using it, but turns if off when you are not using it.
Fig: Activating smart stay function in Mobile phones.
Smart stay is a feature on the Samsung Galaxy S4,S5 series phones
Conclusion
In this paper, we have focused on developing a HPI technology solely using one of the phone’s growing number of onboard sensors, i.e., the front-facing camera. We presented the implementation and evaluation of the EyePhone prototype. The EyePhone relies on eye tracking and blink detection to drive a mobile phone user interface and activate different applications or functions on the phone. Although preliminary, our results indicate that EyePhone is a promising approach to driving Mobile applications in a hand-free manner.
·
Comments
Post a Comment