No case, for now
I recently purchased two commercial eye trackers: the Tobii EyeX and the Eye Tribe tracker, both $99. I'm excited about the possibilities for eye tracking as an input method and for telepresence, but the reality of these trackers is disappointing. The accuracy is so bad that they can't possibly be used as mouse replacements. To make a click target large enough to hit reliably, it would have to be about 1/4 of the screen width and height; anything else would just be too small.
These trackers are trying to solve a harder problem than they need to, though. They're several feet away from the eyes, and they have to deal with arbitrary head motion and variable lighting. For virtual reality, we can fix all of these problems by building eye tracking into a head-mounted display. The camera can be fixed to the head and mere inches from the user's eye, in a completely controlled lighting environment. Accuracy can be much better!
Having eye tracking in a head-mounted display is attractive for several other reasons. It helps solve the VR input problem by adding a built-in input method roughly equivalent to a mouse pointer. There are a lot of possibilities for interesting game mechanics driven by eye tracking. It also helps solve the face-covering problem: when you're wearing a head-mounted display, your facial expressions are hidden. Eye tracking records the hidden part of your face, and combined with external cameras can recover your entire facial expression so that you can have natural conversations with other people in VR.
So, after the disappointing performance of the commercial eye trackers, I decided to try making a VR eye tracker of my own. Here's how I did it:
This diagram shows the setup. The gray area is the light coming from the Rift screen toward the eye. An infrared camera is mounted on top of the Rift and looks down toward a mirror placed inside the Rift between the screen and the lens. This is a "hot" mirror, which is transparent to the eye but reflects infrared light. With this setup, the camera gets a perfect image of the eye through the Rift lens, but the camera is invisible to the eye. It's as if the camera is looking through the screen directly into the eye.
Here again is what the camera looks like installed on the Rift. The mirror is mounted inside, invisible.
And here's what the camera sees! Check out some video here. Remember that the camera is totally invisible to the eye while it is recording this; you see only the Rift screen. This image is nearly perfect for pupil tracking (the bright reflections seen here can be problematic, but I've fixed that in a newer version by moving the illumination off to the side).
You can make this yourself too! Here are all the parts I used:
- PS3 Eye camera
- M12 lens mount for PS3 eye
- 8mm M12 mount lens
- Infrared filter plastic
- "hot mirror" (reflects infrared light, but transparent to visible light)
- Glass cutter
- Plastic snips
- Mounting putty
- Infrared LED
- Battery box
- Soldering equipment
Whew, that's a lot of stuff! Once you have all that, you can follow these steps:
- Disassemble PS3 eye.
- Replace lens mount.
- Cut a piece of the infrared filter to fit in the lens mount.
- Install 8mm lens.
- Disassemble Rift. You need to expose the screen, but you don't need to remove it.
- Cut a hole in the inner shell above the left lens.
- Cut a hole in the outer shell above the hole in the inner shell.
- Use the glass cutter to cut the hot mirror to fit in the Rift at a 45 degree angle.
- Use the mounting putty to mount the mirror inside the Rift and close up the inner shell (clean the screen first!)
- Use the mounting putty to mount the PS3 eye on top of the Rift looking down through the hole at the mirror.
- Solder the infrared LED to the leads from the battery box.
- Mount the battery box to the Rift.
- Insert the LED through the air holes and mount it on the outside of the left lens, pointing at the eye.
- Fine-tune the mounting positions of the LED and camera, and the lens focus.
To go with this hardware I've written custom eye tracking software based on dark-pupil tracking. I'm using OpenCV for debugging UI and reading video files, but the image processing I've written from scratch in Halide and C. The PS3 Eye gives a 640x480 image at 75 FPS, and I'm currently extracting the pupil location accurate to 1/4 pixel in about 6ms per frame (which could definitely be improved).
After all that, how does it perform? I'm happy to say that accuracy is pretty good: With careful calibration I can get ±2 pixel accuracy on the Rift display. However, there's a big caveat: the calibration is extremely sensitive to the exact position of the Rift. Any abrupt head motion, or even changes in facial expression like squinting, will jostle the Rift enough to mess up the calibration and put the tracking tens to hundreds of pixels off.
Can the calibration issues be fixed? I'm not sure yet. The obvious solution would be to add some tracking of head motion so it can be subtracted from eye motion. However, there's a lack of stable markers to use for tracking. The eyelids obviously move around a lot, but even the surrounding skin can move quite a bit as you change your facial expression. The inside corner of the eye next to the nose may be the best tracking point, but it's difficult to see that far to the side through the Rift lens, and it will be a lot more challenging to track accurately than the pupil.
Another possible solution is retina tracking. In my current setup the LED is far from the camera axis, producing a relatively normal-looking "dark pupil" image. If a light source is placed very close to the camera axis, the retina reflects light directly back at the camera, producing a "bright pupil" image. This is the cause of the well-known "red eye" phenomenon in flash pictures. It may be possible to exploit this by focusing the camera on the retina instead of the iris, and tracking the motion of the retina directly. I'm not sure how well this would work, but how cool would it be to take pictures of your own retina?
Yet another possible solution is corneal reflection tracking. The LED produces not just one, but several reflections in the lens of the eye, which appear as bright spots in the image. For pupil tracking these reflections just get in the way, but with corneal reflection tracking they can be exploited. By tracking position of the reflections relative to each other and the pupil, the location and orientation of the eye can be found. This is how most commercial eye trackers work, and may ultimately be the best option.
I've had a lot of fun working on this project, and I intend to continue trying out new ideas. If you're interested in working on eye tracking for VR too, get in touch at firstname.lastname@example.org!