Eye-tracking is a missing input device for VR experiences

(mlajtos.mu)

134 points | by mlajtos 1007 days ago

33 comments

Findecanor 1005 days ago
I once tried a few examples of eye-tracking as an input device at a visit to Tobii some years ago. One of these was a eye-tracking shooting game. It felt completely wrong and unnatural.
My eyes are my input devices. If I have to constrain my gaze and look only at the object to control then I would have to strain myself to perceive the blurry objects outside the centre of my vision just to be able to see those objects at all.
IMHO, eye tracking should only be applied to make the experience feel more natural, not the opposite. For example: in combination with gesture controls to find the object that the gesture is intended for. Or in VR headsets to render at highest resolution only where the user is watching, to keep the FPS rate up.
[-]
- psychstudio 1005 days ago
  I used to work in eeg and eye tracking software (for neuro-motor rehabilitation and control of robotic arms). Using gaze for fine control is very difficult and terribly unnatural due to saccades and general scene scanning. One easy way to demonstrate how difficult it is is to have a cursor appear at the point of gaze. You're vision "chases" the cursor away from your target like it does a floater. Controlling anything directly with gaze is very very hard.
  [-]
  - jcims 1005 days ago
    I always find it odd how smoothly the eye can track something but voluntary movement is herky jerky.
    [-]
    - formerly_proven 1005 days ago
      Ah you see these are different “hardware features”
      https://en.m.wikipedia.org/wiki/Smooth_pursuit
  - MichaelCollins 1005 days ago
    > One easy way to demonstrate how difficult it is is to have a cursor appear at the point of gaze. You're vision "chases" the cursor away from your target like it does a floater.
    That sounds like a slight miscalibration of the eye tracker. Assuming perfect eye control, if the tracker estimates your gaze slightly incorrectly then the cursor will always move away when you try to look at it.
    [-]
    - brtkdotse 1005 days ago
      > Assuming perfect eye control
      No such thing. The eye does constant micro movements and the brain filters this out. You can compensate for this in software but it lowers the target accuracy.
  - dopu 1005 days ago
    It’s no wonder that it takes us many months of training to get monkeys to perform simple saccade-based tasks.
- atoav 1005 days ago
  I mean ideally VR would use eye tracking for:
  - setting the focus plane of the virtual cameras, this way when you look at a close virtual object the far plane blurs as it would naturally. This is of course dependent on the brightness of the screen and picture. But any blurring will already feel better than no blurring at all
  - interactive stories can make use of knowing where you looked at and for how long. This must not necessarily be very obvious
  - virtual NPCs can react to you gazing at or past them
- modeless 1005 days ago
  Tobii's UI stuff is lame. But it is possible to make good UI for eye tracking. It requires rethinking how UI should work.
  What everyone has been doing so far is like slapping a touchscreen on a desktop operating system. Doesn't work! You need to redesign all the interactions for touch, like the iPhone did. Change how scrolling works, how selection works, how menus work. Redo the foundations. Eye tracking requires the same treatment.
  Touchscreens sucked before the iPhone. Eye tracking can have an iPhone moment too, but someone needs to spend years rethinking everything first.
  [-]
  - roberttod 1005 days ago
    Is that conjecture, or are there UIs you have in mind that work?
    I made an eye tracking UI for my masters thesis, with a few different components. Although they all worked, having my eyes tracked to control the UI felt very intense, it caused a lot of mental and physical strain.
    I'm not saying that my thesis is evidence it absolutely can't be done, but it was surprisingly uncomfortable even for what seemed like basic functionality like for example autofocusing the input boxes you are looking at. But maybe there are tweaks that can make it work, I think the main issue is the screen giving feedback on where you happen to be looking, which gives this unnatural feeling that where you look is causing side affects.
    [-]
    - modeless 1005 days ago
      I'd be interested to see your masters thesis! I made my own rudimentary eye tracker many years ago [1] which got me a job at a startup called Eyefluence, where I made a much better eye tracker. The deep learning revolution was very new at the time, and this was possibly the first deep learning based eye tracker.
      We did some really cool experimentation with eye controlled UI, in some ways more advanced than anything I've seen since. We were acquired by Google and I believe the technology is now shelved. But I still think that there is a path to a good eye controlled UI. Comfort is a matter of knowing the constraints of the eye and designing to them.
      https://james.darpinian.com/blog/eye-tracker
      [-]
      - roberttod 1004 days ago
        This is really cool, the hardware for this is far in advance of what I was using. I was using the classic webcam type setup, with the 1/4 inch accuracy you elluded to in your post.
        My thesis focused more on the UI components than the hardware. I had my gaze setup to appear as a cursor input, and used some CSS to hide the cursor on a webpage. Then I used hover effects to open menus, focus inputs etc.
        My question was, could it replace the mouse on a desktop? And I wanted to build something for anyone to use, not as an accessibility input. I used eye gaze with the spacebar on the keyboard as a primary input action.
        The components had large targets, so the accuracy didn't matter too much. I used some basic components to build an email UI, which worked purely through gaze and the keyboard.
        The UI was perfectly functional, but it really drained me/others to use it. Possible it could be due to accuracy, or some of the UI component design. My gut feeling was that the UI reacting to my eyes was the real problem though, there was a strange feeling knowing that you needed to look at certain components to use them. The way your eye works, it wants to jump around to whatever is interesting, and having a UI that needs the eye to look at a certain place isn't pleasant.
        I think any UI that wants to utilize the eye would have to be very subtle, and designed not to feel restrictive to your focus. I'm not convinced that the mouse/your fingers could be replaced by eye tracking but for rendering higher res in VR goggles and that sort of thing makes a lot of sense.
      - mlajtos 1005 days ago
        Really cool solution with the hot mirror! I always wondered if we will end up with cameras behind microdisplays. Some phones are already doing this, but doing that for VR is probably years away.
        The point you made about rethinking all UI interactions for eye-tracking — bull’s eye! Have you done some work along these lines in Eyefluence? I think the fruit company will gladly introduce new UI paradigm. They did some impressive stuff of improving eye-tracking accuraccy via refining synthetic images to real-looking ones with GANs. [0]
        [0] https://machinelearning.apple.com/research/gan
        [-]
        modeless 1005 days ago
        Yes, rethinking UI for eye tracking is what Eyefluence was working on. And in fact we showed Apple all of our stuff before the acquisition. I believe we were in acquisition talks with them as well as Google. I spent a lot of time on the Apple campus tweaking our neural nets to their liking. This was prior to their publishing of that eye tracking work.
        [-]
        mlajtos 1005 days ago
        Is there any public record of what Eyefluence had done? Some demos maybe? I would really like to see that.
        [-]
        modeless 1005 days ago
        There is actually an older demo still on YouTube: https://www.youtube.com/watch?v=TYcrQswVcnA&t=10s (full presentation: https://www.youtube.com/watch?v=iCZLll1l92g)
        The technique called "dual gaze" in the article has some similarity to some of the stuff we were doing. This was long before that paper was published, and I think there were several aspects of our design that were better than the one in that paper.
        [-]
        mlajtos 1004 days ago
        Holy shit, that looks like magic! The slide to confirm interaction utilizing smooth pursuit was really nice touch. I added the extended demo to the article.
        You say the dual gaze worked similarly to your implementation, however I don't see any confirmation flags. I really can't figure out how this works. :) Is the tutorial available somewhere please?
        [-]
        modeless 1004 days ago
        Thanks! I should say I didn't have much to do with that specific demo; I mostly worked on reimplementing everything from the ground up for VR. I don't think there's any good video of that system, but there are some accounts in the press from journalists trying it. (Any journalist that tried it was using my VR system. We didn't let journalists try the AR version because the calibration wasn't reliable enough on new people, but the deep learning based tracking of the VR version was more reliable).
        As far as exactly how it works, I probably shouldn't go around disclosing all the secret sauce. After all, Google bought it fair and square. AFAIK there's nobody left in Google VR that knows much about it, but I haven't worked there for many years so I don't know the current state of things.
        [-]
        mlajtos 1003 days ago
        Ok, I read 20+ articles describing the VR demo you worked on. Some of them explicitly state that they can’t disclose how the interaction works. One of them mentioned that extra saccade is used to “click”, but it isn’t very revealing. Some demos, e.g. 360 virtual displays, has an explicit gaze target to trigger the action, but the AR demo lacks them, so my take is that the 2min tutorial teaches person that there is an invisible, but standardized target (say upper-right corner) that is the trigger. No idea about scrolling. But damn, everybody that tried the demo was super-convinced that this is the way to go and that in 2017 there will be headsets with such tech. Here we are, 6 years later, and I am still waiting.
        Google should start open sourcing their canned and dead projects.
        Thank you for pointing me towards this research and for your work. :) Very cool!
        [-]
        modeless 1003 days ago
        Yeah I wish Google would do something with it too. The good news is headsets with good eye tracking are finally about to become generally available and I expect that within a few years people will be using it in very creative ways. Wide availability will trigger more experimentation than we could ever do as a single startup.
    - towaway15463 1005 days ago
      Did you try having your gaze put elements into easy reach of a cursor or pointer instead of using the gaze as a pointer?
      Gaze is often an indicator of intent so rather than have it do something it would be more natural for it to make something else easier to do.
      An example, using a laser pointer style interface for VR controllers can sometimes be awkward because small motions can move the pointer a great distance making it hard to pick out a precise target. Gaze could be used as an aim mode where motion of the pointer would be limited/mapped to a smaller area surrounding the spot being looked at.
      It could also be used as a meta key to change the function of buttons, look to the right of a window and a button click fast forwards, look left and the same button rewinds.
    - nfw2 1005 days ago
      I would also be interested in reading that thesis. I was thinking a while ago that going to grad school to work on something like that would be cool.
      One thing that occurred to me is that shifting focus may work better as a gradual activation than an immediate one. In other words, focus would be an average of eye position over time, and after reaching a certain threshold on a certain element, the focus would shift.
      Not sure it would work in practice, but perhaps it could ameliorate the chaotic effects you were mentioning.
- acomjean 1005 days ago
  Canon tried to use eye tracking cameras. Eg “look in the viewfinder where you want to focus”. It didn’t take initially (90s). But it’s back .
  https://m.dpreview.com/articles/6931257631/is-the-canon-eos-...
  [-]
  - ghaff 1005 days ago
    Interesting. I'll have to ask my friend who covers Canon to some degree about this. Yeah, they dropped it after trying it out in one of two film camera models in the 90s. They were always a bit cagey about why although, in my experience, it didn't work all that well.
    [-]
    - CarVac 1005 days ago
      In the R3 it's used only briefly to select an initial autofocus point, after which it hands over control to the subject-tracking algorithms.
      From what I've read, for some it can be startlingly effective but outright refuses to work on other people's eyes.
- mlajtos 1005 days ago
  Yes, what you describe is the Golden Gaze problem (related to golden touch of the King Midas). There are several techniques mentioned in the article to mitigate the problem. Combination of gaze and hand gestures (and voice) seems like the most efficient and natural.
  Foveated rendering is also interesting, but that is pretty straightforward application of gaze contingency.
  [-]
  - macrolime 1005 days ago
    I think a combination of gaze and controllers (like the quest controllers) seem more natural, unless it's for AR and you're out and about.
  - brobdingnagians 1005 days ago
    Talon allows using the Tobii with other triggers so that gaze does nothing without a trigger (a keyword or noise), which then zooms the area looked at, then allows using another trigger to perform the click.
- jayd16 1005 days ago
  Yeah, agreed. Using eye tracking as input actually removes the ability to look and interact with different things. I really do not like it.
  Its pretty good for things like AI characters making eye contact or adding eye movement to avatars though.
- cma 1005 days ago
  > My eyes are my input devices.
  Eyes are very important in non-verbal communication so I don't think this is the full story and they are also used for output with deliberate control, but as far as something like being bad as a fine-grained cursor type thing it may hold.
oblak 1005 days ago
> Eye as a Pointing Device
As a long time gamer and VR (Valve Index) owner, this is where I stopped reading. No, no, no. As others have said, we could use gaze tracking for plenty, just not pointing devices. People have been trying this approach for decades and it's always been the wrong one.
It could work for disabled people/animals, but it should not be the main pointing device. Not my kind of article, that's for sure
[-]
- glaslong 1005 days ago
  This exactly. It’s terrible as a precision pointer, but great as an attention heuristic. A strong secondary intent signal.
  [-]
  - pwython 1005 days ago
    What came to mind first was sniping in games like Counter Strike. Your eye is generally within the crosshair aiming at a certain spot, but suddenly someone comes in your peripheral and you have to make a quick flick shot, which requires more muscle memory and not even focusing on the target.
- sli 1004 days ago
  I only want eye tracking on my Index for VRChat integration so my avatar's eyes move with mine. I just cannot imagine using my eyes as some kind of input device, that sounds immensely awkward for someone that has more traditional options (i.e. no physical disabilities). I'm not using VR for productivity in any case, anyway.
- seejayseesjays 1005 days ago
  Exactly. I've got Nystagmus so I reckon that if the eye tracking is particularly accurate, it'd be like trying to control a mouse that uncontrollably for me.
barbariangrunge 1005 days ago
Speaking for myself, I don’t really want companies studying my eye movements, and I generally enjoy being to be able to look at things without them reacting to my glances.
I don’t want to see targeted ads because of where my eyes wander, or to be profiled based on it
The only thing that I want eye tracking for is maybe foveated rendering, and even then, hardware is catching up and we eventually won’t need that.
Head tilt or body orientation is more useful for determining if a character in a game is being ”addressed” or not
Speaking for myself at least
[-]
- GuB-42 1005 days ago
  > The only thing that I want eye tracking for is maybe foveated rendering, and even then, hardware is catching up and we eventually won’t need that.
  I don't like the idea of wasting tremendous amount of power on things you don't see. It may be a thing now because eye tracking with foveated rendering is more expensive then rendering the full frame at full resolution, which is currently less than ideal. Nailing down foveated rendering would allow us to greatly improve quality where it matters, while at the same time provide a wide field of view on a limited power budget.
  Adaptive focus is another area where eye tracking could be important. Even more important than foveated rendering. For now it is almost impossible to read a book in VR as you would do in real life, that's the problem. Unless you have severe presbyopia (yes, it is an advantage in VR!), everything close will appear blurry. Varifocal lens with eye tracking is a way to solve that problem (see Oculus "half dome")
lagrange77 1005 days ago
While thinking about it, i realized, that our vision apparatus has an interesting feature: When you move your head, your field of sight moves with that motion. But when you hold your head still and only move your eyeballs, the field of sight stays constant, while the focus point (gaze point?) changes, scanning the still standing field of sight. Imagine replacing the eyeballs along with their mounting and motion with cameras. Every motion would translate the field of sight as a whole. Anyone knows, how this mechanism is called?
[-]
- boloust 1005 days ago
  If you move just your eyes, they can only rotate in effectively discontinuous jumps (saccades), during which you are basically blind. This limitation is present no matter how slowly you move your eyes.
  However, your eyes are perfectly capable of smooth motion if you stare at a fixed point and move your head or body.
  It seems the two systems were created by different teams that didn't communicate much.
- numpad0 1005 days ago
  Mind blown. Indeed:
  - when I move my head to (+30, +45), my brain reports viewport.LeftTop =(0,0), and,
  - when I move my eyes to (+30, +15), my brain reports viewport.LeftTop = (30, -15).
  But our eyes are built in such way that the whole lens and sensor assembly rotates, rather than having sensor fixed to the head and lens rotating, so it makes no sense. Framebuffer should always span from (0,0) to (size.x, size.y).
  Perhaps the framebuffer always covers the full viewing range, and DMA start addresses are dynamically determined by current eye positions, or perhaps even by feature matching?
  [-]
  - MereInterest 1005 days ago
    It gets even weirder when you consider blind spots. The hardware has a pretty big design flaw, that the signal wires are routed in front of the active sensors instead of behind. As a result, there's a big bundle that needs to pass through the sensor array, resulting in a blind spot. This is fixed in software, by interpolating nearby inputs before presenting it to the user (i.e. your conscious mind).
    Saccades also present a hardware limitation. During rapid movements, the eye can't focus correctly. To improve the overall UX, we throw away those frames entirely, reconstruct them with a linear interpolation of the next few frames, and call it a day. It does introduce some lag, but the user-level process is usually lagging behind anyways, and doesn't notice the input lag so long as we backdate the timestamps.
    [-]
    - psd1 1005 days ago
      The hardware is designed completely iteratively, so it's - predictably - a mass of ossified hacks. The product team has no strategic vision: it's generally conservative, but sometimes a completely random idea gets into production. They do A/B testing but completely half-ass it, often they never deprecate the old version, and deciphering serial numbers requires a degree in microbiology.
      You'd think the environment would be perfect for disruption, but this actually is the new arrival and it's repeating all the mistakes of the incumbents! How the fuck this scales to billions I have no idea. The sysads must be getting overtime.
    - yojo 1005 days ago
      The time stamp thing is especially crazy, and you can see it break down if you saccade to a ticking clock.
      This twitter thread made the rounds a while back, which lays it out in a nice incredulous rant: https://mobile.twitter.com/foone/status/1014267515696922624
    - psd1 1005 days ago
      The hardware is designed completely iteratively, so it's - predictably - a mass of ossified hacks. The product team has no strategic vision: it's generally conservative, but sometimes a completely random idea gets into production. They do A/B testing but completely half-ass it, often they never deprecate the old version, and deciphering serial numbers requires a degree in microbiology.
      You'd think the environment would be perfect for disruption, but this actually is the new arrival and it's repeating all the mistakes of the incumbents! How the fuck this scales to billions I have no idea. The sysads must be getting overtime.
      [-]
      - lagrange77 1004 days ago
        Right. And after the CEO exit scammed without any documentation, we can't even ditch the whole codebase and rewrite it from scratch, as would be mandatory.
    - Someone 1005 days ago
      > To improve the overall UX, we throw away those frames entirely, reconstruct them with a linear interpolation of the next few frames
      Do we know that interpolation is linear? If so, in what coordinate system? Head? Single eye? One including gaze direction? One including what one is attending to (whatever that is)?
      I find it hard to believe it would be linear. Wouldn’t that mean that it would seem a rolling ball stops rolling during saccades?
      I wonder what kind of experiment could be used to research that.
- baxtr 1005 days ago
  Damn. Now that I did that experiment I was reminded of my eye floaters. Thanks for that... My day is ruined.
  /s
  https://en.wikipedia.org/wiki/Floater
  [-]
  - lagrange77 1005 days ago
    Good ol' faithful friends
- alanbernstein 1005 days ago
  Not an expert, but I think part of what you’re describing is an illusion. Our brains do a lot of work to trick us into thinking we see everything that we expect to be in our FOV. Blind spots demonstrate this.
  For me, when I turn my head, my apparent FOV rotates, while when I move my eyes, it does not. However, when my eyes are looking far right, I can no longer actually detect motion in the far left of my FOV. even though that part of the scene still feels like it’s visible, it isn’t.
  Anyway, I think the technology equivalent is just called stabilization. Again, our brains are really good at tricking us into not noticing it happening. But if you could view the “raw feed” from the optic nerves, it wouldn’t look anywhere near as clean as your description suggests.
- danparsonson 1005 days ago
  Not any kind of relevant expert but I think what you're describing there is the combination of two things - the field of vision of your eyes, coupled with your proprioception, your awareness of your body and its position in space.
  [-]
  - lagrange77 1005 days ago
    Right, and some part of the brain seems to do some sort of Kalman filtering with those pieces of perception.
- GistNoesis 1005 days ago
  In the radio domain, this is called beam-forming.
- xeonmc 1005 days ago
  gimbal? And the reflex for keeping the gaze fixed is oculovestibular reflex
  [-]
  - lagrange77 1005 days ago
    As a layman, i would say the gimbal mechanism merely lets you lock focus on some global coordinates, while moving your head around, but maybe i'm wrong.
tzs 1005 days ago
Heck, I'd like eye-tracking for my desktop computer experiences.
In particular, I'd like eye-tracking combined with a system setting that makes closing windows with ⌘-W on my Mac put up a confirmation dialog if I'm not looking at the window that would be closed.
emkoemko 1005 days ago
PSVR 2 has eye tracking, you have to calibrate it when you first try it but once its done it works really well, stuff like when characters talk to you they look at your eyes, being able to see what other people are looking at etc but the biggest advantage to eye tracking is foveated rendering where the game engine renders in higher resolution in areas where your eyes are looking so this will increase performance and graphics.
m00dy 1005 days ago
I think a comfortable vr headset is the missing device for VR experiences.
[-]
- PretzelPirate 1005 days ago
  I find the HP Reverb G2 to be incredibly comfortable and can wear it for hours. I do have an air conditioning unit in my VR room which keeps me cool while moving during that time.
  The Quest 2 is the lest comfortable thing I've ever put on my face and due to its popularity, seems to skew perspectives around how comfortable VR can be.
- zmmmmm 1005 days ago
  Wait a couple of weeks and see how the Meta Quest Pro/Cambria version looks. It's a significantly updated design using pancake lenses (thin/light) and all the weight of battery is moved to the back for better balance. I think this will make it massively more comfortable for longer term use. Of course, it'll be $800+ but that's still a bargain in the VR space (for what it is, anyway).
- mlajtos 1005 days ago
  Agree 100%. I am curious what fruit company has in their sleeve.
  [-]
  - ohgodplsno 1005 days ago
    An overpriced device that doesn't run any games nor has developers for it ?
    The Valve Index showed that as long as your VR headset is $1000 and above, barely anyone will buy it. And that's without taking into account the hardware you need.
    If anyone has a chance to succeed this generation, it's Sony with their PSVR. And even then, it's $500, and you need another $150 for their headset.
    [-]
    - mlajtos 1005 days ago
      Standalone headsets (like Meta Quest line) are much more interesting then PCVR which requires beefy PC, base stations and lot of cables and setting up.
      Paying $1000+ for a standalone headset might sound too ridiculous but when you understand that it is the whole computer, it is much more acceptable.
      [-]
      - ohgodplsno 1005 days ago
        And you expect Apple to put out a $300 device like the Quest ? Because this is the main subject.
        Paying $1000+ for a standalone headset, even if you can play Half Life Alyx in 4K is still not something that will reach any proper market adoption. Unlike a phone, which you can use at all times, a VR device requires a ton of space, requires buy-in from developers (Apple's current list of games and apps that support VR is... right, zero. Planning to run your game on Rosetta + MoltenVK ?), etc.
        It's not the whole computer. It's a computer that you can't really bring outside, that is heavily limited because haptics are still terrible (and will stay terrible). Do you expect people to bring their VR headset to starbucks ?
        [-]
        intrasight 1005 days ago
        I do actually. We will all migrate from a 5" glass rectangle to smartglasses in perhaps 5 years.
        [-]
        jayd16 1005 days ago
        Way too ambitious but maybe they'll be as big as a semi popular game console by then.
        [-]
        towaway15463 1005 days ago
        I don’t think it’s that ambitious. Cambria is standalone, has colour pass through for mixed reality and has balanced the weight distribution better by using pancake lenses and putting the battery on the back. It will certainly be usable while sitting or standing still anywhere, although people will look at you weird if you do it in a coffee shop. It took Meta 3 years to get there from Quest 1 so I’d say given 5 years and some decent sales they could get it down to something that wouldn’t look too strange and would be comfortable enough to wear all day and would work while walking although I imagine that would pose similar risks as walking while using your phone. Instead of black rectangles we’ll be seeing ski-goggles everywhere.
        [-]
        intrasight 1005 days ago
        >similar risks as walking while using your phone
        Much less risky than walking while looking at a phone. This will be a heads-up display after all.
        >we’ll be seeing ski-goggles everywhere
        Ski-goggle? That won't fly. I expect it'll be a slimmed down version of the Magic Leap 2.
        [-]
        towaway15463 1005 days ago
        Maybe it’s just me but the magic leap 2 looks way dorkier than a good pair of ski goggles. You could probably make them look more like shield sunglasses if you prefer that vibe.
        [-]
        intrasight 1005 days ago
        It's not intended as a fashion accessory. But neither were smartphones.
        A co-worker had a Newton in 1994. We all thought it was the dorkiest thing we ever saw - him holding and looking at that thing. Little did we know.
        [-]
        towaway15463 1004 days ago
        I’d argue that if you wear it, especially in a highly visible location like over your eyes, then it is a fashion accessory. Like a hat or glasses it may serve a purpose but that doesn’t negate the need for it to look good.
        Fashion will be vastly more important for mixed reality than it is for smart phones. A smart phone is more akin to a watch, fashion wise. You usually only see it when it’s being used. Mixed reality devices will be visible all the time and they will also mediate our social interactions with others.
        [-]
        intrasight 1004 days ago
        I expect that there will be only half a dozen different models. They will all be bland and black. But to you point, they will support fashion accessories. Sort of like fashion smartphone cases. Some people will be fine with bland and black.
        [-]
        towaway15463 1003 days ago
        Don’t forget that reverse pass through may be a thing so you might have a screen on the outside which opens up all kinds of possibilities.
        jayd16 1005 days ago
        If Apple put out a headset, surely you'd see a lot of support from Unity and unreal devs that 'just' have to do some platform porting. No small task but its not starting from exactly zero.
        [-]
        ohgodplsno 1005 days ago
        >Unreal devs
        The same Unreal that Apple attempted to revoke the license for and that currently don't bother because the userbase is too small (and the VR userbase will be even smaller) ?
        Optimistic.
jrm4 1005 days ago
Speaking of VR, it really feels like the motion sickness part is still woefully underacknowledged. I've enjoyed quite a few VR games (I have access to most of the systems at my job) but there's no way I can spend more than e.g. 30 min a week "in there." The body REALLY REALLY REALLY hates when movement and what you see don't match up, and very few experiences deal with this well and also are really compelling.
[-]
- fknorangesite 1005 days ago
  > underacknowledged
  Because it doesn't affect everyone to the same degree. What you describe sounds really severe (and I don't doubt it) - but for example I'm personally way at the other end of the spectrum: I can easily spend hours in VR, and I can think of only one game that ever made me even slightly motion sick.
- apike 1005 days ago
  I’m susceptible to this too, but I believe the two reasons VR motion sickness is much less discussed nowadays are:
  1. Often it’s triggered by frame rate hitches, which seems solvable with the usual advance of software and hardware over time.
  2. Most folks can train their mind and body out of the motion sickness with practice.
  [-]
  - imposter 1005 days ago
- zmmmmm 1005 days ago
  Without doubting what you say (and some considerable empathy), I actually think the motion sickness part is over emphasised.
  I've given probably 30 people a try of my headset. I think two of them had problems with disorientation and nausea. The vast majority were just fine with all the stationary / teleport experiences (and there are plenty of of those). Smooth motion would be more of a problem for them but a lot can be done with 6DoF and teleporting.
  It's an awkward fraction because if 10% of people can't tolerate VR at all it's definitely going to get in the way of VR becoming a ubiquitous tool where it plays an essential role in work or shared recreational experiences. But I don't think its nearly as large as people think .... many of those perceptions being formed from 5-10 years ago when headsets were vastly inferior to what they are now in terms of latency, tracking and other factors.
  [-]
  - jrm4 1004 days ago
    Exactly -- the STATIONARY/TELEPORT experiences.
    In other words, we can correctly assume the converse; Any application that involves the user moving in VR is presently a non-starter for the most part.
    Now, how fun / interesting / useful are those S/T experiences, especially as compared to the hype of VR? I'd say a solid-but-small chunk of gaming at best. Right now, I feel like the MAX impact VR could have on the world would be roughly equivalent to the Wii's.
    I'm not saying that nothing could flip the switch in the future, and I do acknowledge those "physical" careers that would benefit greatly, e.g. architecture. But I see nothing remotely close on the horizon that puts VR headsets in households on the level of iPhones or even Echos.
- status200 1005 days ago
  I had the opposite experience, my family has a history of motion sickness, but myself and other family members that have tried it have felt really natural in VR. Even my sister could use smooth movement and "advanced" settings, which I never would have predicted.
  Our best guess is that the 6DoF system at least matches head movement, which might be the difference that makes our brain(s) not freak out compared to reading in the car, etc.
  [-]
  - ShamelessC 1005 days ago
    The truth is that it's highly subjective to each individual and simply because it has gotten better for some, doesnt make it a solved issue (and still makes it a non-starter for people effected).
wslh 1005 days ago
Mark Zuckerberg says in a recent podcast [1] that the next VR headset will scan gestures perform eye contact. He says that will be available in October.
[1] https://open.spotify.com/episode/51gxrAActH18RGhKNza598
ohgodplsno 1005 days ago
As said in other comments, eye tracking to aim at things would be absolutely horrendous. Try using a mouse on maximum sensitivity, and you'll see the kind of accuracy you'll need.
However, eye tracking does give you one thing (that is already getting into VR devices today, and that's foveated rendering. Increasing resolution where needed and lowering it elsewhere. Couple this with various techniques to turn on fewer pixels [https://www.immersivecomputinglab.org/publication/color-perc...] and save up even more on performance/energy usage, and you kind have pleasant, high refresh rate experiences.
[-]
- thedorkknight 1005 days ago
  The article talks about foveated rendering
  [-]
  - ohgodplsno 1005 days ago
    The article barely mentions foveated rendering, in the middle of a sea of "use your gaze to click it'll be great I promise".
    [-]
    - mlajtos 1005 days ago
      Foveated rendering should be an invisible technology — the image you see should be perceptually indistinguishable from the fully rendered image. How to get there is an interesting engineering problem, but for me that isn’t an interesting domain.
      You summed up the article quite good, but I don’t claim it will be great. I just think that gaze as an input device is an underresearched topic. Ken Pfeuffer’s research [0] is beyond awesome, but we need masses of thinkerers to explore the domain more. New wave of VR headsets will enable it.
      [0] https://kenpfeuffer.com/publications-2/
oneoff786 1005 days ago
Rye tracking would be great. But you’ll never have ready player one quality experiences with a headset at all
[-]
- solardev 1005 days ago
  The gluten-free movement has gone too far.
fnordpiglet 1005 days ago
I think eye tracking and facial cameras inside the goggles are key, but not as input. We will never be able to render a realistic avatar if it’s not picking up facial movements and eye focus. With a photo of the face and the ability to track muscle movements in the face around the eyes you can reproduce a realistic personal face with expressed expressions reflected in VR. As an input it feels like a limited utility thing - perhaps bring up information on the object focused on, but mental intent and eye focus are often not in line with each other - I might intend to do something with X but am monitoring Y and Z in the environment with my eyes while keeping X in partial focus.
status200 1005 days ago
Fairly certain that eye tracking will be standard on the next generation of VR devices (at least from the major players like the Quest Pro [0] and Pico 4 [1]), which is exciting and terrifying at the same time.
[0] https://www.roadtovr.com/oculus-quest-pro-eye-face-tracking/
[1] https://www.roadtovr.com/pico-4-pro-enterprise-eye-tracking-...
[-]
- jimmySixDOF 1005 days ago
  According to some just leaked Quest 3 CAD files [1] it will not have eye tracking to reduce the camera costs. Project Cambria has both eyes and upper and lower facial cameras which all need real time processing.
  [1] https://youtu.be/tq57TPTsBQQ SadleyItsBradley leak of Quest 3
- shafyy 1005 days ago
  Yes, pretty certain the Meta Cambria (or Quest Pro), which will be announced on Oct 11, will have eye tracking. Main use is to enable foveated rendering for better performance and quality, but yeah, it's kind of terrfying what other uses there could be.
  Edit: Just saw you edited and updated your post mentioning the Quest Pro after posting this.
- squeaky-clean 1005 days ago
  Playstation VR 2 as well
Daub 1005 days ago
I worked on an eye tracking project that compared the difference between the way artists and non artists look at images. Very interesting. Non artists tends to dwell on faces, hands and other evidently interesting things. Artists find interest in less obvious thing… odd angular protrusions, unexpected colors etc.
It also showed me how active and (frankly) jittery all human vision is: never resting, scanning. Also... unexpectedly, we seem to evaluate objects according to their features (points, blobs, angles).
Philip-J-Fry 1005 days ago
PSVR 2 has eye tracking. So, we'll be seeing more usage of it.
joshruby16 1005 days ago
Most comments here are about how painful eye-tracking as a mouse replacement is.
What if instead it was just used for the 1-2 most often used buttons on a site. For example, here on HN - to go back to the main page. Or, on gmail - to press compose. Once gaze in the general area is detected, the affordance can be highlighted, at which point press Enter would cause navigation.
I feel that <4 buttons are the ones that are pressed 80% of the time on most sites.
wellthisisgreat 1005 days ago
Aren't there VR headsets that do actually perform eye-tracking?
Also is projection on a retina is a thing? I thought there were prototypes at least that did that. I heard of such thing from people at MIT or maybe I misunderstood what they meant back then (some 5-6 years ago?)
I did some very basic work with pupil dilation tracking using IR and it's gotta be RGB cameras or you'll just burn your eyes out (or it will feel that way anyways)
ElCheapo 1005 days ago
*look down to pay respects*
atemerev 1005 days ago
I have thought so before, but in my experience, wide field of vision works fine so you can just look at different spots. However, there is no focus/accommodation tracking in VR, which makes your experience hyperreal (everything is always in focus). Incidentally, it is very close to lucid dreaming, where you also see everything at once.
[-]
- mlajtos 1005 days ago
  I have never realized the similarity between VR and lucid dreams is because of missing depth. Thank you for this. :)
presentation 1005 days ago
Sounds tiring to me, I already get fatigued looking down to make the iPhone’s Face ID happy.
[-]
- xattt 1005 days ago
  I beg to differ. Canon marketed several film and video cameras with eye-tracking for focus. I remember it being a very intuitive interface.
  Considering this through a modern perspective of inclusivity, there are probably issues around accessibility for people with eye-mobility issues (amblyopia, stroke-related impairments, etc).
  (1) https://m.dpreview.com/articles/6531126959/looking-back-cano...
  [-]
  - ghaff 1005 days ago
    They appeared during the late days of film EOS models but went away and never returned. I've never heard a good explanation. I even asked a friend of mine who sometimes covers camera tech for one of the big online pubs and Canon would never give an explicit reason.
    [-]
    - AuryGlenz 1005 days ago
      They’re back now, actually.
      [-]
      - ghaff 1005 days ago
        Yeah. Someone else mentioned that you can apparently use eye tracking to set an initial focus point in the R3. And it apparently works well for some and not at all for others.
janoc 1005 days ago
Sorry, as someone who has worked in VR and including with eye tracking for over 20 years, no. Eye or gaze tracking is definitely *not* an input device and should never be used as one.
At least not in the sense of using it as some sort of conscious, user controlled input. Using it to *passively monitor* where and what the user is looking at is fine and useful, though, but maybe less so than many think.
The reason for this is simple - humans don't consciously control eye movement to the same degree as e.g. their limbs. A lot of eye movement is completely automatic.
And even the little that we can consciously control would get extremely fatiguing fast - imagine not being able to move your gaze even a millimeter while driving a car without risking a crash! We have made such experiment around 2004 where the user was asked to guide a character walking on a projection screen (not HMD) by gaze. Most people got a horrible headache within few minutes and it was very uncomfortable.
Another aspect is that eye tracking isn't magic. There is literally an infrared camera looking at your eye from a close distance. This camera has a finite resolution and the eyeball/iris moves relatively little.
With the target the user is actually looking at being several meters away (whether in reality or VR makes no difference here), you could at best identify large objects or some sort of zones where the user is likely to be looking. Beyond 2-3 meters the errors grow fast because the same eye movement corresponds to much a larger displacement the farther the target is (think shining a torch at a distant wall - little hand movement makes it move a lot).
Oh and it still totally ignores the last detail - human eye is not a camera or gun that you aim at something and what the red crosshair aligns with is what the user is looking at. That only means that the thing is in the field of view but the user could be focused on something totally different in the same FOV.
Ever seen those experiments where you are asked to follow some ball or card or count how many objects of some kind are in the scene - and at the end you get asked whether you have noticed the gorilla walking through the stage. Most people won't, yet it was directly in their field of view. Our brain is a large part of the visual perception and yet we have no way to know what it is really "looking at".
So unless we are talking special cases like accessibility solutions for people with disabilities, eye tracking certainly isn't a good candidate for an input device.
And even for things like foveated rendering its utility is a disputable - the complexity, added latency and computational requirements may often outweigh the benefits. Foveated rendering can be done with a fixed, untracked FOV - not ideal but often good enough.
Where I could see it being genuinely useful for VR is eye contact between avatars. That's a big problem that this could solve. However, whether it would justify the extra cost of the headset, the required extra user discomfort (calibration, extra setup required for good results, etc.) I doubt.
Then there are obviously research and professional uses (e.g. tracking attention of a trainee during an instruction session) but those are niche application cases.
And, well, lets not get into the "big brother" stuff where one will be tracked, profiled and evaluated on whether or not they looked at ads, signs or even a cute girl/boy passing by. Companies like Facebook (err Meta) being involved with this should give anyone a pause.
gkfasdfasdf 1005 days ago
Looking forward to the pop up ads right in the center of my gaze
/s
hoseja 1005 days ago
This would be worth it for foveated rendering alone.
[-]
- moron4hire 1005 days ago
  Turns out foveated rendering is not the panacea it was imagined to be. Eye saccades are so fast that there's just too much motion-to-photon latency to do it well.
  https://twitter.com/ID_AA_Carmack/status/1391869530327052291
k__ 1005 days ago
Could be used for games like Silent Hill or Amnesia for effects that only happen if you don't look directly at them.
[-]
- mlajtos 1005 days ago
  Weeping angels from Doctor Who
augasur 1005 days ago
One of the features I am keen to try for eye-tracking is radial menu selection in action games.
epakai 1005 days ago
Fove [0] built a headset with eye tracking features. I had a senior project trying to use it for visual field testing in 2017.
[0] https://www.kickstarter.com/projects/fove/fove-the-worlds-fi...
jvanderbot 1005 days ago
Gaze tracking for adaptive resolution and frame rate makes sense, though.
daxfohl 1005 days ago
I keep thinking something about spotting squirrels climbing trees
sys_64738 1005 days ago
What if I have a lazy eye? Isn't that discrimination?
[-]
- lm28469 1005 days ago
  Everything is a discrimination to someone if you start with such crazy takes
- pessimizer 1005 days ago
  What if I'm blind? Is having any visual component to VR bigotry?
  [-]
  - sys_64738 1005 days ago
    This is why you have Americans with disabilities acts forcing accessibility for those deprived of all sensory and motor function.
- thedorkknight 1005 days ago
  At worst you're just back to a regular VR display in that case, so unless current non-eye-tracking headsets are discriminatory, no. Plus I would think it would still be able to use your non-lazy eye
- micromacrofoot 1005 days ago
  no
twirlock 1005 days ago
picsao 1005 days ago
blululu 1005 days ago
I’ve heard a lot of these claims before and seen them fall flat when you actually start building it. Things like foveated rendering that many (the author included) claim will lead to better graphics and lower power just don’t add up. You need at least two additional cameras running a pretty compute heavy ml algorithm to track eyes and this costs more watts than just rendering the full screen as your gpu intended. As an interface it can be quick but don’t underestimate that it is one really fucking annoying to get targeting feedback on everything you look at (which sucks for dwell oriented tasks like reading). Finally there is the elephant in the room that this tech works somewhat well for people from Northern Europe, but gets super shaken for a wider swath of the global population. Maybe the author has seen some demos that were better than what I’ve seen, but this all seems like the project pitches that I’ve seen for why we need to explore this area and no the final report.
[-]
- LordHeini 1005 days ago
  That is completely wrong.
  For foveated rendering one could easily build dedicate hardware into the device which would require almost no additional power. And those algorithms are not that expensive to begin with. Every camera has face tracking build in (which basically is the same problem) and that does not drain the battery either.
  And why should northern Europeans eyes work differently?
  Even if that would be true one could add a bunch of settings for personalisation.
  Eye distance for example is a common setting on dodays devices...
  [-]
  - blululu 1005 days ago
    Eye trackers typically require 2-4 additional cameras trained on the eyes. The algorithms for eye tracking are typically pretty involved, but maybe you know some very lower power algorithm to track the eye at 120 hz? I'm not sure how dedicated hardware will reduce the power consumption for 2 cameras and an ML pipeline to almost nothing.
    Northern Europeans have very light irises. It makes it easier to track the pupil from a single signal-to-noise perspective. But there are a ton of weird ethnic challenges for eye tracking. Things like reflectivity and moisture of eyes varies a lot across different across different ethnic groups. Personalization add complexity and will only take you so far when you are dealing with physical sensing challenges.
- moron4hire 1005 days ago
  Eye-tracking uses simple IR reflection sensors to follow the black dot of the pupil. It has no dependence on skin or iris color. It's been done for decades in psychological studies and doesn't require complex ML algorithms to do. Eye tracking is not a particularly hard thing to implement in hardware, it's just an added expense for dubious--very dubious--utility.
  Probably the "best" use of eye tracking for modern VR systems is pretty much just reflecting the eye motion in avatars during VR teleconferencing sessions to give a more natural "face-to-face conversation" feel. Beyond that, things like eye-tracking as input and foveated rendering turn out to only be great on paper.
  [-]
  - charcircuit 1005 days ago
    >Eye-tracking uses simple IR reflection sensors
    Every VR headset with eye tracking uses actual cameras.
    [-]
    - sgtnoodle 1005 days ago
      These days, wouldn't a camera sensor be considered a simple sensor? Optical mice use actual cameras too, they're just optimized for a very shallow focus, very high temporal resolution rather than spatial resolution, and sensitivity to IR. Do eye trackers need especially high spatial resolution, or RGB color?
      [-]
      - charcircuit 1005 days ago
        I was mainly referring to the power requirements.
- terafo 1005 days ago
  You are underestimating cost of rendering and overestimating cost of machine learning models. And forgetting the fact that cameras are already used in modern headsets for position tracking and more(there is Quest 2 demo where you cast spells using your hands, no controllers), and it is much much more complicated task than eye tracking.
  [-]
  - blululu 1005 days ago
    If you have some numbers please share.
    I've seen NVIDIA and Tobii (highly motivated parties) claim that you might get a 10% power reduction, but those studies were based on very constrained operating conditions and conveniently brush certain details under the rug. All told, it's a lot of work and a lot complexity to chase something that will at most provide modest gains.
    The cameras on headsets face the world. So you need 2-4 additional cameras trained on the eyes. These camera each require power just to get images. For a decent UX you need tracking at ~120hz (60hz eye tracking is slow). Exact figures are hard to come by on a Sunday morning, but off the top of my head, I've seen figures to the tune of 250mW per camera (which ignores ISP and the compute to move the data around). Computer vision algorithms running at 120 hz are typically not cheap. Pupil Labs cites 50% CPU (on an i5) per eye[1]. Tobii doesn't publish their power consumption but it is also pretty steep. I'm sure there are optimizations here but all the specs I've seen blush at the power consumption required by a general purpose eye tracker.
    If you have figures that suggest otherwise I would be happy to see but every quantitative analysis I've seen comes up with a muddled response. If you already commit to running eye tracking for other reasons then it makes sense to run this rendering optimization, but I'm not sure if it really pays for its self. [1] https://pupil-labs.com/products/vr-ar/tech-specs/
  - jayd16 1005 days ago
    I would think it's actually harder to do eye tracked foveated rendering than motion tracking. They're not identical problems. Motion tracking is a feature, foveated rendering is an optimization that needs to significantly beat the naive approach of fixed foveated rendering.
    Part of the problem is you can throw everything you have at motion tracking. You can over render and then use the end of frame head position to space/time warp to adjust the render the display gets, maybe throwing away some of the edges.
    For foveated rendering, you need to know what to render at the START of the frame before you render it, there can't be any waste, and we're only talking about a thin band of rendering gain IF the user is not looking forward. I would assume the focal point can change faster because you're at least dealing with the head movement + eye movement, so it's even more work that needs to catch up at the end of the frame.
    I really do not think foveated rendering is easier than head tracking. If it was, we would see more of it. Dynamic foveation is really only just now coming to headsets.
  - saltcured 1005 days ago
    While we lack spatial resolution in the periphery, we can be very sensitive to movement or changes there. So I think you would need a pretty accurate and stable low resolution rendering method to avoid inducing unintended perceptions of movement.
    I assume saving power means skipping a lot of processing, which sounds to me like undersampling i.e. the opposite of the supersampling used to get good renderings today. But, it is hard for me to imagine how a generic 3D modeled scene can be undersampled for the periphery without introducing horribly unnatural aliasing artifacts. In other words, you still need to determine proportional occlusion and lighting of all the elements in the scene to understand which blurry colors to show. You can't just cast a few rays and stretch them to fill the periphery. Otherwise, slight camera or object movements might lead to chaotic flashing of peripheral colors as the reduced samples stochastically observe different scene elements while ignoring the rest.
    I know of things like multi-resolution texture pyramids which can precompute a blurred version to allow undersampled texture rendering. And I know many games switch between different low/high polygonal representations of objects (or even cull objects) based on distance to the camera, which often produces obvious "popping" artifacts as the user moves through the 3D world. Are there general scene modeling techniques that can work as smoothly as multi-resolution textures to allow rapid or simultaneous use of different levels of geometric detail in one rendering?
    I think those popping artifacts would be much more distracting if tied to every eye movement rather than only with navigation across a 3D world. Crude, culling methods would be inadequate if they just drop peripheral objects that might otherwise produce the kind of peripheral image that would cause the user to change their gaze...
  - charcircuit 1005 days ago
    >And forgetting the fact that cameras are already used in modern headsets for position tracking and more
    Just because you already are running cameras + ML models that doesn't mean adding even more cameras and ML models is free.
- lam 1005 days ago
  Why was this down-voted? S/he seems to be making valid comments.