Jump to content

Microsoft Kinect


Asura
 Share

Recommended Posts

The biggest problem, like with all motion controls, who wants to hold their hand(s) out for x minutes whilst they're pointing a gun or holding a steering wheel. I'd give that five minutes before I ache. That's why I think motion control is so limiting. Bit annoyed that MS and Sony have thrown themselves at it just because the Wii is selling well.

They were always going to to be honest, it would be a stupid business decision not to.

Link to comment
Share on other sites

What people don't seem to be getting about the voice recognition is that the scope for the conversation can be hugely narrowed and this is something us gamers and games have been used to for ages.

If you are in a FPS for example and you were to say " and the anthropologist found that the gorillas liked cheese" you'd be mad to think that the game should adequately respond.

However, if you were to say "enemies 6 o'clock" or "I've got your back" I'd full expect the AI to understand this information.

Voice recognition software is only duff because the english language contains 70000 words and an almost infinate number of valid combinations. You could easily refine a game to understand 100 key phrases.

Link to comment
Share on other sites

However, if you were to say "enemies 6 o'clock" or "I've got your back" I'd full expect the AI to understand this information.

Ever played EndWar? Voice recognition works perfectly for that but it's still just a gimmick because using the pad is quicker.

Link to comment
Share on other sites

Ever played EndWar? Voice recognition works perfectly for that but it's still just a gimmick because using the pad is quicker.

How can it be quicker to use a pad to say " enemies at 3 o'clock". I'm sorry but it's just not. I've not played end war but I get the impression you play the role of some orcastrating the battlefield. That's not what I'm talking about, I'm talking about comms that you would use in a nromal co-operative game being implemented into AI.

stuff like "rocks at sniper tower" in halo 3. Easy to recognise and it's very easy to see how this kind of thing can take AI to the next level alone without the motion tracking.

Link to comment
Share on other sites

All of those things are happening. Vista alone has a very advanced speech recognition system. A number of systems where you can talk to an AI agent about a very specific subject are in place or are being developed.

In the cosmic scheme of things, it's not that good. And even if it can transcribe word by word, parsing it properly is a whole different problem.

And these would be even more easy in a Natal-run videogame as 1) the subjects are likely to be even more specific, e.g. The location of the magic crystal, and 2) Natal has the ability to recognise speech tone (confirmed by journalists who've used it) which would narrow down what's being said even further.

The problem isn't that they're overpromising, the problem is that you don't properly understand what's currently possible.

I understand perfectly well what is currently possible, and I know that if they attempt it is going to be either severely limited - like, say, Fable 2 in comparison to what was promised - or it will be incredibly frustrating. A few phrases that can be asked of every NPC, a few very specific ones someone tells you to say, a few Easter Eggs - sure. It may work, or it may get tedious and pad input would be quicker - god knows I skip most speech in a adventure games because I can read the text quicker. What no one is going to do is "have a conversation" with a computer in a game. You appear to not understand the underlying complexity of the underlying problems, the amount of time and money invested in trying to solve them, and the generally poor results.

Voice recognition software is only duff because the english language contains 70000 words and an almost infinate number of valid combinations. You could easily refine a game to understand 100 key phrases.

Actually, the number of potential combinations is much less important than the fact that a hell of a lot is inherently ambiguous. Computers are good at crunching through large numbers of things. But how does it know how to parse "Fruit flies like bananas"? Flies can be both a verb and noun.

Link to comment
Share on other sites

Hrm, Natal might be closer to the Eyetoy than everyone's assuming. All of Microsoft's releases on the system refer to "a camera", singular, with a "depth sensor" and some microphones. The "stereo cameras" on the device show up as purple dots on some of the press imagery (e.g. the Lionhead video) so I think they're actually just IR LEDs for a system that works like Sony's old patent. Microsoft recently bought a company that developed such a system. Sony was linked to a similar startup several years ago.

More on ZCam, the PC peripheral behind Microsoft's Natal.

Link to comment
Share on other sites

Actually, the number of potential combinations is much less important than the fact that a hell of a lot is inherently ambiguous. Computers are good at crunching through large numbers of things. But how does it know how to parse "Fruit flies like bananas"? Flies can be both a verb and noun.

why the f**k would it need to. Please tell me a gaming implementation where the difference between the verb and noun of flies could cause a problem?

Link to comment
Share on other sites

why the f**k would it need to. Please tell me a gaming implementation where the difference between the verb and noun of flies could cause a problem?

It's just a classic example of the ambiguity of language. God knows even text-based parsers had a nightmare of it in the adventure game days.

Link to comment
Share on other sites

If voice control was useful we'd have been doing it on our PC for years. Even with Vista well trained on my voice it was still largely pointless.

No we wouldn't, because PCs have a very well developed, finely honed input system and a very, very complicated operating system.

In your living room all you've got is the remote control and the joypad, incredibly clumsy for anything other than playing games or pressing play.

Forget the PC OS comparisons, they're really not useful. If you could tell your TV to change channels you'd use it all the time...

Link to comment
Share on other sites

It's just a classic example of the ambiguity of language. God knows even text-based parsers had a nightmare of it in the adventure game days.

Maybe so, but context is going to reduce that ambiguity to virtually zero.

Sure, you're going to be able to trip the system up, just like telling the AI assistant on your local bus company's website to go rape herself, but is that a reason not to bother with it?

Link to comment
Share on other sites

No we wouldn't, because PCs have a very well developed, finely honed input system and a very, very complicated operating system.

In your living room all you've got is the remote control and the joypad, incredibly clumsy for anything other than playing games or pressing play.

Forget the PC OS comparisons, they're really not useful. If you could tell your TV to change channels you'd use it all the time...

Isn't the video game controller *more* complicated than the mouse input device we use with PCs? They certainly have a lot more buttons, as well as joysticks. If you take the Wii Controller, that has the incredibly useful pointer as well. Surely that has more options than a mouse with two buttons and a roller?

Link to comment
Share on other sites

No we wouldn't, because PCs have a very well developed, finely honed input system and a very, very complicated operating system.

In your living room all you've got is the remote control and the joypad, incredibly clumsy for anything other than playing games or pressing play.

Forget the PC OS comparisons, they're really not useful. If you could tell your TV to change channels you'd use it all the time...

No you wouldn't, it's far quicker to just press the button on the remote than to say even the channel number and has far better false positive and false negative rates.

PC OSes have a pointer-based system and a comprehensive method of inputting text or button commands. The last of these is handled perfectly well by remotes and game controllers and the other two are hardly going to be elegantly performed by voice commands. The best pointer-based system is probably the Wii Remote at this stage, and the best text input system is just to use a keyboard. The only reason why we don't have these in our living rooms already is becase we didn't need them.

Link to comment
Share on other sites

Two problems with voice control.

Do tell

It doesn't work

We'll see. I'm pretty sure if MS are spending so much time and money on it, it does.

and I don't want to do it.

No-one cares. Diddums.

Also it's only suitable for one person in a relatively silent environment.

That's a fair point, I hope they've thought of it. Again, if they've spent this much time and money on it, I'm sure they will have considered that at some stage.

It's mostly useless for games.

Has been so far, but why give up trying? If it works half as well as it does in the video, it'll certainly be worth having for quiz games and dashboard control. Who's it going to hurt?

Link to comment
Share on other sites

Voice Recognition is just a gimmick is it not? Its something you use for a week and then forget all about it as its totally pointless. I've got it on my PC and I've got it in my car, and I've used neither of them since the first week of purchase.

Link to comment
Share on other sites

it'll certainly be worth having for quiz games

Only if it's a game where people answer one at a time. And the dashboard? You'd rather say "command dashboard scroll left" than move your thumb 2mm on the d-pad?

Link to comment
Share on other sites

No I wouldn't.

Even it it worked.

It would be very cumbersome and a lot slower than using a remote.

Not on my Virgin box it wouldn't.

Saying "Current TV" would be a lot quicker than going to the cumbersome guide to find out which channel number it is, then having to scroll down loads of channels to find it.

I have hundreds of channels, and I think the remote is a pretty bad way of navigating around them. Fine if you only like a few channels and set them to favourites or if you memorise loads of channel numbers. Not fine if you want to find a channel you don't usually watch.

Link to comment
Share on other sites

We'll see. I'm pretty sure if MS are spending so much time and money on it, it does.

How much time? How much money? Nobody has ever wasted time and money on flawed tech? Have MS never made mistakes?

These things already exist in a wider world than just games and they don't work very well. Practically everything MS showed was either very obviously mocked up or very flakey (the live stuff). If they had it working they would have shown it.

Sony's wand was proof of working tech. Natal was just pie in the sky stuff which will be massively disappointing when it comes out.

Link to comment
Share on other sites

OK, tell me which of these things is easier.

On Virgin media, E4 is channel 143. I only know that because I've just checked it.

Now which is easier. Learning the numbers of all the channels you watch (or even setting them as favourites and scrolling through them) or saying "E4" out loud when you want to change channel?

Typing in the channel number is only easier if you've already learned how to do it. In the future we won't need to.

Whoever gave the example of voice controlling last.fm on the 360 had it about right. I'd love to be able to do that shit, it'd make life much easier.

Assuming it works.

Link to comment
Share on other sites

Saying "Current TV" would be a lot quicker than going to the cumbersome guide to find out which channel number it is, then having to scroll down loads of channels to find it.

How does "Current TV" contain any more information for the box to work with than pressing a single button? Saying two words takes longer , especially as you'll likely have to repeat it a few times.

Link to comment
Share on other sites

But what if someone on your TV told it to change channels? That'd be quite irritating. It'd be even worse than when the travel news cuts in over the CD in the car

I'd imagine that could be easily solved with the voice and face recognition.

Again, assuming it works.

Link to comment
Share on other sites

Now which is easier. Learning the numbers of all the channels you watch (or even setting them as favourites and scrolling through them) or saying "E4" out loud when you want to change channel?

The easiest way is to bring up a guide and scroll through it.

Link to comment
Share on other sites

Maybe so, but context is going to reduce that ambiguity to virtually zero.

Sure, you're going to be able to trip the system up, just like telling the AI assistant on your local bus company's website to go rape herself, but is that a reason not to bother with it?

It really isn't. Human languages are a bollocks for computers to parse and there are stuctural as well as semantic problems. And that's once it got past the problems with differing accents. If you have stock phrases, or keywords, fine. I imagine there are uses, or cleverly done it could add ot the immerrsion. But it's basically absolutely nothing likr what people are suggesting here.

If anything, for something like Fallout or Oblivion it'd be a step back. That offers conversation trees that expand in a fashion that a real person could ask in too many different ways for the computer to cope with. So you fall back to simple stock phrases.

Link to comment
Share on other sites

How does "Current TV" contain any more information for the box to work with than pressing a single button? Saying two words takes longer , especially as you'll likely have to repeat it a few times.

Perhaps you'd have been right 10 years ago, if you were talking about terrestrial, 5 channel TV.

We have hundreds of channels in the future, join us. And if you can find a way to change from channel 106 to channel 143 with one button press, I'll give you a shiny pound.

Link to comment
Share on other sites

But what if someone on your TV told it to change channels?

I'm pretty sure they could filter out the TV audio.

More problematic is your 3 year old seeing a horse in an advert and going "HORSE" which the TV interprets as "PAUSE".

This is a totally unneeded feature that nobody would use in practice.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue. Use of this website is subject to our Privacy Policy, Terms of Use, and Guidelines.