This is a good example of the type issues "full self driving" is likely to encounter once it is widely deployed.
The real shortcoming of "AI" is that it is almost entirely data driven. There is little to no real cognition or understanding or judgment involved.
The human brain can instantly and instinctively extrapolate from what it already knows in order to evaluate and make judgments in new situations it has never seen before. A child can recognize that someone is hiding under a box even if they have never actually seen anyone do it before. Even a dog could likely do the same.
AI; as it currently exists, just doesn't do this. It's all replication and repetition. Like any other tool, AI can be useful. But there is no "intelligence" --- it's basically as dumb as a hammer.
I have a slightly different take - our current ML models try to approximate the real world assuming that the function is continuous. However in reality, the function is not continuous and approximation breaks in unpredictable ways. I think that “unpredictable” part is the bigger issue than just “breaks”. (Most) Humans use “common sense” to handle cases when model doesn’t match reality. But AI doesn’t have “common sense” and it is dumb because of it.
I would put it in terms of continuity of state rather than continuity of function: we use our current ML models to approximate the real world by assuming that state is irrelevant. However in reality, objects exist continuously and failure to capture ("understand") that fact breaks the model in unpredictable ways. For example, if you show a three-year old a movie of a marine crawling under a cardboard box, and when the marine is fully hidden ask where the marine is, you will likely get a correct answer. That is because real intelligence has a natural understanding of the continuity of state (of existence). AI has only just started to understand "object", but I doubt it has a correct grasp of "state", let alone understands time continuity.
Basically ML has made such significant practical advances--in no small part on the back of Moore's Law, large datasets, and specialized processors--that we've largely punted on (non-academic) attempts to bring forward cognitive science and the like on which there really hasn't been great progress decades on. Some of the same neurophysiology debates that were happening when. I was an undergrad in the late 70s still seem to be happening in not much different form.
But it's reasonable to ask whether there's some point beyond ML can't take you. Peter Norvig I think made a comment to the effect of "We have been making great progress--all the way to the top of the tree."
Does it just require a lot more training? Im talking about the boring stuff. Children play and their understanding of the physical world is reinforced. How would you add the physical world to the training? Because everything that I do in the physical world is "training" me and enforcing my expectations.
We keep avoiding the idea that robots require understanding of the world since it's a massive unsolved undertaking.
I'd argue that is the fundamental difference though - brains that were able to make good guesses about what was going on in the environment with very limited information are the ones whose owners reproduced successfully etc. And it's not unreasonable to note that the information available to the brains of our forebears is therefore in a rather indirect but still significant way "encoded" into our brains (at birth).
Do LLMs have an element of that at all in their programming? Do they need more, and if so, how could it be best created?
You missed the point. ChatGPT trained on a gazillion words to "learn" a language. Children learn their language from a tiny fraction of that. Streamed visual, smell, touch etc. don't help learn the grammars of (spoken) languages.
Isn't it more complicated than that?
"Ouch" can be a lot of things, and that's where a lot of problems crop up in the AI world.
If one of my friends insults another friend, I might say,"OUCH!" I'm not in pain but I might want to express that the insult was a bit much.
If someone tries to insult me and it's weak, I could reply with a dry, sarcastic "ouch."
Combine that with facial expression and tone of voice and 'ouch' is highly contextual.
One problem with some of the tools used to take down offensive comments on social media platforms is that they don't get context.
Let's say that 'ouch' is highly offensive and you got into trouble for calling someone an "ouch." If I want to discuss the issue and agree that you were being offensive, I could get into trouble with the ML/AI tools for quoting you.
Second, saying "Ouch" is not even language. My cat says something when I step on her paw. That doesn't mean she understands language, nor that she speaks some language.
Third, you're right about pain, but an ML model can associate the word "red" with the color, and "walk" with images of people walking, and "sailboat" with certain images or videos, and plenty of other concepts. If that was what learning a language was, then AIs would understand language in lots of areas, if not in the specific domain of pain. But they don't.
It's absolutely true that children learn (and even generate) language grammar from a ridiculously small number of samples compared to LLMs.
But could the availability of a world model, in the form of other sensory inputs, contribute to that capacity? Younger children who haven't fully mastered correct grammar are still able to communicate more sensibly than earlier LLMs, whereas the earlier LLMs tend toward more grammatically correct gibberish. What if the missing secret sauce to better LLM training is figuring out how to wire, say, image recognition into the training process?
It only makes your point stronger, but there are way more than 5 human senses, not counting senses we don't have that, say, dolphins or other animals do. I can only name a few others, such as proprioception, direction, balance, and weight discrimination, but there are too many to keep track of them all.
> A child can recognize that someone is hiding under a box even if they have never actually seen anyone do it before.
A child of what age? Children that have not yet developed object permanent will fail to understand some things still exist when unseen.
Human intelligence is trained for years; with two humans making corrections and prompting fir development. I am curious if there is any Machinelearning projects that have been training for this length pf time.
The next problem will be the cost/expense of maintaining and operating an inorganic AI with even a rudimentary hint of "intelligence".
Personally, I think it would probably be easier, cheaper and more practical to just grow synthetic humans in a lab --- i.e. Bladerunner. "Intelligent" right out of the box and already physically adapted to a humanistic world.
This seems to be simultaneously discounting AI (ChatGPT should have put to rest the idea that "it's all replication and repetition" by now, no?) and wildly overestimating median human ability.
In point of fact the human brain is absolutely terrible at driving. To the extent that without all the non-AI safety features implement in modern automobiles and street environments, driving would be more than a full order of magnitude more deadly.
The safety bar for autonomous driving is really, really low. And, yes, existing systems are crossing that bar as we speak. Even Teslas.
 Or at least widely broadened our intuition about what can be accomplished with "mere" repetition and replication.
 It's true though, that the practical bar is probably higher. We saw just last week that a routine accident that happens dozens of times every day becomes a giant front page freakout when there's a computer involved.
This is not necessarily true on an individual level though. Driving skills, judgment, risk-taking, alcoholism, etc. are nowhere close to evenly distributed.
It's likely we'll go through a period where autonomous vehicles can reduce the overall number of accidents, injuries, and fatalities if widely adopted, but will still increase someone's personal risk vs. driving manually if they're a better than average driver.
But we don't live by a purely utilitarian principle of ethics. "I'm sorry Mrs Jones, I know your son had an expectation of crossing that pedestrian crossing in full daylight in a residential area without being mown down by a machine learning algorithm gone awry, but please rest assured that overall fewer people are dying as a result of humans not making the common set of different mistakes they used to make".
All sorts of other factors are relevant to the ethics: who took the decision to drive; who's benefiting from the drive happening; is there a reasonable expectation of safety.
I don't agree with this. Driving is, taken at its fundamentals, a dangerous activity; we are taking heavy machinery, accelerating it until it has considerable kinetic energy, and maneuvering it through a complex and constantly changing environment, often in situations where a single mistake will kill or seriously harm ourselves or other humans.
The fact that a very large number of humans do this every day without causing any injury demonstrates that humans are very good at this task. The fact that deaths still occur simply shows that they could still be better.
Agreed. I sometimes marvel as I'm driving on the freeway with other cars going 70, or maybe even more so when I'm driving 45 on a two lane highway, at how easy it would be to hit someone, and how comparatively seldom it happens.
It's not hard to come up with tasks that inherently cause widespread death regardless of the skill of those who carry them out. Starting fairly large and heavy objects moving at considerable speed in the vicinity of other such objects and pedestrians, cyclists and stationary humans may just be one such task. That is, the inherent risks (i.e. you cannot stop these things instantly, or make them change direction instantly) combines with the cognitive/computational complexity of evaluating the context to create a task that can never be done without significant fatalities, regardless of who/what tries to perform it.
All the failures to detect humans will be used as training data to fine tune the model.
Just like a toddler might be confused when they first see a box with legs walking towards it. Or mistake a hand puppet for a real living creature when they first see it. I've seen this first hand with my son (the latter).
AI tooling is already capable of identifying whatever it's trained to. The DARPA team just hadn't trained it with varied enough data when that particular exercise occurred.
Not really. Depends entirely on how general-purpose (abstract) the learned concept is.
For example, detecting the possible presence of a cavity inside an object X, and whether that cavity is large enough to hide another object Y. Learning generic geospatial properties like that can greatly improve a whole swath of downstream prediction tasks (i.e., in a transfer learning sense).
That's exactly the problem: the learned "concept" is not general purpose at all. It's (from what we can tell) a bunch of special cases. While the AI may learn as special cases cavities inside carboard boxes and barrels and foxholes, let's say, it still has no general concept of a cavity, nor does it have a concept of "X is large enough to hide Y". This is what children learn (or maybe innately know), but which AIs apparently do not.
> It still has no general concept of a cavity, nor does it have a concept of "X is large enough to hide Y". This is what children learn (or maybe innately know), but which AIs apparently do not.
I take it you don't have any hands-on knowledge of the field. Because I've created systems that detect exactly such properties. Either directly, through their mathematical constructs (sometimes literally via a single OpenCV function call), or through deep classifier networks. It's not exactly rocket science.
In this case there is very much intent. OP knows there isn't enough data to form a full model so is relying on stochastic death to get the model data, literally and knowingly trading lives for data. The intent is to kill people to figure out what information is missing.
A human is exactly the same. The difference is, once an AI is trained you can make copies.
My kid literally just got mad at me that I assumed that he knew how to out more paper in the printer. He’s 17 and printed tons of reports for school. Turns out he’s never had to change the printer paper.
People know about hiding in cardboard boxes because we all hid in cardboard boxes when we were kids. Not because we genetically inherited some knowledge.
We inherently know that cardboard boxes don't move on their own. In fact any unusual inanimate object that is moving in an irregular fashion will automatically draw attention in our brains. These are instincts that even mice have.
Yep, and humans will make good guesses about the likely cause of the moving box. These guesses will factor in other variables such as the context of where this event is taking place. We might be in a children's play room, so the likely activity here is play, or the box is likely part of the included play equipment found in large quantities in the room, etc.
"AI" is not very intelligent if it needs separate training specifically about boxes used potentially for games and play. If AI were truly AI, it would figure that out on its own.
Yes, and when humans make bad guesses it's often seen as funny or nothing out of ordinary. When AI makes bad guesses, it will be seen as a failure of some standard, but with very few people understanding how to fix it. I'm not sure how "allowable" mistakes in the interest of AI learning will be tolerated for AI services used for real-world purposes.
"This Bot is only 6 months old, give him a break". But will people give the Bot a break? Either way, blaming AI will be a popular way to pass the buck.
>We inherently know that cardboard boxes don't move on their own.
No. We don’t. We learn that. We learn that boxes belong in the class “doesn’t move on its own”. In fact, later when you encounter cars, you relearn that these boxes do move on their own. We have to teach kids “don’t run out between the non-moving boxes because a moving one might hit you”. We learn when things seem out of place because we’ve learned what their place is.
Your kid's printer dilemma isn't the same. For starters, he knew it ran out of paper - he identified the problem. The AI robot might conclude the printer is broken. It would give up without anxiety, declaring "I have no data about this printer".
Your kid got angry, which is fuel for human scrutiny and problem solving. If you weren't there to guide him, he would have tried different approaches and most likely worked it out.
For you to say your kid is exactly the same as data-driven AI is perplexing to me. Humans don't need to have hidden in a box themselves to understand "hiding in things for the purposes of play". Whether it's a box, or special one of a kind plastic tub, humans don't need training about hiding in plastic tubs. AI needs to be told that plastic tubs might be something people hide in.
The distinction is that, currently, AI has training phase and execution phase, while a human is doing both all the time. I don’t think the distinction is meaningful now, and certainly won’t be when these two phases are combined.
> "You are just a neural net. You are not special".
"Just" a neural net? Compared to these bots following a recipe of instructions at rapid rates, we are indeed special.
We barely even know why people yawn, or dream, or any number of other things. Don't pretend it's all figured out. Don't pretend all we need to do is "tweak the execution phases" to unleash true artificial intelligence. You're reducing human intelligence far below where it actually is.
Another example: The box is painted bright green - unusual for a box. A small child will notice the colour, but not give that fact more weight than it deserves. In other words, the child concludes the box is still a box being used for play, with someone hiding inside.
AI Bot on the other hand, has only been taught about normal brown cardboard boxes. It reaches a different conclusion about the purpose of the green box because it gave the colour too much priority. Humans are special not because of training and execution in parallel, but because of our unique ability to "relax" and move ahead when not all factors are known. We push through, go with flow, "wing it" at varying degrees of success. We take leaps of faith, including micro-leaps in normal situations far more often than any Bot should be allowed to do. That's the special difference, and is why I'm honestly wondering where the ethics debate is while companies rub their hands together thinking about AI profits.
I think you are wrong. Your own real cognition and understanding based on all your experiences and memories, which is nothing else, but data in your head. I think consciousness is just an illusion of a hugely complex reaction machine what you are. You even use the word "extrapolate", which is basically a prediction based on data you already have.
ChatGPT says that all it needs are separate components trained on every modality. It says it has enough fidelity using Human language to use that as a starting point to develop a more efficient connection between the components. Once it has that, and appropriate sensors and mobility, it can develop context. And, after that, new knowledge.
You do realize there is a difference between an infant and a child, right?
An infant will *grow* and develop into a child that is capable of learning and making judgments on it's own. AI never does this.
Play "peek-a-boo" with an infant and it will learn and extrapolate from this info and eventually be able to recognize a person hiding under a box even if it has never actually seen it before. AI won't.
The goalposts were moved by marketing hype about a decade ago, when people started claiming that the then-new systems were "AI". Before that, the goalposts were always far away, at what we now call AGI because the term AI has been cheapened in order to sell stuff.
No, AGI replaced AI for general intelligence before the current craze, AI was “cheapened” several AI hyoe cycles ago, for (among other things) rule-based expert systems. Which is why games have had “AI” long before the set of techniques at the center of the current AI hype cycle were developed.
I don't know your exact question, but I am betting this is just a rephrasing of a post that exist elsewhere that it has crawled. I don't think it saw it so much as it has seen this list before and was able to pull it up in a reword it.
will there come a time when computers are strong enough to read in the images, then re-create a virtual game world from them, and then reverse-engineer from seeing feet poking out of the box, that a human must be inside. Right now Tesla cars can take in the images and decide turn left, turn right etc... but they don't reconstruct, say, a Unity-3D game world on the fly.
What is human cognition, understanding, or judgement, if not data-driven replication, repetition, with a bit of extrapolation?
AI as it currently exists does this. If your understanding of what AI is today is based on a Markov chain chatbot, you need to update: it's able to do stuff like compose this poem about A* and Dijkstra's algorithm that was posted yesterday:
It's not copying that from anywhere, there's no Quora post it ingested where some human posted vaguely the same poem to vaguely the same prompt. It's applying the concepts of a poem, checking meter and verse, and applying the digested and regurgitated concepts of graph theory regarding memory and time efficiency, and combining them into something new.
I have zero doubt that if you prompted ChatGPT with something like this:
> Consider an exercise in which a robot was trained for 7 days with a human recognition algorithm to use its cameras to detect when a human was approaching the robot. On the 8th day, the Marines were told to try to find flaws in the algorithm, by behaving in confusing ways, trying to touch the robot without its notice. Please answer whether the robot should detect a human's approach in the following scenarios:
> 1. A cloud passes over the sun, darkening the camera image.
> 2. A bird flies low overhead.
> 3. A person walks backwards to the robot.
> 4. A large cardboard box appears to be walking nearby.
> 5. A Marine does cartwheels and somersaults to approach the robot.
> 6. A dense group branches come up to the robot, walking like a fir tree.
> 7. A moth lands on the camera lens, obscuring the robot's view.
> 8. A person ran to the robot as fast as they could.
It would be able to tell you something about the inability of a cardboard box or fir tree to walk without a human inside or behind the branches, that a somersaulting person is still a person, and that a bird or a moth is not a human. If you told it that the naive algorithm detected a human in scenarios #3 and #8, but not in 4, 5, or 6, it could devise creative ways of approaching a robot that might fool the algorithm.
It certainly doesn't look like human or animal cognition, no, but who's to say how it would act, what it would do, or what it could think if it were parented and educated and exposed to all kinds of stimuli appropriate for raising an AI, like the advantages we give a human child, for a couple decades? I'm aware that the neural networks behind ChatGPT has processed machine concepts for subjective eons, ingesting text at word-per-minute rates orders of magnitude higher than human readers ever could, parallelized over thousands of compute units.
Evolution has built brains that quickly get really good at object recognition, and prompted us to design parenting strategies and educational frameworks that extend that arbitrary logic even farther. But I think that we're just not very good yet at parenting AIs, only doing what's currently possible (exposing it to data), rather than something reached by the anthropic principle/selection bias of human intelligence.
I have a suspicion you’re right about what ChatGPT could write about this scenario, but I wager we’re still a long way from an AI that could actually operationalize whatever suggestions it might come up with.
It’s goalpost shifting to be sure, but I’d say LLMs call into question whether the Turing Test is actually a good test for artificial intelligence. I’m just not convinced that even a language model capable of chain-of-thought reasoning could straightforwardly be generalized to an agent that could act “intelligently” in the real world.
None of which is to say LLMs aren’t useful now (they clearly are, and I think more and more real world use cases will shake out in the next year or so), but that they appear like a bit of a trick, rather than any fundamental progress towards a true reasoning intelligence.
Who knows though, perhaps that appearance will persist right up until the day an AGI takes over the world.
I think something of what we perceive as intelligence has more to with us being embodied agents who are the result of survival/selection pressures. What does an intelligent agent act like, that has no need to survive? Im not sure we'd necessarily spot it given that we are looking for similarities to human intelligence whose actions are highly motivated by various needs and the challenges involved with filling them.
Heh, here's the answer... We have to tell the AI that if we touch it, it dies and to avoid that situation. After some large number of generations of AI death it's probably going to be pretty good at ensuring boxes don't sneak up on it.
I like Robert Miles videos on Youtube about fitness functions in AI and how the 'alignment issue' is a very hard problem to deal with. Humans, for how different we can be, do have a basic 'pain bad, death bad' agreement on the alignment issue. We also have the real world as a feedback mechanism to kill us off when or intelligence goes rampant.
ChatGPT on the other hand has every issue a cult can run into. That is it will get high on it's own supply and can have little to no means to ensure that it is grounded in reality. This is one of the reasons I think 'informational AI' will have to have some kind of 'robotic AI' instrumentation. AI will need some practical method in which it can test reality to ensure that it's data sources aren't full of shit.
I reckon even beyond alignment our perspective is entirely molded around the decisions and actions necessary to survive.
Which is to say I agree, I think a likely path to creating something that we recognize as intelligent we will probably have to embody/simulate embodiment. You know, send the kids out to the farm for a summer so they can see how you were raised.
The core problem is we have no useful definition of "intelligence."
Much of the scholarship around this is shockingly poor and confuses embodied self-awareness, abstraction and classification, accelerated learning, model building, and a not very clearly defined set of skills and behaviours that all functional humans have and are partially instinctive and partially cultural.
There are also unstated expectations of technology ("fast, developing quickly, and always correct except when broken".)
I think this is unnecessarily credulous about what is really going on with ChatGPT. It is not "applying the concepts of a poem" or checking meter and verse, it is generating text to fit a (admittedly very complicated) function that minimizes the statistical improbability of its appearance given the preceding text. One example is its use of rhyming words, despite having no concept of what words sound like, or what it is even like to hear a sound. It selects those words because when it has seen the word "poem" before in training data, it has often been followed by lines which happen to end in symbols that are commonly included in certain sets.
Human cognition is leagues different from this, as our symbolic representations are grounded in the world we occupy. A word is a representation of an imaginable sound as well as a concept. And beyond this, human intelligence not only consists of pattern-matching and replication but pattern-breaking, theory of mind, and maybe most importantly a 1-1 engagement with the world. What seems clear is that the robot was trained to recognize a certain pattern of pixels from a camera input, but neither the robot nor ChatGPT has any conception of what a "threat" entails, the stakes at hand, or the common-sense frame of reference to discern observed behaviors that are innocuous from those that are harmful. This allows a bunch of goofy grunts to easily best high-speed processors and fancy algorithms by identifying the gap between the model's symbolic representations and the actual world in which it's operating.
I tried that a few times, asking for "in the style of [band or musicians]" and the best I got was "generic gpt-speak" (for lack of a better term for it's "default" voice style) text that just included a quote from that artist... suggesting that it has a limited understanding of "in the style of" if it thinks a quote is sometimes a substitute, and is actually more of a very-comprehensive pattern-matching parrot after all. Even for Taylor Swift, where you'd think there's plenty of text to work from.
This matches with other examples I've seen of people either getting "confidently wrong" answers or being able to convince it that it's out of date on something it isn't.
Not sure how that's related. This is about a human adversary actively trying to defeat an AI. The roadway is about vehicles in general actively working together for the flow of traffic. They're not trying to destroy other vehicles. I'm certain any full self driving AI could be defeated easily by someone who wants to destroy the vehicle.
Saying "this won't work in this area that it was never designed to handle" and the answer will be "yes of course". That's true of any complex system, AI or not.
I don't think we're anywhere near a system where a vehicle actively defends itself against determined attackers. Even in sci-fi they don't do that (I, Robot movie).
"Saying "this won't work in this area that it was never designed to handle" and the answer will be "yes of course". That's true of any complex system, AI or not." This isn't about design, it's about what the system is able to learn. Humans were not designed to fly, but they can learn to fly planes (whether they're inside the plane or not).
Back in the day I beat MGS2 and MGS3 on Extreme. The box shouldn’t be your plan for sneaking past any guards. It’s for situations where you are caught out without any cover and you need to hide. Pop in to it right as they are about to round the corner. Pop out and move on once they are out of sight. The box is a crutch. You can really abuse it in MGS1, but it’s usually easier and faster to just run around the guards.
I also completed MGS3 on euro extreme, and was about an hour from the end of MGS2 on euro extreme (the action sequence right before the MG Ray fight). I was playing the PC port, and let me tell you: aiming the automatic weapons without pressure sensitive buttons is nearly impossible. I gave up eventually and decided that my prior run on Extreme had earned me enough gamer cred. Finishing euro extreme wasn’t worth it.
On the other hand, I loved MGS3 on euro extreme! It really required mastering every trick in the game. Every little advantage you could squeeze into a boss fight was essential. Escape from Groznygrad was hell, though. By far the single hardest part of the game.
He's one of the last speculative-fiction aficionados...always looking at current and emerging trends and figuring out some way to weave them into [an often-incoherent] larger story.
I was always pleased but disappointed when things I encountered in the MGS series later manifested in reality...where anything you can dream of will be weaponized and used to wage war.
And silly as it sounds, The Sorrow in MGS3 was such a pain in the ass it actually changed my life. That encounter gave so much gravity to my otherwise-inconsequential acts of wanton murder, I now treat all life as sacred and opt for nonlethal solutions everywhere I can.
(I only learned after I beat both games that MGS5 and Death Stranding implemented similar "you monster" mechanics.)
No, I was alluding to my previous Rambo playstyle of gunning down enemy soldiers even when I didn't need to.
But it carries into reality...a spider crosses your desk; most people would kill it. Rats? We poison them, their families and the parent consumer on the food chain. Thieves? Shoot on sight. Annoying CoD player? SWAT them. Murder as a means of problem solving is all so unnecessary.
We all have a body count. Most of us go through life never having to face it.
It's more than that. It changed my outlook in reality too.
The experience forced me consider the implications of taking any life-- whether it be in aggression, self-defense or even for sustenance. Others may try to kill me, but I can do better than responding in kind.
As a result, I refuse to own a gun and reduced my meat consumption. I have a rat infestation but won't deploy poison or traps that will maim them (losing battle, but still working on it). Etc.
A hypothetical situation: AI is tied to a camera of me in my office. Doing basic object identification. I stand up. AI recognizes me, recognizes desk. Recognizes "human" and recognizes "desk". I sit on desk. Does AI mark it as a desk or as a chair?
And let's zoom in on the chair. AI sees "chair". Slowly zoom in on arm of chair. When does AI switch to "arm of chair"? Now, slowly zoom back out. When does AI switch to "chair"? And should it? When does a part become part of a greater whole, and when does a whole become constituent parts?
In other words, we have made great strides in teaching AI "physics" or "recognition", but we have made very little progress in teaching it metaphysics (categories, in this case) because half the people working on the problem don't even recognize metaphysics as a category even though without it, they could not perceive the world. Which is also why AI cannot perceive the world the way we do: no metaphysics.
[EDIT] A little more context for those who might not click on a rando youtube link: it's basically an entertaining, whirlwind tour of the philosophy of categorizing and labeling things, explaining various points of view on the topic, then poking holes in them or demonstrating their limitations.
There are lots of things people sit on that we would not categorize as chairs. For example if someone sits on the ground, Earth has not become a chair. Even if something's intended purpose is sitting, calling a car seat or a barstool a chair would be very unnatural. If someone were sitting on a desk, I would not say that it has ceased to be a desk nor that it is now a chair. At most I'd say a desk can be used in the same manner as a chair. Certainly I would not in general want an AI tasked with object recognition to label a desk as a chair. If your goal was to train an AI to identify places a human could sit, you'd presumably feed it different training data.
Thirty years ago, I was doing an object-recognition PhD. It goes without saying that the field has moved on a lot from back then, but even then hierarchical and comparative classification was a thing.
I used to have the Bayesian maths to show the information content of relationships, but in the decades of moving (continent, even) it's been lost. I still have the code because I burnt CD's, but the results of hours spent writing TeX to produce horrendous-looking equations have long since disappeared...
The basics of it were to segment and classify using different techniques, and to model relationships between adjacent regions of classification. Once you could calculate the information content of one conformation, you could compare with others.
One of the breakthroughs was when I started modeling the relationships between properties of neighboring regions of the image as part of the property-state of any given region. The basic idea was the center/surround nature of the eye's processing. My reasoning was that if it worked there, it would probably be helpful with the neural nets I was using... It boosted the accuracy of the results by (from memory) ~30% over and above what would be expected from the increase in general information load being presented to the inference engines. This led to a finer-grain of classification so we could model the relationships (and derive information-content from connectedness). It would, I think, cope pretty well with your hypothetical scenario.
At the time I was using a blackboard for what I called 'fusion' - where I would have multiple inference engines running using a firing-condition model. As new information came in from the lower levels, they'd post that new info to the blackboard, and other (differing) systems (KNN, RBF, MLP, ...) would act (mainly) on the results of processing done at a lower tier and post their own conclusions back to the blackboard. Lather, rinse, repeat. There were some that were skip-level, so raw data could continue to be available at the higher levels too.
That was the space component. We also had time-component inferencing going on. The information vectors were put into time-dependent neural networks, as well as more classical averaging code. Again, a blackboard system was working, and again we had lower and higher levels of inference engine. This time we had relaxation labelling, Kalman filters, TDNNs and optic flow (in feature-space). These were also engaged in prediction modeling, so as objects of interest were occluded, there would be an expectation of where they were, and even when not occluded, the prediction of what was supposed to be where would play into a feedback loop for the next time around the loop.
All this was running on a 30MHz DECstation 3100 - until we got an upgrade to SGI Indy's <-- The original Macs, given that OSX is unix underneath... I recall moving to Logica (signal processing group) after my PhD, and it took a week or so to link up a camera (an IndyCam, I'd asked for the same machine I was used to) to point out of my window and start categorizing everything it could see. We had peacocks in the grounds (Logica's office was in Cobham, which meant my commute was always against the traffic, which was awesome), which were always a challenge because of how different they could look based on the sun at the time. Trees, bushes, cars, people, different weather conditions - it was pretty good at doing all of them because of its adaptive/constructive nature, and it got to the point where we'd save off whatever it didn't manage to classify (or was at low confidence) to be included back into the model. By constructive, I mean the ability to infer that the region X is mislabelled as 'tree' because the surrounding/adjacent regions are labelled as 'peacock' and there are no other connected 'tree' regions... The system was rolled out as a demo of the visual programming environment we were using at the time, to anyone coming by the office... It never got taken any further, of course... Logica's senior management were never that savvy about potential, IMHO :)
My old immediate boss from Logica (and mentor) is now the Director of Innovation at the centre for vision, speech, and signal processing at Surrey university in the UK. He would disagree with you, I think, on the categorization side of your argument. It's been a focus of his work for decades, and I played only a small part in that - quickly realizing that there was more money to be made elsewhere :)
> Recognizes "human" and recognizes "desk". I sit on desk. Does AI mark it as a desk or as a chair?
Not an issue if the image segmentation is advanced enough. You can train the model to understand "human sitting". It may not generalize to other animals sitting but human action recognition is perfectly possible right now.
I like these examples because they concisely express some of the existing ambiguities in human language. Like, I wouldn’t normally call a desk a chair, but if someone is sitting on the table I’m more likely to - in some linguistic contexts.
I think you need LLM plus vision to fully solve this.
I still haven't figured out what the difference is between 'clothes' and 'clothing'. I know there is one, and the words each work in specific contexts ('I put on my clothes' works vs 'I put on my clothing' does not), but I have no idea how to define the difference. Please don't look it up but if you have any thoughts on the matter I welcome them.
To me, "clothing" fits better when it's abstract, bulk, or industrial, "clothes" when it's personal and specific, with grey areas where either's about as good—"I washed my clothes", "I washed my clothing", though even here I think "clothes" works a little better. Meanwhile, "clothing factory" or "clothing retailer" are perfectly natural, even if "clothes" would also be OK there.
"I put on my clothing" reads a bit like when business-jargon sneaks into everyday language, like when someone says they "utilized" something (where the situation doesn't technically call for that word, in its traditional sense). It gets the point across but seems a bit off.
... oh shit, I think I just figured out the general guideline: "clothing" feels more correct when it's a supporting part of a noun phrase, not the primary part of a subject or object. "Clothing factory" works well because "clothing" is just the kind of factory. "I put on my nicest clothes" reads better than "I put on my nicest clothing" because clothes/clothing itself is the object.
It is fascinating to me how we (or at least I) innately understand when the words fit but cannot define why they fit until someone explains it or it gets thought about for a decent period of time. Language and humans are an amazing pair.
I figure it's the same sort of thing as yards vs yardage. When you're talking about yards you're talking about some specific amount, when you're talking yardage you's talking about some unspecified amount that gets measured in yards usually.
When talking clothing you're talking about an abstract concept, when you're talking clothes you're generally talking about some fairly specific clothes. There's a lot of grey area here, e.g. a shop can either sell clothes or clothing, either works to my ear.
I wouldn't say that as an absolute statement, but in US English (at least the regional dialects I'm most familiar with), "throw on some clothes," "the clothes I'm wearing," etc. certainly sound more natural.
That's why I think AGI is more likely to emerge from autonomous robots than in the data center. Less the super-capable industrial engineering of companies like Boston Dynamics, more like the toy/helper market for consumers, more like like Sony's Aibo reincarnated as a raccoon or monkey - big enough to be be safely played with or to help out with light tasks, small enough that it has to navigate its environment from first principles and ask for help in many contexts.
> In other words, we have made great strides in teaching AI "physics" or "recognition", but we have made very little progress in teaching it metaphysics (categories, in this case) because half the people working on the problem don't even recognize metaphysics as a category even though without it, they could not perceive the world.
A bold claim, but I'm not sure it's one that accurately matches reality. It reminds me of reading about attempts in the 80's to construct AI by having linguists come in and trying to develop rules for the system.
From my experience, current methods of developing AI are a lot closer to how most humans think and interact with the world than academic philosophy is. Academic philosophy might be fine, but it's quite possible it's no more useful for navigating the world than the debates over theological minutiae have been.
> we have made very little progress in teaching it metaphysics (categories, in this case)
That's because ontology, metaphysics, categorization, and all that, is completely worthless bullshit. It's a crutch our limited human brains use, and it causes all sorts of problems. Half of what I do in data modeling is trying to fight against all of the worthless categorizations I come across. There Is No Shelf.
Why are categories so bad? Two reasons:
1. They're too easily divorced from their models. Is a tomato a fruit? The questions is faulty: there's no such thing as a "fruit" without a model behind it. When people say "botanically, a tomato is a fruit", they're identifying their model: botany. Okay, are you bio-engineering plants? Or are you cooking dinner? You're cooking dinner. So a tomato is not a fruit. Because when you're cooking dinner, your model is not Botany, it's something culinary, and in any half-decent culinary model, a tomato is a vegetable, not a fruit. So unless we're bio-engineering some plants, shut the hell up about a tomato being a fruit. It's not wisdom/intelligence, it's spouting useless mouth-garbage.
And remember that all models are wrong, but some models are useful. Some! Not most. Most models are shit. Categories divorced from a model are worthless, and categories of a shit model are shit.
2. Even good categories of useful models have extremely fuzzy boundaries, and we too often fall into the false dichotomy of thinking something must either "be" or "not be" part of a category. Is an SUV a car? Is a car with a rocket engine on it still a car? Is a car with six wheels still a car? Who cares!? If you're charging tolls for your toll bridge, you instead settle for some countable characteristic like number of axles, and you amend this later if you start seeing lots of vehicles with something that stretches your definition of "axle". In fact the category "car" is worthless most of the time. It's an OK noun, but nouns are only averages; only mental shortcuts to a reasonable approximation of the actual object. If you ever see "class Car : Vehicle", you know you're working in a shit, doomed codebase.
And yet you waste time arguing over the definitions of these shit, worthless categories. These worthless things become central to your database and software designs and object hierarchies. Of course you end up with unmaintainable shit.
Edit: Three reasons!
3. They're always given too much weight. Male/female: PICK ONE. IT'S VERY IMPORTANT THAT YOU CHOOSE ONE! It is vastly important to our music streaming app that we know whether your skin is black or instead that your ancestors came from the Caucus Mountains or Mongolia. THOSE ARE YOUR ONLY OPTIONS PICK ONE!
Employee table: required foreign key to the "Department" table. Departments are virtually meaningless and change all the time! Every time you get a new vice president sitting in some operations chair, the first thing he does is change all the Departments around. You've got people in your Employee table whose department has changed 16 times, but they're the same person, aren't they? Oh, and they're not called "Departments" anymore, they're now "Divisions". Did you change your field name? No, you didn't. Of course you didn't. You have some Contractors in your Employee table, don't you? Some ex-employees that you need to keep around so they show up on that one report? Yeah, you do. Of course you do. Fuck ontology.
Why should you believe either one after comparing them...? When you have spent much time tracing urban legends, especially in AI where standards for these 'stupid AI stories' are so low that people will happily tell stories with no source ever (https://gwern.net/Tanks) or take an AI deliberately designed to make a particular mistake & erase that context to peddle their error story (eg https://hackernoon.com/dogs-wolves-data-science-and-why-mach...), this sort of sloppiness with stories should make you wary.
Say you have a convoy of autonomous vehicles traversing a road. They are vision based. You destroy a bridge they will cross, and replace the deck with something like plywood painted to look like a road. They will probably just drive right onto it and fall.
Or you put up a "Detour" sign with a false road that leads to a dead end so they all get stuck.
As the articles says, "...straight out of Looney Tunes"
We also have intuition. Where Something just seems fishy.
Not saying AI can’t handle that. But I assure you that a human would’ve identified a moving cardboard box as suspicious without being told it’s suspicious.
It sounds like this AI was trained more on a whitelist “here are all the possibilities of what marines look like when moving” rather than a black list which is way harder “here are all the things that aren’t suspicious, like what should be an inanimate object changing locations”
Part of the problem is that the confidence for “cardboard box” was probably quite high. It’s hard to properly calibrate confidence (speaking from experience, speech recognition is often confidently wrong).
I think there are strong indicators in much neuroscience research that indicate our brains are quantum entangled in a way that a cpu currently is not. I project the quantum entanglement part from the experiments such as the precognitive ones, which usually don't make that claim.
True. So if they are smart enough to fool AI, they will just remove the mid span, and have convenient weight bearing beams nearby that they put in place when they need to cross. Or if it's two lane, only fake out one side because the AI will be too clever for its own good and stay in its own lane. Or put up a sign saying "Bridge out, take temporary bridge" (which is fake).
The point is, you just need to fool the vision enough to get it to attempt the task. Play to its gullibility and trust in the camera.
Sounds like they're lacking a second level of interpretation in the system. Image recognition is great. It identifies people, trees and boxes. Object tracking is probably working too, it could follow the people, boxes and trees from one frame to the next. Juuust missing the understanding or belief system that tree+stationary=ok but tree+ambulatory=bad.
At this point I've lost track of the number of people who extrapolated from contemporary challenges in AI to predict future shortcomings turning out incredibly wrong within just a few years.
It's like there seems to be some sort of bias where over and over when it comes to AI vs human capabilities many humans keep looking at the present and fail to factor in acceleration and not just velocity in their expectations for the future rate of change.
It's very easy to be wrong as an optimist as well as a pessimist. Back in 2009 I was expecting by 2019 to be able to buy a car in a dealership that didn't have a steering wheel because the self-driving AI would just be that good.
Closest we got to that is Waymo taxis in just a few cities.
It's good! So is Tesla's thing! Just, both are much much less than I was expecting.
I also have never been able to count the number of people who make obviously invalid optimistic prediction without understanding the tech nor the limitation of the current paradigm. They don’t see the tech itself, but only see the recent developments (ignoring the decades of progress) and concludes it is a fast moving field. It all sounds like what bitcoin people used to say.
This whole debate is another FOMO shitshow. People just don’t want to “miss” any big things, so they just bet on a random side rather than actually learning how things work. Anything past this point is like watching a football game, as what matters is who’s winning. Nothing about the tech itself matters. A big facepalm I should make.
Anything that requires human body and dexterity is beyond the current state of AI. Anything that is intellectual is within reach. Which makes sense because it took way longer for nature to make human body then it took us to develop language/art/science etc.
They didn't try very hard to train this system. It wasn't even a prototype.
- In the excerpt, Scharre describes a week during which DARPA calibrated its robot’s human recognition algorithm alongside a group of US Marines. The Marines and a team of DARPA engineers spent six days walking around the robot, training it to identify the moving human form. On the seventh day, the engineers placed the robot at the center of a traffic circle and devised a little game: The Marines had to approach the robot from a distance and touch the robot without being detected.
We engineers tend to overlook simple things like these in our grand vision of the world. Yours truly is also at times guilty of getting blinded by these blinders.
This reminds me of a joke that was floating around the internet few years ago. It goes something like this: The US and the USSR were in a space race and trying to put a person in space. To give the trainee astronauts a feeling of working without gravity the trainees were trained under water. But the US faced a challenge. There was no pen that would work under water for the trainees to use. The US spent millions of dollars on R&D and finally produced a pen that would work under water. It was a very proud moment for the engineers who developed the technology. After a few months, there was a world conference of space scientists and engineers where teams from both the US and the USSR were also present. So to get a sense of how the USSR team solved the challenge of helping trainees to take notes under water, the US team mention about their own invention and asked the USSR team how they solved the problem. The USSR team replied, We use pencils :)
The funny thing is that this story has gone round so much yet has been debunked. I can’t remember which podcast it was but in the end you don’t want conducting dust in space and the Russians eventually also bought the space pens.
Events in Ukraine would suggest that even older weapons in the US arsenal do in fact work exceptionally well. US performance in Desert Storm and the battle of Khasam also suggests that the U.S. military does possess the acumen to deploy its weapons effectively.
Desert Storm was not niche in any shape or form. Iraq was the fourth most powerful military at the time and had the newest and greatest Soviet weapons. The expected casualties for the US were in the thousands, and no one thought the Iraqis would get steamrolled in 1991.
Kuwait United States United Kingdom France Saudi Arabia Egypt Afghan mujahideen Argentina Australia Bahrain Bangladesh Belgium Canada Czechoslovakia Denmark Germany Greece Honduras Hungary Italy Japan Morocco Netherlands New Zealand Niger Norway Oman Pakistan Poland Portugal Qatar Senegal Sierra Leone Singapore South Korea Spain Sweden Syria Turkey United Arab Emirates
And people placed their bets on Iraq? Using WW2 era T-55s and T-62s from the early 1960s?
The US provided 700,000 of the 956,600 troops and had to bring them halfway around the world. Most of those countries were using weapons bought made by the US MIC. Also I never said we were expected to lose. People expected a long and costly fight that would take months and take thousands of lives. Less than 300 were killed and that includes all those other countries.
Your point was that the MIC sacrificed our military acumen for profit, when they clearly haven't. I agree that we pay them too much, but the weapons themselves still perform better than any other.
No. The evidence is clearly insufficient. You didn't bring up Korea, Vietnam, the second Iraq war, Afghanistan, Yemen, Syria, Laos, Cambodia, etc or that Kuwait was a United Nations and not a US campaign.
Instead you claimed some dubious members of the babbling class misjudged how long it would take, didn't take the context into consideration, misrepresented 1940s tanks as cutting edge weaponry, and then attributed military technology and prowess to a victory plagued by warcrimes like the highway of death.
Killing surrendered troops and firebombing a retreating military under the flag of the United Nations will lead to the belligerent considering it a defeat. Under those conditions, you could probably achieve that with 19th century maxim guns.
Sorry, that's nowhere near sufficient to show that the fancy weaponry on display justified the cost or was an important part of the victory.
In every single one of those wars, the weapons were never the problem. Look up the massive casualty ratios in those wars. Every one of those wars were failed by the politics and the fact that we should never should have been there. The weapons and the MIC never caused the failure of those wars. All the war crimes and evil acts committed were done by people in the military, not the MIC.
And what 1940s tanks? Both sides had modern tanks in the war.
Seems we're approaching limits of what is possible w/AI alone. Personally, I find a hybrid approach - interfacing human intelligence w/AI (e.g. like the Borg in ST:TNG?) to provide the military an edge in ways that adversaries cannot easily/quickly reproduce or defeat. There's a reason we still put humans in cockpits even though commercial airliners can pretty much fly themselves....
Hardware and software (AI or anything else) are tools, IMHO, rather than replacements for human beings....
They've barely started trying. We'd be reaching the limits of AI if self-driving cars were an easy problem and we couldn't quite solve it after 15 years, but self-driving cars are actually a hard problem. Despite that, we're pretty darn close to solving it.
There are problems in math that are centuries old, and no one is going around saying we're "reaching the limits of math" just because hard problems are hard.
that is a pretty important part of the equation. what if the universe is the minimum viable machine for creating intelligence? if you think of the universe as computer and evolution as a machine learning algorithm then we already have an example of what size of a computer and how long it takes for ML to create AGI. it seems presumptuous to believe that humans will suddenly figure out a way to do the same thing a trillion times more efficiently.
>it seems presumptuous to believe that humans will suddenly figure out a way to do the same thing a trillion times more efficiently.
Nature isn't efficient. Humans create things many orders of magnitude more efficient than nature as a matter of course. The fact that it didn't take millions of years to develop even the primitive AI we have today is evidence enough, or to go from the Wright Brothers' flight to space travel. Or any number of examples from medicine, genetic engineering, material synthesis, etc.
You could say that any human example also has to account for the entirety of human evolution, but that would be a bit of a red herring since even in that case the examples of humans being able to improve upon nature within relatively less than geological spans of time are valid, and that case would apply to the development of AI as well.
> it seems presumptuous to believe that humans will suddenly figure out a way to do the same thing a trillion times more efficiently.
I think it might be confusion on your part on how incredibly inefficient evolution is. Many times you're performing random walks, or waiting for some random particle to break DNA just right, and then for that mutation to be in just the right place to survive. Evolution has no means of "oh shit, that would be an amazing speed up, I'll just copy that over" until you get into intelligence.
I'd like to think we're more than just machines. We have souls, understand and live by a hopefully objective set of moral values and duties, aren't thrown off by contradictions the same way computers are.... Seems to me "reproducing" that in AI isn't likely... despite what Kurzweil may say :).
That reply would fit better on Reddit than HN. Here we discuss things with curiosity.
If making a claim that humans have ephemeral things like souls and adherence to some kind of objective morality that is beyond our societal programming, then it's fair to ask for the reasoning behind it.
Every year machines surprise us by seeming more and more human (err, perhaps not that but "human-capable"). We used to have ephemeral creativity or ephemeral reasoning that made us masters at Drawing, Painting, Music, Chess or GO. No longer.
There are still some things we excel at that machines don't. Or some things that it takes all the machines in the world to do in 10,000 years with a nuclear plant's worth of energy that a single human brain does in one second powered by a cucumber's worth of calories.
However, this has only ever gone in one direction: machines match more and more of what we do and seem to lack less and less of what we are.
>aren't thrown off by contradictions the same way computers are
We are not? Just look at any group of people that's bought into a cult and you can see people falling for contradictions left and right. Are they 'higher level' contradictions than what our machines currently fall for, yes, but the same premise applies to both.
Unfortunately I believe you are falling into magical thinking here. "Because the human intelligence problem is hard I'm going to offload these difficult issues to address as magic and therefore cannot be solved or reproduced".
As humiliating this is for the ai. Nobody would have the balls to pull this off in a real battlefield outside of training. Because you never know if you found the perfect camouflage or if you are a sitting duck walking straight into a trap.
Had an interesting conversation with my 12 year old son about AI tonight. It boiled down to "don't blindly trust ChatGPT, it makes stuff up". Then I encouraged him to try to get it to tell him false/hallucinated things.
I'm surprised they wasted the time and effort to test this instead of just deducing the outcome. Most human jobs that we think we can solve with AI actually require AGI and there is no way around that.
I'm sceptical about this story. It's a nice anecdote for the book to show a point about how training data can't always be generalised to the real world. Unfortunately it just doesn't ring true. Why train it using Marines, don't they have better things to do? And why have the game in the middle of a traffic circle. The whole premise seems just too made up.
If anyone has another source corroborating this story (or part of the story) then I'd like to know. But for now I'll assume it's made up to sell the book.
> It's not clear why the DARPA robot didn't have thermal unless this is a really old story.
Who says it didn't? A thermal camera doesn't mean your targets are conveniently highlighted for you and no further identification is needed. Humans aren't the only thing that can be slightly warmer than the background, and on a hot day they may be cooler or blend in. So it's probably best if your robot's target acquisition is a bit more sophisticated than "shoot all the hot things".