Imagine you have a robot with a great set of mechanical eyes. These eyes can see and interpret things in the physical world. One of the things this robot can see and interpret is software user interfaces. By looking at a screen, the robot can tell where the submit button is, where to hit like, and where to swipe left or right.

Now imagine this robot’s eyes can just as easily train on a 72-inch screen on a wall in your living room. And these eyes are so good, the robot can see every pixel. Now imagine that robot wants to know Wikipedia. I mean it wants to know ALL of Wikipedia. The robot could put Wikipedia up on that 72inch screen and read (very quickly) all the content, methodically clicking on every single link and turning over every single rock. This could work. The robot would learn all the information. But it may not be the most performant and efficient way to transfer this data. But let’s say it goes ahead and reads Wikipedia this way. How would it store the data? In what structure?  Now imagine it can store the data by performing entity extraction on ingestion “as it reads it” and then form a massive entity graph of people, places, things, and concepts.

Okay so that created and interesting picture in your mind I hope. I hope you imagined a robot standing in front of a giant tv and reading all of wikipedia as fast as superhumanly possible and storing all the information in a beautiful graph, thus storing the data much like the human brain does.

With that picture still in your mind let’s stay with the cool graph brain but redo the way the robot ingests the data. Imagine instead we took all the Wikipedia data and turned into a bitmap across a 72inch tv. Now imagine that bitmap changed every .1 seconds. Imagine then how much faster the robot can ingest all that data. The robot, after staring at a screen for a few minutes, can ingest all of Wikipedia.  If the robot just hooked up to wifi couldn’t we just zap all that data into the robot’s brain? Sure. That could work. As long as the robot could find and accept the data feed. But think about how humans ingest information. Through our senses. If we want to make Turing-like AI that closely resembles humans and can think with the same sophistication, shouldn’t we try to mimic the way a human ingests information? Maybe the data rates of a data feed over wifi is faster. But certainly seeing data through the eyes and hearing data through the ears is more ubiquitous. And isn’t it really just a matter of time until we figure out how to pass information faster through computer vision that we can today through wireless data?

What if two robots wanted to communicate with each other? Most people would conclude today that they would use some wireless protocol with some authentication protocol link up and pass data back and forth. Imagine it they could communicate more ubiquitously through the visual spectrum.