Source: Empower Labs
How much progress has been made in robots recently?
Recently, research on intelligent robots has been in full swing, with new demonstrations emerging one after another.
Tesla released the second-generation Optimus in mid-December. This robot is not an industrial product, but a pure prototype, but it is well finished. In demonstrations, the astronaut-shaped Optimus showed off its exquisite athletic capabilities. Musk said that the reason it was designed to be human-sized and shaped is to seamlessly replace human labor and perform anything humans are unwilling to do.
Tesla's The robot has a strong sci-fi industrial style and looks expensive to build. Perhaps it is this image that gives it the expectation that "everything is taken for granted". In fact, Tesla didn't show many of its application scenarios, so people seemed to just "oh" about it. However, the two robots announced one after another in January made many people express their sincere "Huh?"
The first one to appear was Stanford Mobile Aloha project announced by university research team. The reason why this project has aroused widespread interest may be that the scenarios it finds for robots are more practical: cooking, playing with cats, and doing laundry. In fact, the main innovation of this project is that it uses low-cost hardware (more than 30,000 US dollars, which is still very expensive for home use) to realize an autonomous mobile two-hand robot (although its appearance is not very human-like), And it can learn human skills. This learning process seems a bit secondary. Take cooking as an example. You have to operate it to cook once, and then it remembers the general actions. At this time, it is impossible for it to hold the pot steadily at once, but the wonderful thing is that it will conduct dozens of independent training through the camera on its arm before it can really hold it steadily.
Next, Figure The company released a video of their humanoid robot Figure 01 making coffee. This robot hears the human voice command "Make me a cup of coffee" and can skillfully use the capsule coffee machine to make a cup of coffee. Figure calls this achievement "the ChatGPT moment of humanoid robots," not because it uses a large language model to understand human voice commands, but because the coffee-making skill was learned simply by observing human movements. The achievement is as shocking as ChatGPT. Figure 01 established an understanding of the task behavior by visually observing the behavior of humans using a coffee machine, and then mastered this skill through several times of independent training and error correction. This shows the broad prospects of AI-driven general-purpose humanoid robots.
Bill Gates’ A robot in every home
In the first issue of "Scientific American" magazine in 2007, there was an article signed by Bill Gates. I remember it was the cover headline. . The title of the article is "A robot in every home."
In the article, Bill Gates expressed great excitement about the opportunities in the robotics industry, because it is very similar to when he founded Microsoft 30 years ago: breakthrough technologies have emerged in the industry, but professional-grade business machines are still monopolized in the hands of a few large companies. . Although startups and geeks continue to create some interesting things, they are so fragmented that there are no common standards and development tools. Therefore, Bill Gates boldly predicted: As long as this problem is solved, robots will surely enter thousands of households.
So, Microsoft decisively invested in promoting this matter at that time, established the Robotics department, and launched Microsoft Robotics Studio, preparing to replicate the success that Microsoft had achieved in the PC era. success.
In the article, Bill Gates cited the classic DARPA 2004 Cross-Country Challenge. Yes, it is the legendary DARPA who invented the Internet. The goal of sponsoring this competition is to enable a fully autonomous navigation vehicle to cross more than 140 miles of the Mojave Desert. In the first year of the competition, the best competitors only struggled to cover 7 miles. In the second year, 5 cars successfully completed the competition, and they simply ran all the way to the end. This competition greatly reflects the speed of evolution of robotics technology. This is also where Bill Gates' confidence lies.
Microsoft's efforts at that time were at the development tool level. The capabilities of sensors, motors, servo mechanisms and other hardware are rapidly improving, and their prices are falling. However, at the development level, you have to write a dedicated program for each piece of hardware to drive it. Moreover, how to enable the weak processor at the time to process data from multiple sensors in real time was also a big challenge. Microsoft's solution is to establish standards for drivers and provide multi-threading capabilities. Microsoft even launched the .NET Micro Framework. Students who understand .NET technology should be able to imagine that putting such a big killer into robot development tools is simply a dimensionality reduction blow. Robot developers don't even need to worry about memory and thread scheduling. They can just write logic directly.
But it turned out to be true We already know that Microsoft's efforts in the field of robotics were not successful, and the entire Robotics division was completely disbanded in a reorganization in 2014. In the author's own intermittent observations, I feel that the main reasons may be cost and application. After all, to this day, it costs a lot of money to have a robotic arm at home, and we don’t know what to do with it.
ChatGPT Moment for robots?
Pull the timeline back to the present, whether it is Mobile Aloha or Figure 01, they all demonstrate such an ability: through sensors (whether Whether it is a camera or remote operation of joints) to learn an action, and truly master this action through autonomous training feedback. Not only that, but this set of actions can be formed into a skill that can then be invoked through natural conversation. Such skills can be copied to similar robots at any time without programming.
It seems that the robot's capabilities have really reached a new level. This also made many people exclaim in unison: "Have robots also reached the moment of subversion like ChatGPT?"
Compared to Bill more than ten years ago When Gates made his prediction, today's robots have made several new strides:
1. More versatile. In the eyes of Bill Gates, robots can be of any shape, as long as they can complete a certain task. When the author himself sneaked into the Robotics group for a meeting, he could only run and crawl when he saw their demonstrations. But now robots can already have skills in household scenes, and these skills can be copied and spread. Moreover, the design of the robot itself tends to be more humanoid, and it is also designed to perform various general tasks on behalf of people.
2. Natural interaction. With the support of multi-modal LLM, current robot technology can understand human voice commands and learn from inputs such as cameras. This is a great progress in the field of machine learning and significantly reduces the difficulty of development and use.
3. Costs are further reduced. Although the announced hardware cost of Mobile Aloha is still more than 30,000 US dollars, this includes a mobile base. If you only count the robotic arm, it seems that it can barely be regarded as a high-end home appliance. The mobile base may be one of the next hot topics. For example, the logic of some recent investments in Tesla is "Don't treat it as an electric car, but treat it as the next generation universal mobile base."
Jim Fan is one of the largest KOLs in this field. He is a senior scientist at NVIDIA and was the first intern at OpenAI. In a recent tweet, he explained why he believes robots will be the biggest hot topic in 2024.
But even here In an enthusiastic tweet, Jim believed that "universal physical AI robots" are still about three years away.
In this regard, the author is cautiously optimistic. Optimism comes from seeing such great progress, and caution comes from Microsoft's lessons learned.
But one thing is for sure, it is indeed exciting.