The Future of Robotics: Chatbots that Can Take Action!

Covariant, a robotics startup, is testing a ChatGPT-style chatbot that can operate a robotic arm in order to develop machines that can be more useful in the physical world.

Title Helping AI Chatbots Get a Hand—and an Arm

Robots that can effectively communicate and interact with humans have always been a tantalizing vision for the future. However, programming robots to perform complex tasks beyond a limited set of chores has proven to be a monumental challenge. Until now… 👾

Peter Chen, the CEO of Covariant, a renowned robot software company, recently showcased a groundbreaking chatbot that possesses the ability to manipulate physical objects. Imagine asking a chatbot to show you the tote in front of it, and it not only describes the items but also picks them up and moves them around with the help of a robot arm. Cool, right? 🤖

This hands-on chatbot is an innovative step toward giving robots the kind of flexible capabilities that we’ve only seen in programs like ChatGPT. It represents the promising future of AI-powered robots that can do more than just mundane tasks. “Foundation models are the future of robotics,” declares Chen, referring to large-scale, general-purpose machine-learning models adapted for specific domains. Covariant’s impressive chatbot, powered by their Robot Foundation Model (RFM-1), is no exception.

RFM-1 has been trained not only with vast amounts of text, like its counterparts ChatGPT and Google’s Gemini, but also with video and hardware control data sourced from tens of millions of examples of robot movements in the physical world. This combination allows RFM-1 to connect language with action seamlessly. 😮

This model goes above and beyond language proficiency. It can generate videos demonstrating robots completing a wide range of tasks. For instance, it can show you how a robot should grab an object from a cluttered bin. “It’s a little bit mind-blowing,” says Chen. And he’s right! 🤯

But that’s not all. RFM-1 has even showcased the ability to control hardware that it hasn’t been specifically trained on. In the future, this could mean that a single general model could operate a humanoid robot. Pieter Abbeel, the co-founder and chief scientist of Covariant, believes that with further training, RFM-1 could enable robots to perform an array of tasks with fluency, similar to how Tesla uses data from its cars to train self-driving algorithms. The possibilities are endless! 🚀

Covariant, founded in 2017, currently sells software that allows robots to pick items out of bins in warehouses. However, with models like RFM-1, they can potentially expand the capabilities of their robots to adapt to new tasks more smoothly. Many roboticists, including Abbeel and his colleagues, see the language models behind ChatGPT and similar programs as a potential catalyst for a robotics revolution 🤖.

Of course, such ambitious projects come with challenges. One major hurdle is the lack of readily available data for training robots in the same way that text and images can be easily accessed on the internet. Pulkit Agrawal, an MIT professor specializing in AI and robotics, emphasizes the need to generate training data for robots. This can involve collecting videos of humans performing tasks or creating simulations featuring robots. Google DeepMind, for example, has developed AI models (RT-2) and datasets (RT-X) for robots using this approach.

Covariant’s impressive robot arm data, acquired from its deployments with customers, is undeniably useful. However, it’s currently limited to specific warehouse tasks. To achieve more general robot capabilities, a broader range of data must be collected. How much data is required and how to gather it efficiently are open questions that researchers are grappling with.

One fascinating aspect of Covariant’s work is its ability to enhance the AI models’ understanding of the physical world. Abbeel highlights that RFM-1, in comparison to OpenAI’s realistic video model, Sora, demonstrates a better grasp of the limitations and possibilities of the real world. It’s a significant step toward bridging the gap between virtual simulations and physical actions.

✨ Want to learn more about the future of robotics and the incredible potential of models like RFM-1? Check out these fascinating articles! 📚

  1. Robots are riding the humanoid hype wave with a $26B valuation
  2. Use Google’s Gemini AI app on your Android phone
  3. Galaxy Z Fold 6 Ultra: The real needs and 7 features
  4. Elon Musk says a cheaper Tesla model is coming by 2025
  5. OpenAI’s newest model Sora can generate videos that look decent

🛠️ Exciting possibilities lie ahead as we continue to explore the integration of language models and robotics. Let’s embrace the future and see where it takes us! Share your thoughts and predictions with us in the comments below. And remember to spread the word by sharing this article on your favorite social media platforms. Together, we can shape the future of robotics! 🌟