in blog post Today, OpenAI say They “trained a neural network to play Minecraft with Video PreTraining (VPT) on vast, unlabeled video data sets of human Minecraft play, using only small amounts of labeled contractor data.” The model is said to be able to teach you how to make diamond tools, which is “a task that would normally require an experienced person to perform in 20 minutes or more (24,000 operations)”. From the post: To take advantage of the rich, unlabeled video data available on the Internet, we introduce a new and simple semi-supervised imitation learning method, Video PreTraining (VPT). We start by collecting small data sets from contractors that record not only videos of the contractors, but also the actions they take (in our case, key presses and mouse movements). You use this data to train an inverse dynamics model (IDM) that predicts what each step of the video will do. The important thing is that IDM can use past and future information to guess what each step is doing. This task is much easier and therefore requires significantly less data than the behavior duplication task, which predicts behavior given only past video frames. You can then use the trained IDM to label a much larger set of online video data and learn to act through behavioral replication.
We decided to validate our method in Minecraft. Because (1) it is one of the most actively played video games in the world, it has a wealth of video data available for free and (2) resembles real-world applications such as computer use. Unlike previous work in Minecraft, which used a simplified workspace to make navigation easier, our AI uses the 20Hz frame rate of the mouse and keyboard, a much more generally applicable but much more difficult native human interface.
Our behavioral replication model (‘oeVPT-based model’), trained on 70,000 hours of IDM-labeled online video, performs tasks in Minecraft that are nearly impossible to achieve with reinforcement learning from scratch. It learns to collect logs by cutting down trees, turning those logs into boards, and then turning those boards into crafting tables. This sequence takes about 50 seconds, or 1,000 consecutive game moves, by someone proficient in Minecraft. The model also performs other complex skills that humans often do in games, such as swimming, hunting animals, and eating food. You’ve also learned the technique of “pillar jumping,” a common move in Minecraft where you repeatedly jump and place blocks below you to elevate yourself. For more information, OpenAI paper (PDF) About the project.
OpenAI has trained a neural network to play Minecraft proficiently.
Source link OpenAI has trained a neural network to play Minecraft proficiently.