ERRA: An Embodied Representation and Reasoning Architecture for Long-horizon Language-conditioned Manipulation Tasks
Code
Code is available here.
Video
Examples of Language Instruction in Testing
Short-horizon Tasks | Long-horizon Tasks |
---|---|
Grasp an object (i.e., trash) | Please clean the table |
Place the object into the bin | Please put sth into the drawer (cosmetic, can, clip) |
Open the drawer | Please cut sth (eggplant, banana, apple) |
Close the drawer | Please put all the round objects from the table to the box |
Place sth into the drawer (cosmetic, can, clip) | |
Grasp the knife | Hybird Tasks |
Cut sth (eggplant, banana, apple) | Please close the drawer and then grasp the knife |
Grasp a round object | Please put sth into the drawer and then clean the table. |
Place the round object into the bin | Please clean the table and then cut sth (e.g., banana) |
Grasp sth (cosmetic, can, clip) |
Unseen Verb and Noun
Verb | Noun |
---|---|
put → move, place, pick | can → jar, cola |
cut → chop, slice | cosmetic → makeup |
clean → empty, clear | apple, banana→fruit |
close → shut | eggplant → vegetable |
grasp → grip, catch | table → tableland, stage |
open → unclose, unlock | object → thing, item |
place → put, set | bin → box, dustbin |
Action Language Set
Grasp an object | Grasp the + “object name” |
Place the object into the bin | Cut the + “object name” |
Open the drawer | Place the + “object name” into the drawer |
Close the drawer | Grasp a round object |
Done | Place the round object into the bin |