Using machine learning to teach robots to get dressed

In the Siggraph 2018 paper Learning to Dress: Synthesizing Human Dressing Motion via Deep Reinforcement Learning, a Georgia Institute of Technology/Google Brain research team describe how they taught body-shame to an AI, leaving it with an unstoppable compulsion to clothe itself before the frowning mien of God.


The AI uses machine learning tools to "automatically discover robust dressing techniques," and manages to train a robust get-dressed model despite the high computational expense of simulating cloth.


The paper's fascinating: the secret to getting an AI to get dressed is haptics — a sense of touch that is used to dynamically retune the AI's coordination to adjust to the rippling, slithering, treacherous textiles — the model incorporates the ripping point of the cloth and penalizes AIs that rent their garments asunder while clothing themselves.


In this task, a t-shirt is initialized on the character’s shoulders with
the character’s neck contained within the collar. To randomize the
initial garment state, we apply a random impulse force with fixed
magnitude to all the garment vertices at the beginning of the simu-
lation. We allow the garment to settle for 1s before the character
begins to move.

The first control policy completes the task of moving the right
end effector into gripping range of the specified grip feature. The
policy attempts to match a given position and orientation target
in the garment feature space. Once the error threshold is reached,
control transitions to an alignment policy designed to “tuck” the
left end effector and forearm under the waist feature of the garment
in preparation for dressing the arm. This policy attempts to contain
the arm within the triangle formed by the gripping hand and the
shoulders. This heuristic approximates the opening of the garment
waist feature. In addition, this policy is rewarded for contact with
the garment interior and penalized for geodesic distance from a
selected point on the interior of the garment. Once interior contact
is detected, and the arm is within the heuristic triangle, control is
transitioned to the left sleeve dressing controller which attempts to
minimize the end effector contact geodesic distance from the end
feature of the sleeve and maximize the containment depth of the arm
within the sleeve entrance feature. A task vector is provided which
indicates the direction the end effector should move to decrease its
contact geodesic distance (or points to the garment feature if not in contact). Once the limb has passed a threshold distance through
the sleeve, the re-grip controller directs the hands together into
position to exchange grip from right hand to left. Once the left
hand is within a threshold distance of its gripping target, the grip
exchange is triggered and control is transitioned to the second “tuck”
control policy with the same purpose and transition criteria as the
first. The second sleeve policy is then run to pass the right arm
through the right sleeve. At this point, the seventh and final policy
is used to guide the character back to its start pose while avoiding
garment tearing.


Learning to Dress: Synthesizing Human Dressing Motion via Deep
Reinforcement Learning
[Alexander Clegg, Wenhao Yu, Jie Tan, C. Karen Liu And Greg Turk/ACM Transactions on Graphics, Vol. 37, No. 6]

(via JWZ)