FNI V2.0 for Dummy G1: Authority (A:62), Popularity (P:52), Recency (R:87), Quality (Q:50). Semantic (S) is a query-time baseline scored live at search.
This release derives from BONES-SEED (Skeletal Everyday Embodiment
Dataset) β a large multi-actor motion-capture corpus collected for studying
humanoid robot teleoperation. Roughly 522 operators contribute ~142K
motion clips sampled at 120 fps, covering locomotion, communication, dance,
everyday actions, sport, gaming, and interactions. Each operator is annotated
with biometric attributes (height, weight, age, gender) plus per-segment bone
lengths.
In BONES-SEED, every clip ships in three parallel formats so the same motion
can be studied at different levels of body-shape disclosure:
SOMA Proportional BVH β original mocap on each operator's true bone
lengths.
SOMA Uniform BVH β the same motion retargeted onto a single shared
skeleton (body shape stripped).
G1 CSV β the same motion retargeted again to the Unitree G1 humanoid
robot, as 29-DOF joint angles.
The clip identifiers follow {motion_name}__A{actor_uid}[__M].csv, where
_M denotes the mirrored variant. Clips are bucketed by capture date
(YYMMDD).
UNVEIL's central finding is that even after both retargeting steps strip
away the operator's body shape, the G1 joint-angle stream still carries
enough operator-specific dynamics β velocity profiles, ranges of motion,
coordination rhythms β to re-identify the original operator and recover
their height, weight, age, and gender. Our paper proposes an operator-aware
anonymizer that closes this leak; this repository is the result of applying
it to every G1 clip.
What's in this repository
For every clip in BONES-SEED, this repository ships the G1-retargeted CSV
after anonymization. The folder layout mirrors the source's
g1/csv/{date}/{motion}__A{actor}[__M].csv convention, so paths line up
one-to-one with the BONES-SEED G1 split.
text
csv/
βββ /
βββ __A[__M].csv
CSV columns: Frame, root_translate{X,Y,Z}, root_rotate{X,Y,Z}, <29 joint DOFs>
in centimetres and degrees