← Companion site A0 portrait · 1440 × 1018 px @ screen scale
Press P to print / save as PDF
tactile sensing RGB-D · SLAM 3D Gaussian Splatting in-hand manipulation

GaussianFeels: real-time multimodal 3D reconstruction with tactile-enhanced Gaussian splatting.

An online, object-centric Gaussian map for contact-rich manipulation — updated under hand-induced occlusion, tracked when pose supervision is removed, and exported to manipulation from frame zero.

Krishi Attri Dept. of Mechanical Engineering · SNU
Advisor: Prof. Yong-Lae Park · Soft Robotics & Bionics Lab
M.S. THESIS · 2024-24243
gaussianfeels
companion site

① Problem

The geometry that matters is what the camera can't see.

Surfaces hidden by hand and fingertips during in-hand manipulation are exactly what a policy needs. RGB-D covers exposed surfaces; tactile covers a few cm² at contact.

② Research claim

An explicit object-centric Gaussian map is a suitable shared state for contact-rich manipulation: updated online from synchronized multimodal observations, tracked under partial visibility, exposed to manipulation as a progressively improving model.

③ Inputs · synchronized frame Ft

  • It, Dt, Mt · RGB, depth, mask
  • K, TtWC · intrinsics + camera pose
  • Pcamt · camera point cloud
  • Ptact, Ntact · tactile contacts + normals
  • ht · hand state · grasp center FK
  • TtWO,GT · GT pose (optional)

④ Frame-0 seeding · 24-candidate PCA

Translation seed from FK grasp center or depth-cloud centroid + 2 cm view offset. Rotation seed enumerates all 24 axis-aligned orientations of the PCA frame and minimizes:

ϕ(Rk) = svar + 2 scam + 0.5 sup + 10 svis + 100 sin(4.3)

The 100× weight on sin ensures tactile normals dominate when contacts exist.

⑤ Frozen-map signed distance

Smooth point-cloud SDF analogue used by the tracker:

dM(p) = (p − q̃(p))ᵀ ñ(p)(4.7)

The frozen map changes only through accepted recent observations; the tracker aligns against a stable reference rather than against splats being optimized.

⑥ System diagram · per-frame data flow
SENSORS RGB-D + mask It · Dt · Mt Tactile (DIGIT) Pttac, Nttac Hand state ht · FK grasp SYNC Ft eq. (4.1) POSE TRACKER PCA × 24 (frame 0) ICP gate · 4 thresholds Theseus 2-stage SE(3) Frozen-map SDF → TtWO OBJECT MAP μiO · qiO · siO · αi · ci contact-aware spawn density boost · freeze tactile-region boost eqs. (4.9–4.12) ⑦ occlusion-aware reweight eq. (4.13–4.17) MANIPULATION · GPU 1 ⑧ Frame-0 prior Hunyuan3D-2-mini orient. variant search ⑨ Progressive replacement measured overwrites generated provenance preserved ⑩ Policy export point cloud + flags PLY · TtWO trajectory POSE MODE GT GT-init SLAM — same loop, different pose source ⑪ ONLINE BEHAVIOR

⑦ Online training objective

Lcore = λ′rgbLrgb + λ′depthLdepth + λ′tactileLtactile + λsurfaceLsurface(4.13)
svis = 1 − ρocc(1 − wmin), λ′tactile = (1 + β ρocc) λtactile(4.14–4.17)

With wmin=0.2, β=2: tactile supervision overtakes the unoccluded visual baseline at ρocc ≈ 0.4.

⑧ Contributions

  • Object-centric Gaussian map — object frame; world only on render
  • Tracker in the loop — Theseus + frozen-map SDF + ICP prior
  • Contact-aware spawn — insertion + tactile-region boost
  • Occlusion-aware reweighting — visual ↔ tactile trade-off
  • Dual-process interface — SLAM ‖ manipulation, separate GPUs
  • Frame-0 image-to-3D bootstrap — Hunyuan3D-2-mini + search
  • Provenance-preserving update — measured/generated flags

⑨ Benchmark · frame-0 priors

MethodF@5CuboidUsed
Hunyuan3D-2-mini✓ deployed
FastSAM3Dbaseline
TripoSR— ↑blobbaseline
Gecocollapsebaseline
RGB2Pointcollapsebaseline

F-score alone fails — TripoSR can outscore Hunyuan3D on a cuboid while rounding it to a blob. Category-faithful win rate reported alongside F@5.

⑩ Reconstruction quality (reserved)

FeelSight-Sim · F@5
FeelSight-Real · F@5
Occlusion · ADD-S
mm ↓
Runtime
FPS ↑

⑪ Failure modes

  • F1 · thin objects — splats biased outside GT; spawn-scale floor
  • F2 · large motion — ICP gate (Δt > 50 mm); pose lags 1–2 frames
  • F3 · DIGIT-glow mask drift — median-area + 60 px filter mitigate

⑫ Take-away

An explicit object-centric Gaussian map supports online reconstruction, pose tracking under occlusion, and manipulation export from one shared state — no re-encode between roles.

GaussianFeels · M.S. Thesis · Spring 2026 | Department of Mechanical Engineering, Seoul National University
measured / camera tactile / generated Gaussian state
→ companion site