[D] Solving the "Liquid-Solid Interface" Problem: 116 High-Fidelity Datasets of Coastal Physics (Waves, Saturated Sand, Light Transport)

Reddit r/MachineLearning / 3/22/2026

📰 NewsIdeas & Deep AnalysisTools & Practical Usage

Key Points

  • The author has compiled 116 high-fidelity datasets of coastal physics (waves, saturated sand, light transport) collected from the Arabian Sea to document phenomena that are currently poorly understood by AI.
  • The datasets emphasize technical integrity, featuring zero motion blur at 1/4000s shutter, ultra-clean data through professional sensor/optics decontamination, and high-bitrate ProRes 422 HQ for challenging high-glare conditions.
  • Full metadata and labeling accompany each dataset, including ISO, shutter, GPS, and comprehensive tagging for precise training and evaluation.
  • Access options include a light 6.6 GB sample via Google Drive and full sets (60+ GB each) available upon request for researchers and developers.
  • The author invites feedback from the ML/CV community on how clean and complete these datasets are for current training pipelines and whether they can reduce flickering and geometric artifacts in fluid-surface generation.
[D] Solving the "Liquid-Solid Interface" Problem: 116 High-Fidelity Datasets of Coastal Physics (Waves, Saturated Sand, Light Transport)

Modern generative models (Sora, Runway, Kling) still struggle with the complex physics of the shoreline. I’ve spent months capturing 116 datasets from the Arabian Sea to document phenomena that are currently poorly understood by AI:

  • Wave-Object Interaction: Real-world flow around obstacles and backwash dynamics.
  • Phase Transitions: The precise moment of water receding and sand drying (albedo/specular decay).
  • Multi-Layer Light Transport: Transparency and subsurface scattering in varying water depths and lighting angles.
  • Complex Reflectivity: Concurrent reflections on moving waves, foam, and water-saturated sand mirrors.
  • Fluid-on-Fluid Dynamics: Standing waves and counter-flows at river mouths during various tidal stages.

Technical Integrity:

  • Zero Motion Blur: Shot at 1/4000s shutter speed. Every bubble and solar sparkle is a sharp geometric reference point.
  • Ultra-Clean Matrix: Professional sensor/optics decontamination. No artifacts, just pure data for segmentation.
  • High-Bitrate: ProRes 422 HQ, preserving 10-bit tonal richness in extreme high-glare (contre-jour) environments.

Full Metadata & Labeling: Each set includes precise technical specs (ISO, Shutter, GPS) and comprehensive labeling.

I’m looking for professional feedback from the ML/CV community: How "clean" and "complete" are these datasets for your current training pipelines?

Access for Evaluation:

  • Light Sample (6.6 GB): Link to Google Drive
  • Full Sets (60+ GB each): Available upon request for researchers and developers.

I am interested in whether this level of physical "ground truth" can significantly reduce flickering and geometric artifacts in fluid-surface generation.

submitted by /u/Artistic_Monk_8334
[link] [comments]