Resources

IXPUG banner image

This presentation introduces an innovative approach that combines Large Language Models (LLMs) and differentiable rendering techniques to automate the construction of digital twins. In our approach, we employ LLMs to guide and optimize the placement of objects in digital twin scenarios. This is achieved by integrating LLMs with differentiable rendering, a method traditionally used for optimizing object positions in computer graphics based on image pixel loss. Our technique enhances this process by incorporating a second modality, namely Lidar data, resulting in faster convergence and improved accuracy. This fusion of sensor inputs proves invaluable, especially for applications like autonomous vehicles, where establishing the precise location of multiple actors in a scene is crucial. Our methodology involves several key steps: (1) Generating a point cloud of the scene via ray casting, (2) Extracting lightweight geometry from the point cloud using PlaneSLAM, (3) Creating potential camera paths through the scene, (4) Selecting the most suitable camera path by leveraging the LLM in conjunction with image segmentation and classification, and (5) Rendering the camera flight path from its origin to the final destination. The technical backbone of this system includes the use of Mitsuba for ray tracing, powered by Intel's Embree ray tracing library. This setup encompasses Lidar simulation, image rendering, and a final differentiable rendering step for precise camera positioning. Future iterations may incorporate Intel OSPRay for enhanced Lidar-like ray casting and image rendering, with a possible integration of Mitsuba for differentiable render camera positioning. The machine learning inference chain utilizes a pre-trained LLM from OpenAI accessed via LangChain, coupled with GroundingDINO for zero-shot image segmentation and classification within PyTorch. This entire workflow is optimized for performance on the latest generation of Intel CPUs. This presentation will delve into the technical details of this approach, demonstrating its efficacy in automating digital twin construction and its potential applications in various industries, particularly in the realm of autonomous vehicle navigation and scene understanding.

Event Name

IXPUG Webinar Series

Keywords

LLMs,differentiable rendering,digital twins,ray tracing,Lidar simulation,autonomous vehicles,in situ visualization