publications
(* Equal Contribution)
2026
-
TranSplat: Instant Object Relighting in Gaussian Splatting via Spherical Harmonic Radiance TransferBoyang Tony Yu, Yanlin Jin, Yun He, Akshat Dave, Ravi Ramamoorthi, and Guha BalakrishnanIn submission. ArXiv preprint arXiv:2503.22676., 2026We present TranSplat, a method for instant, accurate object relighting within the Gaussian Splatting (GS) framework. Rather than relying on costly inverse rendering routines, we propose a BRDF-free radiance transfer strategy that analytically modulates the spherical harmonic (SH) appearance coefficients of an object’s 2D Gaussian surfels using per-normal irradiance ratios derived from source and target environment maps. To handle view-dependent and glossy appearances without explicit material estimation, we introduce a specularity-aware dual-path SH transfer strategy that adapts higher-order SH bands in the reflection domain. Additionally, we propose a lightweight SH-domain self-shadowing module to ensure physically realistic occlusion without explicit mesh raycasting. Operating as a post-processing step, TranSplat requires no additional GS retraining for a pair of source and target scenes. Evaluations on synthetic and real-world objects demonstrate state-of-the-art accuracy, outperforming recent inverse-rendering and diffusion-based GS relighting methods across most conditions, all while completing relighting operations in under one second. Although bounded by radially symmetric BRDF approximations and the low-pass nature of the SH basis, TranSplat produces perceptually realistic renderings even for glossy, complex materials, establishing a valuable, lightweight path forward for GS relighting.
2025
-
ORB-Guided Self-supervised Visual Odometry with Selective Online AdaptationYanlin Jin, Rui-Yang Ju, Haojun Liu, and Yuzhong ZhongIEEE International Conference on Robotics and Automation (ICRA), 2025Deep visual odometry, despite extensive research, still faces limitations in accuracy and generalizability that prevent its broader application. To address these challenges, we propose an Oriented FAST and Rotated BRIEF (ORB)-guided visual odometry with selective online adaptation named ORB-SfMLearner. We present a novel use of ORB features for learning-based ego-motion estimation, leading to more robust and accurate results. We also introduce the cross-attention mechanism to enhance the explainability of PoseNet and have revealed that driving direction of the vehicle can be explained through attention weights. To improve generalizability, our selective online adaptation allows the network to rapidly and selectively adjust to the optimal parameters across different domains. Experimental results on KITTI and vKITTI datasets show that our method outperforms previous state-of-the-art deep visual odometry methods in terms of ego-motion accuracy and generalizability.
2024
-
ToDER: Towards Colonoscopy Depth Estimation and Reconstruction with Geometry Constraint AdaptationZhenhua Wu*, Yanlin Jin*, Liangdong Qiu*, Xiaoguang Han, Xiang Wan, and Guanbin LiTechnical Report. ArXiv preprint arXiv:2407.16508., 2024Visualizing colonoscopy is crucial for medical auxiliary diagnosis to prevent undetected polyps in areas that are not fully observed. Traditional feature-based and depth-based reconstruction approaches usually end up with undesirable results due to incorrect point matching or imprecise depth estimation in realistic colonoscopy videos. Modern deep-based methods often require a sufficient number of ground truth samples, which are generally hard to obtain in optical colonoscopy. To address this issue, self-supervised and domain adaptation methods have been explored. However, these methods neglect geometry constraints and exhibit lower accuracy in predicting detailed depth. We thus propose a novel reconstruction pipeline with a bi-directional adaptation architecture named ToDER to get precise depth estimations. Furthermore, we carefully design a TNet module in our adaptation architecture to yield geometry constraints and obtain better depth quality. Estimated depth is finally utilized to reconstruct a reliable colon model for visualization. Experimental results demonstrate that our approach can precisely predict depth maps in both realistic and synthetic colonoscopy videos compared with other self-supervised and domain adaptation methods. Our method on realistic colonoscopy also shows the great potential for visualizing unobserved regions and preventing misdiagnoses.
-
Three-stage binarization of color document images based on discrete wavelet transform and generative adversarial networksRui-Yang Ju, Yu-Shian Lin, Yanlin Jin, Chih-Chia Chen, Chun-Tse Chien, and Jen-Shiun ChiangKnowledge-Based Systems (IF 7.2), 2024The efficient segmentation of foreground text information from the background in degraded color document images is a critical challenge in the preservation of ancient manuscripts. The imperfect preservation of ancient manuscripts has led to various types of degradation over time, such as staining, yellowing, and ink seepage, significantly affecting image binarization results. This work proposes a three-stage method using generative adversarial networks (GANs) for the degraded color document images binarization. Stage-1 involves applying discrete wavelet transform (DWT) and retaining the low-low (LL) subband images for image enhancement. In Stage-2, the original input image is split into red, green, and blue (RGB) three single-channel images and one grayscale image, and each image is trained with independent adversarial networks to extract color foreground information. In Stage-3, the output image from Stage-2 and the resized input image are used to train independent adversarial networks for document binarization, enabling the integration of global and local features. The experimental results demonstrate that our proposed method outperforms other traditional and state-of-the-art (SOTA) methods on the Document Image Binarization Contest (DIBCO) datasets.
2022
-
A self-attention-embedded deep learning model for phasor measurement unit-based post-fault transient stability predictionXiaoxuan Han, Yanlin Jin, Ge Wu, Sixin Guo, and Tingjian LiuIn 2022 Asian Conference on Frontiers of Power and Energy (ACFPE) , 2022Although deep learning-based predictors have achieved high accuracy in phasor measurement units (PMUs)-based post-fault transient stability assessment (TSA), most of these “black-box” models are not interpretable, making it difficult for operators to select proper countermeasures for instability prevention. To address this problem, we first propose a novel deep learning model embedded with self-attention layers for TSA. After that, we further adopt a transfer learning strategy to develop a set of predictors aiming at the identification of unstable generators. Case study on the New England 10-machine 39-bus system shows that, compared with other baseline models, the proposed self-attention-embedded model is able to achieve better performance in transient stability classification. Moreover, together with the embedded attention module, the predictors generated by transfer learning can be used to inform the operators about the cluster of the unstable generators in the disturbed power system.