arXiv Papers of RGB-Thermal Vision
Authors:Asiegbu Miracle Kanu-Asiegbu, Nitin Jotwani, Xiaoxiao Du
Abstract:
Pedestrian detection is a critical task in robot perception. Multispectral modalities (visible light and thermal) can boost pedestrian detection performance by providing complementary visual information. Several gaps remain with multispectral pedestrian detection methods. First, existing approaches primarily focus on spatial fusion and often neglect temporal information. Second, RGB and thermal image pairs in multispectral benchmarks may not always be perfectly aligned. Pedestrians are also challenging to detect due to varying lighting conditions, occlusion, etc. This work proposes Strip-Fusion, a spatial-temporal fusion network that is robust to misalignment in input images, as well as varying lighting conditions and heavy occlusions. The Strip-Fusion pipeline integrates temporally adaptive convolutions to dynamically weigh spatial-temporal features, enabling our model to better capture pedestrian motion and context over time. A novel Kullback-Leibler divergence loss was designed to mitigate modality imbalance between visible and thermal inputs, guiding feature alignment toward the more informative modality during training. Furthermore, a novel post-processing algorithm was developed to reduce false positives. Extensive experimental results show that our method performs competitively for both the KAIST and the CVC-14 benchmarks. We also observed significant improvements compared to previous state-of-the-art on challenging conditions such as heavy occlusion and misalignment.
Authors:Yajvan Ravan, Aref Malek, Chester Dolph, Nikhil Behari
Abstract:
High-altitude, multi-spectral, aerial imagery is scarce and expensive to acquire, yet it is necessary for algorithmic advances and application of machine learning models to high-impact problems such as wildfire detection. We introduce a human-annotated dataset from the NASA Autonomous Modular Sensor (AMS) using 12-channel, medium to high altitude (3 - 50 km) aerial wildfire images similar to those used in current US wildfire missions. Our dataset combines spectral data from 12 different channels, including infrared (IR), short-wave IR (SWIR), and thermal. We take imagery from 20 wildfire missions and randomly sample small patches to generate over 4000 images with high variability, including occlusions by smoke/clouds, easily-confused false positives, and nighttime imagery. We demonstrate results from a deep-learning model to automate the human-intensive process of fire perimeter determination. We train two deep neural networks, one for image classification and the other for pixel-level segmentation. The networks are combined into a unique real-time segmentation model to efficiently localize active wildfire on an incoming image feed. Our model achieves 96% classification accuracy, 74% Intersection-over-Union(IoU), and 84% recall surpassing past methods, including models trained on satellite data and classical color-rule algorithms. By leveraging a multi-spectral dataset, our model is able to detect active wildfire at nighttime and behind clouds, while distinguishing between false positives. We find that data from the SWIR, IR, and thermal bands is the most important to distinguish fire perimeters. Our code and dataset can be found here: https://github.com/nasa/Autonomous-Modular-Sensor-Wildfire-Segmentation/tree/main and https://drive.google.com/drive/folders/1-u4vs9rqwkwgdeeeoUhftCxrfe_4QPTn?=usp=drive_link
Authors:Gang Liu, Sobin Alosious, Subhamoy Mahajan, Eric Inae, Yihan Zhu, Yuhan Liu, Renzheng Zhang, Jiaxin Xu, Addison Howard, Ying Li, Tengfei Luo, Meng Jiang
Abstract:
Machine learning (ML) offers a powerful path toward discovering sustainable polymer materials, but progress has been limited by the lack of large, high-quality, and openly accessible polymer datasets. The Open Polymer Challenge (OPC) addresses this gap by releasing the first community-developed benchmark for polymer informatics, featuring a dataset with 10K polymers and 5 properties: thermal conductivity, radius of gyration, density, fractional free volume, and glass transition temperature. The challenge centers on multi-task polymer property prediction, a core step in virtual screening pipelines for materials discovery. Participants developed models under realistic constraints that include small data, label imbalance, and heterogeneous simulation sources, using techniques such as feature-based augmentation, transfer learning, self-supervised pretraining, and targeted ensemble strategies. The competition also revealed important lessons about data preparation, distribution shifts, and cross-group simulation consistency, informing best practices for future large-scale polymer datasets. The resulting models, analysis, and released data create a new foundation for molecular AI in polymer science and are expected to accelerate the development of sustainable and energy-efficient materials. Along with the competition, we release the test dataset at https://www.kaggle.com/datasets/alexliu99/neurips-open-polymer-prediction-2025-test-data. We also release the data generation pipeline at https://github.com/sobinalosious/ADEPT, which simulates more than 25 properties, including thermal conductivity, radius of gyration, and density.
Authors:Weiran Li, Yeqiang Liu, Yijie Wei, Mina Han, Qiannan Guo, Zhenbo Li
Abstract:
Multi-object tracking (MOT) is a fundamental task in computer vision with critical applications in autonomous driving and robotics. Multimodal MOT that integrates visible light and thermal infrared information is particularly essential for robust autonomous driving systems. However, effectively fusing these heterogeneous modalities is challenging. Simple strategies like concatenation or addition often fail to bridge the significant non-linear distribution gap between their feature representations, which can lead to modality conflicts and degrade tracking accuracy. Drawing inspiration from the connection between multimodal MOT and the iterative refinement in diffusion models, this paper proposes DM$^3$T, a novel framework that reformulates multimodal fusion as an iterative feature alignment process to generate accurate and temporally coherent object trajectories. Our approach performs iterative cross-modal harmonization through a proposed Cross-Modal Diffusion Fusion (C-MDF) module. In this process, features from both modalities provide mutual guidance, iteratively projecting them onto a shared, consistent feature manifold. This enables the learning of complementary information and achieves deeper fusion compared to conventional methods. Additionally, we introduce a plug-and-play Diffusion Refiner (DR) to enhance and refine the unified feature representation. To further improve tracking robustness, we design a Hierarchical Tracker that adaptively handles confidence estimation. DM$^3$T unifies object detection, state estimation, and data association into a comprehensive online tracking framework without complex post-processing. Extensive experiments on the VT-MOT benchmark demonstrate that our method achieves 41.7 HOTA, representing a 1.54% relative improvement over existing state-of-the-art methods. The code and models are available at https://vranlee.github.io/DM-3-T/.
Authors:Minchong Chen, Xiaoyun Yuan, Junzhe Wan, Jianing Zhang, Jun Zhang
Abstract:
The miniaturization of thermal sensors for mobile platforms inherently limits their spatial resolution and textural fidelity, leading to blurry and less informative images. Existing thermal super-resolution (SR) methods can be grouped into single-image and RGB-guided approaches: the former struggles to recover fine structures from limited information, while the latter relies on accurate and laborious cross-camera calibration, which hinders practical deployment and robustness. Here, we propose 3M-TI, a calibration-free Multi-camera cross-Modality diffusion framework for Mobile Thermal Imaging. At its core, 3M-TI integrates a cross-modal self-attention module (CSM) into the diffusion UNet, replacing the original self-attention layers to adaptively align thermal and RGB features throughout the denoising process, without requiring explicit camera calibration. This design enables the diffusion network to leverage its generative prior to enhance spatial resolution, structural fidelity, and texture detail in the super-resolved thermal images. Extensive evaluations on real-world mobile thermal cameras and public benchmarks validate our superior performance, achieving state-of-the-art results in both visual quality and quantitative metrics. More importantly, the thermal images enhanced by 3M-TI lead to substantial gains in critical downstream tasks like object detection and segmentation, underscoring its practical value for robust mobile thermal perception systems. More materials: https://github.com/work-submit/3MTI.
Authors:Hao Li, Yuhao Wang, Xiantao Hu, Wenning Hao, Pingping Zhang, Dong Wang, Huchuan Lu
Abstract:
RGB-Thermal (RGBT) tracking aims to exploit visible and thermal infrared modalities for robust all-weather object tracking. However, existing RGBT trackers struggle to resolve modality discrepancies, which poses great challenges for robust feature representation. This limitation hinders effective cross-modal information propagation and fusion, which significantly reduces the tracking accuracy. To address this limitation, we propose a novel Contextual Aggregation with Deformable Alignment framework called CADTrack for RGBT Tracking. To be specific, we first deploy the Mamba-based Feature Interaction (MFI) that establishes efficient feature interaction via state space models. This interaction module can operate with linear complexity, reducing computational cost and improving feature discrimination. Then, we propose the Contextual Aggregation Module (CAM) that dynamically activates backbone layers through sparse gating based on the Mixture-of-Experts (MoE). This module can encode complementary contextual information from cross-layer features. Finally, we propose the Deformable Alignment Module (DAM) to integrate deformable sampling and temporal propagation, mitigating spatial misalignment and localization drift. With the above components, our CADTrack achieves robust and accurate tracking in complex scenarios. Extensive experiments on five RGBT tracking benchmarks verify the effectiveness of our proposed method. The source code is released at https://github.com/IdolLab/CADTrack.
Authors:Yanpeng Gong, Sishuai Li, Fei Qin, Yue Mei, Xiaoying Zhuang, Timon Rabczuk
Abstract:
This study presents a finite element and virtual element (FE-VE) coupled method for thermomechanical analysis in electronic packaging structures. The approach partitions computational domains strategically, employing FEM for regular geometries to maximize computational efficiency and VEM for complex shapes to enhance geometric flexibility. Interface compatibility is maintained through coincident nodal correspondence, ensuring solution continuity across domain boundaries while reducing meshing complexity and computational overhead. Validation through electronic packaging applications demonstrates reasonable agreement with reference solutions and acceptable convergence characteristics across varying mesh densities. The method effectively captures thermal distributions and stress concentrations in multi-material systems, establishing a practical computational framework for electronic packaging analysis involving complex geometries. Source codes are available at https://github.com/yanpeng-gong/FeVeCoupled-ElectronicPackaging.
Authors:Suyang Li, Fernando Fajardo-Rojas, Diego Gomez-Gualdron, Remco Chang, Mingwei Li
Abstract:
Designing multi-functional alloys requires exploring high-dimensional composition-structure-property spaces, yet current tools are limited to low-dimensional projections and offer limited support for sensitivity or multi-objective tradeoff reasoning. We introduce AlloyLens, an interactive visual analytics system combining a coordinated scatterplot matrix (SPLOM), dynamic parameter sliders, gradient-based sensitivity curves, and nearest neighbor recommendations. This integrated approach reveals latent structure in simulation data, exposes the local impact of compositional changes, and highlights tradeoffs when exact matches are absent. We validate the system through case studies co-developed with domain experts spanning structural, thermal, and electrical alloy design.
Authors:Akshay Sai Banderwaar, Abhishek Gupta
Abstract:
Eigenvalue problems have a distinctive forward-inverse structure and are fundamental to characterizing a system's thermal response, stability, and natural modes. Physics-Informed Neural Networks (PINNs) offer a mesh-free alternative for solving such problems but are often orders of magnitude slower than classical numerical schemes. In this paper, we introduce a reformulated PINN approach that casts the search for eigenpairs as a biconvex optimization problem, enabling fast and provably convergent alternating convex search (ACS) over eigenvalues and eigenfunctions using analytically optimal updates. Numerical experiments show that PINN-ACS attains high accuracy with convergence speeds up to 500$\times$ faster than gradient-based PINN training. We release our codes at https://github.com/NeurIPS-ML4PS-2025/PINN_ACS_CODES.
Authors:Karthikeyan Chandra Sekaran, Markus Geisler, Dominik Rößle, Adithya Mohan, Daniel Cremers, Wolfgang Utschick, Michael Botsch, Werner Huber, Torsten Schön
Abstract:
Recent cooperative perception datasets have played a crucial role in advancing smart mobility applications by enabling information exchange between intelligent agents, helping to overcome challenges such as occlusions and improving overall scene understanding. While some existing real-world datasets incorporate both vehicle-to-vehicle and vehicle-to-infrastructure interactions, they are typically limited to a single intersection or a single vehicle. A comprehensive perception dataset featuring multiple connected vehicles and infrastructure sensors across several intersections remains unavailable, limiting the benchmarking of algorithms in diverse traffic environments. Consequently, overfitting can occur, and models may demonstrate misleadingly high performance due to similar intersection layouts and traffic participant behavior. To address this gap, we introduce UrbanIng-V2X, the first large-scale, multi-modal dataset supporting cooperative perception involving vehicles and infrastructure sensors deployed across three urban intersections in Ingolstadt, Germany. UrbanIng-V2X consists of 34 temporally aligned and spatially calibrated sensor sequences, each lasting 20 seconds. All sequences contain recordings from one of three intersections, involving two vehicles and up to three infrastructure-mounted sensor poles operating in coordinated scenarios. In total, UrbanIng-V2X provides data from 12 vehicle-mounted RGB cameras, 2 vehicle LiDARs, 17 infrastructure thermal cameras, and 12 infrastructure LiDARs. All sequences are annotated at a frequency of 10 Hz with 3D bounding boxes spanning 13 object classes, resulting in approximately 712k annotated instances across the dataset. We provide comprehensive evaluations using state-of-the-art cooperative perception methods and publicly release the codebase, dataset, HD map, and a digital twin of the complete data collection environment.
Authors:Jinyuan Liu, Zihang Chen, Zhu Liu, Zhiying Jiang, Long Ma, Xin Fan, Risheng Liu
Abstract:
We engage in the relatively underexplored task named thermal infrared image enhancement. Existing infrared image enhancement methods primarily focus on tackling individual degradations, such as noise, contrast, and blurring, making it difficult to handle coupled degradations. Meanwhile, all-in-one enhancement methods, commonly applied to RGB sensors, often demonstrate limited effectiveness due to the significant differences in imaging models. In sight of this, we first revisit the imaging mechanism and introduce a Progressive Prompt Fusion Network (PPFN). Specifically, the PPFN initially establishes prompt pairs based on the thermal imaging process. For each type of degradation, we fuse the corresponding prompt pairs to modulate the model's features, providing adaptive guidance that enables the model to better address specific degradations under single or multiple conditions. In addition, a Selective Progressive Training (SPT) mechanism is introduced to gradually refine the model's handling of composite cases to align the enhancement process, which not only allows the model to remove camera noise and retain key structural details, but also enhancing the overall contrast of the thermal image. Furthermore, we introduce the most high-quality, multi-scenarios infrared benchmark covering a wide range of scenarios. Extensive experiments substantiate that our approach not only delivers promising visual results under specific degradation but also significantly improves performance on complex degradation scenes, achieving a notable 8.76\% improvement. Code is available at https://github.com/Zihang-Chen/HM-TIR.
Authors:Jiuhong Xiao, Roshan Nayak, Ning Zhang, Daniel Tortei, Giuseppe Loianno
Abstract:
Paired RGB-thermal data is crucial for visual-thermal sensor fusion and cross-modality tasks, including important applications such as multi-modal image alignment and retrieval. However, the scarcity of synchronized and calibrated RGB-thermal image pairs presents a major obstacle to progress in these areas. To overcome this challenge, RGB-to-Thermal (RGB-T) image translation has emerged as a promising solution, enabling the synthesis of thermal images from abundant RGB datasets for training purposes. In this study, we propose ThermalGen, an adaptive flow-based generative model for RGB-T image translation, incorporating an RGB image conditioning architecture and a style-disentangled mechanism. To support large-scale training, we curated eight public satellite-aerial, aerial, and ground RGB-T paired datasets, and introduced three new large-scale satellite-aerial RGB-T datasets--DJI-day, Bosonplus-day, and Bosonplus-night--captured across diverse times, sensor types, and geographic regions. Extensive evaluations across multiple RGB-T benchmarks demonstrate that ThermalGen achieves comparable or superior translation performance compared to existing GAN-based and diffusion-based methods. To our knowledge, ThermalGen is the first RGB-T image translation model capable of synthesizing thermal images that reflect significant variations in viewpoints, sensor characteristics, and environmental conditions. Project page: http://xjh19971.github.io/ThermalGen
Authors:Ruichao Hou, Xingyuan Li, Tongwei Ren, Dongming Zhou, Gangshan Wu, Jinde Cao
Abstract:
RGB-thermal salient object detection (RGB-T SOD) aims to identify prominent objects by integrating complementary information from RGB and thermal modalities. However, learning the precise boundaries and complete objects remains challenging due to the intrinsic insufficient feature fusion and the extrinsic limitations of data scarcity. In this paper, we propose a novel hybrid prompt-driven segment anything model (HyPSAM), which leverages the zero-shot generalization capabilities of the segment anything model (SAM) for RGB-T SOD. Specifically, we first propose a dynamic fusion network (DFNet) that generates high-quality initial saliency maps as visual prompts. DFNet employs dynamic convolution and multi-branch decoding to facilitate adaptive cross-modality interaction, overcoming the limitations of fixed-parameter kernels and enhancing multi-modal feature representation. Moreover, we propose a plug-and-play refinement network (P2RNet), which serves as a general optimization strategy to guide SAM in refining saliency maps by using hybrid prompts. The text prompt ensures reliable modality input, while the mask and box prompts enable precise salient object localization. Extensive experiments on three public datasets demonstrate that our method achieves state-of-the-art performance. Notably, HyPSAM has remarkable versatility, seamlessly integrating with different RGB-T SOD methods to achieve significant performance gains, thereby highlighting the potential of prompt engineering in this field. The code and results of our method are available at: https://github.com/milotic233/HyPSAM.
Authors:Yuhong Feng, Hongtao Chen, Qi Zhang, Jie Chen, Zhaoxi He, Mingzhe Liu, Jianghai Liao
Abstract:
Accurate RGB-Thermal (RGB-T) crowd counting is crucial for public safety in challenging conditions. While recent Transformer-based methods excel at capturing global context, their inherent lack of spatial inductive bias causes attention to spread to irrelevant background regions, compromising crowd localization precision. Furthermore, effectively bridging the gap between these distinct modalities remains a major hurdle. To tackle this, we propose the Dual Modulation Framework, comprising two modules: Spatially Modulated Attention (SMA), which improves crowd localization by using a learnable Spatial Decay Mask to penalize attention between distant tokens and prevent focus from spreading to the background; and Adaptive Fusion Modulation (AFM), which implements a dynamic gating mechanism to prioritize the most reliable modality for adaptive cross-modal fusion. Extensive experiments on RGB-T crowd counting datasets demonstrate the superior performance of our method compared to previous works. Code available at https://github.com/Cht2924/RGBT-Crowd-Counting.
Authors:Seokjin Go, Joongun Park, Spandan More, Hanjiang Wu, Irene Wang, Aaron Jezghani, Tushar Krishna, Divya Mahajan
Abstract:
The rapid scaling of Large Language Models (LLMs) has pushed training workloads far beyond the limits of single-node analysis, demanding a deeper understanding of how these models behave across large-scale, multi-GPU systems. In this paper, we present a comprehensive characterization of LLM training across diverse real-world workloads and hardware platforms, including NVIDIA H100/H200 and AMD MI250 GPUs. We analyze dense and sparse models under various parallelism strategies -- tensor, pipeline, data, and expert -- and evaluate their effects on hardware utilization, power consumption, and thermal behavior. We further evaluate the effectiveness of optimizations such as activation recomputation and compute-communication overlap. Our findings show that performance is not determined solely by scaling hardware capacity. Scale-up systems with fewer, higher-memory GPUs can outperform scale-out systems in communication-bound regimes, but only under carefully tuned configurations; in other cases, scale-out deployments achieve superior throughput. We also show that certain parallelism combinations, such as tensor with pipeline, lead to bandwidth underutilization due to inefficient data chunking, while increasing microbatch sizes beyond a certain point induces bursty execution and peak power excursions that worsen thermal throttling. These insights reveal how training performance is shaped by complex interactions between hardware, system topology, and model execution. We conclude by offering recommendations for system and hardware design to improve the scalability and reliability of future LLM systems and workloads. The source code of this project is available at https://github.com/sitar-lab/CharLLM-PPT.
Authors:Xiaodong Guo, Tong Liu, Yike Li, Zi'ang Lin, Zhihong Deng
Abstract:
RGB-thermal (RGB-T) semantic segmentation improves the environmental perception of autonomous platforms in challenging conditions. Prevailing models employ encoders pre-trained on RGB images to extract features from both RGB and infrared inputs, and design additional modules to achieve cross-modal feature fusion. This results in limited thermal feature extraction and suboptimal cross-modal fusion, while the redundant encoders further compromises the model's real-time efficiency. To address the above issues, we propose TUNI, with an RGB-T encoder consisting of multiple stacked blocks that simultaneously perform multi-modal feature extraction and cross-modal fusion. By leveraging large-scale pre-training with RGB and pseudo-thermal data, the RGB-T encoder learns to integrate feature extraction and fusion in a unified manner. By slimming down the thermal branch, the encoder achieves a more compact architecture. Moreover, we introduce an RGB-T local module to strengthen the encoder's capacity for cross-modal local feature fusion. The RGB-T local module employs adaptive cosine similarity to selectively emphasize salient consistent and distinct local features across RGB-T modalities. Experimental results show that TUNI achieves competitive performance with state-of-the-art models on FMB, PST900 and CART, with fewer parameters and lower computational cost. Meanwhile, it achieves an inference speed of 27 FPS on a Jetson Orin NX, demonstrating its real-time capability in deployment. Codes are available at https://github.com/xiaodonguo/TUNI.
Authors:Zhipeng Weng, Xiaopeng Liu, Ce Liu, Xingyuan Guo, Yukai Shi, Liang Lin
Abstract:
Although large scale models achieve significant improvements in performance, the overfitting challenge still frequently undermines their generalization ability. In super resolution tasks on images, diffusion models as representatives of generative models typically adopt large scale architectures. However, few-shot drone-captured infrared training data frequently induces severe overfitting in large-scale architectures. To address this key challenge, our method proposes a new Gaussian quantization representation learning method oriented to diffusion models that alleviates overfitting and enhances robustness. At the same time, an effective monitoring mechanism tracks large scale architectures during training to detect signs of overfitting. By introducing Gaussian quantization representation learning, our method effectively reduces overfitting while maintaining architecture complexity. On this basis, we construct a multi source drone-based infrared image benchmark dataset for detection and use it to emphasize overfitting issues of large scale architectures in few sample, drone-based diverse drone-based image reconstruction scenarios. To verify the efficacy of the method in mitigating overfitting, experiments are conducted on the constructed benchmark. Experimental results demonstrate that our method outperforms existing super resolution approaches and significantly mitigates overfitting of large scale architectures under complex conditions. The code and DroneSR dataset will be available at: https://github.com/wengzp1/GARLSR.
Authors:Yanpeng Gong, Sishuai Li, Fei Qin, Bingbing Xu
Abstract:
This paper presents two approaches: the virtual element method (VEM) and the stabilization-free virtual element method (SFVEM) for analyzing thermomechanical behavior in electronic packaging structures with geometric multi-scale features. Since the virtual element method allows the use of arbitrary polygonal elements, the inherent mesh flexibility of VEM allows localized mesh modifications without affecting global mesh structure, making it particularly effective for the analysis of electronic packaging reliability involving complex geometries and multiple geometric scales. The approach implements a novel non-matching mesh generation strategy that strategically combines polygonal meshes for complex small-scale regions with regular quadrilateral meshes for larger domains. The VEM formulation addresses both heat conduction and thermomechanical coupling problems, with comprehensive verification through analytical benchmarks and practical electronic packaging case studies, including Through-Silicon Via (TSV), Ball Grid Array (BGA), and Plastic Ball Grid Array (PBGA) structures. Results demonstrate that the method accurately captures stress concentrations at material interfaces and provides reliable thermal and mechanical response predictions. Some MATLAB codes for the numerical examples are provided at https://github.com/yanpeng-gong/VEM-electronic-packaging and on the VEMhub website (www.vemhub.com).
Authors:Sofiane Bouaziz, Adel Hafiane, Raphael Canals, Rachid Nedjai
Abstract:
Urbanization, climate change, and agricultural stress are increasing the demand for precise and timely environmental monitoring. Land Surface Temperature (LST) is a key variable in this context and is retrieved from remote sensing satellites. However, these systems face a trade-off between spatial and temporal resolution. While spatio-temporal fusion methods offer promising solutions, few have addressed the estimation of daily LST at 10 m resolution. In this study, we present WGAST, a Weakly-Supervised Generative Network for Daily 10 m LST Estimation via Spatio-Temporal Fusion of Terra MODIS, Landsat 8, and Sentinel-2. WGAST is the first end-to-end deep learning framework designed for this task. It adopts a conditional generative adversarial architecture, with a generator composed of four stages: feature extraction, fusion, LST reconstruction, and noise suppression. The first stage employs a set of encoders to extract multi-level latent representations from the inputs, which are then fused in the second stage using cosine similarity, normalization, and temporal attention mechanisms. The third stage decodes the fused features into high-resolution LST, followed by a Gaussian filter to suppress high-frequency noise. Training follows a weakly supervised strategy based on physical averaging principles and reinforced by a PatchGAN discriminator. Experiments demonstrate that WGAST outperforms existing methods in both quantitative and qualitative evaluations. Compared to the best-performing baseline, on average, WGAST reduces RMSE by 17.18% and improves SSIM by 11.00%. Furthermore, WGAST is robust to cloud-induced LST and effectively captures fine-scale thermal patterns, as validated against 33 ground-based sensors. The code is available at https://github.com/Sofianebouaziz1/WGAST.git.
Authors:Xiaoyang Zhang, jinjiang Li, Guodong Fan, Yakun Ju, Linwei Fan, Jun Liu, Alex C. Kot
Abstract:
Infrared and visible image fusion (IVIF) aims to combine the thermal radiation information from infrared images with the rich texture details from visible images to enhance perceptual capabilities for downstream visual tasks. However, existing methods often fail to preserve key targets due to a lack of deep semantic understanding of the scene, while the fusion process itself can also introduce artifacts and detail loss, severely compromising both image quality and task performance. To address these issues, this paper proposes SGDFuse, a conditional diffusion model guided by the Segment Anything Model (SAM), to achieve high-fidelity and semantically-aware image fusion. The core of our method is to utilize high-quality semantic masks generated by SAM as explicit priors to guide the optimization of the fusion process via a conditional diffusion model. Specifically, the framework operates in a two-stage process: it first performs a preliminary fusion of multi-modal features, and then utilizes the semantic masks from SAM jointly with the preliminary fused image as a condition to drive the diffusion model's coarse-to-fine denoising generation. This ensures the fusion process not only has explicit semantic directionality but also guarantees the high fidelity of the final result. Extensive experiments demonstrate that SGDFuse achieves state-of-the-art performance in both subjective and objective evaluations, as well as in its adaptability to downstream tasks, providing a powerful solution to the core challenges in image fusion. The code of SGDFuse is available at https://github.com/boshizhang123/SGDFuse.
Authors:Xiao Wang, Zikang Yan, Hao Si, Zhendong Yang, Qingquan Yang, Dengdi Sun, Wanli Lyu, Jin Tang
Abstract:
Estimating heat flux in the nuclear fusion device EAST is a critically important task. Traditional scientific computing methods typically model this process using the Finite Element Method (FEM). However, FEM relies on grid-based sampling for computation, which is computationally inefficient and hard to perform real-time simulations during actual experiments. Inspired by artificial intelligence-powered scientific computing, this paper proposes a novel Physics-Informed Neural Network (PINN) to address this challenge, significantly accelerating the heat conduction estimation process while maintaining high accuracy. Specifically, given inputs of different materials, we first feed spatial coordinates and time stamps into the neural network, and compute boundary loss, initial condition loss, and physical loss based on the heat conduction equation. Additionally, we sample a small number of data points in a data-driven manner to better fit the specific heat conduction scenario, further enhancing the model's predictive capability. We conduct experiments under both uniform and non-uniform heating conditions on the top surface. Experimental results show that the proposed thermal conduction physics-informed neural network achieves accuracy comparable to the finite element method, while achieving $\times$40 times acceleration in computational efficiency. The dataset and source code will be released on https://github.com/Event-AHU/OpenFusion.
Authors:Yaozong Zheng, Bineng Zhong, Qihua Liang, Shengping Zhang, Guorong Li, Xianxian Li, Rongrong Ji
Abstract:
We propose a universal video-level modality-awareness tracking model with online dense temporal token learning (called {\modaltracker}). It is designed to support various tracking tasks, including RGB, RGB+Thermal, RGB+Depth, and RGB+Event, utilizing the same model architecture and parameters. Specifically, our model is designed with three core goals: \textbf{Video-level Sampling}. We expand the model's inputs to a video sequence level, aiming to see a richer video context from an near-global perspective. \textbf{Video-level Association}. Furthermore, we introduce two simple yet effective online dense temporal token association mechanisms to propagate the appearance and motion trajectory information of target via a video stream manner. \textbf{Modality Scalable}. We propose two novel gated perceivers that adaptively learn cross-modal representations via a gated attention mechanism, and subsequently compress them into the same set of model parameters via a one-shot training manner for multi-task inference. This new solution brings the following benefits: (i) The purified token sequences can serve as temporal prompts for the inference in the next video frames, whereby previous information is leveraged to guide future inference. (ii) Unlike multi-modal trackers that require independent training, our one-shot training scheme not only alleviates the training burden, but also improves model representation. Extensive experiments on visible and multi-modal benchmarks show that our {\modaltracker} achieves a new \textit{SOTA} performance. The code will be available at https://github.com/GXNU-ZhongLab/ODTrack.
Authors:Yongsong Huang, Tomo Miyazaki, Xiaofeng Liu, Shinichiro Omachi
Abstract:
Infrared Image Super-Resolution (IRSR) is challenged by the low contrast and sparse textures of infrared data, requiring robust long-range modeling to maintain global coherence. While State-Space Models like Mamba offer proficiency in modeling long-range dependencies for this task, their inherent 1D causal scanning mechanism fragments the global context of 2D images, hindering fine-detail restoration. To address this, we propose Global Phase and Spectral Prompt-guided Mamba (GPSMamba), a framework that synergizes architectural guidance with non-causal supervision. First, our Adaptive Semantic-Frequency State Space Module (ASF-SSM) injects a fused semantic-frequency prompt directly into the Mamba block, integrating non-local context to guide reconstruction. Then, a novel Thermal-Spectral Attention and Phase Consistency Loss provides explicit, non-causal supervision to enforce global structural and spectral fidelity. By combining these two innovations, our work presents a systematic strategy to mitigate the limitations of causal modeling. Extensive experiments demonstrate that GPSMamba achieves state-of-the-art performance, validating our approach as a powerful new paradigm for infrared image restoration. Code is available at https://github.com/yongsongH/GPSMamba.
Authors:Kailai Zhou, Fuqiang Yang, Shixian Wang, Bihan Wen, Chongde Zi, Linsen Chen, Qiu Shen, Xun Cao
Abstract:
RGB-Thermal (RGBT) multispectral vision is essential for robust perception in complex environments. Most RGBT tasks follow a case-by-case research paradigm, relying on manually customized models to learn task-oriented representations. Nevertheless, this paradigm is inherently constrained by artificial inductive bias, modality bias, and data bottleneck. To address these limitations, we make the initial attempt to build a Generalized RGBT MultiSpectral foundation model (M-SpecGene), which aims to learn modality-invariant representations from large-scale broad data in a self-supervised manner. M-SpecGene provides new insights into multispectral fusion and integrates prior case-by-case studies into a unified paradigm. Considering the unique characteristic of information imbalance in RGBT data, we introduce the Cross-Modality Structural Sparsity (CMSS) metric to quantify the information density across two modalities. Then we develop the GMM-CMSS progressive masking strategy to facilitate a flexible, easy-to-hard, and object-centric pre-training process. Comprehensive experiments validate M-SpecGene's generalizability across eleven datasets for four RGBT downstream tasks. The code will be available at https://github.com/CalayZhou/M-SpecGene.
Authors:Jifeng Shen, Haibo Zhan, Shaohua Dong, Xin Zuo, Wankou Yang, Haibin Ling
Abstract:
Modern multispectral feature fusion for object detection faces two critical limitations: (1) Excessive preference for local complementary features over cross-modal shared semantics adversely affects generalization performance; and (2) The trade-off between the receptive field size and computational complexity present critical bottlenecks for scalable feature modeling. Addressing these issues, a novel Multispectral State-Space Feature Fusion framework, dubbed MS2Fusion, is proposed based on the state space model (SSM), achieving efficient and effective fusion through a dual-path parametric interaction mechanism. More specifically, the first cross-parameter interaction branch inherits the advantage of cross-attention in mining complementary information with cross-modal hidden state decoding in SSM. The second shared-parameter branch explores cross-modal alignment with joint embedding to obtain cross-modal similar semantic features and structures through parameter sharing in SSM. Finally, these two paths are jointly optimized with SSM for fusing multispectral features in a unified framework, allowing our MS2Fusion to enjoy both functional complementarity and shared semantic space. In our extensive experiments on mainstream benchmarks including FLIR, M3FD and LLVIP, our MS2Fusion significantly outperforms other state-of-the-art multispectral object detection methods, evidencing its superiority. Moreover, MS2Fusion is general and applicable to other multispectral perception tasks. We show that, even without specific design, MS2Fusion achieves state-of-the-art results on RGB-T semantic segmentation and RGBT salient object detection, showing its generality. The source code will be available at https://github.com/61s61min/MS2Fusion.git.
Authors:Antonella Barisic Kulas, Andreja Jurasovic, Stjepan Bogdan
Abstract:
Thermal imaging from unmanned aerial vehicles (UAVs) holds significant potential for applications in search and rescue, wildlife monitoring, and emergency response, especially under low-light or obscured conditions. However, the scarcity of large-scale, diverse thermal aerial datasets limits the advancement of deep learning models in this domain, primarily due to the high cost and logistical challenges of collecting thermal data. In this work, we introduce a novel procedural pipeline for generating synthetic thermal images from an aerial perspective. Our method integrates arbitrary object classes into existing thermal backgrounds by providing control over the position, scale, and orientation of the new objects, while aligning them with the viewpoints of the background. We enhance existing thermal datasets by introducing new object categories, specifically adding a drone class in urban environments to the HIT-UAV dataset and an animal category to the MONET dataset. In evaluating these datasets for object detection task, we showcase strong performance across both new and existing classes, validating the successful expansion into new applications. Through comparative analysis, we show that thermal detectors outperform their visible-light-trained counterparts and highlight the importance of replicating aerial viewing angles. Project page: https://github.com/larics/thermal_aerial_synthetic.
Authors:Boyue Xu, Ruichao Hou, Tongwei Ren, Gangshan Wu
Abstract:
Prompt-learning-based multi-modal trackers have achieved promising progress by employing lightweight visual adapters to incorporate auxiliary modality features into frozen foundation models. However, existing approaches often struggle to learn reliable prompts due to limited exploitation of critical cues across frequency and temporal domains. In this paper, we propose a novel visual and memory dual adapter (VMDA) to construct more robust and discriminative representations for multi-modal tracking. Specifically, we develop a simple but effective visual adapter that adaptively transfers discriminative cues from auxiliary modality to dominant modality by jointly modeling the frequency, spatial, and channel-wise features. Additionally, we design the memory adapter inspired by the human memory mechanism, which stores global temporal cues and performs dynamic update and retrieval operations to ensure the consistent propagation of reliable temporal information across video sequences. Extensive experiments demonstrate that our method achieves state-of-the-art performance on the various multi-modal tracking tasks, including RGB-Thermal, RGB-Depth, and RGB-Event tracking. Code and models are available at https://github.com/xuboyue1999/mmtrack.git.
Authors:Xiaodong Guo, Zi'ang Lin, Luwen Hu, Zhihong Deng, Tong Liu, Wujie Zhou
Abstract:
The integration of RGB and thermal data can significantly improve semantic segmentation performance in wild environments for field robots. Nevertheless, multi-source data processing (e.g. Transformer-based approaches) imposes significant computational overhead, presenting challenges for resource-constrained systems. To resolve this critical limitation, we introduced CM-SSM, an efficient RGB-thermal semantic segmentation architecture leveraging a cross-modal state space modeling (SSM) approach. Our framework comprises two key components. First, we introduced a cross-modal 2D-selective-scan (CM-SS2D) module to establish SSM between RGB and thermal modalities, which constructs cross-modal visual sequences and derives hidden state representations of one modality from the other. Second, we developed a cross-modal state space association (CM-SSA) module that effectively integrates global associations from CM-SS2D with local spatial features extracted through convolutional operations. In contrast with Transformer-based approaches, CM-SSM achieves linear computational complexity with respect to image resolution. Experimental results show that CM-SSM achieves state-of-the-art performance on the CART dataset with fewer parameters and lower computational cost. Further experiments on the PST900 dataset demonstrate its generalizability. Codes are available at https://github.com/xiaodonguo/CMSSM.
Authors:Abhay Negi, Omey M. Manyar, Satyandra K. Gupta
Abstract:
Robotic manipulation in space is essential for emerging applications such as debris removal and in-space servicing, assembly, and manufacturing (ISAM). A key requirement for these tasks is the ability to perform precise, contact-rich manipulation under significant uncertainty. In particular, thermal-induced deformation of manipulator links and temperature-dependent encoder bias introduce kinematic parameter errors that significantly degrade end-effector accuracy. Traditional calibration techniques rely on external sensors or dedicated calibration procedures, which can be infeasible or risky in dynamic, space-based operational scenarios.
This paper proposes a novel method for kinematic parameter estimation that only requires encoder measurements and binary contact detection. The approach focuses on estimating link thermal deformation strain and joint encoder biases by leveraging information of the contact manifold - the set of relative SE(3) poses at which contact between the manipulator and environment occurs. We present two core contributions: (1) a differentiable, learning-based model of the contact manifold, and (2) an optimization-based algorithm for estimating kinematic parameters from encoder measurements at contact instances. By enabling parameter estimation using only encoder measurements and contact detection, this method provides a robust, interpretable, and data-efficient solution for safe and accurate manipulation in the challenging conditions of space.
Authors:Kaiyuan Chen, Zhengjie Hu, Shaolin Zhang, Yuanqing Xia, Wannian Liang, Shuo Wang
Abstract:
The rapid detection of abnormal body temperatures in urban populations is essential for managing public health risks, especially during outbreaks of infectious diseases. Multi-drone thermal screening systems offer promising solutions for fast, large-scale, and non-intrusive human temperature monitoring. However, trajectory planning for multiple drones in complex urban environments poses significant challenges, including collision avoidance, coverage efficiency, and constrained flight environments. In this study, we propose an enhanced trust region sequential convex optimization (TR-SCO) algorithm for optimal trajectory planning of multiple drones performing thermal screening tasks. Our improved algorithm integrates a refined convex optimization formulation within a trust region framework, effectively balancing trajectory smoothness, obstacle avoidance, altitude constraints, and maximum screening coverage. Simulation results demonstrate that our approach significantly improves trajectory optimality and computational efficiency compared to conventional convex optimization methods. This research provides critical insights and practical contributions toward deploying efficient multi-drone systems for real-time thermal screening in urban areas. For reader who are interested in our research, we release our source code at https://github.com/Cherry0302/Enhanced-TR-SCO.
Authors:Ayush Shrivastava, Andrew Owens
Abstract:
We present a method for finding cross-modal space-time correspondences. Given two images from different visual modalities, such as an RGB image and a depth map, our model identifies which pairs of pixels correspond to the same physical points in the scene. To solve this problem, we extend the contrastive random walk framework to simultaneously learn cycle-consistent feature representations for both cross-modal and intra-modal matching. The resulting model is simple and has no explicit photo-consistency assumptions. It can be trained entirely using unlabeled data, without the need for any spatially aligned multimodal image pairs. We evaluate our method on both geometric and semantic correspondence tasks. For geometric matching, we consider challenging tasks such as RGB-to-depth and RGB-to-thermal matching (and vice versa); for semantic matching, we evaluate on photo-sketch and cross-style image alignment. Our method achieves strong performance across all benchmarks.
Authors:Raman Jha, Adithya Lenka, Mani Ramanagopal, Aswin Sankaranarayanan, Kaushik Mitra
Abstract:
In nighttime conditions, high noise levels and bright illumination sources degrade image quality, making low-light image enhancement challenging. Thermal images provide complementary information, offering richer textures and structural details. We propose RT-X Net, a cross-attention network that fuses RGB and thermal images for nighttime image enhancement. We leverage self-attention networks for feature extraction and a cross-attention mechanism for fusion to effectively integrate information from both modalities. To support research in this domain, we introduce the Visible-Thermal Image Enhancement Evaluation (V-TIEE) dataset, comprising 50 co-located visible and thermal images captured under diverse nighttime conditions. Extensive evaluations on the publicly available LLVIP dataset and our V-TIEE dataset demonstrate that RT-X Net outperforms state-of-the-art methods in low-light image enhancement. The code and the V-TIEE can be found here https://github.com/jhakrraman/rt-xnet.
Authors:X. Feng, D. Zhang, S. Hu, X. Li, M. Wu, J. Zhang, X. Chen, K. Huang
Abstract:
Effectively modeling and utilizing spatiotemporal features from RGB and other modalities (\eg, depth, thermal, and event data, denoted as X) is the core of RGB-X tracker design. Existing methods often employ two parallel branches to separately process the RGB and X input streams, requiring the model to simultaneously handle two dispersed feature spaces, which complicates both the model structure and computation process. More critically, intra-modality spatial modeling within each dispersed space incurs substantial computational overhead, limiting resources for inter-modality spatial modeling and temporal modeling. To address this, we propose a novel tracker, CSTrack, which focuses on modeling Compact Spatiotemporal features to achieve simple yet effective tracking. Specifically, we first introduce an innovative Spatial Compact Module that integrates the RGB-X dual input streams into a compact spatial feature, enabling thorough intra- and inter-modality spatial modeling. Additionally, we design an efficient Temporal Compact Module that compactly represents temporal features by constructing the refined target distribution heatmap. Extensive experiments validate the effectiveness of our compact spatiotemporal modeling method, with CSTrack achieving new SOTA results on mainstream RGB-X benchmarks. The code and models will be released at: https://github.com/XiaokunFeng/CSTrack.
Authors:Ze Wang, Jingang Qu, Zhenyu Gao, Pascal Morin
Abstract:
This work demonstrates an airflow inertial based odometry system with multi-sensor data fusion, including thermal anemometer, IMU, ESC, and barometer. This goal is challenging because low-cost IMUs and barometers have significant bias, and anemometer measurements are very susceptible to interference from spinning propellers and ground effects. We employ a GRU-based deep neural network to estimate relative air speed from noisy and disturbed anemometer measurements, and an observer with bias model to fuse the sensor data and thus estimate the state of aerial vehicle. A complete flight data, including takeoff and landing on the ground, shows that the approach is able to decouple the downwash induced wind speed caused by propellers and the ground effect, and accurately estimate the flight speed in a wind-free indoor environment. IMU, and barometer bias are effectively estimated, which significantly reduces the position integration drift, which is only 5.7m for 203s manual random flight. The open source is available on https://github.com/SyRoCo-ISIR/Flight-Speed-Estimation-Airflow.
Authors:Xiao Wang, Yu Jin, Lan Chen, Bo Jiang, Lin Zhu, Yonghong Tian, Jin Tang, Bin Luo
Abstract:
Event-based Vision Sensors (EVS) have demonstrated significant advantages over traditional RGB frame-based cameras in low-light conditions, high-speed motion capture, and low latency. Consequently, object detection based on EVS has attracted increasing attention from researchers. Current event stream object detection algorithms are typically built upon Convolutional Neural Networks (CNNs) or Transformers, which either capture limited local features using convolutional filters or incur high computational costs due to the utilization of self-attention. Recently proposed vision heat conduction backbone networks have shown a good balance between efficiency and accuracy; however, these models are not specifically designed for event stream data. They exhibit weak capability in modeling object contour information and fail to exploit the benefits of multi-scale features. To address these issues, this paper proposes a novel dynamic graph induced contour-aware heat conduction network for event stream based object detection, termed CvHeat-DET. The proposed model effectively leverages the clear contour information inherent in event streams to predict the thermal diffusivity coefficients within the heat conduction model, and integrates hierarchical structural graph features to enhance feature learning across multiple scales. Extensive experiments on three benchmark datasets for event stream-based object detection fully validated the effectiveness of the proposed model. The source code of this paper will be released on https://github.com/Event-AHU/OpenEvDET.
Authors:Xiao Ni, Carsten Kuehnel, Xiaoyi Jiang
Abstract:
Rapid advances in deep learning for computer vision have driven the adoption of RGB camera-based adaptive traffic light systems to improve traffic safety and pedestrian comfort. However, these systems often overlook the needs of people with mobility restrictions. Moreover, the use of RGB cameras presents significant challenges, including limited detection performance under adverse weather or low-visibility conditions, as well as heightened privacy concerns. To address these issues, we propose a fully automated, thermal detector-based traffic light system that dynamically adjusts signal durations for individuals with walking impairments or mobility burden and triggers the auditory signal for visually impaired individuals, thereby advancing towards barrier-free intersection for all users. To this end, we build the thermal dataset for people with mobility restrictions (TD4PWMR), designed to capture diverse pedestrian scenarios, particularly focusing on individuals with mobility aids or mobility burden under varying environmental conditions, such as different lighting, weather, and crowded urban settings. While thermal imaging offers advantages in terms of privacy and robustness to adverse conditions, it also introduces inherent hurdles for object detection due to its lack of color and fine texture details and generally lower resolution of thermal images. To overcome these limitations, we develop YOLO-Thermal, a novel variant of the YOLO architecture that integrates advanced feature extraction and attention mechanisms for enhanced detection accuracy and robustness in thermal imaging. Experiments demonstrate that the proposed thermal detector outperforms existing detectors, while the proposed traffic light system effectively enhances barrier-free intersection. The source codes and dataset are available at https://github.com/leon2014dresden/YOLO-THERMAL.
Authors:Jitesh Joshi, Youngjun Cho
Abstract:
Remote physiological sensing using camera-based technologies offers transformative potential for non-invasive vital sign monitoring across healthcare and human-computer interaction domains. Although deep learning approaches have advanced the extraction of physiological signals from video data, existing methods have not been sufficiently assessed for their robustness to domain shifts. These shifts in remote physiological sensing include variations in ambient conditions, camera specifications, head movements, facial poses, and physiological states which often impact real-world performance significantly. Cross-dataset evaluation provides an objective measure to assess generalization capabilities across these domain shifts. We introduce Target Signal Constrained Factorization module (TSFM), a novel multidimensional attention mechanism that explicitly incorporates physiological signal characteristics as factorization constraints, allowing more precise feature extraction. Building on this innovation, we present MMRPhys, an efficient dual-branch 3D-CNN architecture designed for simultaneous multitask estimation of photoplethysmography (rPPG) and respiratory (rRSP) signals from multimodal RGB and thermal video inputs. Through comprehensive cross-dataset evaluation on five benchmark datasets, we demonstrate that MMRPhys with TSFM significantly outperforms state-of-the-art methods in generalization across domain shifts for rPPG and rRSP estimation, while maintaining a minimal inference latency suitable for real-time applications. Our approach establishes new benchmarks for robust multitask and multimodal physiological sensing and offers a computationally efficient framework for practical deployment in unconstrained environments. The web browser-based application featuring on-device real-time inference of MMRPhys model is available at https://physiologicailab.github.io/mmrphys-live
Authors:Shenglan Li, Rui Yao, Yong Zhou, Hancheng Zhu, Kunyang Sun, Bing Liu, Zhiwen Shao, Jiaqi Zhao
Abstract:
To reduce the reliance on large-scale annotations, self-supervised RGB-T tracking approaches have garnered significant attention. However, the omission of the object region by erroneous pseudo-label or the introduction of background noise affects the efficiency of modality fusion, while pseudo-label noise triggered by similar object noise can further affect the tracking performance. In this paper, we propose GDSTrack, a novel approach that introduces dynamic graph fusion and temporal diffusion to address the above challenges in self-supervised RGB-T tracking. GDSTrack dynamically fuses the modalities of neighboring frames, treats them as distractor noise, and leverages the denoising capability of a generative model. Specifically, by constructing an adjacency matrix via an Adjacency Matrix Generator (AMG), the proposed Modality-guided Dynamic Graph Fusion (MDGF) module uses a dynamic adjacency matrix to guide graph attention, focusing on and fusing the object's coherent regions. Temporal Graph-Informed Diffusion (TGID) models MDGF features from neighboring frames as interference, and thus improving robustness against similar-object noise. Extensive experiments conducted on four public RGB-T tracking datasets demonstrate that GDSTrack outperforms the existing state-of-the-art methods. The source code is available at https://github.com/LiShenglana/GDSTrack.
Authors:Stefanos Gkikas, Raul Fernandez Rojas, Manolis Tsiknakis
Abstract:
Pain is a manifold condition that impacts a significant percentage of the population. Accurate and reliable pain evaluation for the people suffering is crucial to developing effective and advanced pain management protocols. Automatic pain assessment systems provide continuous monitoring and support decision-making processes, ultimately aiming to alleviate distress and prevent functionality decline. This study introduces PainFormer, a vision foundation model based on multi-task learning principles trained simultaneously on 14 tasks/datasets with a total of 10.9 million samples. Functioning as an embedding extractor for various input modalities, the foundation model provides feature representations to the Embedding-Mixer, a transformer-based module that performs the final pain assessment. Extensive experiments employing behavioral modalities - including RGB, synthetic thermal, and estimated depth videos - and physiological modalities such as ECG, EMG, GSR, and fNIRS revealed that PainFormer effectively extracts high-quality embeddings from diverse input modalities. The proposed framework is evaluated on two pain datasets, BioVid and AI4Pain, and directly compared to 75 different methodologies documented in the literature. Experiments conducted in unimodal and multimodal settings demonstrate state-of-the-art performances across modalities and pave the way toward general-purpose models for automatic pain assessment. The foundation model's architecture (code) and weights are available at: https://github.com/GkikasStefanos/PainFormer.
Authors:Jonas Frey, Turcan Tuna, Lanke Frank Tarimo Fu, Cedric Weibel, Katharine Patterson, Benjamin Krummenacher, Matthias Müller, Julian Nubert, Maurice Fallon, Cesar Cadena, Marco Hutter
Abstract:
Achieving robust autonomy in mobile robots operating in complex and unstructured environments requires a multimodal sensor suite capable of capturing diverse and complementary information. However, designing such a sensor suite involves multiple critical design decisions, such as sensor selection, component placement, thermal and power limitations, compute requirements, networking, synchronization, and calibration. While the importance of these key aspects is widely recognized, they are often overlooked in academia or retained as proprietary knowledge within large corporations. To improve this situation, we present Boxi, a tightly integrated sensor payload that enables robust autonomy of robots in the wild. This paper discusses the impact of payload design decisions made to optimize algorithmic performance for downstream tasks, specifically focusing on state estimation and mapping. Boxi is equipped with a variety of sensors: two LiDARs, 10 RGB cameras including high-dynamic range, global shutter, and rolling shutter models, an RGB-D camera, 7 inertial measurement units (IMUs) of varying precision, and a dual antenna RTK GNSS system. Our analysis shows that time synchronization, calibration, and sensor modality have a crucial impact on the state estimation performance. We frame this analysis in the context of cost considerations and environment-specific challenges. We also present a mobile sensor suite `cookbook` to serve as a comprehensive guideline, highlighting generalizable key design considerations and lessons learned during the development of Boxi. Finally, we demonstrate the versatility of Boxi being used in a variety of applications in real-world scenarios, contributing to robust autonomy. More details and code: https://github.com/leggedrobotics/grand_tour_box
Authors:Xingxing Zuo, Nikhil Ranganathan, Connor Lee, Georgia Gkioxari, Soon-Jo Chung
Abstract:
Monocular depth estimation (MDE) from thermal images is a crucial technology for robotic systems operating in challenging conditions such as fog, smoke, and low light. The limited availability of labeled thermal data constrains the generalization capabilities of thermal MDE models compared to foundational RGB MDE models, which benefit from datasets of millions of images across diverse scenarios. To address this challenge, we introduce a novel pipeline that enhances thermal MDE through knowledge distillation from a versatile RGB MDE model. Our approach features a confidence-aware distillation method that utilizes the predicted confidence of the RGB MDE to selectively strengthen the thermal MDE model, capitalizing on the strengths of the RGB model while mitigating its weaknesses. Our method significantly improves the accuracy of the thermal MDE, independent of the availability of labeled depth supervision, and greatly expands its applicability to new scenarios. In our experiments on new scenarios without labeled depth, the proposed confidence-aware distillation method reduces the absolute relative error of thermal MDE by 22.88\% compared to the baseline without distillation.
Authors:Manjunath D, Aniruddh Sikdar, Prajwal Gurunath, Sumanth Udupa, Suresh Sundaram
Abstract:
Domain-adaptive thermal object detection plays a key role in facilitating visible (RGB)-to-thermal (IR) adaptation by reducing the need for co-registered image pairs and minimizing reliance on large annotated IR datasets. However, inherent limitations of IR images, such as the lack of color and texture cues, pose challenges for RGB-trained models, leading to increased false positives and poor-quality pseudo-labels. To address this, we propose Semantic-Aware Gray color Augmentation (SAGA), a novel strategy for mitigating color bias and bridging the domain gap by extracting object-level features relevant to IR images. Additionally, to validate the proposed SAGA for drone imagery, we introduce the IndraEye, a multi-sensor (RGB-IR) dataset designed for diverse applications. The dataset contains 5,612 images with 145,666 instances, captured from diverse angles, altitudes, backgrounds, and times of day, offering valuable opportunities for multimodal learning, domain adaptation for object detection and segmentation, and exploration of sensor-specific strengths and weaknesses. IndraEye aims to enhance the development of more robust and accurate aerial perception systems, especially in challenging environments. Experimental results show that SAGA significantly improves RGB-to-IR adaptation for autonomous driving and IndraEye dataset, achieving consistent performance gains of +0.4% to +7.6% (mAP) when integrated with state-of-the-art domain adaptation techniques. The dataset and codes are available at https://github.com/airliisc/IndraEye.
Authors:Aihua Zheng, Yongqi Sun, Zi Wang, Chenglong Li, Jin Tang
Abstract:
The performance of multi-spectral vehicle Re-identification (ReID) is significantly degraded when some important discriminative cues in visible, near infrared and thermal infrared spectra are lost. Existing methods generate or enhance missing details in low-quality spectra data using the high-quality one, generally called the primary spectrum, but how to justify the primary spectrum is a challenging problem. In addition, when the quality of the primary spectrum is low, the enhancement effect would be greatly degraded, thus limiting the performance of multi-spectral vehicle ReID. To address these problems, we propose the Collaborative Enhancement Network (CoEN), which generates a high-quality proxy from all spectra data and leverages it to supervise the selection of primary spectrum and enhance all spectra features in a collaborative manner, for robust multi-spectral vehicle ReID. First, to integrate the rich cues from all spectra data, we design the Proxy Generator (PG) to progressively aggregate multi-spectral features. Second, we design the Dynamic Quality Sort Module (DQSM), which sorts all spectra data by measuring their correlations with the proxy, to accurately select the primary spectra with the highest correlation. Finally, we design the Collaborative Enhancement Module (CEM) to effectively compensate for missing contents of all spectra by collaborating the primary spectra and the proxy, thereby mitigating the impact of low-quality primary spectra. Extensive experiments on three benchmark datasets are conducted to validate the efficacy of the proposed approach against other multi-spectral vehicle ReID methods. The codes will be released at https://github.com/yongqisun/CoEN.
Authors:Mengyuan Li, Changhong Fu, Ziyu Lu, Zijie Zhang, Haobo Zuo, Liangliang Yao
Abstract:
Thermal imaging can greatly enhance the application of intelligent unmanned aerial vehicles (UAV) in challenging environments. However, the inherent low resolution of thermal sensors leads to insufficient details and blurred boundaries. Super-resolution (SR) offers a promising solution to address this issue, while most existing SR methods are designed for fixed-scale SR. They are computationally expensive and inflexible in practical applications. To address above issues, this work proposes a novel any-scale thermal SR method (AnyTSR) for UAV within a single model. Specifically, a new image encoder is proposed to explicitly assign specific feature code to enable more accurate and flexible representation. Additionally, by effectively embedding coordinate offset information into the local feature ensemble, an innovative any-scale upsampler is proposed to better understand spatial relationships and reduce artifacts. Moreover, a novel dataset (UAV-TSR), covering both land and water scenes, is constructed for thermal SR tasks. Experimental results demonstrate that the proposed method consistently outperforms state-of-the-art methods across all scaling factors as well as generates more accurate and detailed high-resolution images. The code is located at https://github.com/vision4robotics/AnyTSR.
Authors:Anning Hu, Ang Li, Xirui Jin, Danping Zou
Abstract:
We introduce ThermoStereoRT, a real-time thermal stereo matching method designed for all-weather conditions that recovers disparity from two rectified thermal stereo images, envisioning applications such as night-time drone surveillance or under-bed cleaning robots. Leveraging a lightweight yet powerful backbone, ThermoStereoRT constructs a 3D cost volume from thermal images and employs multi-scale attention mechanisms to produce an initial disparity map. To refine this map, we design a novel channel and spatial attention module. Addressing the challenge of sparse ground truth data in thermal imagery, we utilize knowledge distillation to boost performance without increasing computational demands. Comprehensive evaluations on multiple datasets demonstrate that ThermoStereoRT delivers both real-time capacity and robust accuracy, making it a promising solution for real-world deployment in various challenging environments. Our code will be released on https://github.com/SJTU-ViSYS-team/ThermoStereoRT
Authors:Shiao Wang, Xiao Wang, Bo Jiang, Lin Zhu, Guoqi Li, Yaowei Wang, Yonghong Tian, Jin Tang
Abstract:
Human Activity Recognition (HAR) primarily relied on traditional RGB cameras to achieve high-performance activity recognition. However, the challenging factors in real-world scenarios, such as insufficient lighting and rapid movements, inevitably degrade the performance of RGB cameras. To address these challenges, biologically inspired event cameras offer a promising solution to overcome the limitations of traditional RGB cameras. In this work, we rethink human activity recognition by combining the RGB and event cameras. The first contribution is the proposed large-scale multi-modal RGB-Event human activity recognition benchmark dataset, termed HARDVS 2.0, which bridges the dataset gaps. It contains 300 categories of everyday real-world actions with a total of 107,646 paired videos covering various challenging scenarios. Inspired by the physics-informed heat conduction model, we propose a novel multi-modal heat conduction operation framework for effective activity recognition, termed MMHCO-HAR. More in detail, given the RGB frames and event streams, we first extract the feature embeddings using a stem network. Then, multi-modal Heat Conduction blocks are designed to fuse the dual features, the key module of which is the multi-modal Heat Conduction Operation layer. We integrate RGB and event embeddings through a multi-modal DCT-IDCT layer while adaptively incorporating the thermal conductivity coefficient via FVEs into this module. After that, we propose an adaptive fusion module based on a policy routing strategy for high-performance classification. Comprehensive experiments demonstrate that our method consistently performs well, validating its effectiveness and robustness. The source code and benchmark dataset will be released on https://github.com/Event-AHU/HARDVS/tree/HARDVSv2
Authors:Houzhang Fang, Xiaolin Wang, Zengyang Li, Lu Wang, Qingshan Li, Yi Chang, Luxin Yan
Abstract:
Infrared unmanned aerial vehicle (UAV) images captured using thermal detectors are often affected by temperature dependent low-frequency nonuniformity, which significantly reduces the contrast of the images. Detecting UAV targets under nonuniform conditions is crucial in UAV surveillance applications. Existing methods typically treat infrared nonuniformity correction (NUC) as a preprocessing step for detection, which leads to suboptimal performance. Balancing the two tasks while enhancing detection beneficial information remains challenging. In this paper, we present a detection-friendly union framework, termed UniCD, that simultaneously addresses both infrared NUC and UAV target detection tasks in an end-to-end manner. We first model NUC as a small number of parameter estimation problem jointly driven by priors and data to generate detection-conducive images. Then, we incorporate a new auxiliary loss with target mask supervision into the backbone of the infrared UAV target detection network to strengthen target features while suppressing the background. To better balance correction and detection, we introduce a detection-guided self-supervised loss to reduce feature discrepancies between the two tasks, thereby enhancing detection robustness to varying nonuniformity levels. Additionally, we construct a new benchmark composed of 50,000 infrared images in various nonuniformity types, multi-scale UAV targets and rich backgrounds with target annotations, called IRBFD. Extensive experiments on IRBFD demonstrate that our UniCD is a robust union framework for NUC and UAV target detection while achieving real-time processing capabilities. Dataset can be available at https://github.com/IVPLaboratory/UniCD.
Authors:Jay N. Paranjape, Celso de Melo, Vishal M. Patel
Abstract:
Thermal imaging is crucial for scene understanding, particularly in low-light and nighttime conditions. However, collecting large thermal datasets is costly and labor-intensive due to the specialized equipment required for infrared image capture. To address this challenge, researchers have explored visible-to-thermal image translation. Most existing methods rely on Generative Adversarial Networks (GANs) or Diffusion Models (DMs), treating the task as a style transfer problem. As a result, these approaches attempt to learn both the modality distribution shift and underlying physical principles from limited training data. In this paper, we propose F-ViTA, a novel approach that leverages the general world knowledge embedded in foundation models to guide the diffusion process for improved translation. Specifically, we condition an InstructPix2Pix Diffusion Model with zero-shot masks and labels from foundation models such as SAM and Grounded DINO. This allows the model to learn meaningful correlations between scene objects and their thermal signatures in infrared imagery. Extensive experiments on five public datasets demonstrate that F-ViTA outperforms state-of-the-art (SOTA) methods. Furthermore, our model generalizes well to out-of-distribution (OOD) scenarios and can generate Long-Wave Infrared (LWIR), Mid-Wave Infrared (MWIR), and Near-Infrared (NIR) translations from the same visible image. Code: https://github.com/JayParanjape/F-ViTA/tree/master.
Authors:Ukcheol Shin, Jinsun Park
Abstract:
Achieving robust and accurate spatial perception under adverse weather and lighting conditions is crucial for the high-level autonomy of self-driving vehicles and robots. However, existing perception algorithms relying on the visible spectrum are highly affected by weather and lighting conditions. A long-wave infrared camera (i.e., thermal imaging camera) can be a potential solution to achieve high-level robustness. However, the absence of large-scale datasets and standardized benchmarks remains a significant bottleneck to progress in active research for robust visual perception from thermal images. To this end, this manuscript provides a large-scale Multi-Spectral Stereo (MS$^2$) dataset that consists of stereo RGB, stereo NIR, stereo thermal, stereo LiDAR data, and GNSS/IMU information along with semi-dense depth ground truth. MS$^2$ dataset includes 162K synchronized multi-modal data pairs captured across diverse locations (e.g., urban city, residential area, campus, and high-way road) at different times (e.g., morning, daytime, and nighttime) and under various weather conditions (e.g., clear-sky, cloudy, and rainy). Secondly, we conduct a thorough evaluation of monocular and stereo depth estimation networks across RGB, NIR, and thermal modalities to establish standardized benchmark results on MS$^2$ depth test sets (e.g., day, night, and rainy). Lastly, we provide in-depth analyses and discuss the challenges revealed by the benchmark results, such as the performance variability for each modality under adverse conditions, domain shift between different sensor modalities, and potential research direction for thermal perception. Our dataset and source code are publicly available at https://sites.google.com/view/multi-spectral-stereo-dataset and https://github.com/UkcheolShin/SupDepth4Thermal.
Authors:Yu-Hsi Chen
Abstract:
Detecting and tracking multiple unmanned aerial vehicles (UAVs) in thermal infrared video is inherently challenging due to low contrast, environmental noise, and small target sizes. This paper provides a straightforward approach to address multi-UAV tracking in thermal infrared video, leveraging recent advances in detection and tracking. Instead of relying on the well-established YOLOv5 with DeepSORT combination, we present a tracking framework built on YOLOv12 and BoT-SORT, enhanced with tailored training and inference strategies. We evaluate our approach following the 4th Anti-UAV Challenge metrics and reach competitive performance. Notably, we achieved strong results without using contrast enhancement or temporal information fusion to enrich UAV features, highlighting our approach as a "Strong Baseline" for multi-UAV tracking tasks. We provide implementation details, in-depth experimental analysis, and a discussion of potential improvements. The code is available at https://github.com/wish44165/YOLOv12-BoT-SORT-ReID .
Authors:Ruiyang Ha, Songyi Jiang, Bin Li, Bikang Pan, Yihang Zhu, Junjie Zhang, Xiatian Zhu, Shaogang Gong, Jingya Wang
Abstract:
Conventional person re-identification (ReID) research is often limited to single-modality sensor data from static cameras, which fails to address the complexities of real-world scenarios where multi-modal signals are increasingly prevalent. For instance, consider an urban ReID system integrating stationary RGB cameras, nighttime infrared sensors, and UAVs equipped with dynamic tracking capabilities. Such systems face significant challenges due to variations in camera perspectives, lighting conditions, and sensor modalities, hindering effective person ReID. To address these challenges, we introduce the MP-ReID benchmark, a novel dataset designed specifically for multi-modality and multi-platform ReID. This benchmark uniquely compiles data from 1,930 identities across diverse modalities, including RGB, infrared, and thermal imaging, captured by both UAVs and ground-based cameras in indoor and outdoor environments. Building on this benchmark, we introduce Uni-Prompt ReID, a framework with specific-designed prompts, tailored for cross-modality and cross-platform scenarios. Our method consistently outperforms state-of-the-art approaches, establishing a robust foundation for future research in complex and dynamic ReID environments. Our dataset are available at:https://mp-reid.github.io/.
Authors:Jiayi Zhao, Fei Teng, Kai Luo, Guoqiang Zhao, Zhiyong Li, Xu Zheng, Kailun Yang
Abstract:
The perception capability of robotic systems relies on the richness of the dataset. Although Segment Anything Model 2 (SAM2), trained on large datasets, demonstrates strong perception potential in perception tasks, its inherent training paradigm prevents it from being suitable for RGB-T tasks. To address these challenges, we propose SHIFNet, a novel SAM2-driven Hybrid Interaction Paradigm that unlocks the potential of SAM2 with linguistic guidance for efficient RGB-Thermal perception. Our framework consists of two key components: (1) Semantic-Aware Cross-modal Fusion (SACF) module that dynamically balances modality contributions through text-guided affinity learning, overcoming SAM2's inherent RGB bias; (2) Heterogeneous Prompting Decoder (HPD) that enhances global semantic information through a semantic enhancement module and then combined with category embeddings to amplify cross-modal semantic consistency. With 32.27M trainable parameters, SHIFNet achieves state-of-the-art segmentation performance on public benchmarks, reaching 89.8% on PST900 and 67.8% on FMB, respectively. The framework facilitates the adaptation of pre-trained large models to RGB-T segmentation tasks, effectively mitigating the high costs associated with data collection while endowing robotic systems with comprehensive perception capabilities. The source code will be made publicly available at https://github.com/iAsakiT3T/SHIFNet.
Authors:Nikita Kazeev, Wei Nong, Ignat Romanov, Ruiming Zhu, Andrey Ustyuzhanin, Shuya Yamazaki, Kedar Hippalgaonkar
Abstract:
Crystal symmetry plays a fundamental role in determining its physical, chemical, and electronic properties such as electrical and thermal conductivity, optical and polarization behavior, and mechanical strength. Almost all known crystalline materials have internal symmetry. However, this is often inadequately addressed by existing generative models, making the consistent generation of stable and symmetrically valid crystal structures a significant challenge. We introduce WyFormer, a generative model that directly tackles this by formally conditioning on space group symmetry. It achieves this by using Wyckoff positions as the basis for an elegant, compressed, and discrete structure representation. To model the distribution, we develop a permutation-invariant autoregressive model based on the Transformer encoder and an absence of positional encoding. Extensive experimentation demonstrates WyFormer's compelling combination of attributes: it achieves best-in-class symmetry-conditioned generation, incorporates a physics-motivated inductive bias, produces structures with competitive stability, predicts material properties with competitive accuracy even without atomic coordinates, and exhibits unparalleled inference speed.
Authors:Mingjie Wen, Jiahe Han, Wenjuan Li, Xiaoya Chang, Qingzhao Chu, Dongping Chen
Abstract:
The discovery and optimization of high-energy materials (HEMs) are constrained by the prohibitive computational expense and prolonged development cycles inherent in conventional approaches. In this work, we develop a general neural network potential (NNP) that efficiently predicts the structural, mechanical, and decomposition properties of HEMs composed of C, H, N, and O. Our framework leverages pre-trained NNP models, fine-tuned using transfer learning on energy and force data derived from density functional theory (DFT) calculations. This strategy enables rapid adaptation across 20 different HEM systems while maintaining DFT-level accuracy, significantly reducing computational costs. A key aspect of this work is the ability of NNP model to capture the chemical activity space of HEMs, accurately describe the key atomic interactions and reaction mechanisms during thermal decomposition. The general NNP model has been applied in molecular dynamics (MD) simulations and validated with experimental data for various HEM structures. Results show that the NNP model accurately predicts the structural, mechanical, and decomposition properties of HEMs by effectively describing their chemical activity space. Compared to traditional force fields, it offers superior DFT-level accuracy and generalization across both microscopic and macroscopic properties, reducing the computational and experimental costs. This work provides an efficient strategy for the design and development of HEMs and proposes a promising framework for integrating DFT, machine learning, and experimental methods in materials research. (To facilitate further research and practical applications, we open-source our NNP model on GitHub: https://github.com/MingjieWen/General-NNP-model-for-C-H-N-O-Energetic-Materials.)
Authors:Xingyuan Li, Zirui Wang, Yang Zou, Zhixin Chen, Jun Ma, Zhiying Jiang, Long Ma, Jinyuan Liu
Abstract:
Infrared imaging is essential for autonomous driving and robotic operations as a supportive modality due to its reliable performance in challenging environments. Despite its popularity, the limitations of infrared cameras, such as low spatial resolution and complex degradations, consistently challenge imaging quality and subsequent visual tasks. Hence, infrared image super-resolution (IISR) has been developed to address this challenge. While recent developments in diffusion models have greatly advanced this field, current methods to solve it either ignore the unique modal characteristics of infrared imaging or overlook the machine perception requirements. To bridge these gaps, we propose DifIISR, an infrared image super-resolution diffusion model optimized for visual quality and perceptual performance. Our approach achieves task-based guidance for diffusion by injecting gradients derived from visual and perceptual priors into the noise during the reverse process. Specifically, we introduce an infrared thermal spectrum distribution regulation to preserve visual fidelity, ensuring that the reconstructed infrared images closely align with high-resolution images by matching their frequency components. Subsequently, we incorporate various visual foundational models as the perceptual guidance for downstream visual tasks, infusing generalizable perceptual features beneficial for detection and segmentation. As a result, our approach gains superior visual results while attaining State-Of-The-Art downstream task performance. Code is available at https://github.com/zirui0625/DifIISR
Authors:Zhu Liu, Zijun Wang, Jinyuan Liu, Fanqi Meng, Long Ma, Risheng Liu
Abstract:
Thermal imaging is often compromised by dynamic, complex degradations caused by hardware limitations and unpredictable environmental factors. The scarcity of high-quality infrared data, coupled with the challenges of dynamic, intricate degradations, makes it difficult to recover details using existing methods. In this paper, we introduce thermal degradation simulation integrated into the training process via a mini-max optimization, by modeling these degraded factors as adversarial attacks on thermal images. The simulation is dynamic to maximize objective functions, thus capturing a broad spectrum of degraded data distributions. This approach enables training with limited data, thereby improving model performance.Additionally, we introduce a dual-interaction network that combines the benefits of spiking neural networks with scale transformation to capture degraded features with sharp spike signal intensities. This architecture ensures compact model parameters while preserving efficient feature representation. Extensive experiments demonstrate that our method not only achieves superior visual quality under diverse single and composited degradation, but also delivers a significant reduction in processing when trained on only fifty clear images, outperforming existing techniques in efficiency and accuracy. The source code will be available at https://github.com/LiuZhu-CV/DEAL.
Authors:Ukcheol Shin, Kyunghyun Lee, Jean Oh
Abstract:
Deploying depth estimation networks in the real world requires high-level robustness against various adverse conditions to ensure safe and reliable autonomy. For this purpose, many autonomous vehicles employ multi-modal sensor systems, including an RGB camera, NIR camera, thermal camera, LiDAR, or Radar. They mainly adopt two strategies to use multiple sensors: modality-wise and multi-modal fused inference. The former method is flexible but memory-inefficient, unreliable, and vulnerable. Multi-modal fusion can provide high-level reliability, yet it needs a specialized architecture. In this paper, we propose an effective solution, named align-and-fuse strategy, for the depth estimation from multi-spectral images. In the align stage, we align embedding spaces between multiple spectrum bands to learn shareable representation across multi-spectral images by minimizing contrastive loss of global and spatially aligned local features with geometry cue. After that, in the fuse stage, we train an attachable feature fusion module that can selectively aggregate the multi-spectral features for reliable and robust prediction results. Based on the proposed method, a single-depth network can achieve both spectral-invariant and multi-spectral fused depth estimation while preserving reliability, memory efficiency, and flexibility.
Authors:He Wang, Tianyang Xu, Zhangyong Tang, Xiao-Jun Wu, Josef Kittler
Abstract:
Multi-modal tracking is essential in single-object tracking (SOT), as different sensor types contribute unique capabilities to overcome challenges caused by variations in object appearance. However, existing unified RGB-X trackers (X represents depth, event, or thermal modality) either rely on the task-specific training strategy for individual RGB-X image pairs or fail to address the critical importance of modality-adaptive perception in real-world applications. In this work, we propose UASTrack, a unified adaptive selection framework that facilitates both model and parameter unification, as well as adaptive modality discrimination across various multi-modal tracking tasks. To achieve modality-adaptive perception in joint RGB-X pairs, we design a Discriminative Auto-Selector (DAS) capable of identifying modality labels, thereby distinguishing the data distributions of auxiliary modalities. Furthermore, we propose a Task-Customized Optimization Adapter (TCOA) tailored to various modalities in the latent space. This strategy effectively filters noise redundancy and mitigates background interference based on the specific characteristics of each modality. Extensive comparisons conducted on five benchmarks including LasHeR, GTOT, RGBT234, VisEvent, and DepthTrack, covering RGB-T, RGB-E, and RGB-D tracking scenarios, demonstrate our innovative approach achieves comparative performance by introducing only additional training parameters of 1.87M and flops of 1.95G. The code will be available at https://github.com/wanghe/UASTrack.
Authors:Ahmed Sharshar, Yasser Attia, Mohammad Yaqub, Mohsen Guizani
Abstract:
Traditional remote spirometry lacks the precision required for effective pulmonary monitoring. We present a novel, non-invasive approach using multimodal predictive models that integrate RGB or thermal video data with patient metadata. Our method leverages energy-efficient Spiking Neural Networks (SNNs) for the regression of Peak Expiratory Flow (PEF) and classification of Forced Expiratory Volume (FEV1) and Forced Vital Capacity (FVC), using lightweight CNNs to overcome SNN limitations in regression tasks. Multimodal data integration is improved with a Multi-Head Attention Layer, and we employ K-Fold validation and ensemble learning to boost robustness. Using thermal data, our SNN models achieve 92% accuracy on a breathing-cycle basis and 99.5% patient-wise. PEF regression models attain Relative RMSEs of 0.11 (thermal) and 0.26 (RGB), with an MAE of 4.52% for FEV1/FVC predictions, establishing state-of-the-art performance. Code and dataset can be found on https://github.com/ahmed-sharshar/RespiroDynamics.git
Authors:Xie Zhang, Chenxiao Li, Chenshu Wu
Abstract:
This paper presents the design and implementation of TAPOR, a privacy-preserving, non-contact, and fully passive sensing system for accurate and robust 3D hand pose reconstruction for around-device interaction using a single low-cost thermal array sensor. Thermal sensing using inexpensive and miniature thermal arrays emerges with an excellent utility-privacy balance, offering an imaging resolution significantly lower than cameras but far superior to RF signals like radar or WiFi. The design of TAPOR, however, is challenging, mainly because the captured temperature maps are low-resolution and textureless. To overcome the challenges, we investigate thermo-depth and thermo-pose properties, proposing a novel physics-inspired neural network that learns effective 3D spatial representations of potential hand poses. We then formulate the 3D pose reconstruction problem as a distinct retrieval task, enabling accurate hand pose determination from the input temperature map. To deploy TAPOR on IoT devices, we introduce an effective heterogeneous knowledge distillation method, reducing computation by 377x. TAPOR is fully implemented and tested in real-world scenarios, showing remarkable performance, supported by four gesture control and finger tracking case studies. We envision TAPOR to be a ubiquitous interface for around-device control and have open-sourced it at https://github.com/aiot-lab/TAPOR.
Authors:Akash Kundu
Abstract:
The Sachdev-Ye-Kitaev (SYK) model, known for its strong quantum correlations and chaotic behavior, serves as a key platform for quantum gravity studies. However, variationally preparing thermal states on near-term quantum processors for large systems ($N>12$, where $N$ is the number of Majorana fermions) presents a significant challenge due to the rapid growth in the complexity of parameterized quantum circuits. This paper addresses this challenge by integrating reinforcement learning (RL) with convolutional neural networks, employing an iterative approach to optimize the quantum circuit and its parameters. The refinement process is guided by a composite reward signal derived from entropy and the expectation values of the SYK Hamiltonian. This approach reduces the number of CNOT gates by two orders of magnitude for systems $N\geq12$ compared to traditional methods like first-order Trotterization. We demonstrate the effectiveness of the RL framework in both noiseless and noisy quantum hardware environments, maintaining high accuracy in thermal state preparation. This work advances a scalable, RL-based framework with applications for quantum gravity studies and out-of-time-ordered thermal correlators computation in quantum many-body systems on near-term quantum hardware. The code is available at https://github.com/Aqasch/solving_SYK_model_with_RL.
Authors:Peifu Liu, Tingfa Xu, Guokai Shi, Jingxuan Xu, Huan Chen, Jianan Li
Abstract:
Hyperspectral salient object detection (HSOD) aims to extract targets or regions with significantly different spectra from hyperspectral images. While existing deep learning-based methods can achieve good detection results, they generally necessitate pixel-level annotations, which are notably challenging to acquire for hyperspectral images. To address this issue, we introduce point supervision into HSOD, and incorporate Spectral Saliency, derived from conventional HSOD methods, as a pivotal spectral representation within the framework. This integration leads to the development of a novel Spectrum-oriented Point-supervised Saliency Detector (SPSD). Specifically, we propose a novel pipeline, specifically designed for HSIs, to generate pseudo-labels, effectively mitigating the performance decline associated with point supervision strategy. Additionally, Spectral Saliency is employed to counteract information loss during model supervision and saliency refinement, thereby maintaining the structural integrity and edge accuracy of the detected objects. Furthermore, we introduce a Spectrum-transformed Spatial Gate to focus more precisely on salient regions while reducing feature redundancy. We have carried out comprehensive experiments on both HSOD-BIT and HS-SOD datasets to validate the efficacy of our proposed method, using mean absolute error (MAE), E-measure, F-measure, Area Under Curve, and Cross Correlation as evaluation metrics. For instance, on the HSOD-BIT dataset, our SPSD achieves a MAE of 0.031 and an F-measure of 0.878. Thorough ablation studies have substantiated the effectiveness of each individual module and provided insights into the model's working mechanism. Further evaluations on RGB-thermal salient object detection datasets highlight the versatility of our approach.
Authors:Sofiane Bouaziz, Adel Hafiane, Raphael Canals, Rachid Nedjai
Abstract:
Land Surface Temperature (LST) plays a key role in climate monitoring, urban heat assessment, and land-atmosphere interactions. However, current thermal infrared satellite sensors cannot simultaneously achieve high spatial and temporal resolution. Spatio-temporal fusion (STF) techniques address this limitation by combining complementary satellite data, one with high spatial but low temporal resolution, and another with high temporal but low spatial resolution. Existing STF techniques, from classical models to modern deep learning (DL) architectures, were primarily developed for surface reflectance (SR). Their application to thermal data remains limited and often overlooks LST-specific spatial and temporal variability. This study provides a focused review of DL-based STF methods for LST. We present a formal mathematical definition of the thermal fusion task, propose a refined taxonomy of relevant DL methods, and analyze the modifications required when adapting SR-oriented models to LST. To support reproducibility and benchmarking, we introduce a new dataset comprising 51 Terra MODIS-Landsat LST pairs from 2013 to 2024, and evaluate representative models to explore their behavior on thermal data. The analysis highlights performance gaps, architecture sensitivities, and open research challenges. The dataset and accompanying resources are publicly available at https://github.com/Sofianebouaziz1/STF-LST.
Authors:Kunpeng Wang, Keke Chen, Chenglong Li, Zhengzheng Tu, Bin Luo
Abstract:
Alignment-free RGB-Thermal (RGB-T) salient object detection (SOD) aims to achieve robust performance in complex scenes by directly leveraging the complementary information from unaligned visible-thermal image pairs, without requiring manual alignment. However, the labor-intensive process of collecting and annotating image pairs limits the scale of existing benchmarks, hindering the advancement of alignment-free RGB-T SOD. In this paper, we construct a large-scale and high-diversity unaligned RGB-T SOD dataset named UVT20K, comprising 20,000 image pairs, 407 scenes, and 1256 object categories. All samples are collected from real-world scenarios with various challenges, such as low illumination, image clutter, complex salient objects, and so on. To support the exploration for further research, each sample in UVT20K is annotated with a comprehensive set of ground truths, including saliency masks, scribbles, boundaries, and challenge attributes. In addition, we propose a Progressive Correlation Network (PCNet), which models inter- and intra-modal correlations on the basis of explicit alignment to achieve accurate predictions in unaligned image pairs. Extensive experiments conducted on unaligned and aligned datasets demonstrate the effectiveness of our method.Code and dataset are available at https://github.com/Angknpng/PCNet.
Authors:Qingyu Xu, Longguang Wang, Weidong Sheng, Yingqian Wang, Chao Xiao, Chao Ma, Wei An
Abstract:
Tracking multiple tiny objects is highly challenging due to their weak appearance and limited features. Existing multi-object tracking algorithms generally focus on single-modality scenes, and overlook the complementary characteristics of tiny objects captured by multiple remote sensors. To enhance tracking performance by integrating complementary information from multiple sources, we propose a novel framework called {HGT-Track (Heterogeneous Graph Transformer based Multi-Tiny-Object Tracking)}. Specifically, we first employ a Transformer-based encoder to embed images from different modalities. Subsequently, we utilize Heterogeneous Graph Transformer to aggregate spatial and temporal information from multiple modalities to generate detection and tracking features. Additionally, we introduce a target re-detection module (ReDet) to ensure tracklet continuity by maintaining consistency across different modalities. Furthermore, this paper introduces the first benchmark VT-Tiny-MOT (Visible-Thermal Tiny Multi-Object Tracking) for RGB-T fused multiple tiny object tracking. Extensive experiments are conducted on VT-Tiny-MOT, and the results have demonstrated the effectiveness of our method. Compared to other state-of-the-art methods, our method achieves better performance in terms of MOTA (Multiple-Object Tracking Accuracy) and ID-F1 score. The code and dataset will be made available at https://github.com/xuqingyu26/HGTMT.
Authors:Tao Zhang, Zhenhai Liu, Feipeng Qi, Yongjun Jiao, Tailin Wu
Abstract:
Multiphysics simulation, which models the interactions between multiple physical processes, and multi-component simulation of complex structures are critical in fields like nuclear and aerospace engineering. Previous studies use numerical solvers or ML-based surrogate models for these simulations. However, multiphysics simulations typically require integrating multiple specialized solvers-each for a specific physical process-into a coupled program, which introduces significant development challenges. Furthermore, existing numerical algorithms struggle with highly complex large-scale structures in multi-component simulations. Here we propose compositional Multiphysics and Multi-component PDE Simulation with Diffusion models (M2PDE) to overcome these challenges. During diffusion-based training, M2PDE learns energy functions modeling the conditional probability of one physical process/component conditioned on other processes/components. In inference, M2PDE generates coupled multiphysics and multi-component solutions by sampling from the joint probability distribution. We evaluate M2PDE on two multiphysics tasks-reaction-diffusion and nuclear thermal coupling-where it achieves more accurate predictions than surrogate models in challenging scenarios. We then apply it to a multi-component prismatic fuel element problem, demonstrating that M2PDE scales from single-component training to a 64-component structure and outperforms existing domain-decomposition and graph-based approaches. The code is available at https://github.com/AI4Science-WestlakeU/M2PDE.
Authors:Hao Tang, Zechao Li, Dong Zhang, Shengfeng He, Jinhui Tang
Abstract:
RGB-Thermal Salient Object Detection aims to pinpoint prominent objects within aligned pairs of visible and thermal infrared images. Traditional encoder-decoder architectures, while designed for cross-modality feature interactions, may not have adequately considered the robustness against noise originating from defective modalities. Inspired by hierarchical human visual systems, we propose the ConTriNet, a robust Confluent Triple-Flow Network employing a Divide-and-Conquer strategy. Specifically, ConTriNet comprises three flows: two modality-specific flows explore cues from RGB and Thermal modalities, and a third modality-complementary flow integrates cues from both modalities. ConTriNet presents several notable advantages. It incorporates a Modality-induced Feature Modulator in the modality-shared union encoder to minimize inter-modality discrepancies and mitigate the impact of defective samples. Additionally, a foundational Residual Atrous Spatial Pyramid Module in the separated flows enlarges the receptive field, allowing for the capture of multi-scale contextual information. Furthermore, a Modality-aware Dynamic Aggregation Module in the modality-complementary flow dynamically aggregates saliency-related cues from both modality-specific flows. Leveraging the proposed parallel triple-flow framework, we further refine saliency maps derived from different flows through a flow-cooperative fusion strategy, yielding a high-quality, full-resolution saliency map for the final prediction. To evaluate the robustness and stability of our approach, we collect a comprehensive RGB-T SOD benchmark, VT-IMAG, covering various real-world challenging scenarios. Extensive experiments on public benchmarks and our VT-IMAG dataset demonstrate that ConTriNet consistently outperforms state-of-the-art competitors in both common and challenging scenarios.
Authors:Gijs Vermariën, Serena Viti, Rahul Ravichandran, Thomas G. Bisbas
Abstract:
We present a novel dataset of simulations of the photodissociation region (PDR) in the Orion Bar and provide benchmarks of emulators for the dataset. Numerical models of PDRs are computationally expensive since the modeling of these changing regions requires resolving the thermal balance and chemical composition along a line-of-sight into an interstellar cloud. This often makes it a bottleneck for 3D simulations of these regions. In this work, we provide a dataset of 8192 models with different initial conditions simulated with 3D-PDR. We then benchmark different architectures, focusing on Augmented Neural Ordinary Differential Equation (ANODE) based models (Code be found at https://github.com/uclchem/neuralpdr). Obtaining fast and robust emulators that can be included as preconditioners of classical codes or full emulators into 3D simulations of PDRs.
Authors:Pengfei Lyu, Pak-Hei Yeung, Xiaosheng Yu, Chengdong Wu, Jagath C. Rajapakse
Abstract:
The rapid development of deep learning has significantly improved salient object detection (SOD) combining both RGB and thermal (RGB-T) images. However, existing deep learning-based RGB-T SOD models suffer from two major limitations. First, Transformer-based models with quadratic complexity are computationally expensive and memory-intensive, limiting their application in high-resolution bi-modal feature fusion. Second, even when these models converge to an optimal solution, there remains a frequency gap between the prediction and ground-truth. To overcome these limitations, we propose a purely Fourier transform-based model, namely Deep Fourier-Embedded Network (DFENet), for accurate RGB-T SOD. To address the computational complexity when dealing with high-resolution images, we leverage the efficiency of fast Fourier transform with linear complexity to design three key components: (1) the Modal-coordinated Perception Attention, which fuses RGB and thermal modalities with enhanced multi-dimensional representation; (2) the Frequency-decomposed Edge-aware Block, which clarifies object edges by deeply decomposing and enhancing frequency components of low-level features; and (3) the Fourier Residual Channel Attention Block, which prioritizes high-frequency information while aligning channel-wise global relationships. To mitigate the frequency gap, we propose Co-focus Frequency Loss, which dynamically weights hard frequencies during edge frequency reconstruction by cross-referencing bi-modal edge information in the Fourier domain. Extensive experiments on four RGB-T SOD benchmark datasets demonstrate that DFENet outperforms fifteen existing state-of-the-art RGB-T SOD models. Comprehensive ablation studies further validate the value and effectiveness of our newly proposed components. The code is available at https://github.com/JoshuaLPF/DFENet.
Authors:Wassim El Ahmar, Dhanvin Kolhatkar, Farzan Nowruzi, Robert Laganiere
Abstract:
Multiple Object Tracking (MOT) in thermal imaging presents unique challenges due to the lack of visual features and the complexity of motion patterns. This paper introduces an innovative approach to improve MOT in the thermal domain by developing a novel box association method that utilizes both thermal object identity and motion similarity. Our method merges thermal feature sparsity and dynamic object tracking, enabling more accurate and robust MOT performance. Additionally, we present a new dataset comprised of a large-scale collection of thermal and RGB images captured in diverse urban environments, serving as both a benchmark for our method and a new resource for thermal imaging. We conduct extensive experiments to demonstrate the superiority of our approach over existing methods, showing significant improvements in tracking accuracy and robustness under various conditions. Our findings suggest that incorporating thermal identity with motion data enhances MOT performance. The newly collected dataset and source code is available at https://github.com/wassimea/thermalMOT
Authors:Yang Zou, Zhixin Chen, Zhipeng Zhang, Xingyuan Li, Long Ma, Jinyuan Liu, Peng Wang, Yanning Zhang
Abstract:
Image super-resolution (SR) is a classical yet still active low-level vision problem that aims to reconstruct high-resolution (HR) images from their low-resolution (LR) counterparts, serving as a key technique for image enhancement. Current approaches to address SR tasks, such as transformer-based and diffusion-based methods, are either dedicated to extracting RGB image features or assuming similar degradation patterns, neglecting the inherent modal disparities between infrared and visible images. When directly applied to infrared image SR tasks, these methods inevitably distort the infrared spectral distribution, compromising the machine perception in downstream tasks. In this work, we emphasize the infrared spectral distribution fidelity and propose a Contourlet refinement gate framework to restore infrared modal-specific features while preserving spectral distribution fidelity. Our approach captures high-pass subbands from multi-scale and multi-directional infrared spectral decomposition to recover infrared-degraded information through a gate architecture. The proposed Spectral Fidelity Loss regularizes the spectral frequency distribution during reconstruction, which ensures the preservation of both high- and low-frequency components and maintains the fidelity of infrared-specific features. We propose a two-stage prompt-learning optimization to guide the model in learning infrared HR characteristics from LR degradation. Extensive experiments demonstrate that our approach outperforms existing image SR models in both visual and perceptual tasks while notably enhancing machine perception in downstream tasks. Our code is available at https://github.com/hey-it-s-me/CoRPLE.
Authors:Ismail Can Yagmur, Hasan F. Ates, Bahadir K. Gunturk
Abstract:
Accurate multispectral image matching presents significant challenges due to non-linear intensity variations across spectral modalities, extreme viewpoint changes, and the scarcity of labeled datasets. Current state-of-the-art methods are typically specialized for a single spectral difference, such as visibleinfrared, and struggle to adapt to other modalities due to their reliance on expensive supervision, such as depth maps or camera poses. To address the need for rapid adaptation across modalities, we introduce XPoint, a self-supervised, modular image-matching framework designed for adaptive training and fine-tuning on aligned multispectral datasets, allowing users to customize key components based on their specific tasks. XPoint employs modularity and self-supervision to allow for the adjustment of elements such as the base detector, which generates pseudoground truth keypoints invariant to viewpoint and spectrum variations. The framework integrates a VMamba encoder, pretrained on segmentation tasks, for robust feature extraction, and includes three joint decoder heads: two are dedicated to interest point and descriptor extraction; and a task-specific homography regression head imposes geometric constraints for superior performance in tasks like image registration. This flexible architecture enables quick adaptation to a wide range of modalities, demonstrated by training on Optical-Thermal data and fine-tuning on settings such as visual-near infrared, visual-infrared, visual-longwave infrared, and visual-synthetic aperture radar. Experimental results show that XPoint consistently outperforms or matches state-ofthe-art methods in feature matching and image registration tasks across five distinct multispectral datasets. Our source code is available at https://github.com/canyagmur/XPoint.
Authors:Pengfei Lyu, Pak-Hei Yeung, Xiaosheng Yu, Xiufei Cheng, Chengdong Wu, Jagath C. Rajapakse
Abstract:
Unmanned aerial vehicle (UAV)-based bi-modal salient object detection (BSOD) aims to segment salient objects in a scene utilizing complementary cues in unaligned RGB and thermal image pairs. However, the high computational expense of existing UAV-based BSOD models limits their applicability to real-world UAV devices. To address this problem, we propose an efficient Fourier filter network with contrastive learning that achieves both real-time and accurate performance. Specifically, we first design a semantic contrastive alignment loss to align the two modalities at the semantic level, which facilitates mutual refinement in a parameter-free way. Second, inspired by the fast Fourier transform that obtains global relevance in linear complexity, we propose synchronized alignment fusion, which aligns and fuses bi-modal features in the channel and spatial dimensions by a hierarchical filtering mechanism. Our proposed model, AlignSal, reduces the number of parameters by 70.0%, decreases the floating point operations by 49.4%, and increases the inference speed by 152.5% compared to the cutting-edge BSOD model (i.e., MROS). Extensive experiments on the UAV RGB-T 2400 and seven bi-modal dense prediction datasets demonstrate that AlignSal achieves both real-time inference speed and better performance and generalizability compared to nineteen state-of-the-art models across most evaluation metrics. In addition, our ablation studies further verify AlignSal's potential in boosting the performance of existing aligned BSOD models on UAV-based unaligned data. The code is available at: https://github.com/JoshuaLPF/AlignSal.
Authors:Zhicheng Zhao, Juanjuan Gu, Chenglong Li, Chun Wang, Zhongling Huang, Jin Tang
Abstract:
Optics-guided Thermal UAV image Super-Resolution (OTUAV-SR) has attracted significant research interest due to its potential applications in security inspection, agricultural measurement, and object detection. Existing methods often employ single guidance model to generate the guidance features from optical images to assist thermal UAV images super-resolution. However, single guidance models make it difficult to generate effective guidance features under favorable and adverse conditions in UAV scenarios, thus limiting the performance of OTUAV-SR. To address this issue, we propose a novel Guidance Disentanglement network (GDNet), which disentangles the optical image representation according to typical UAV scenario attributes to form guidance features under both favorable and adverse conditions, for robust OTUAV-SR. Moreover, we design an attribute-aware fusion module to combine all attribute-based optical guidance features, which could form a more discriminative representation and fit the attribute-agnostic guidance process. To facilitate OTUAV-SR research in complex UAV scenarios, we introduce VGTSR2.0, a large-scale benchmark dataset containing 3,500 aligned optical-thermal image pairs captured under diverse conditions and scenes. Extensive experiments on VGTSR2.0 demonstrate that GDNet significantly improves OTUAV-SR performance over state-of-the-art methods, especially in the challenging low-light and foggy environments commonly encountered in UAV scenarios. The dataset and code will be publicly available at https://github.com/Jocelyney/GDNet.
Authors:Dong-Guw Lee, Jeongyun Kim, Younggun Cho, Ayoung Kim
Abstract:
Thermal Infrared (TIR) imaging provides robust perception for navigating in challenging outdoor environments but faces issues with poor texture and low image contrast due to its 14/16-bit format. Conventional methods utilize various tone-mapping methods to enhance contrast and photometric consistency of TIR images, however, the choice of tone-mapping is largely dependent on knowing the task and temperature dependent priors to work well. In this paper, we present Thermal Chameleon Network (TCNet), a task-adaptive tone-mapping approach for RAW 14-bit TIR images. Given the same image, TCNet tone-maps different representations of TIR images tailored for each specific task, eliminating the heuristic image rescaling preprocessing and reliance on the extensive prior knowledge of the scene temperature or task-specific characteristics. TCNet exhibits improved generalization performance across object detection and monocular depth estimation, with minimal computational overhead and modular integration to existing architectures for various tasks. Project Page: https://github.com/donkeymouse/ThermalChameleon
Authors:Yi Liu, Chengxin Li, Shoukun Xu, Jungong Han
Abstract:
Multi-modal fusion has played a vital role in multi-modal scene understanding. Most existing methods focus on cross-modal fusion involving two modalities, often overlooking more complex multi-modal fusion, which is essential for real-world applications like autonomous driving, where visible, depth, event, LiDAR, etc., are used. Besides, few attempts for multi-modal fusion, \emph{e.g.}, simple concatenation, cross-modal attention, and token selection, cannot well dig into the intrinsic shared and specific details of multiple modalities. To tackle the challenge, in this paper, we propose a Part-Whole Relational Fusion (PWRF) framework. For the first time, this framework treats multi-modal fusion as part-whole relational fusion. It routes multiple individual part-level modalities to a fused whole-level modality using the part-whole relational routing ability of Capsule Networks (CapsNets). Through this part-whole routing, our PWRF generates modal-shared and modal-specific semantics from the whole-level modal capsules and the routing coefficients, respectively. On top of that, modal-shared and modal-specific details can be employed to solve the issue of multi-modal scene understanding, including synthetic multi-modal segmentation and visible-depth-thermal salient object detection in this paper. Experiments on several datasets demonstrate the superiority of the proposed PWRF framework for multi-modal scene understanding. The source code has been released on https://github.com/liuyi1989/PWRF.
Authors:Andong Lu, Jiacong Zhao, Chenglong Li, Yun Xiao, Bin Luo
Abstract:
Modality gap between RGB and thermal infrared (TIR) images is a crucial issue but often overlooked in existing RGBT tracking methods. It can be observed that modality gap mainly lies in the image style difference. In this work, we propose a novel Coupled Knowledge Distillation framework called CKD, which pursues common styles of different modalities to break modality gap, for high performance RGBT tracking. In particular, we introduce two student networks and employ the style distillation loss to make their style features consistent as much as possible. Through alleviating the style difference of two student networks, we can break modality gap of different modalities well. However, the distillation of style features might harm to the content representations of two modalities in student networks. To handle this issue, we take original RGB and TIR networks as the teachers, and distill their content knowledge into two student networks respectively by the style-content orthogonal feature decoupling scheme. We couple the above two distillation processes in an online optimization framework to form new feature representations of RGB and thermal modalities without modality gap. In addition, we design a masked modeling strategy and a multi-modal candidate token elimination strategy into CKD to improve tracking robustness and efficiency respectively. Extensive experiments on five standard RGBT tracking datasets validate the effectiveness of the proposed method against state-of-the-art methods while achieving the fastest tracking speed of 96.4 FPS. Code available at https://github.com/Multi-Modality-Tracking/CKD.
Authors:Yifan Gong, Yushu Wu, Zheng Zhan, Pu Zhao, Liangkai Liu, Chao Wu, Xulong Tang, Yanzhi Wang
Abstract:
Two-stage object detectors exhibit high accuracy and precise localization, especially for identifying small objects that are favorable for various edge applications. However, the high computation costs associated with two-stage detection methods cause more severe thermal issues on edge devices, incurring dynamic runtime frequency change and thus large inference latency variations. Furthermore, the dynamic number of proposals in different frames leads to various computations over time, resulting in further latency variations. The significant latency variations of detectors on edge devices can harm user experience and waste hardware resources. To avoid thermal throttling and provide stable inference speed, we propose Lotus, a novel framework that is tailored for two-stage detectors to dynamically scale CPU and GPU frequencies jointly in an online manner based on deep reinforcement learning (DRL). To demonstrate the effectiveness of Lotus, we implement it on NVIDIA Jetson Orin Nano and Mi 11 Lite mobile platforms. The results indicate that Lotus can consistently and significantly reduce latency variation, achieve faster inference, and maintain lower CPU and GPU temperatures under various settings.
Authors:Robin Gerster, Holger Caesar, Matthias Rapp, Alexander Wolpert, Michael Teutsch
Abstract:
Despite their success in various vision tasks, deep neural network architectures often underperform in out-of-distribution scenarios due to the difference between training and target domain style. To address this limitation, we introduce One-Shot Style Adaptation (OSSA), a novel unsupervised domain adaptation method for object detection that utilizes a single, unlabeled target image to approximate the target domain style. Specifically, OSSA generates diverse target styles by perturbing the style statistics derived from a single target image and then applies these styles to a labeled source dataset at the feature level using Adaptive Instance Normalization (AdaIN). Extensive experiments show that OSSA establishes a new state-of-the-art among one-shot domain adaptation methods by a significant margin, and in some cases, even outperforms strong baselines that use thousands of unlabeled target images. By applying OSSA in various scenarios, including weather, simulated-to-real (sim2real), and visual-to-thermal adaptations, our study explores the overarching significance of the style gap in these contexts. OSSA's simplicity and efficiency allow easy integration into existing frameworks, providing a potentially viable solution for practical applications with limited data availability. Code is available at https://github.com/RobinGerster7/OSSA
Authors:Ruiqiang Xiao, Xiaohu Chen
Abstract:
Crack segmentation is crucial in civil engineering, particularly for assessing pavement integrity and ensuring the durability of infrastructure. While deep learning has advanced RGB-based segmentation, performance degrades under adverse conditions like low illumination or motion blur. Thermal imaging offers complementary information by capturing emitted radiation, improving crack detection in challenging environments. Combining RGB and thermal images (RGB-T) for crack segmentation shows promise in complex real-world conditions, such as adverse weather, yet research in this area remains limited. Current RGB-T segmentation methods often fail to fully exploit the complementary relationships between modalities at various levels of interaction. To address this, we propose IRFusionFormer, a novel model for crack segmentation that effectively integrates RGB and thermal data. Our Efficient RGB-T Cross Fusion Module captures multi-scale relationships and long-range dependencies between modalities without significant computational overhead. Additionally, we introduce the Interaction-Hybrid-Branch-Supervision framework, which enhances interaction between modalities by distributing fused features across branches with joint supervision. To maintain the topological structure of cracks, we introduce a novel topology-based loss function that preserves connectivity during training. Our method achieves state-of-the-art performance, with a Dice score of 90.01% and an IoU of 81.83%, significantly improving robustness and accuracy in varying environmental conditions. These advancements address key challenges in pavement crack segmentation, offering a more reliable and efficient solution. For access to the codes, data, and models from this study, visit https://github.com/sheauhuu/IRFusionFormer
Authors:Pinxue Guo, Wanyun Li, Hao Huang, Lingyi Hong, Xinyu Zhou, Zhaoyu Chen, Jinglun Li, Kaixun Jiang, Wei Zhang, Wenqiang Zhang
Abstract:
Multi-modal Video Object Segmentation (VOS), including RGB-Thermal, RGB-Depth, and RGB-Event, has garnered attention due to its capability to address challenging scenarios where traditional VOS methods struggle, such as extreme illumination, rapid motion, and background distraction. Existing approaches often involve designing specific additional branches and performing full-parameter fine-tuning for fusion in each task. However, this paradigm not only duplicates research efforts and hardware costs but also risks model collapse with the limited multi-modal annotated data. In this paper, we propose a universal framework named X-Prompt for all multi-modal video object segmentation tasks, designated as RGB+X. The X-Prompt framework first pre-trains a video object segmentation foundation model using RGB data, and then utilize the additional modality of the prompt to adapt it to downstream multi-modal tasks with limited data. Within the X-Prompt framework, we introduce the Multi-modal Visual Prompter (MVP), which allows prompting foundation model with the various modalities to segment objects precisely. We further propose the Multi-modal Adaptation Experts (MAEs) to adapt the foundation model with pluggable modality-specific knowledge without compromising the generalization capacity. To evaluate the effectiveness of the X-Prompt framework, we conduct extensive experiments on 3 tasks across 4 benchmarks. The proposed universal X-Prompt framework consistently outperforms the full fine-tuning paradigm and achieves state-of-the-art performance. Code: https://github.com/PinxueGuo/X-Prompt.git
Authors:Xie Zhang, Chenshu Wu
Abstract:
Human sensing has gained increasing attention in various applications. Among the available technologies, visual images offer high accuracy, while sensing on the RF spectrum preserves privacy, creating a conflict between imaging resolution and privacy preservation. In this paper, we explore thermal array sensors as an emerging modality that strikes an excellent resolution-privacy balance for ubiquitous sensing. To this end, we present TADAR, the first multi-user Thermal Array-based Detection and Ranging system that estimates the inherently missing range information, extending thermal array outputs from 2D thermal pixels to 3D depths and empowering them as a promising modality for ubiquitous privacy-preserving human sensing. We prototype TADAR using a single commodity thermal array sensor and conduct extensive experiments in different indoor environments. Our results show that TADAR achieves a mean F1 score of 88.8% for multi-user detection and a mean accuracy of 32.0 cm for multi-user ranging, which further improves to 20.1 cm for targets located within 3 m. We conduct two case studies on fall detection and occupancy estimation to showcase the potential applications of TADAR. We hope TADAR will inspire the vast community to explore new directions of thermal array sensing, beyond wireless and acoustic sensing. TADAR is open-sourced on GitHub: https://github.com/aiot-lab/TADAR.
Authors:Qian Chen, Shihao Shu, Xiangzhi Bai
Abstract:
Novel-view synthesis based on visible light has been extensively studied. In comparison to visible light imaging, thermal infrared imaging offers the advantage of all-weather imaging and strong penetration, providing increased possibilities for reconstruction in nighttime and adverse weather scenarios. However, thermal infrared imaging is influenced by physical characteristics such as atmospheric transmission effects and thermal conduction, hindering the precise reconstruction of intricate details in thermal infrared scenes, manifesting as issues of floaters and indistinct edge features in synthesized images. To address these limitations, this paper introduces a physics-induced 3D Gaussian splatting method named Thermal3D-GS. Thermal3D-GS begins by modeling atmospheric transmission effects and thermal conduction in three-dimensional media using neural networks. Additionally, a temperature consistency constraint is incorporated into the optimization objective to enhance the reconstruction accuracy of thermal infrared images. Furthermore, to validate the effectiveness of our method, the first large-scale benchmark dataset for this field named Thermal Infrared Novel-view Synthesis Dataset (TI-NSD) is created. This dataset comprises 20 authentic thermal infrared video scenes, covering indoor, outdoor, and UAV(Unmanned Aerial Vehicle) scenarios, totaling 6,664 frames of thermal infrared image data. Based on this dataset, this paper experimentally verifies the effectiveness of Thermal3D-GS. The results indicate that our method outperforms the baseline method with a 3.03 dB improvement in PSNR and significantly addresses the issues of floaters and indistinct edge features present in the baseline method. Our dataset and codebase will be released in \href{https://github.com/mzzcdf/Thermal3DGS}{\textcolor{red}{Thermal3DGS}}.
Authors:Rongfeng Lu, Hangyu Chen, Zunjie Zhu, Yuhang Qin, Ming Lu, Le Zhang, Chenggang Yan, Anke Xue
Abstract:
Thermography is especially valuable for the military and other users of surveillance cameras. Some recent methods based on Neural Radiance Fields (NeRF) are proposed to reconstruct the thermal scenes in 3D from a set of thermal and RGB images. However, unlike NeRF, 3D Gaussian splatting (3DGS) prevails due to its rapid training and real-time rendering. In this work, we propose ThermalGaussian, the first thermal 3DGS approach capable of rendering high-quality images in RGB and thermal modalities. We first calibrate the RGB camera and the thermal camera to ensure that both modalities are accurately aligned. Subsequently, we use the registered images to learn the multimodal 3D Gaussians. To prevent the overfitting of any single modality, we introduce several multimodal regularization constraints. We also develop smoothing constraints tailored to the physical characteristics of the thermal modality. Besides, we contribute a real-world dataset named RGBT-Scenes, captured by a hand-hold thermal-infrared camera, facilitating future research on thermal scene reconstruction. We conduct comprehensive experiments to show that ThermalGaussian achieves photorealistic rendering of thermal images and improves the rendering quality of RGB images. With the proposed multimodal regularization constraints, we also reduced the model's storage cost by 90%. Our project page is at https://thermalgaussian.github.io/.
Authors:Ang He, Xiaobo Li, Ximei Wu, Chengyue Su, Jing Chen, Sheng Xu, Xiaobin Guo
Abstract:
Unmanned aerial vehicles (UAVs) equipped with thermal infrared (TIR) cameras play a crucial role in combating nocturnal wildlife poaching. However, TIR images often face challenges such as jitter, and wildlife overlap, necessitating UAVs to possess the capability to identify blurred and overlapping small targets. Current traditional lightweight networks deployed on UAVs struggle to extract features from blurry small targets. To address this issue, we developed ALSS-YOLO, an efficient and lightweight detector optimized for TIR aerial images. Firstly, we propose a novel Adaptive Lightweight Channel Split and Shuffling (ALSS) module. This module employs an adaptive channel split strategy to optimize feature extraction and integrates a channel shuffling mechanism to enhance information exchange between channels. This improves the extraction of blurry features, crucial for handling jitter-induced blur and overlapping targets. Secondly, we developed a Lightweight Coordinate Attention (LCA) module that employs adaptive pooling and grouped convolution to integrate feature information across dimensions. This module ensures lightweight operation while maintaining high detection precision and robustness against jitter and target overlap. Additionally, we developed a single-channel focus module to aggregate the width and height information of each channel into four-dimensional channel fusion, which improves the feature representation efficiency of infrared images. Finally, we modify the localization loss function to emphasize the loss value associated with small objects to improve localization accuracy. Extensive experiments on the BIRDSAI and ISOD TIR UAV wildlife datasets show that ALSS-YOLO achieves state-of-the-art performance, Our code is openly available at https://github.com/helloworlder8/computer_vision.
Authors:Zhengyi Liu, Sheng Deng, Xinrui Wang, Linbo Wang, Xianyong Fang, Bin Tang
Abstract:
Scribble supervised salient object detection (SSSOD) constructs segmentation ability of attractive objects from surroundings under the supervision of sparse scribble labels. For the better segmentation, depth and thermal infrared modalities serve as the supplement to RGB images in the complex scenes. Existing methods specifically design various feature extraction and multi-modal fusion strategies for RGB, RGB-Depth, RGB-Thermal, and Visual-Depth-Thermal image input respectively, leading to similar model flood. As the recently proposed Segment Anything Model (SAM) possesses extraordinary segmentation and prompt interactive capability, we propose an SSSOD family based on SAM, named SSFam, for the combination input with different modalities. Firstly, different modal-aware modulators are designed to attain modal-specific knowledge which cooperates with modal-agnostic information extracted from the frozen SAM encoder for the better feature ensemble. Secondly, a siamese decoder is tailored to bridge the gap between the training with scribble prompt and the testing with no prompt for the stronger decoding ability. Our model demonstrates the remarkable performance among combinations of different modalities and refreshes the highest level of scribble supervised methods and comes close to the ones of fully supervised methods. https://github.com/liuzywen/SSFam
Authors:Zihan Qin, Jialei Xu, Wenbo Zhao, Junjun Jiang, Xianming Liu
Abstract:
Depth estimation under adverse conditions remains a significant challenge. Recently, multi-spectral depth estimation, which integrates both visible light and thermal images, has shown promise in addressing this issue. However, existing algorithms struggle with precise pixel-level feature matching, limiting their ability to fully exploit geometric constraints across different spectra. To address this, we propose a novel framework incorporating stereo depth estimation to enforce accurate geometric constraints. In particular, we treat the visible light and thermal images as a stereo pair and utilize a Cross-modal Feature Matching (CFM) Module to construct a cost volume for pixel-level matching. To mitigate the effects of poor lighting on stereo matching, we introduce Degradation Masking, which leverages robust monocular thermal depth estimation in degraded regions. Our method achieves state-of-the-art (SOTA) performance on the Multi-Spectral Stereo (MS2) dataset, with qualitative evaluations demonstrating high-quality depth maps under varying lighting conditions.
Authors:Nan Zhou, Huandong Wang, Jiahao Li, Yang Li, Xiao-Ping Zhang, Yong Li, Xinlei Chen
Abstract:
Fine-grained fire prediction plays a crucial role in emergency response. Infrared images and fire masks provide complementary thermal and boundary information, yet current methods are predominantly limited to binary mask modeling with inherent signal sparsity, failing to capture the complex dynamics of fire. While world models show promise in video generation, their physical inconsistencies pose significant challenges for fire forecasting. This paper introduces PhysFire-WM, a Physics-informed World Model for emulating Fire spread dynamics. Our approach internalizes combustion dynamics by encoding structured priors from a Physical Simulator to rectify physical discrepancies, coupled with a Cross-task Collaborative Training strategy (CC-Train) that alleviates the issue of limited information in mask-based modeling. Through parameter sharing and gradient coordination, CC-Train effectively integrates thermal radiation dynamics and spatial boundary delineation, enhancing both physical realism and geometric accuracy. Extensive experiments on a fine-grained multimodal fire dataset demonstrate the superior accuracy of PhysFire-WM in fire spread prediction. Validation underscores the importance of physical priors and cross-task collaboration, providing new insights for applying physics-informed world models to disaster prediction.
Authors:Zhicheng Zhao, Fengjiao Peng, Jinquan Yan, Wei Lu, Chenglong Li, Jin Tang
Abstract:
Optics-guided thermal UAV image super-resolution has attracted significant research interest due to its potential in all-weather monitoring applications. However, existing methods typically compress optical features to match thermal feature dimensions for cross-modal alignment and fusion, which not only causes the loss of high-frequency information that is beneficial for thermal super-resolution, but also introduces physically inconsistent artifacts such as texture distortions and edge blurring by overlooking differences in the imaging physics between modalities. To address these challenges, we propose PCNet to achieve cross-resolution mutual enhancement between optical and thermal modalities, while physically constraining the optical guidance process via thermal conduction to enable robust thermal UAV image super-resolution. In particular, we design a Cross-Resolution Mutual Enhancement Module (CRME) to jointly optimize thermal image super-resolution and optical-to-thermal modality conversion, facilitating effective bidirectional feature interaction across resolutions while preserving high-frequency optical priors. Moreover, we propose a Physics-Driven Thermal Conduction Module (PDTM) that incorporates two-dimensional heat conduction into optical guidance, modeling spatially-varying heat conduction properties to prevent inconsistent artifacts. In addition, we introduce a temperature consistency loss that enforces regional distribution consistency and boundary gradient smoothness to ensure generated thermal images align with real-world thermal radiation principles. Extensive experiments on VGTSR2.0 and DroneVehicle datasets demonstrate that PCNet significantly outperforms state-of-the-art methods on both reconstruction quality and downstream tasks including semantic segmentation and object detection.
Authors:Aihua Zheng, Ya Gao, Shihao Li, Chenglong Li, Jin Tang
Abstract:
Multi-modal vehicle Re-Identification (ReID) aims to leverage complementary information from RGB, Near Infrared (NIR), and Thermal Infrared (TIR) modalities to retrieve the same vehicle. The challenges of multi-modal vehicle ReID arise from the uncertainty of modality quality distribution induced by inherent discrepancies across modalities, resulting in distinct conflicting fusion requirements for data with balanced and unbalanced quality distributions. Existing methods handle all multi-modal data within a single fusion model, overlooking the different needs of the two data types and making it difficult to decouple the conflict between intra-class consistency and inter-modal heterogeneity. To this end, we propose Disentangle Collaboration and Guidance Fusion Representations for Multi-modal Vehicle ReID (DCG-ReID). Specifically, to disentangle heterogeneous quality-distributed modal data without mutual interference, we first design the Dynamic Confidence-based Disentangling Weighting (DCDW) mechanism: dynamically reweighting three-modal contributions via interaction-derived modal confidence to build a disentangled fusion framework. Building on DCDW, we develop two scenario-specific fusion strategies: (1) for balanced quality distributions, Collaboration Fusion Module (CFM) mines pairwise consensus features to capture shared discriminative information and boost intra-class consistency; (2) for unbalanced distributions, Guidance Fusion Module (GFM) implements differential amplification of modal discriminative disparities to reinforce dominant modality advantages, guide auxiliary modalities to mine complementary discriminative info, and mitigate inter-modal divergence to boost multi-modal joint decision performance. Extensive experiments on three multi-modal ReID benchmarks (WMVeID863, MSVR310, RGBNT100) validate the effectiveness of our method. Code will be released upon acceptance.
Authors:Andong Lu, Mai Wen, Jinhu Wang, Yuanzhi Guo, Chenglong Li, Jin Tang, Bin Luo
Abstract:
Existing multimodal tracking studies focus on bi-modal scenarios such as RGB-Thermal, RGB-Event, and RGB-Language. Although promising tracking performance is achieved through leveraging complementary cues from different sources, it remains challenging in complex scenes due to the limitations of bi-modal scenarios. In this work, we introduce a general multimodal visual tracking task that fully exploits the advantages of four modalities, including RGB, thermal infrared, event, and language, for robust tracking under challenging conditions. To provide a comprehensive evaluation platform for general multimodal visual tracking, we construct QuadTrack600, a large-scale, high-quality benchmark comprising 600 video sequences (totaling 384.7K high-resolution (640x480) frame groups). In each frame group, all four modalities are spatially aligned and meticulously annotated with bounding boxes, while 21 sequence-level challenge attributes are provided for detailed performance analysis. Despite quad-modal data provides richer information, the differences in information quantity among modalities and the computational burden from four modalities are two challenging issues in fusing four modalities. To handle these issues, we propose a novel approach called QuadFusion, which incorporates an efficient Multiscale Fusion Mamba with four different scanning scales to achieve sufficient interactions of the four modalities while overcoming the exponential computational burden, for general multimodal visual tracking. Extensive experiments on the QuadTrack600 dataset and three bi-modal tracking datasets, including LasHeR, VisEvent, and TNL2K, validate the effectiveness of our QuadFusion.
Authors:Yifei Deng, Chenglong Li, Zhenyu Chen, Zihen Xu, Jin Tang
Abstract:
The performance of traditional text-image person retrieval task is easily affected by lighting variations due to imaging limitations of visible spectrum sensors. In recent years, cross-modal information fusion has emerged as an effective strategy to enhance retrieval robustness. By integrating complementary information from different spectral modalities, it becomes possible to achieve more stable person recognition and matching under complex real-world conditions. Motivated by this, we introduce a novel task: Text-RGBT Person Retrieval, which incorporates cross-spectrum information fusion by combining the complementary cues from visible and thermal modalities for robust person retrieval in challenging environments. The key challenge of Text-RGBT person retrieval lies in aligning text with multi-modal visual features. However, the inherent heterogeneity between visible and thermal modalities may interfere with the alignment between vision and language. To handle this problem, we propose a Decoupled Cross-modal Alignment network (DCAlign), which sufficiently mines the relationships between modality-specific and modality-collaborative visual with the text, for Text-RGBT person retrieval. To promote the research and development of this field, we create a high-quality Text-RGBT person retrieval dataset, RGBT-PEDES. RGBT-PEDES contains 1,822 identities from different age groups and genders with 4,723 pairs of calibrated RGB and T images, and covers high-diverse scenes from both daytime and nighttime with a various of challenges such as occlusion, weak alignment and adverse lighting conditions. Additionally, we carefully annotate 7,987 fine-grained textual descriptions for all RGBT person image pairs. Extensive experiments on RGBT-PEDES demonstrate that our method outperforms existing text-image person retrieval methods.
Authors:Chenglong Li, Tao Wang, Zhaodong Ding, Yun Xiao, Jin Tang
Abstract:
RGBT tracking usually suffers from various challenging factors of low resolution, similar appearance, extreme illumination, thermal crossover and occlusion, to name a few. Existing works often study complex fusion models to handle challenging scenarios, but can not well adapt to various challenges, which might limit tracking performance. To handle this problem, we propose a novel Dynamic Disentangled Fusion Network called DDFNet, which disentangles the fusion process into several dynamic fusion models via the challenge attributes to adapt to various challenging scenarios, for robust RGBT tracking. In particular, we design six attribute-based fusion models to integrate RGB and thermal features under the six challenging scenarios respectively.Since each fusion model is to deal with the corresponding challenges, such disentangled fusion scheme could increase the fusion capacity without the dependence on large-scale training data. Considering that every challenging scenario also has different levels of difficulty, we propose to optimize the combination of multiple fusion units to form each attribute-based fusion model in a dynamic manner, which could well adapt to the difficulty of the corresponding challenging scenario. To address the issue that which fusion models should be activated in the tracking process, we design an adaptive aggregation fusion module to integrate all features from attribute-based fusion models in an adaptive manner with a three-stage training algorithm. In addition, we design an enhancement fusion module to further strengthen the aggregated feature and modality-specific features. Experimental results on benchmark datasets demonstrate the effectiveness of our DDFNet against other state-of-the-art methods.
Authors:Prashant Kumar Choudhary, Nouhaila Innan, Muhammad Shafique, Rajeev Singh
Abstract:
Quantum circuit design is a key bottleneck for practical quantum machine learning on complex, real-world data. We present an automated framework that discovers and refines variational quantum circuits (VQCs) using graph-based Bayesian optimization with a graph neural network (GNN) surrogate. Circuits are represented as graphs and mutated and selected via an expected improvement acquisition function informed by surrogate uncertainty with Monte Carlo dropout. Candidate circuits are evaluated with a hybrid quantum-classical variational classifier on the next generation firewall telemetry and network internet of things (NF-ToN-IoT-V2) cybersecurity dataset, after feature selection and scaling for quantum embedding. We benchmark our pipeline against an MLP-based surrogate, random search, and greedy GNN selection. The GNN-guided optimizer consistently finds circuits with lower complexity and competitive or superior classification accuracy compared to all baselines. Robustness is assessed via a noise study across standard quantum noise channels, including amplitude damping, phase damping, thermal relaxation, depolarizing, and readout bit flip noise. The implementation is fully reproducible, with time benchmarking and export of best found circuits, providing a scalable and interpretable route to automated quantum circuit discovery.
Authors:Lanhu Wu, Zilin Gao, Hao Fei, Mong-Li Lee, Wynne Hsu
Abstract:
RGB-D salient object detection (SOD) aims to identify the most conspicuous objects in a scene with the incorporation of depth cues. Existing methods mainly rely on CNNs, limited by the local receptive fields, or Vision Transformers that suffer from the cost of quadratic complexity, posing a challenge in balancing performance and computational efficiency. Recently, state space models (SSM), Mamba, have shown great potential for modeling long-range dependency with linear complexity. However, directly applying SSM to RGB-D SOD may lead to deficient local semantics as well as the inadequate cross-modality fusion. To address these issues, we propose a Local Emphatic and Adaptive Fusion state space model (LEAF-Mamba) that contains two novel components: 1) a local emphatic state space module (LE-SSM) to capture multi-scale local dependencies for both modalities. 2) an SSM-based adaptive fusion module (AFM) for complementary cross-modality interaction and reliable cross-modality integration. Extensive experiments demonstrate that the LEAF-Mamba consistently outperforms 16 state-of-the-art RGB-D SOD methods in both efficacy and efficiency. Moreover, our method can achieve excellent performance on the RGB-T SOD task, proving a powerful generalization ability.
Authors:Jinhao Li, Zijian Chen, Lirong Deng, Changbo Wang, Guangtao Zhai
Abstract:
Person re-identification (ReID) aims to retrieve the images of an interested person in the gallery images, with wide applications in medical rehabilitation, abnormal behavior detection, and public security. However, traditional person ReID models suffer from uni-modal capability, leading to poor generalization ability in multi-modal data, such as RGB, thermal, infrared, sketch images, textual descriptions, etc. Recently, the emergence of multi-modal large language models (MLLMs) shows a promising avenue for addressing this problem. Despite this potential, existing methods merely regard MLLMs as feature extractors or caption generators, which do not fully unleash their reasoning, instruction-following, and cross-modal understanding capabilities. To bridge this gap, we introduce MMReID-Bench, the first multi-task multi-modal benchmark specifically designed for person ReID. The MMReID-Bench includes 20,710 multi-modal queries and gallery images covering 10 different person ReID tasks. Comprehensive experiments demonstrate the remarkable capabilities of MLLMs in delivering effective and versatile person ReID. Nevertheless, they also have limitations in handling a few modalities, particularly thermal and infrared data. We hope MMReID-Bench can facilitate the community to develop more robust and generalizable multimodal foundation models for person ReID.
Authors:Tangin Amir Smrity, MD Zahin Muntaqim Hasan Muhammad Kafi, Abu Saleh Musa Miah, Najmul Hassan, Yuichi Okuyama, Nobuyoshi Asai, Taro Suzuki, Jungpil Shin
Abstract:
Induction motors (IMs) are indispensable in industrial and daily life, but they are susceptible to various faults that can lead to overheating, wasted energy consumption, and service failure. Early detection of faults is essential to protect the motor and prolong its lifespan. This paper presents a hybrid method that integrates BYOL with CNNs for classifying thermal images of induction motors for fault detection. The thermal dataset used in this work includes different operating states of the motor, such as normal operation, overload, and faults. We employed multiple deep learning (DL) models for the BYOL technique, ranging from popular architectures such as ResNet-50, DenseNet-121, DenseNet-169, EfficientNetB0, VGG16, and MobileNetV2. Additionally, we introduced a new high-performance yet lightweight CNN model named BYOL-IMNet, which comprises four custom-designed blocks tailored for fault classification in thermal images. Our experimental results demonstrate that the proposed BYOL-IMNet achieves 99.89\% test accuracy and an inference time of 5.7 ms per image, outperforming state-of-the-art models. This study highlights the promising performance of the CNN-BYOL hybrid method in enhancing accuracy for detecting faults in induction motors, offering a robust methodology for online monitoring in industrial settings.
Authors:Muhammad Haris Khan, Miguel Altamirano Cabrera, Dmitrii Iarchuk, Yara Mahmoud, Daria Trinitatova, Issatay Tokmurziyev, Dzmitry Tsetserukou
Abstract:
This paper introduces HapticVLM, a novel multimodal system that integrates vision-language reasoning with deep convolutional networks to enable real-time haptic feedback. HapticVLM leverages a ConvNeXt-based material recognition module to generate robust visual embeddings for accurate identification of object materials, while a state-of-the-art Vision-Language Model (Qwen2-VL-2B-Instruct) infers ambient temperature from environmental cues. The system synthesizes tactile sensations by delivering vibrotactile feedback through speakers and thermal cues via a Peltier module, thereby bridging the gap between visual perception and tactile experience. Experimental evaluations demonstrate an average recognition accuracy of 84.67% across five distinct auditory-tactile patterns and a temperature estimation accuracy of 86.7% based on a tolerance-based evaluation method with an 8°C margin of error across 15 scenarios. Although promising, the current study is limited by the use of a small set of prominent patterns and a modest participant pool. Future work will focus on expanding the range of tactile patterns and increasing user studies to further refine and validate the system's performance. Overall, HapticVLM presents a significant step toward context-aware, multimodal haptic interaction with potential applications in virtual reality, and assistive technologies.
Authors:Peiran Peng, Tingfa Xu, Liqiang Song, Mengqi Zhu, Yuqiang Fang, Jianan Li
Abstract:
Detecting tiny objects in multimodal Red-Green-Blue-Thermal (RGBT) imagery is a critical challenge in computer vision, particularly in surveillance, search and rescue, and autonomous navigation. Drone-based scenarios exacerbate these challenges due to spatial misalignment, low-light conditions, occlusion, and cluttered backgrounds. Current methods struggle to leverage the complementary information between visible and thermal modalities effectively. We propose COXNet, a novel framework for RGBT tiny object detection, addressing these issues through three core innovations: i) the Cross-Layer Fusion Module, fusing high-level visible and low-level thermal features for enhanced semantic and spatial accuracy; ii) the Dynamic Alignment and Scale Refinement module, correcting cross-modal spatial misalignments and preserving multi-scale features; and iii) an optimized label assignment strategy using the GeoShape Similarity Measure for better localization. COXNet achieves a 3.32\% mAP$_{50}$ improvement on the RGBTDronePerson dataset over state-of-the-art methods, demonstrating its effectiveness for robust detection in complex environments.
Authors:Ce Zhang, Zifu Wan, Simon Stepputtis, Katia Sycara, Yaqi Xie
Abstract:
Semantic segmentation relying solely on RGB data often struggles in challenging conditions such as low illumination and obscured views, limiting its reliability in critical applications like autonomous driving. To address this, integrating additional thermal radiation data with RGB images demonstrates enhanced performance and robustness. However, how to effectively reconcile the modality discrepancies and fuse the RGB and thermal features remains a well-known challenge. In this work, we address this challenge from a novel spectral perspective. We observe that the multi-modal features can be categorized into two spectral components: low-frequency features that provide broad scene context, including color variations and smooth areas, and high-frequency features that capture modality-specific details such as edges and textures. Inspired by this, we propose the Spectral-aware Global Fusion Network (SGFNet) to effectively enhance and fuse the multi-modal features by explicitly modeling the interactions between the high-frequency, modality-specific features. Our experimental results demonstrate that SGFNet outperforms the state-of-the-art methods on the MFNet and PST900 datasets.
Authors:Penelope Brown, Julie Stephany Berrio Perez, Mao Shan, Stewart Worrall
Abstract:
Vulnerable road users (VRUs) such as pedestrians, cyclists, and motorcyclists represent more than half of global traffic deaths, yet their detection remains challenging in poor lighting, adverse weather, and unbalanced data sets. This paper presents a multimodal detection framework that integrates RGB and thermal infrared imaging with a fine-tuned YOLOv8 model. Training leveraged KITTI, BDD100K, and Teledyne FLIR datasets, with class re-weighting and light augmentations to improve minority-class performance and robustness, experiments show that 640-pixel resolution and partial backbone freezing optimise accuracy and efficiency, while class-weighted losses enhance recall for rare VRUs. Results highlight that thermal models achieve the highest precision, and RGB-to-thermal augmentation boosts recall, demonstrating the potential of multimodal detection to improve VRU safety at intersections.
Authors:Meng Yu, Te Cui, Qitong Chu, Wenjie Song, Yi Yang, Yufeng Yue
Abstract:
Reliable semantic segmentation of open environments is essential for intelligent systems, yet significant problems remain: 1) Existing RGB-T semantic segmentation models mainly rely on low-level visual features and lack high-level textual information, which struggle with accurate segmentation when categories share similar visual characteristics. 2) While SAM excels in instance-level segmentation, integrating it with thermal images and text is hindered by modality heterogeneity and computational inefficiency. To address these, we propose TASeg, a text-aware RGB-T segmentation framework by using Low-Rank Adaptation (LoRA) fine-tuning technology to adapt vision foundation models. Specifically, we propose a Dynamic Feature Fusion Module (DFFM) in the image encoder, which effectively merges features from multiple visual modalities while freezing SAM's original transformer blocks. Additionally, we incorporate CLIP-generated text embeddings in the mask decoder to enable semantic alignment, which further rectifies the classification error and improves the semantic understanding accuracy. Experimental results across diverse datasets demonstrate that our method achieves superior performance in challenging scenarios with fewer trainable parameters.
Authors:Meng Yu, Luojie Yang, Xunjie He, Yi Yang, Yufeng Yue
Abstract:
Semantic segmentation is a critical technique for effective scene understanding. Traditional RGB-T semantic segmentation models often struggle to generalize across diverse scenarios due to their reliance on pretrained models and predefined categories. Recent advancements in Visual Language Models (VLMs) have facilitated a shift from closed-set to open-vocabulary semantic segmentation methods. However, these models face challenges in dealing with intricate scenes, primarily due to the heterogeneity between RGB and thermal modalities. To address this gap, we present Open-RGBT, a novel open-vocabulary RGB-T semantic segmentation model. Specifically, we obtain instance-level detection proposals by incorporating visual prompts to enhance category understanding. Additionally, we employ the CLIP model to assess image-text similarity, which helps correct semantic consistency and mitigates ambiguities in category identification. Empirical evaluations demonstrate that Open-RGBT achieves superior performance in diverse and challenging real-world scenarios, even in the wild, significantly advancing the field of RGB-T semantic segmentation.
Authors:Xue-Feng Zhu, Tianyang Xu, Yifan Pan, Jinjie Gu, Xi Li, Jiwen Lu, Xiao-Jun Wu, Josef Kittler
Abstract:
Existing multi-modal object tracking approaches primarily focus on dual-modal paradigms, such as RGB-Depth or RGB-Thermal, yet remain challenged in complex scenarios due to limited input modalities. To address this gap, this work introduces a novel multi-modal tracking task that leverages three complementary modalities, including visible RGB, Depth (D), and Thermal Infrared (TIR), aiming to enhance robustness in complex scenarios. To support this task, we construct a new multi-modal tracking dataset, coined RGBDT500, which consists of 500 videos with synchronised frames across the three modalities. Each frame provides spatially aligned RGB, depth, and thermal infrared images with precise object bounding box annotations. Furthermore, we propose a novel multi-modal tracker, dubbed RDTTrack. RDTTrack integrates tri-modal information for robust tracking by leveraging a pretrained RGB-only tracking model and prompt learning techniques. In specific, RDTTrack fuses thermal infrared and depth modalities under a proposed orthogonal projection constraint, then integrates them with RGB signals as prompts for the pre-trained foundation tracking model, effectively harmonising tri-modal complementary cues. The experimental results demonstrate the effectiveness and advantages of the proposed method, showing significant improvements over existing dual-modal approaches in terms of tracking accuracy and robustness in complex scenarios.
Authors:Zhangyong Tang, Tianyang Xu, Xuefeng Zhu, Hui Li, Shaochuan Zhao, Tao Zhou, Chunyang Cheng, Xiaojun Wu, Josef Kittler
Abstract:
The development of smart cities has led to the generation of massive amounts of multi-modal data in the context of a range of tasks that enable a comprehensive monitoring of the smart city infrastructure and services. This paper surveys one of the most critical tasks, multi-modal visual object tracking (MMVOT), from the perspective of multimodality analysis. Generally, MMVOT differs from single-modal tracking in four key aspects, data collection, modality alignment and annotation, model designing, and evaluation. Accordingly, we begin with an introduction to the relevant data modalities, laying the groundwork for their integration. This naturally leads to a discussion of challenges of multi-modal data collection, alignment, and annotation. Subsequently, existing MMVOT methods are categorised, based on different ways to deal with visible (RGB) and X modalities: programming the auxiliary X branch with replicated or non-replicated experimental configurations from the RGB branch. Here X can be thermal infrared (T), depth (D), event (E), near infrared (NIR), language (L), or sonar (S). The final part of the paper addresses evaluation and benchmarking. In summary, we undertake an omni survey of all aspects of multi-modal visual object tracking (VOT), covering six MMVOT tasks and featuring 338 references in total. In addition, we discuss the fundamental rhetorical question: Is multi-modal tracking always guaranteed to provide a superior solution to unimodal tracking with the help of information fusion, and if not, in what circumstances its application is beneficial. Furthermore, for the first time in this field, we analyse the distributions of the object categories in the existing MMVOT datasets, revealing their pronounced long-tail nature and a noticeable lack of animal categories when compared with RGB datasets.
Authors:Chen Zhou, Peng Cheng, Junfeng Fang, Yifan Zhang, Yibo Yan, Xiaojun Jia, Yanyan Xu, Kun Wang, Xiaochun Cao
Abstract:
Multispectral object detection, utilizing RGB and TIR (thermal infrared) modalities, is widely recognized as a challenging task. It requires not only the effective extraction of features from both modalities and robust fusion strategies, but also the ability to address issues such as spectral discrepancies, spatial misalignment, and environmental dependencies between RGB and TIR images. These challenges significantly hinder the generalization of multispectral detection systems across diverse scenarios. Although numerous studies have attempted to overcome these limitations, it remains difficult to clearly distinguish the performance gains of multispectral detection systems from the impact of these "optimization techniques". Worse still, despite the rapid emergence of high-performing single-modality detection models, there is still a lack of specialized training techniques that can effectively adapt these models for multispectral detection tasks. The absence of a standardized benchmark with fair and consistent experimental setups also poses a significant barrier to evaluating the effectiveness of new approaches. To this end, we propose the first fair and reproducible benchmark specifically designed to evaluate the training "techniques", which systematically classifies existing multispectral object detection methods, investigates their sensitivity to hyper-parameters, and standardizes the core configurations. A comprehensive evaluation is conducted across multiple representative multispectral object detection datasets, utilizing various backbone networks and detection frameworks. Additionally, we introduce an efficient and easily deployable multispectral object detection framework that can seamlessly optimize high-performing single-modality models into dual-modality models, integrating our advanced training techniques.
Authors:Alessandro Ottaviano, Andrino Meli, Paul Scheffler, Giovanni Bambini, Robert Balas, Davide Rossi, Andrea Bartolini, Luca Benini
Abstract:
Managing energy and thermal profiles is critical for many-core HPC processors with hundreds of application-class processing elements (PEs). Advanced model predictive control (MPC) delivers state-of-the-art performance but requires solving an online optimization problem over a thousand times per second (1 kHz control bandwidth), with computational and memory demands scaling with PE count. Traditional MPC approaches execute the controller on the PEs, but operating system overheads create jitter and limit control bandwidth. Running MPC on dedicated on-chip controllers enables fast, deterministic control but raises concerns about area and power overhead. In this work, we tackle these challenges by proposing a hardware-software codesign of a lightweight MPC controller, based on an operator-splitting quadratic programming solver and an embedded multi-core RISC-V controller. Key innovations include pruning weak thermal couplings to reduce model memory and ahead-of-time scheduling for efficient parallel execution of sparse triangular systems arising from the optimization problem. The proposed controller achieves sub-millisecond latency when controlling 144 PEs at 500 MHz, delivering 33x lower latency and 7.9x higher energy efficiency than a single-core baseline. Operating within a compact less than 1 MiB memory footprint, it consumes as little as 325 mW while occupying less than 1.5% of a typical HPC processor's die area.
Authors:Xin Chen, Ben Kang, Wanting Geng, Jiawen Zhu, Yi Liu, Dong Wang, Huchuan Lu
Abstract:
In this paper, we propose a simple yet unified single object tracking (SOT) framework, dubbed SUTrack. It consolidates five SOT tasks (RGB-based, RGB-Depth, RGB-Thermal, RGB-Event, RGB-Language Tracking) into a unified model trained in a single session. Due to the distinct nature of the data, current methods typically design individual architectures and train separate models for each task. This fragmentation results in redundant training processes, repetitive technological innovations, and limited cross-modal knowledge sharing. In contrast, SUTrack demonstrates that a single model with a unified input representation can effectively handle various common SOT tasks, eliminating the need for task-specific designs and separate training sessions. Additionally, we introduce a task-recognition auxiliary training strategy and a soft token type embedding to further enhance SUTrack's performance with minimal overhead. Experiments show that SUTrack outperforms previous task-specific counterparts across 11 datasets spanning five SOT tasks. Moreover, we provide a range of models catering edge devices as well as high-performance GPUs, striking a good trade-off between speed and accuracy. We hope SUTrack could serve as a strong foundation for further compelling research into unified tracking models. Code and models are available at github.com/chenxin-dlut/SUTrack.
Authors:Yiming Sun, Zifan Ye, Qinghua Hu, Pengfei Zhu
Abstract:
Multi-modal image fusion aims to integrate complementary information from multiple source images to produce high-quality fused images with enriched content. Although existing approaches based on state space model have achieved satisfied performance with high computational efficiency, they tend to either over-prioritize infrared intensity at the cost of visible details, or conversely, preserve visible structure while diminishing thermal target salience. To overcome these challenges, we propose DIFF-MF, a novel difference-driven channel-spatial state space model for multi-modal image fusion. Our approach leverages feature discrepancy maps between modalities to guide feature extraction, followed by a fusion process across both channel and spatial dimensions. In the channel dimension, a channel-exchange module enhances channel-wise interaction through cross-attention dual state space modeling, enabling adaptive feature reweighting. In the spatial dimension, a spatial-exchange module employs cross-modal state space scanning to achieve comprehensive spatial fusion. By efficiently capturing global dependencies while maintaining linear computational complexity, DIFF-MF effectively integrates complementary multi-modal features. Experimental results on the driving scenarios and low-altitude UAV datasets demonstrate that our method outperforms existing approaches in both visual quality and quantitative evaluation.
Authors:Qiyi Tong, Olivia Nocentini, Marta Lagomarsino, Kuanqi Cai, Marta Lorenzini, Arash Ajoudani
Abstract:
Facial Landmark Detection (FLD) in thermal imagery is critical for applications in challenging lighting conditions, but it is hampered by the lack of rich visual cues. Conventional cross-modal solutions, like feature fusion or image translation from RGB data, are often computationally expensive or introduce structural artifacts, limiting their practical deployment. To address this, we propose Multi-Level Cross-Modal Knowledge Distillation (MLCM-KD), a novel framework that decouples high-fidelity RGB-to-thermal knowledge transfer from model compression to create both accurate and efficient thermal FLD models. A central challenge during knowledge transfer is the profound modality gap between RGB and thermal data, where traditional unidirectional distillation fails to enforce semantic consistency across disparate feature spaces. To overcome this, we introduce Dual-Injected Knowledge Distillation (DIKD), a bidirectional mechanism designed specifically for this task. DIKD establishes a connection between modalities: it not only guides the thermal student with rich RGB features but also validates the student's learned representations by feeding them back into the frozen teacher's prediction head. This closed-loop supervision forces the student to learn modality-invariant features that are semantically aligned with the teacher, ensuring a robust and profound knowledge transfer. Experiments show that our approach sets a new state-of-the-art on public thermal FLD benchmarks, notably outperforming previous methods while drastically reducing computational overhead.
Authors:Qingsen Ma, Chen Zou, Dianyun Wang, Jia Wang, Liuyu Xiang, Zhaofeng He
Abstract:
Under extremely low-light conditions, novel view synthesis (NVS) faces severe degradation in terms of geometry, color consistency, and radiometric stability. Standard 3D Gaussian Splatting (3DGS) pipelines fail when applied directly to underexposed inputs, as independent enhancement across views causes illumination inconsistencies and geometric distortion. To address this, we present DTGS, a unified framework that tightly couples Retinex-inspired illumination decomposition with thermal-guided 3D Gaussian Splatting for illumination-invariant reconstruction. Unlike prior approaches that treat enhancement as a pre-processing step, DTGS performs joint optimization across enhancement, geometry, and thermal supervision through a cyclic enhancement-reconstruction mechanism. A thermal supervisory branch stabilizes both color restoration and geometry learning by dynamically balancing enhancement, structural, and thermal losses. Moreover, a Retinex-based decomposition module embedded within the 3DGS loop provides physically interpretable reflectance-illumination separation, ensuring consistent color and texture across viewpoints. To evaluate our method, we construct RGBT-LOW, a new multi-view low-light thermal dataset capturing severe illumination degradation. Extensive experiments show that DTGS significantly outperforms existing low-light enhancement and 3D reconstruction baselines, achieving superior radiometric consistency, geometric fidelity, and color stability under extreme illumination.
Authors:Hao Wang, Hongkui Zheng, Kai He, Abolfazl Razi
Abstract:
Scanning transmission electron microscopy (STEM) plays a critical role in modern materials science, enabling direct imaging of atomic structures and their evolution under external interferences. However, interpreting time-resolved STEM data remains challenging due to two entangled degradation effects: spatial drift caused by mechanical and thermal instabilities, and beam-induced signal loss resulting from radiation damage. These factors distort both geometry and intensity in complex, temporally correlated ways, making it difficult for existing methods to explicitly separate their effects or model material dynamics at atomic resolution. In this work, we present AtomDiffuser, a time-aware degradation modeling framework that disentangles sample drift and radiometric attenuation by predicting an affine transformation and a spatially varying decay map between any two STEM frames. Unlike traditional denoising or registration pipelines, our method leverages degradation as a physically heuristic, temporally conditioned process, enabling interpretable structural evolutions across time. Trained on synthetic degradation processes, AtomDiffuser also generalizes well to real-world cryo-STEM data. It further supports high-resolution degradation inference and drift alignment, offering tools for visualizing and quantifying degradation patterns that correlate with radiation-induced atomic instabilities.
Authors:Barış Kavas, Efe C. Balta, Lars Witte, Michael R. Tucker, John Lygeros, Markus Bambach
Abstract:
This study investigates the stabilization of interlayer temperature in the laser powder bed fusion process through a novel switched layer-to-layer closed-loop feedback controller. The controller architecture aims to measure the interlayer temperature by a laterally positioned thermal camera and maintain a preset reference temperature by switching between the heating mode through dynamic laser power adjustment and the cooling mode by assigning interlayer dwell time to allow cooling between layers. The switching controller employs a feedback optimization control algorithm for the heating mode to adjust the laser power, and a triggering algorithm that increases the interlayer dwell time until the interlayer temperature reaches the reference value. Additionally, the study compares the performance of the proposed controller in both supported and unsupported overhanging parts to evaluate the effect of support structures on the controller performance as well as the thermal behavior of overhanging parts. Results demonstrate the controller's effectiveness in stabilizing interlayer temperature across varying cross-sectional areas while remaining within the material's stable processing zone. In the heating mode, the controller efficiently stabilizes temperature, even in geometries with significant cross-section variation. The study also identifies trade-offs among process efficiency, energy consumption, and build time. Supported parts exhibit reduced overheating but consume more energy and material, while unsupported parts stabilize interlayer temperature faster but with longer build times due to increased dwell time assignments. The research highlights notable improvements in interlayer temperature control for geometries prone to excessive thermal stresses. Moreover, the introduction of interlayer dwell time offers a practical solution to maintaining thermal stability in complex geometries.
Authors:Yuepeng Zhang, Yu Chen, Yuda Li, Shaoyuan Li, Xiang Yin
Abstract:
Control Barrier Functions (CBFs) have emerged as an effective and non-invasive safety filter for ensuring the safety of autonomous systems in dynamic environments with formal guarantees. However, most existing works on CBF synthesis focus on fully known settings. Synthesizing CBFs online based on perception data in unknown environments poses particular challenges. Specifically, this requires the construction of CBFs from high-dimensional data efficiently in real time. This paper proposes a new approach for online synthesis of CBFs directly from local Occupancy Grid Maps (OGMs). Inspired by steady-state thermal fields, we show that the smoothness requirement of CBFs corresponds to the solution of the steady-state heat conduction equation with suitably chosen boundary conditions. By leveraging the sparsity of the coefficient matrix in Laplace's equation, our approach allows for efficient computation of safety values for each grid cell in the map. Simulation and real-world experiments demonstrate the effectiveness of our approach. Specifically, the results show that our CBFs can be synthesized in an average of milliseconds on a 200 * 200 grid map, highlighting its real-time applicability.
Authors:Mohammadreza Baharani, Ghazal Alinezhad Noghre, Armin Danesh Pazho, Gabriel Maldonado, Hamed Tabkhi
Abstract:
Foundation Models (FM) have increasingly drawn the attention of researchers due to their scalability and generalization across diverse tasks. Inspired by the success of FMs and the principles that have driven advancements in Large Language Models (LLMs), we introduce MoFM as a novel Motion Foundation Model. MoFM is designed for the semantic understanding of complex human motions in both time and space. To facilitate large-scale training, MotionBook, a comprehensive human motion dictionary of discretized motions is designed and employed. MotionBook utilizes Thermal Cubes to capture spatio-temporal motion heatmaps, applying principles from discrete variational models to encode human movements into discrete units for a more efficient and scalable representation. MoFM, trained on a large corpus of motion data, provides a foundational backbone adaptable to diverse downstream tasks, supporting paradigms such as one-shot, unsupervised, and supervised tasks. This versatility makes MoFM well-suited for a wide range of motion-based applications.
Authors:Mengmeng Wang, Teli Ma, Shuo Xin, Xiaojun Hou, Jiazheng Xing, Guang Dai, Jingdong Wang, Yong Liu
Abstract:
Visual Object Tracking (VOT) is an attractive and significant research area in computer vision, which aims to recognize and track specific targets in video sequences where the target objects are arbitrary and class-agnostic. The VOT technology could be applied in various scenarios, processing data of diverse modalities such as RGB, thermal infrared and point cloud. Besides, since no one sensor could handle all the dynamic and varying environments, multi-modal VOT is also investigated. This paper presents a comprehensive survey of the recent progress of both single-modal and multi-modal VOT, especially the deep learning methods. Specifically, we first review three types of mainstream single-modal VOT, including RGB, thermal infrared and point cloud tracking. In particular, we conclude four widely-used single-modal frameworks, abstracting their schemas and categorizing the existing inheritors. Then we summarize four kinds of multi-modal VOT, including RGB-Depth, RGB-Thermal, RGB-LiDAR and RGB-Language. Moreover, the comparison results in plenty of VOT benchmarks of the discussed modalities are presented. Finally, we provide recommendations and insightful observations, inspiring the future development of this fast-growing literature.
Authors:Banglei Guan, Dongcai Tan, Jing Tao, Ang Su, Yang Shang, Qifeng Yu
Abstract:
In the deformation measurement of high-temperature structures, image degradation caused by thermal radiation and random errors introduced by heat haze restrict the accuracy and effectiveness of deformation measurement. To suppress thermal radiation and heat haze using fusion-restoration image processing methods, thereby improving the accuracy and effectiveness of DIC in the measurement of high-temperature deformation. For image degradation caused by thermal radiation, based on the image layered representation, the image is decomposed into positive and negative channels for parallel processing, and then optimized for quality by multi-exposure image fusion. To counteract the high-frequency, random errors introduced by heat haze, we adopt the FSIM as the objective function to guide the iterative optimization of model parameters, and the grayscale average algorithm is applied to equalize anomalous gray values, thereby reducing measurement error. The proposed multi-exposure image fusion algorithm effectively suppresses image degradation caused by complex illumination conditions, boosting the effective computation area from 26% to 50% for under-exposed images and from 32% to 40% for over-exposed images without degrading measurement accuracy in the experiment. Meanwhile, the image restoration combined with the grayscale average algorithm reduces static thermal deformation measurement errors. The error in ε_xx is reduced by 85.3%, while the errors in ε_yy and γ_xy are reduced by 36.0% and 36.4%, respectively. We present image processing methods to suppress the interference of thermal radiation and heat haze in high-temperature deformation measurement using DIC. The experimental results verify that the proposed method can effectively improve image quality, reduce deformation measurement errors, and has potential application value in thermal deformation measurement.
Authors:Elise Zhang, François Mirallès, Stéphane Dellacherie, Di Wu, Benoit Boulet
Abstract:
Weather is a dominant external driver of residential electricity demand, but adding many meteorological covariates can inflate model complexity and may even impair accuracy. Selecting appropriate exogenous features is non-trivial and calls for a principled selection framework, given the direct operational implications for day-to-day planning and reliability. This work investigates whether causal feature selection can retain the most informative weather drivers while improving parsimony and robustness for short-term load forecasting. We present a case study on Southern Ontario with two open-source datasets: (i) IESO hourly electricity consumption by Forward Sortation Areas; (ii) ERA5 weather reanalysis data. We compare different feature selection regimes (no feature selection, non-causal selection, PCMCI-causal selection) on city-level forecasting with three different time series forecasting models: GRU, TCN, PatchTST. In the feature analysis, non-causal selection prioritizes radiation and moisture variables that show correlational dependence, whereas PCMCI-causal selection emphasizes more direct thermal drivers and prunes the indirect covariates. We detail the evaluation pipeline and report diagnostics on prediction accuracy and extreme-weather robustness, positioning causal feature selection as a practical complement to modern forecasters when integrating weather into residential load forecasting.
Authors:Dongcai Tan, Shunkun Liang, Bin Li, Banglei Guan, Ang Su, Yuan Lin, Dapeng Zhang, Minggang Wan, Zibin Liu, Chenglong Wang, Jiajian Zhu, Zhang Li, Yang Shang, Qifeng Yu
Abstract:
Stereo optical measurement techniques, such as digital image correlation (DIC), are widely used in 3D deformation measurement as non-contact, full-field measurement methods, in which stereo calibration is a crucial step. However, current stereo calibration methods lack intuitive optimal pose guidance, leading to inefficiency and suboptimal accuracy in deformation measurements. The aim of this study is to develop an interactive calibration framework that automatically generates the next optimal pose, enabling high-accuracy stereo calibration for 3D deformation measurement. We propose a pose optimization method that introduces joint optimization of relative and absolute extrinsic parameters, with the minimization of the covariance matrix trace adopted as the loss function to solve for the next optimal pose. Integrated with this method is a user-friendly graphical interface, which guides even non-expert users to capture qualified calibration images. Our proposed method demonstrates superior efficiency (requiring fewer images) and accuracy (demonstrating lower measurement errors) compared to random pose, while maintaining robustness across varying FOVs. In the thermal deformation measurement tests on an S-shaped specimen, the results exhibit high agreement with finite element analysis (FEA) simulations in both deformation magnitude and evolutionary trends. We present a pose guidance method for high-precision stereo calibration in 3D deformation measurement. The simulation experiments, real-world experiments, and thermal deformation measurement applications all demonstrate the significant application potential of our proposed method in the field of 3D deformation measurement. Keywords: Stereo calibration, Optimal pose guidance, 3D deformation measurement, Digital image correlation
Authors:Yunqi Shi, Chengrui Gao, Wanqi Ren, Siyuan Xu, Ke Xue, Mingxuan Yuan, Chao Qian, Zhi-Hua Zhou
Abstract:
This work introduces Open3DBench, an open-source 3D-IC backend implementation benchmark built upon the OpenROAD-flow-scripts framework, enabling comprehensive evaluation of power, performance, area, and thermal metrics. Our proposed flow supports modular integration of 3D partitioning, placement, 3D routing, RC extraction, and thermal simulation, aligning with advanced 3D flows that rely on commercial tools and in-house scripts. We present two foundational 3D placement algorithms: Open3D-Tiling, which emphasizes regular macro placement, and Open3D-DMP, which enhances wirelength optimization through cross-die co-placement with analytical placer DREAMPlace. Experimental results show significant improvements in area (51.19%), wirelength (24.06%), timing (30.84%), and power (5.72%) compared to 2D flows. The results also highlight that better wirelength does not necessarily lead to PPA gain, emphasizing the need of developing PPA-driven methods. Open3DBench offers a standardized, reproducible platform for evaluating 3D EDA methods, effectively bridging the gap between open-source tools and commercial solutions in 3D-IC design.
Authors:Arunkumar Rathinam, Leo Pauly, Abd El Rahman Shabayek, Wassim Rharbaoui, Anis Kacem, Vincent Gaudillière, Djamila Aouada
Abstract:
Multispectral pedestrian detection has gained significant attention in recent years, particularly in autonomous driving applications. To address the challenges posed by adversarial illumination conditions, the combination of thermal and visible images has demonstrated its advantages. However, existing fusion methods rely on the critical assumption that the RGB-Thermal (RGB-T) image pairs are fully overlapping. These assumptions often do not hold in real-world applications, where only partial overlap between images can occur due to sensors configuration. Moreover, sensor failure can cause loss of information in one modality. In this paper, we propose a novel module called the Hybrid Attention (HA) mechanism as our main contribution to mitigate performance degradation caused by partial overlap and sensor failure, i.e. when at least part of the scene is acquired by only one sensor. We propose an improved RGB-T fusion algorithm, robust against partial overlap and sensor failure encountered during inference in real-world applications. We also leverage a mobile-friendly backbone to cope with resource constraints in embedded systems. We conducted experiments by simulating various partial overlap and sensor failure scenarios to evaluate the performance of our proposed method. The results demonstrate that our approach outperforms state-of-the-art methods, showcasing its superiority in handling real-world challenges.
Authors:Qipan Wang, Tianxiang Zhu, Tianyu Jia, Yibo Lin, Runsheng Wang, Ru Huang
Abstract:
Rising demand in AI and automotive applications is accelerating 2.5D IC adoption, with multiple chiplets tightly placed to enable high-speed interconnects and heterogeneous integration. As chiplet counts grow, traditional placement tools, limited by poor scalability and reliance on slow simulations, must evolve beyond wirelength minimization to address thermal and mechanical reliability, critical challenges in heterogeneous integration. In this paper, we present ATMPlace, the first analytical placer for 2.5D ICs that jointly optimizes wirelength, peak temperature, and operational warpage using physics-based compact models. It generates Pareto optimal placements for systems with dozens of chiplets. Experimental results demonstrate superior performance: 146 percent and 52 percent geometric mean wirelength improvement over TAP 2.5D and TACPlace, respectively, with 3 to 13 percent lower temperature and 5 to 27 percent less warpage, all achieved approximately 10 times faster. The proposed framework is general and can be extended to enable fast, scalable, and reliable design exploration for next-generation 2.5D systems.
Authors:Cheng-Hau Yang, Guglielmo Scovazzi, Adarsh Krishnamurthy, Baskar Ganapathysubramanian
Abstract:
This paper presents an incomplete Octree mesh implementation of the Shifted Boundary Method (Octree-SBM) for multiphysics simulations of coupled flow and heat transfer. Specifically, a semi-implicit formulation of the thermal Navier-Stokes equations is used to accelerate the simulations while maintaining accuracy. The SBM enables precise enforcement of field and derivative boundary conditions on cut (intercepted) elements, allowing for accurate flux calculations near complex geometries, when using non-boundary fitted meshes. Both Dirichlet and Neumann boundary conditions are implemented within the SBM framework, with results demonstrating that the SBM ensures precise enforcement of Neumann boundary conditions on Octree-based meshes. We illustrate this approach by simulating flows across different regimes, spanning several orders of magnitude in both the Rayleigh number ($Ra \sim 10^3$--$10^9$) and the Reynolds number ($Re \sim 10^0$--$10^4$), and covering the laminar, transitional, and turbulent flow regimes. Coupled thermal-flow phenomena and their statistics across all these regimes are accurately captured without any additional numerical treatments, beyond a Residual-based Variational Multiscale formulation (RB-VMS). This approach offers a reliable and efficient solution for complex geometries, boundary conditions and flow regimes in computational multiphysics simulations.
Authors:Tianxiang Zhu, Qipan Wang, Yibo Lin, Runsheng Wang, Ru Huang
Abstract:
Thermomechanical stress induced by through-silicon vias (TSVs) plays an important role in the performance and reliability analysis of 2.5D/3D ICs. While the finite element method (FEM) adopted by commercial software can provide accurate simulation results, it is very time- and memory-consuming for large-scale analysis. Over the past decade, the linear superposition method has been utilized to perform fast thermal stress estimations of TSV arrays, but it suffers from a lack of accuracy. In this paper, we propose MORE-Stress, a novel strict numerical algorithm for efficient thermal stress simulation of TSV arrays based on model order reduction. Extensive experimental results demonstrate that our algorithm can realize a 153-504 times reduction in computational time and a 39-115 times reduction in memory usage compared with the commercial software ANSYS, with negligible errors less than 1%. Our algorithm is as efficient as the linear superposition method, with an order of magnitude smaller errors and fast convergence.
Authors:Shan Tang, Ziwei Cao, Zhenling Yang, Jiachen Guo, Yicheng Lu, Wing Kam Liu, Xu Guo
Abstract:
Generative artificial intelligence (GAI) plays a fundamental role in high-impact AI-based systems such as SORA and AlphaFold. Currently, GAI shows limited capability in the specialized domains due to data scarcity. In this paper, we develop a continuum mechanics-based theoretical framework to generalize the optimal transport theory from pure mathematics, which can be used to describe the dynamics of data, realizing the generative tasks with a small amount of data. The developed theory is used to solve three typical problem involved in many mechanical designs and engineering applications: at material level, how to generate the stress-strain response outside the range of experimental conditions based on experimentally measured stress-strain data; at structure level, how to generate the temperature-dependent stress fields under the thermal loading; at system level, how to generate the plastic strain fields under transient dynamic loading. Our results show the proposed theory can complete the generation successfully, showing its potential to solve many difficult problems involved in engineering applications, not limited to mechanics problems, such as image generation. The present work shows that mechanics can provide new tools for computer science. The limitation of the proposed theory is also discussed.
Authors:Ke Li, Di Wang, Zhangyuan Hu, Shaofeng Li, Weiping Ni, Lin Zhao, Quan Wang
Abstract:
Infrared-visible object detection (IVOD) seeks to harness the complementary information in infrared and visible images, thereby enhancing the performance of detectors in complex environments. However, existing methods often neglect the frequency characteristics of complementary information, such as the abundant high-frequency details in visible images and the valuable low-frequency thermal information in infrared images, thus constraining detection performance. To solve this problem, we introduce a novel Frequency-Driven Feature Decomposition Network for IVOD, called FD2-Net, which effectively captures the unique frequency representations of complementary information across multimodal visual spaces. Specifically, we propose a feature decomposition encoder, wherein the high-frequency unit (HFU) utilizes discrete cosine transform to capture representative high-frequency features, while the low-frequency unit (LFU) employs dynamic receptive fields to model the multi-scale context of diverse objects. Next, we adopt a parameter-free complementary strengths strategy to enhance multimodal features through seamless inter-frequency recoupling. Furthermore, we innovatively design a multimodal reconstruction mechanism that recovers image details lost during feature extraction, further leveraging the complementary information from infrared and visible images to enhance overall representational capacity. Extensive experiments demonstrate that FD2-Net outperforms state-of-the-art (SOTA) models across various IVOD benchmarks, i.e. LLVIP (96.2% mAP), FLIR (82.9% mAP), and M3FD (83.5% mAP).
Authors:Houzhang Fang, Chenxing Wu, Kun Bai, Tianqi Chen, Xiaolin Wang, Xiyang Liu, Yi Chang, Luxin Yan
Abstract:
Unmanned aerial vehicle (UAV) target tracking based on thermal infrared imaging has been one of the most important sensing technologies in anti-UAV applications. However, the infrared UAV targets often exhibit weak features and complex backgrounds, posing significant challenges to accurate tracking. To address these problems, we introduce SiamDFF, a novel dynamic feature fusion Siamese network that integrates feature enhancement and global contextual attention knowledge distillation for infrared UAV target (IRUT) tracking. The SiamDFF incorporates a selective target enhancement network (STEN), a dynamic spatial feature aggregation module (DSFAM), and a dynamic channel feature aggregation module (DCFAM). The STEN employs intensity-aware multi-head cross-attention to adaptively enhance important regions for both template and search branches. The DSFAM enhances multi-scale UAV target features by integrating local details with global features, utilizing spatial attention guidance within the search frame. The DCFAM effectively integrates the mixed template generated from STEN in the template branch and original template, avoiding excessive background interference with the template and thereby enhancing the emphasis on UAV target region features within the search frame. Furthermore, to enhance the feature extraction capabilities of the network for IRUT without adding extra computational burden, we propose a novel tracking-specific target-aware contextual attention knowledge distiller. It transfers the target prior from the teacher network to the student model, significantly improving the student network's focus on informative regions at each hierarchical level of the backbone network. Extensive experiments on real infrared UAV datasets demonstrate that the proposed approach outperforms state-of-the-art target trackers under complex backgrounds while achieving a real-time tracking speed.
Authors:Jiahuan Long, Tingsong Jiang, Hanqing Liu, Chao Ma, Wen Yao
Abstract:
Adversarial patches have emerged as a popular privacy-preserving approach for resisting AI-driven surveillance systems. However, their conspicuous appearance makes them difficult to deploy in real-world scenarios. In this paper, we propose a thermally activated adversarial wearable designed to ensure adaptability and effectiveness in complex real-world environments. The system integrates thermochromic dyes with flexible heating units to induce visually dynamic adversarial patterns on clothing surfaces. In its default state, the clothing appears as an ordinary black T-shirt. Upon heating via an embedded thermal unit, hidden adversarial patterns on the fabric are activated, allowing the wearer to effectively evade detection across both visible and infrared modalities. Physical experiments demonstrate that the adversarial wearable achieves rapid texture activation within 50 seconds and maintains an adversarial success rate above 80\% across diverse real-world surveillance environments. This work demonstrates a new pathway toward physically grounded, user-controllable anti-AI systems, highlighting the growing importance of proactive adversarial techniques for privacy protection in the age of ubiquitous AI surveillance.
Authors:Ziming Liu, Yizhou Liu, Jeff Gore, Max Tegmark
Abstract:
Beyond neural scaling laws, little is known about the laws underlying large language models (LLMs). We introduce Neural Thermodynamic Laws (NTL) -- a new framework that offers fresh insights into LLM training dynamics. On the theoretical side, we demonstrate that key thermodynamic quantities (e.g., temperature, entropy, heat capacity, thermal conduction) and classical thermodynamic principles (e.g., the three laws of thermodynamics and the equipartition theorem) naturally emerge under river-valley loss landscape assumptions. On the practical side, this scientific perspective yields intuitive guidelines for designing learning rate schedules.
Authors:Jiahuan Long, Wen Yao, Tingsong Jiang, Chao Ma
Abstract:
Adversarial patches are widely used to evaluate the robustness of object detection systems in real-world scenarios. These patches were initially designed to deceive single-modal detectors (e.g., visible or infrared) and have recently been extended to target visible-infrared dual-modal detectors. However, existing dual-modal adversarial patch attacks have limited attack effectiveness across diverse physical scenarios. To address this, we propose CDUPatch, a universal cross-modal patch attack against visible-infrared object detectors across scales, views, and scenarios. Specifically, we observe that color variations lead to different levels of thermal absorption, resulting in temperature differences in infrared imaging. Leveraging this property, we propose an RGB-to-infrared adapter that maps RGB patches to infrared patches, enabling unified optimization of cross-modal patches. By learning an optimal color distribution on the adversarial patch, we can manipulate its thermal response and generate an adversarial infrared texture. Additionally, we introduce a multi-scale clipping strategy and construct a new visible-infrared dataset, MSDrone, which contains aerial vehicle images in varying scales and perspectives. These data augmentation strategies enhance the robustness of our patch in real-world conditions. Experiments on four benchmark datasets (e.g., DroneVehicle, LLVIP, VisDrone, MSDrone) show that our method outperforms existing patch attacks in the digital domain. Extensive physical tests further confirm strong transferability across scales, views, and scenarios.
Authors:Alfreds Lapkovskis, Boris Sedlak, Sindri Magnússon, Schahram Dustdar, Praveen Kumar Donta
Abstract:
Ensuring Service Level Objectives (SLOs) in large-scale architectures, such as Distributed Computing Continuum Systems (DCCS), is challenging due to their heterogeneous nature and varying service requirements across different devices and applications. Additionally, unpredictable workloads and resource limitations lead to fluctuating performance and violated SLOs. To improve SLO compliance in DCCS, one possibility is to apply machine learning; however, the design choices are often left to the developer. To that extent, we provide a benchmark of Active Inference -- an emerging method from neuroscience -- against three established reinforcement learning algorithms (Deep Q-Network, Advantage Actor-Critic, and Proximal Policy Optimization). We consider a realistic DCCS use case: an edge device running a video conferencing application alongside a WebSocket server streaming videos. Using one of the respective algorithms, we continuously monitor key performance metrics, such as latency and bandwidth usage, to dynamically adjust parameters -- including the number of streams, frame rate, and resolution -- to optimize service quality and user experience. To test algorithms' adaptability to constant system changes, we simulate dynamically changing SLOs and both instant and gradual data-shift scenarios, such as network bandwidth limitations and fluctuating device thermal states. Although the evaluated algorithms all showed advantages and limitations, our findings demonstrate that Active Inference is a promising approach for ensuring SLO compliance in DCCS, offering lower memory usage, stable CPU utilization, and fast convergence.
Authors:Yinchao Ma, Yuyang Tang, Wenfei Yang, Tianzhu Zhang, Xu Zhou, Feng Wu
Abstract:
Single object tracking aims to localize target object with specific reference modalities (bounding box, natural language or both) in a sequence of specific video modalities (RGB, RGB+Depth, RGB+Thermal or RGB+Event.). Different reference modalities enable various human-machine interactions, and different video modalities are demanded in complex scenarios to enhance tracking robustness. Existing trackers are designed for single or several video modalities with single or several reference modalities, which leads to separate model designs and limits practical applications. Practically, a unified tracker is needed to handle various requirements. To the best of our knowledge, there is still no tracker that can perform tracking with these above reference modalities across these video modalities simultaneously. Thus, in this paper, we present a unified tracker, UniSOT, for different combinations of three reference modalities and four video modalities with uniform parameters. Extensive experimental results on 18 visual tracking, vision-language tracking and RGB+X tracking benchmarks demonstrate that UniSOT shows superior performance against modality-specific counterparts. Notably, UniSOT outperforms previous counterparts by over 3.0\% AUC on TNL2K across all three reference modalities and outperforms Un-Track by over 2.0\% main metric across all three RGB+X video modalities.
Authors:Weiheng Zhong, Qibang Liu, Diab Abueidda, Seid Koric, Hadi Meidani
Abstract:
Neural operators have emerged as powerful tools for learning nonlinear mappings between function spaces, enabling real-time prediction of complex dynamics in diverse scientific and engineering applications. With their growing adoption in engineering design evaluation, a wide range of neural operator architectures have been proposed for various problem settings. However, model selection remains challenging due to the absence of fair and comprehensive comparisons. To address this, we propose and standardize six representative 3D industry-scale engineering design datasets spanning thermal analysis, linear elasticity, elasto-plasticity, time-dependent plastic problems, and computational fluid dynamics. All datasets include fully preprocessed inputs and outputs for model training, making them directly usable across diverse neural operator architectures. Using these datasets, we conduct a systematic comparison of four types of neural operator variants, including Branch-Trunk-based Neural Operators inspired by DeepONet, Graph-based Neural Operators inspired by Graph Neural Networks, Grid-based Neural Operators inspired by Fourier Neural Operators, and Point-based Neural Operators inspired by PointNet. We further introduce practical enhancements to adapt these models to different engineering settings, improving the fairness of the comparison. Our benchmarking study evaluates each model strengths and limitations in terms of predictive performance, computational efficiency, memory usage, and deployment complexity. The findings provide actionable insights to guide future neural operator development.
Authors:Kazuma Kobayashi, Jaewan Park, Qibang Liu, Seid Koric, Diab Abueidda, Syed Bahauddin Alam
Abstract:
Scientific applications increasingly demand real-time surrogate models that can capture the behavior of strongly coupled multiphysics systems driven by multiple input functions, such as in thermo-mechanical and electro-thermal processes. While neural operator frameworks, such as Deep Operator Networks (DeepONets), have shown considerable success in single-physics settings, their extension to multiphysics problems remains poorly understood. In particular, the challenge of learning nonlinear interactions between tightly coupled physical fields has received little systematic attention. This study addresses a foundational question: should the architectural design of a neural operator reflect the strength of physical coupling it aims to model? To answer this, we present the first comprehensive, architecture-aware evaluation of DeepONet variants across three regimes: single-physics, weakly coupled, and strongly coupled multiphysics systems. We consider a reaction-diffusion equation with dual spatial inputs, a nonlinear thermo-electrical problem with bidirectional coupling through temperature-dependent conductivity, and a viscoplastic thermo-mechanical model of steel solidification governed by transient phase-driven interactions. Two operator-learning frameworks, the classical DeepONet and its sequential GRU-based extension, S-DeepONet, are benchmarked using both single-branch and multi-branch (MIONet-style) architectures. Our results demonstrate that architectural alignment with physical coupling is crucial: single-branch networks significantly outperform multi-branch counterparts in strongly coupled settings, whereas multi-branch encodings offer advantages for decoupled or single-physics problems. Once trained, these surrogates achieve full-field predictions up to 1.8e4 times faster than high-fidelity finite-element solvers, without compromising solution accuracy.
Authors:Xinling Yu, Ziyue Liu, Hai Li, Yixing Li, Xin Ai, Zhiyu Zeng, Ian Young, Zheng Zhang
Abstract:
Thermal analysis is crucial in three-dimensional integrated circuit (3D-IC) design due to increased power density and complex heat dissipation paths. Although operator learning frameworks such as DeepOHeat have demonstrated promising preliminary results in accelerating thermal simulation, they face critical limitations in prediction capability for multi-scale thermal patterns, training efficiency, and trustworthiness of results during design optimization. This paper presents DeepOHeat-v1, an enhanced physics-informed operator learning framework that addresses these challenges through three key innovations. First, we integrate Kolmogorov-Arnold Networks with learnable activation functions as trunk networks, enabling an adaptive representation of multi-scale thermal patterns. This approach achieves a $1.25\times$ and $6.29\times$ reduction in error in two representative test cases. Second, we introduce a separable training method that decomposes the basis function along the coordinate axes, achieving $62\times$ training speedup and $31\times$ GPU memory reduction in our baseline case, and enabling thermal analysis at resolutions previously infeasible due to GPU memory constraints. Third, we propose a confidence score to evaluate the trustworthiness of the predicted results, and further develop a hybrid optimization workflow that combines operator learning with finite difference (FD) using Generalized Minimal Residual (GMRES) method for incremental solution refinement, enabling efficient and trustworthy thermal optimization. Experimental results demonstrate that DeepOHeat-v1 achieves accuracy comparable to optimization using high-fidelity finite difference solvers, while speeding up the entire optimization process by $70.6\times$ in our test cases, effectively minimizing the peak temperature through optimal placement of heat-generating components.
Authors:Jiaxin Xu, Gang Liu, Ruilan Guo, Meng Jiang, Tengfei Luo
Abstract:
The advancement of polymer informatics has been significantly propelled by the integration of machine learning (ML) techniques, enabling the rapid prediction of polymer properties and expediting the discovery of high-performance polymeric materials. However, the field lacks a standardized workflow that encompasses prediction accuracy, uncertainty quantification, ML interpretability, and polymer synthesizability. In this study, we introduce POINT$^{2}$ (POlymer INformatics Training and Testing), a comprehensive benchmark database and protocol designed to address these critical challenges. Leveraging the existing labeled datasets and the unlabeled PI1M dataset, a collection of approximately one million virtual polymers generated via a recurrent neural network trained on the realistic polymers, we develop an ensemble of ML models, including Quantile Random Forests, Multilayer Perceptrons with dropout, Graph Neural Networks, and pretrained large language models. These models are coupled with diverse polymer representations such as Morgan, MACCS, RDKit, Topological, Atom Pair fingerprints, and graph-based descriptors to achieve property predictions, uncertainty estimations, model interpretability, and template-based polymerization synthesizability across a spectrum of properties, including gas permeability, thermal conductivity, glass transition temperature, melting temperature, fractional free volume, and density. The POINT$^{2}$ database can serve as a valuable resource for the polymer informatics community for polymer discovery and optimization.
Authors:Qibang Liu, Pengfei Cai, Diab Abueidda, Sagar Vyas, Seid Koric, Rafael Gomez-Bombarelli, Philippe Geubelle
Abstract:
Under some initial and boundary conditions, the rapid reaction-thermal diffusion process taking place during frontal polymerization (FP) destabilizes the planar mode of front propagation, leading to spatially varying, complex hierarchical patterns in thermoset polymeric materials. Although modern reaction-diffusion models can predict the patterns resulting from unstable FP, the inverse design of patterns, which aims to retrieve process conditions that produce a desired pattern, remains an open challenge due to the non-unique and non-intuitive mapping between process conditions and manufactured patterns. In this work, we propose a probabilistic generative model named univariate conditional variational autoencoder (UcVAE) for the inverse design of hierarchical patterns in FP-based manufacturing. Unlike the cVAE, which encodes both the design space and the design target, the UcVAE encodes only the design space. In the encoder of the UcVAE, the number of training parameters is significantly reduced compared to the cVAE, resulting in a shorter training time while maintaining comparable performance. Given desired pattern images, the trained UcVAE can generate multiple process condition solutions that produce high-fidelity hierarchical patterns.
Authors:Raisa Bentay Hossain, Farid Ahmed, Kazuma Kobayashi, Seid Koric, Diab Abueidda, Syed Bahauddin Alam
Abstract:
Effective real-time monitoring is a foundation of digital twin technology, crucial for detecting material degradation and maintaining the structural integrity of nuclear systems to ensure both safety and operational efficiency. Traditional physical sensor systems face limitations such as installation challenges, high costs, and difficulty measuring critical parameters in hard-to-reach or harsh environments, often resulting in incomplete data coverage. Machine learning-driven virtual sensors, integrated within a digital twin framework, offer a transformative solution by enhancing physical sensor capabilities to monitor critical degradation indicators like pressure, velocity, and turbulence. However, conventional machine learning models struggle with real-time monitoring due to the high-dimensional nature of reactor data and the need for frequent retraining. This paper introduces the use of Deep Operator Networks (DeepONet) as a core component of a digital twin framework to predict key thermal-hydraulic parameters in the hot leg of an AP-1000 Pressurized Water Reactor (PWR). DeepONet serves as a dynamic and scalable virtual sensor by accurately mapping the interplay between operational input parameters and spatially distributed system behaviors. In this study, DeepONet is trained with different operational conditions, which relaxes the requirement of continuous retraining, making it suitable for online and real-time prediction components for digital twin. Our results show that DeepONet achieves accurate predictions with low mean squared error and relative L2 error and can make predictions on unknown data 1400 times faster than traditional CFD simulations. This speed and accuracy enable DeepONet to synchronize with the physical system in real-time, functioning as a dynamic virtual sensor that tracks degradation-contributing conditions.
Authors:Srivathsan Badrinarayanan, Yue Su, Janghoon Ock, Alan Pham, Sanya Ahuja, Amir Barati Farimani
Abstract:
Protein mutations can have profound effects on biological function, making accurate prediction of property changes critical for drug discovery, protein engineering, and precision medicine. Current approaches rely on fine-tuning protein-specific transformers for individual datasets, but struggle with cross-dataset generalization due to heterogeneous experimental conditions and limited target domain data. We introduce two key innovations: (1) the first application of Model-Agnostic Meta-Learning (MAML) to protein mutation property prediction, and (2) a novel mutation encoding strategy using separator tokens to directly incorporate mutations into sequence context. We build upon transformer architectures integrating them with MAML to enable rapid adaptation to new tasks through minimal gradient steps rather than learning dataset-specific patterns. Our mutation encoding addresses the critical limitation where standard transformers treat mutation positions as unknown tokens, significantly degrading performance. Evaluation across three diverse protein mutation datasets (functional fitness, thermal stability, and solubility) demonstrates significant advantages over traditional fine-tuning. In cross-task evaluation, our meta-learning approach achieves 29% better accuracy for functional fitness with 65% less training time, and 94% better accuracy for solubility with 55% faster training. The framework maintains consistent training efficiency regardless of dataset size, making it particularly valuable for industrial applications and early-stage protein design where experimental data is limited. This work establishes a systematic application of meta-learning to protein mutation analysis and introduces an effective mutation encoding strategy, offering transformative methodology for cross-domain generalization in protein engineering.
Authors:Jian Xiao, Ji Wang, Ming Zeng, Hongbo Xu, Xingwang Li, Arumugam Nallanathan
Abstract:
The advent of Rydberg atomic quantum receivers (RAQRs) offers a new solution for the evolution of wireless transceiver architecture, promising unprecedented sensitivity and immunity to thermal noise. However, RAQRs introduce a unique non-linear signal model based on biased phase retrieval, which complicates fundamental channel estimation tasks. Traditional iterative algorithms often struggle in low signal-to-noise regimes and fail to capture complex and non-ideal system characteristics. To address this, we propose a novel model-driven deep learning framework for channel estimation in RAQRs. Specifically, we propose a Transformer-based unrolling architecture, termed URformer, which is derived by unrolling a stabilized variant of the expectation-maximization Gerchberg-Saxton (EM-GS) algorithm. Specifically, each layer of the proposed URformer incorporates three trainable modules: 1) a learnable filter implemented by a neural network that replaces the fixed Bessel function ratio in the classic EM-GS algorithm; 2) a trainable gating mechanism that adaptively combines classic and model-based updates to ensure training stability; and 3) a efficient channel Transformer block that learns to correct residual errors by capturing non-local dependencies across the channel matrix. Numerical results demonstrate that the proposed URformer significantly outperforms classic iterative algorithms and conventional black-box neural networks with less pilot overhead.
Authors:Sri Krishna Vadlamani, Kfir Sulimany, Zhihui Gao, Tingjun Chen, Dirk Englund
Abstract:
Machine intelligence on edge devices enables low-latency processing and improved privacy, but is often limited by the energy and delay of moving and converting data. Current systems frequently avoid local model storage by sending queries to a server, incurring uplink cost, network latency, and privacy risk. We present the opposite approach: broadcasting model weights to clients that perform inference locally using in-physics computation inside the radio receive chain. A base station transmits weights as radio frequency (RF) waveforms; the client encodes activations onto the waveform and computes the result using existing mixer and filter stages, RF components already present in billions of edge devices such as cellphones, eliminating repeated signal conversions and extra hardware. Analysis shows that thermal noise and nonlinearity create an optimal energy window for accurate analog inner products. Hardware-tailored training through a differentiable RF chain preserves accuracy within this regime. Circuit-informed simulations, consistent with a companion experiment, demonstrate reduced memory and conversion overhead while maintaining high accuracy in realistic wireless edge scenarios.
Authors:Mark Ballard, Guanqun Song, Ting Zhu
Abstract:
The rapid proliferation of satellite constellations, particularly in Low Earth Orbit (LEO), has fundamentally altered the global space infrastructure, shifting the risk landscape from purely kinetic collisions to complex cyber-physical threats. While traditional safety frameworks focus on debris mitigation, ground-based adversaries increasingly exploit radio-frequency links, supply chain vulnerabilities, and software update pathways to degrade space assets. This paper presents a comparative analysis of satellite cybersecurity across LEO, Medium Earth Orbit (MEO), and Geostationary Earth Orbit (GEO) regimes. By synthesizing data from 60 publicly documented security incidents with key vulnerability proxies--including Telemetry, Tracking, and Command (TT&C) anomalies, encryption weaknesses, and environmental stressors--we characterize how orbital altitude dictates attack feasibility and impact. Our evaluation reveals distinct threat profiles: GEO systems are predominantly targeted via high-frequency uplink exposure, whereas LEO constellations face unique risks stemming from limited power budgets, hardware constraints, and susceptibility to thermal and radiation-induced faults. We further bridge the gap between security and sustainability, arguing that unmitigated cyber vulnerabilities accelerate hardware obsolescence and debris accumulation, undermining efforts toward carbon-neutral space operations. The results demonstrate that weak encryption and command path irregularities are the most consistent predictors of adversarial success across all orbits.
Authors:Fuyang Liu, Shun Lu, Jilin Mei, Yu Hu
Abstract:
RGB-Thermal fusion is a potential solution for various weather and light conditions in challenging scenarios. However, plenty of studies focus on designing complex modules to fuse different modalities. With the widespread application of large language models (LLMs), valuable information can be more effectively extracted from natural language. Therefore, we aim to leverage the advantages of large language models to design a structurally simple and highly adaptable multimodal fusion model architecture. We proposed MultimodAl Segmentation with TExt PRompts (MASTER) architecture, which integrates LLM into the fusion of RGB-Thermal multimodal data and allows complex query text to participate in the fusion process. Our model utilizes a dual-path structure to extract information from different modalities of images. Additionally, we employ LLM as the core module for multimodal fusion, enabling the model to generate learnable codebook tokens from RGB, thermal images, and textual information. A lightweight image decoder is used to obtain semantic segmentation results. The proposed MASTER performs exceptionally well in benchmark tests across various automated driving scenarios, yielding promising results.
Authors:Masaki Adachi, Siu Lun Chau, Wenjie Xu, Anurag Singh, Michael A. Osborne, Krikamol Muandet
Abstract:
We introduce Social Bayesian Optimization (SBO), a vote-efficient algorithm for consensus-building in collective decision-making. In contrast to single-agent scenarios, collective decision-making encompasses group dynamics that may distort agents' preference feedback, thereby impeding their capacity to achieve a social-influence-free consensus -- the most preferable decision based on the aggregated agent utilities. We demonstrate that under mild rationality axioms, reaching social-influence-free consensus using noisy feedback alone is impossible. To address this, SBO employs a dual voting system: cheap but noisy public votes (e.g., show of hands in a meeting), and more accurate, though expensive, private votes (e.g., one-to-one interview). We model social influence using an unknown social graph and leverage the dual voting system to efficiently learn this graph. Our theoretical findigns show that social graph estimation converges faster than the black-box estimation of agents' utilities, allowing us to reduce reliance on costly private votes early in the process. This enables efficient consensus-building primarily through noisy public votes, which are debiased based on the estimated social graph to infer social-influence-free feedback. We validate the efficacy of SBO across multiple real-world applications, including thermal comfort, team building, travel negotiation, and energy trading collaboration.
Authors:Zhehu Yuan, Jinyang Liu, Guanqun Song, Ting Zhu
Abstract:
In satellite applications, managing thermal conditions is a significant challenge due to the extreme fluctuations in temperature during orbital cycles. One of the solutions is to heat the satellite when it is not exposed to sunlight, which could protect the satellites from extremely low temperatures. However, heat dissipation is necessary for Graphics Processing Units (GPUs) to operate properly and efficiently. In this way, this paper investigates the use of GPU as a means of passive heating in low-earth orbit (LEO) satellites. Our approach uses GPUs to generate heat during the eclipse phase of satellite orbits, substituting traditional heating systems, while the GPUs are also cooled down during this process. The results highlight the potential advantages and limitations of this method, including the cost implications, operational restrictions, and the technical complexity involved. Also, this paper explores the thermal behavior of GPUs under different computational loads, specifically focusing on execution-dominated and FLOP-dominated workloads. Moreover, this paper discusses future directions for improving GPU-based heating solutions, including further cost analysis, system optimization, and practical testing in real satellite missions.
Authors:Yury Zabegaev, Inga Berre, Eirik Keilegavlen
Abstract:
The numerical modeling of fracture contact thermo-poromechanics is crucial for advancing subsurface engineering applications, including CO2 sequestration, production of geo-energy resources, energy storage and wastewater disposal operations. Accurately modeling this problem presents substantial challenges due to the complex physics involved in strongly coupled thermo-poromechanical processes and the frictional contact mechanics of fractures. To resolve process couplings in the resulting mathematical model, it is common to apply fully implicit time stepping. This necessitates the use of an iterative linear solver to run the model. The solver's efficiency primarily depends on a robust preconditioner, which is particularly challenging to develop because it must handle the mutual couplings between linearized contact mechanics and energy, momentum, and mass balance. In this work, we introduce a preconditioner for the problem based on the nested approximations of Schur complements. To decouple the momentum balance, we utilize the fixed-stress approximation, extended to account for both the porous media and fracture subdomains. The singularity of the contact mechanics submatrix is resolved by a linear transformation. Two variations of the algorithm are proposed to address the coupled mass and energy balance submatrix: either the Constrained Pressure Residual or the System-AMG approach. The preconditioner is evaluated through numerical experiments of fluid injection into fractured porous media, which causes thermal contraction and subsequent sliding and opening of fractures. The experiments show that the preconditioner performs robustly for a wide range of simulation regimes governed by various fracture states, friction coefficients and Peclet number. The grid refinement experiments demonstrate that the preconditioner scales well in terms of GMRES iterations, in both two and three dimensions.
Authors:Minghui Lin, Shu Wang, Xiang Wang, Jianhua Tang, Longbin Fu, Zhengrong Zuo, Nong Sang
Abstract:
Current multi-modal object re-identification approaches based on large-scale pre-trained backbones (i.e., ViT) have displayed remarkable progress and achieved excellent performance. However, these methods usually adopt the standard full fine-tuning paradigm, which requires the optimization of considerable backbone parameters, causing extensive computational and storage requirements. In this work, we propose an efficient prompt-tuning framework tailored for multi-modal object re-identification, dubbed DMPT, which freezes the main backbone and only optimizes several newly added decoupled modality-aware parameters. Specifically, we explicitly decouple the visual prompts into modality-specific prompts which leverage prior modality knowledge from a powerful text encoder and modality-independent semantic prompts which extract semantic information from multi-modal inputs, such as visible, near-infrared, and thermal-infrared. Built upon the extracted features, we further design a Prompt Inverse Bind (PromptIBind) strategy that employs bind prompts as a medium to connect the semantic prompt tokens of different modalities and facilitates the exchange of complementary multi-modal information, boosting final re-identification results. Experimental results on multiple common benchmarks demonstrate that our DMPT can achieve competitive results to existing state-of-the-art methods while requiring only 6.5% fine-tuning of the backbone parameters.
Authors:Olaf Borsboom, Arnab Bhadra, Mauro Salazar, Theo Hofman
Abstract:
In this paper, we present geometric scaling models for axial flux motors (AFMs) to be used for in-wheel powertrain design optimization purposes. We first present a vehicle and powertrain model, with emphasis on the electric motor model. We construct the latter by formulating the analytical scaling laws for AFMs, based on the scaling concept of RFMs from the literature, specifically deriving the model of the main loss component in electric motors: the copper losses. We further present separate scaling models of motor parameters, losses and thermal models, as well as the torque limits and cost, as a function of the design variables. Second, we validate these scaling laws with several experiments leveraging high-fidelity finite-element simulations. Finally, we define an optimization problem that minimizes the energy consumption over a drive cycle, optimizing the motor size and transmission ratio for a wide range of electric vehicle powertrain topologies. In our study, we observe that the all-wheel drive topology equipped with in-wheel AFMs is the most efficient, but also generates the highest material cost.
Authors:Soumyendu Sarkar, Antonio Guillen-Perez, Zachariah J Carmichael, Avisek Naug, Refik Mert Cam, Vineet Gundecha, Ashwin Ramesh Babu, Sahand Ghorbanpour, Ricardo Luna Gutierrez
Abstract:
Reducing energy consumption and carbon emissions in data centers by enabling real-time temperature prediction is critical for sustainability and operational efficiency. Achieving this requires accurate modeling of the 3D temperature field to capture airflow dynamics and thermal interactions under varying operating conditions. Traditional thermal CFD solvers, while accurate, are computationally expensive and require expert-crafted meshes and boundary conditions, making them impractical for real-time use. To address these limitations, we develop a vision-based surrogate modeling framework that operates directly on a 3D voxelized representation of the data center, incorporating server workloads, fan speeds, and HVAC temperature set points. We evaluate multiple architectures, including 3D CNN U-Net variants, a 3D Fourier Neural Operator, and 3D vision transformers, to map these thermal inputs to high-fidelity heat maps. Our results show that the surrogate models generalize across data center configurations and achieve up to 20,000x speedup (hundreds of milliseconds vs. hours). This fast and accurate estimation of hot spots and temperature distribution enables real-time cooling control and workload redistribution, leading to substantial energy savings (7\%) and reduced carbon footprint.
Authors:Avisek Naug, Antonio Guillen, Vineet Kumar, Scott Greenwood, Wesley Brewer, Sahand Ghorbanpour, Ashwin Ramesh Babu, Vineet Gundecha, Ricardo Luna Gutierrez, Soumyendu Sarkar
Abstract:
Liquid cooling is critical for thermal management in high-density data centers with the rising AI workloads. However, machine learning-based controllers are essential to unlock greater energy efficiency and reliability, promoting sustainability. We present LC-Opt, a Sustainable Liquid Cooling (LC) benchmark environment, for reinforcement learning (RL) control strategies in energy-efficient liquid cooling of high-performance computing (HPC) systems. Built on the baseline of a high-fidelity digital twin of Oak Ridge National Lab's Frontier Supercomputer cooling system, LC-Opt provides detailed Modelica-based end-to-end models spanning site-level cooling towers to data center cabinets and server blade groups. RL agents optimize critical thermal controls like liquid supply temperature, flow rate, and granular valve actuation at the IT cabinet level, as well as cooling tower (CT) setpoints through a Gymnasium interface, with dynamic changes in workloads. This environment creates a multi-objective real-time optimization challenge balancing local thermal regulation and global energy efficiency, and also supports additional components like a heat recovery unit (HRU). We benchmark centralized and decentralized multi-agent RL approaches, demonstrate policy distillation into decision and regression trees for interpretable control, and explore LLM-based methods that explain control actions in natural language through an agentic mesh architecture designed to foster user trust and simplify system management. LC-Opt democratizes access to detailed, customizable liquid cooling models, enabling the ML community, operators, and vendors to develop sustainable data center liquid cooling control solutions.
Authors:Xiaolei Zhu, Xiaofei Jin, Ziyang Kang, Chonghui Sun, Junjie Feng, Dingwen Hu, Zengyi Wang, Hanyue Zhuang, Qian Zheng, Huajin Tang, Shi Gu, Xin Du, De Ma, Gang Pan
Abstract:
Neuromorphic computing promises brain-like efficiency, yet today's multi-chip systems scale over PCBs and incur orders-of-magnitude penalties in bandwidth, latency, and energy, undermining biological algorithms and system efficiency. We present DarwinWafer, a hyperscale system-on-wafer that replaces off-chip interconnects with wafer-scale, high-density integration of 64 Darwin3 chiplets on a 300 mm silicon interposer. A GALS NoC within each chiplet and an AER-based asynchronous wafer fabric with hierarchical time-step synchronization provide low-latency, coherent operation across the wafer. Each chiplet implements 2.35 M neurons and 0.1 B synapses, yielding 0.15 B neurons and 6.4 B synapses per wafer.At 333 MHz and 0.8 V, DarwinWafer consumes ~100 W and achieves 4.9 pJ/SOP, with 64 TSOPS peak throughput (0.64 TSOPS/W). Realization is enabled by a holistic chiplet-interposer co-design flow (including an in-house interposer-bump planner with early SI/PI and electro-thermal closure) and a warpage-tolerant assembly that fans out I/O via PCBlets and compliant pogo-pin connections, enabling robust, demountable wafer-to-board integration. Measurements confirm 10 mV supply droop and a uniform thermal profile (34-36 °C) under ~100 W. Application studies demonstrate whole-brain simulations: two zebrafish brains per chiplet with high connectivity fidelity (Spearman r = 0.896) and a mouse brain mapped across 32 chiplets (r = 0.645). To our knowledge, DarwinWafer represents a pioneering demonstration of wafer-scale neuromorphic computing, establishing a viable and scalable path toward large-scale, brain-like computation on silicon by replacing PCB-level interconnects with high-density, on-wafer integration.
Authors:Valeria Zitz, Michael Küttner, Jonas Hummel, Michael T. Knierim, Michael Beigl, Tobias Röddiger
Abstract:
Maintaining thermal comfort in shared indoor environments remains challenging, as centralized HVAC systems are slow to adapt and standardized to group norms. Cold exposure not only reduces subjective comfort but can impair cognitive performance, particularly under moderate to severe cold stress. Personal Comfort Systems (PCS) have shown promise by providing localized heating, yet many designs target distal body parts with low thermosensitivity and often lack portability. In this work, we investigate whether targeted thermal stimulation using in-ear worn devices can manipulate thermal perception and enhance thermal comfort. We present Heatables, a novel in-ear wearable that emits Near-Infrared (NIR) and Infrared (IR) radiation via integrated LEDs to deliver localized optical heating. This approach leverages NIR-IR's ability to penetrate deeper tissues, offering advantages over traditional resistive heating limited to surface warming. In a placebo-controlled study with 24 participants, each exposed for 150 minutes in a cool office environment (approximately 17.5 degrees Celsius) to simulate sustained cold stress during typical sedentary office activities, Heatables significantly increased the perceived ambient temperature by around 1.5 degrees Celsius and delayed cold discomfort. Importantly, thermal benefits extended beyond the ear region, improving both whole-body comfort and thermal acceptability. These findings position in-ear NIR-IR-LED-based stimulation as a promising modality for unobtrusive thermal comfort enhancement in everyday contexts.
Authors:M-Mahdi Naddaf-Sh, Andrew Lee, Kin Yen, Eemon Amini, Iman Soltani
Abstract:
This study investigates the potential of infrared (IR) camera technology to enhance driver safety for emergency vehicles operating in low-visibility conditions, particularly at night and in dense fog. Such environments significantly increase the risk of collisions, especially for tow trucks and snowplows that must remain operational in challenging conditions. Conventional driver assistance systems often struggle under these conditions due to limited visibility. In contrast, IR cameras, which detect the thermal signatures of obstacles, offer a promising alternative. The evaluation combines controlled laboratory experiments, real-world field tests, and surveys of emergency vehicle operators. In addition to assessing detection performance, the study examines the feasibility of retrofitting existing Department of Transportation (DoT) fleets with cost-effective IR-based driver assistance systems. Results underscore the utility of IR technology in enhancing driver awareness and provide data-driven recommendations for scalable deployment across legacy emergency vehicle fleets.
Authors:Jinchang Zhang, Zijun Li, Guoyu Lu
Abstract:
Depth-guided multimodal fusion combines depth information from visible and infrared images, significantly enhancing the performance of 3D reconstruction and robotics applications. Existing thermal-visible image fusion mainly focuses on detection tasks, ignoring other critical information such as depth. By addressing the limitations of single modalities in low-light and complex environments, the depth information from fused images not only generates more accurate point cloud data, improving the completeness and precision of 3D reconstruction, but also provides comprehensive scene understanding for robot navigation, localization, and environmental perception. This supports precise recognition and efficient operations in applications such as autonomous driving and rescue missions. We introduce a text-guided and depth-driven infrared and visible image fusion network. The model consists of an image fusion branch for extracting multi-channel complementary information through a diffusion model, equipped with a text-guided module, and two auxiliary depth estimation branches. The fusion branch uses CLIP to extract semantic information and parameters from depth-enriched image descriptions to guide the diffusion model in extracting multi-channel features and generating fused images. These fused images are then input into the depth estimation branches to calculate depth-driven loss, optimizing the image fusion network. This framework aims to integrate vision-language and depth to directly generate color-fused images from multimodal inputs.
Authors:Yan Zhang, Wen Yang, Chang Xu, Qian Hu, Fang Xu, Gui-Song Xia
Abstract:
Drone-based RGBT object detection plays a crucial role in many around-the-clock applications. However, real-world drone-viewed RGBT data suffers from the prominent position shift problem, i.e., the position of a tiny object differs greatly in different modalities. For instance, a slight deviation of a tiny object in the thermal modality will induce it to drift from the main body of itself in the RGB modality. Considering RGBT data are usually labeled on one modality (reference), this will cause the unlabeled modality (sensed) to lack accurate supervision signals and prevent the detector from learning a good representation. Moreover, the mismatch of the corresponding feature point between the modalities will make the fused features confusing for the detection head. In this paper, we propose to cast the cross-modality box shift issue as the label noise problem and address it on the fly via a novel Mean Teacher-based Cross-modality Box Correction head ensemble (CBC). In this way, the network can learn more informative representations for both modalities. Furthermore, to alleviate the feature map mismatch problem in RGBT fusion, we devise a Shifted Window-Based Cascaded Alignment (SWCA) module. SWCA mines long-range dependencies between the spatially unaligned features inside shifted windows and cascaded aligns the sensed features with the reference ones. Extensive experiments on two drone-based RGBT object detection datasets demonstrate that the correction results are both visually and quantitatively favorable, thereby improving the detection performance. In particular, our CBC module boosts the precision of the sensed modality ground truth by 25.52 aSim points. Overall, the proposed detector achieves an mAP_50 of 43.55 points on RGBTDronePerson and surpasses a state-of-the-art method by 8.6 mAP50 on a shift subset of DroneVehicle dataset. The code and data will be made publicly available.
Authors:Jovan Stojkovic, Chaojie Zhang, Ãñigo Goiri, Esha Choukse, Haoran Qiu, Rodrigo Fonseca, Josep Torrellas, Ricardo Bianchini
Abstract:
The rising demand for generative large language models (LLMs) poses challenges for thermal and power management in cloud datacenters. Traditional techniques often are inadequate for LLM inference due to the fine-grained, millisecond-scale execution phases, each with distinct performance, thermal, and power profiles. Additionally, LLM inference workloads are sensitive to various configuration parameters (e.g., model parallelism, size, and quantization) that involve trade-offs between performance, temperature, power, and output quality. Moreover, clouds often co-locate SaaS and IaaS workloads, each with different levels of visibility and flexibility. We propose TAPAS, a thermal- and power-aware framework designed for LLM inference clusters in the cloud. TAPAS enhances cooling and power oversubscription capabilities, reducing the total cost of ownership (TCO) while effectively handling emergencies (e.g., cooling and power failures). The system leverages historical temperature and power data, along with the adaptability of SaaS workloads, to: (1) efficiently place new GPU workload VMs within cooling and power constraints, (2) route LLM inference requests across SaaS VMs, and (3) reconfigure SaaS VMs to manage load spikes and emergency situations. Our evaluation on a large GPU cluster demonstrates significant reductions in thermal and power throttling events, boosting system efficiency.
Authors:Samarth Chopra, Fernando Cladera, Varun Murali, Vijay Kumar
Abstract:
Neural Radiance Fields (NeRFs) have shown significant promise in 3D scene reconstruction and novel view synthesis. In agricultural settings, NeRFs can serve as digital twins, providing critical information about fruit detection for yield estimation and other important metrics for farmers. However, traditional NeRFs are not robust to challenging lighting conditions, such as low-light, extreme bright light and varying lighting. To address these issues, this work leverages three different sensors: an RGB camera, an event camera and a thermal camera. Our RGB scene reconstruction shows an improvement in PSNR and SSIM by +2.06 dB and +8.3% respectively. Our cross-spectral scene reconstruction enhances downstream fruit detection by +43.0% in mAP50 and +61.1% increase in mAP50-95. The integration of additional sensors leads to a more robust and informative NeRF. We demonstrate that our multi-modal system yields high quality photo-realistic reconstructions under various tree canopy covers and at different times of the day. This work results in the development of a resilient NeRF, capable of performing well in visibly degraded scenarios, as well as a learnt cross-spectral representation, that is used for automated fruit detection.
Authors:Ziang Yin, Hongjian Zhou, Nicholas Gangi, Meng Zhang, Jeff Zhang, Zhaoran Rena Huang, Jiaqi Gu
Abstract:
In this work, we identify three considerations that are essential for realizing practical photonic AI systems at scale: (1) dynamic tensor operation support for modern models rather than only weight-static kernels, especially for attention/Transformer-style workloads; (2) systematic management of conversion, control, and data-movement overheads, where multiplexing and dataflow must amortize electronic costs instead of letting ADC/DAC and I/O dominate; and (3) robustness under hardware non-idealities that become more severe as integration density grows. To study these coupled tradeoffs quantitatively, and to ensure they remain meaningful under real implementation constraints, we build a cross-layer toolchain that supports photonic AI design from early exploration to physical realization. SimPhony provides implementation-aware modeling and rapid cross-layer evaluation, translating physical costs into system-level metrics so architectural decisions are grounded in realistic assumptions. ADEPT and ADEPT-Z enable end-to-end circuit and topology exploration, connecting system objectives to feasible photonic fabrics under practical device and circuit constraints. Finally, Apollo and LiDAR provide scalable photonic physical design automation, turning candidate circuits into manufacturable layouts while accounting for routing, thermal, and crosstalk constraints.
Authors:Thomas Krug, Fabian Raisch, Dominik Aimer, Markus Wirnsberger, Ferdinand Sigg, Felix Koch, Benjamin Schäfer, Benjamin Tischler
Abstract:
Data-driven modeling of building thermal dynamics is emerging as an increasingly important field of research for large-scale intelligent building control. However, research in data-driven modeling using machine learning (ML) techniques requires massive amounts of thermal building data, which is not easily available. Neither empirical public datasets nor existing data generators meet the needs of ML research in terms of data quality and quantity. Moreover, existing data generation approaches typically require expert knowledge in building simulation. To fill this gap, we present a thermal building data generation framework which we call BuilDa. BuilDa is designed to produce synthetic data of adequate quality and quantity for ML research. The framework does not require profound building simulation knowledge to generate large volumes of data. BuilDa uses a single-zone Modelica model that is exported as a Functional Mock-up Unit (FMU) and simulated in Python. We demonstrate BuilDa by generating data and utilizing it for a transfer learning study involving the fine-tuning of 486 data-driven models.
Authors:Lyes Saad Saoud, Irfan Hussain
Abstract:
Biomimetic intelligence and robotics are transforming field ecology by enabling lifelike robotic surrogates that interact naturally with animals under real world conditions. Studying avian behavior in the wild remains challenging due to the need for highly realistic morphology, durable outdoor operation, and intelligent perception that can adapt to uncontrolled environments. We present a next generation bio inspired robotic platform that replicates the morphology and visual appearance of the female Houbara bustard to support controlled ethological studies and conservation oriented field research. The system introduces a fully digitally replicable fabrication workflow that combines high resolution structured light 3D scanning, parametric CAD modelling, articulated 3D printing, and photorealistic UV textured vinyl finishing to achieve anatomically accurate and durable robotic surrogates. A six wheeled rocker bogie chassis ensures stable mobility on sand and irregular terrain, while an embedded NVIDIA Jetson module enables real time RGB and thermal perception, lightweight YOLO based detection, and an autonomous visual servoing loop that aligns the robot's head toward detected targets without human intervention. A lightweight thermal visible fusion module enhances perception in low light conditions. Field trials in desert aviaries demonstrated reliable real time operation at 15 to 22 FPS with latency under 100 ms and confirmed that the platform elicits natural recognition and interactive responses from live Houbara bustards under harsh outdoor conditions. This integrated framework advances biomimetic field robotics by uniting reproducible digital fabrication, embodied visual intelligence, and ecological validation, providing a transferable blueprint for animal robot interaction research, conservation robotics, and public engagement.
Authors:Fabian Raisch, Max Langtry, Felix Koch, Ruchi Choudhary, Christoph Goebel, Benjamin Tischler
Abstract:
Transfer Learning (TL) is currently the most effective approach for modeling building thermal dynamics when only limited data are available. TL uses a pretrained model that is fine-tuned to a specific target building. However, it remains unclear how to proceed after initial fine-tuning, as more operational measurement data are collected over time. This challenge becomes even more complex when the dynamics of the building change, for example, after a retrofit or a change in occupancy. In Machine Learning literature, Continual Learning (CL) methods are used to update models of changing systems. TL approaches can also address this challenge by reusing the pretrained model at each update step and fine-tuning it with new measurement data. A comprehensive study on how to incorporate new measurement data over time to improve prediction accuracy and address the challenges of concept drifts (changes in dynamics) for building thermal dynamics is still missing.
Therefore, this study compares several CL and TL strategies, as well as a model trained from scratch, for thermal dynamics modeling during building operation. The methods are evaluated using 5--7 years of simulated data representative of single-family houses in Central Europe, including scenarios with concept drifts from retrofits and changes in occupancy. We propose a CL strategy (Seasonal Memory Learning) that provides greater accuracy improvements than existing CL and TL methods, while maintaining low computational effort. SML outperformed the benchmark of initial fine-tuning by 28.1\% without concept drifts and 34.9\% with concept drifts.
Authors:Thomas Krug, Fabian Raisch, Dominik Aimer, Markus Wirnsberger, Ferdinand Sigg, Benjamin Schäfer, Benjamin Tischler
Abstract:
Transfer learning (TL) can improve data-driven modeling of building thermal dynamics. Therefore, many new TL research areas emerge in the field, such as selecting the right source model for TL. However, these research directions require massive amounts of thermal building data which is lacking presently. Neither public datasets nor existing data generators meet the needs of TL research in terms of data quality and quantity. Moreover, existing data generation approaches typically require expert knowledge in building simulation. We present BuilDa, a thermal building data generation framework for producing synthetic data of adequate quality and quantity for TL research. The framework does not require profound building simulation knowledge to generate large volumes of data. BuilDa uses a single-zone Modelica model that is exported as a Functional Mock-up Unit (FMU) and simulated in Python. We demonstrate BuilDa by generating data and utilizing it for pretraining and fine-tuning TL models.
Authors:Javier Penuela, Sahar Moghimian Hoosh, Ilia Kamyshev, Aldo Bischi, Henni Ouerdane
Abstract:
The optimal management of a building's microclimate to satisfy the occupants' needs and objectives in terms of comfort, energy efficiency, and costs is particularly challenging. This complexity arises from the non-linear, time-dependent interactions among all the variables of the control problem and the changing internal and external constraints. Focusing on the accurate modeling of the indoor temperature, we propose a data-driven approach to address this challenge. We account for thermal inertia, non-linear effects, small perturbations of the indoor climate dynamics caused by ventilation and weather variations, as well as for the stochastic nature of the control system due to the observed noise in the input signal. Since the prohibitive cost of quality data acquisition and processing limits the implementation of data-driven approaches for real-life problems, we applied a method that merges several Bayesian machine learning and deep learning architectures that are suitable for predicting complex system dynamics, while relaxing the dataset quality requirements. Our framework includes a built-in deep Kalman filter, which makes it deployable even with low-accuracy temperature sensors. It achieves state-of-the-art performance, best performing with a 150-minute prediction horizon with an RMSE of 0.2455, an MAE of 0.162, and an $R^2$ of 0.926. The model's performance remains consistent even when exposed to highly noisy data. Finally, we show how our approach can be extended to other applications including demand response event duration prediction and equipment failure detection.
Authors:Jialun Pei, Diandian Guo, Donghui Yang, Zhixi Li, Yuxin Feng, Long Ma, Bo Du, Pheng-Ann Heng
Abstract:
In laparoscopic surgery, a clear and high-quality visual field is critical for surgeons to make accurate intraoperative decisions. However, persistent visual degradation, including smoke generated by energy devices, lens fogging from thermal gradients, and lens contamination due to blood or tissue fluid splashes during surgical procedures, severely impair visual clarity. These degenerations can seriously hinder surgical workflow and pose risks to patient safety. To systematically investigate and address various forms of surgical scene degradation, we introduce a real-world open-source surgical image restoration dataset covering laparoscopic environments, called SurgClean, which involves multi-type image restoration tasks, e.g., desmoking, defogging, and desplashing. SurgClean comprises 1,020 images with diverse degradation types and corresponding paired reference labels. Based on SurgClean, we establish a standardized evaluation benchmark and provide performance for 22 representative generic task-specific image restoration approaches, including 12 generic and 10 task-specific image restoration approaches. Experimental results reveal substantial performance gaps relative to clinical requirements, highlighting a critical opportunity for algorithm advancements in intelligent surgical restoration. Furthermore, we explore the degradation discrepancies between surgical and natural scenes from structural perception and semantic understanding perspectives, providing fundamental insights for domain-specific image restoration research. Our work aims to empower the capabilities of restoration algorithms to increase surgical environments and improve the efficiency of clinical procedures.
Authors:Ge Meng, Zhongnan Cai, Jingyan Tu, Yingying Wang, Chenxin Li, Yue Huang, Xinghao Ding
Abstract:
Panchromatic (PAN) -assisted Dual-Camera Compressive Hyperspectral Imaging (DCCHI) is a key technology in snapshot hyperspectral imaging. Existing research primarily focuses on exploring spectral information from 2D compressive measurements and spatial information from PAN images in an explicit manner, leading to a bottleneck in HSI reconstruction. Various physical factors, such as temperature, emissivity, and multiple reflections between objects, play a critical role in the process of a sensor acquiring hyperspectral thermal signals. Inspired by this, we attempt to investigate the interrelationships between physical properties to provide deeper theoretical insights for HSI reconstruction. In this paper, we propose a Physics-Informed Cross-Modal State Space Model Network (PCMamba) for DCCHI, which incorporates the forward physical imaging process of HSI into the linear complexity of Mamba to facilitate lightweight and high-quality HSI reconstruction. Specifically, we analyze the imaging process of hyperspectral thermal signals to enable the network to disentangle the three key physical properties-temperature, emissivity, and texture. By fully exploiting the potential information embedded in 2D measurements and PAN images, the HSIs are reconstructed through a physics-driven synthesis process. Furthermore, we design a Cross-Modal Scanning Mamba Block (CSMB) that introduces inter-modal pixel-wise interaction with positional inductive bias by cross-scanning the backbone features and PAN features. Extensive experiments conducted on both real and simulated datasets demonstrate that our method significantly outperforms SOTA methods in both quantitative and qualitative metrics.
Authors:Arslan Mazitov, Filippo Bigi, Matthias Kellner, Paolo Pegolo, Davide Tisi, Guillaume Fraux, Sergey Pozdnyakov, Philip Loche, Michele Ceriotti
Abstract:
Machine-learning interatomic potentials (MLIPs) have greatly extended the reach of atomic-scale simulations, offering the accuracy of first-principles calculations at a fraction of the cost. Leveraging large quantum mechanical databases and expressive architectures, recent ''universal'' models deliver qualitative accuracy across the periodic table but are often biased toward low-energy configurations. We introduce PET-MAD, a generally applicable MLIP trained on a dataset combining stable inorganic and organic solids, systematically modified to enhance atomic diversity. Using a moderate but highly-consistent level of electronic-structure theory, we assess PET-MAD's accuracy on established benchmarks and advanced simulations of six materials. Despite the small training set and lightweight architecture, PET-MAD is competitive with state-of-the-art MLIPs for inorganic solids, while also being reliable for molecules, organic materials, and surfaces. It is stable and fast, enabling the near-quantitative study of thermal and quantum mechanical fluctuations, functional properties, and phase transitions out of the box. It can be efficiently fine-tuned to deliver full quantum mechanical accuracy with a minimal number of targeted calculations.
Authors:Fabian Raisch, Thomas Krug, Christoph Goebel, Benjamin Tischler
Abstract:
Transfer Learning (TL) is an emerging field in modeling building thermal dynamics. This method reduces the data required for a data-driven model of a target building by leveraging knowledge from a source building. Consequently, it enables the creation of data-efficient models that can be used for advanced control and fault detection & diagnosis. A major limitation of the TL approach is its inconsistent performance across different sources. Although accurate source-building selection for a target is crucial, it remains a persistent challenge. We present GenTL, a general transfer learning model for single-family houses in Central Europe. GenTL can be efficiently fine-tuned to a large variety of target buildings. It is pretrained on a Long Short-Term Memory (LSTM) network with data from 450 different buildings. The general transfer learning model eliminates the need for source-building selection by serving as a universal source for fine-tuning. Comparative analysis with conventional single-source to single-target TL demonstrates the efficacy and reliability of the general pretraining approach. Testing GenTL on 144 target buildings for fine-tuning reveals an average prediction error (RMSE) reduction of 42.1 % compared to fine-tuning single-source models.
Authors:Mathis Bode, Damian Alvarez, Paul Fischer, Christos E. Frouzakis, Jens Henrik Göbbert, Joseph A. Insley, Yu-Hsiang Lan, Victor A. Mateevitsi, Misun Min, Michael E. Papka, Silvio Rizzi, Roshan J. Samuel, Jörg Schumacher
Abstract:
Turbulent heat and momentum transfer processes due to thermal convection cover many scales and are of great importance for several natural and technical flows. One consequence is that a fully resolved three-dimensional analysis of these turbulent transfers at high Rayleigh numbers, which includes the boundary layers, is possible only using supercomputers. The visualization of these dynamics poses an additional hurdle since the thermal and viscous boundary layers in thermal convection fluctuate strongly. In order to track these fluctuations continuously, data must be tapped at high frequency for visualization, which is difficult to achieve using conventional methods. This paper makes two main contributions in this context. First, it discusses the simulations of turbulent Rayleigh-Bénard convection up to Rayleigh numbers of $Ra=10^{12}$ computed with NekRS on GPUs. The largest simulation was run on 840 nodes with 3360 GPU on the JUWELS Booster supercomputer. Secondly, an in-situ workflow using ASCENT is presented, which was successfully used to visualize the high-frequency turbulent fluctuations.
Authors:Max Langtry, Chaoqun Zhuang, Rebecca Ward, Nikolas Makasis, Monika J. Kreitmair, Zack Xuereb Conti, Domenic Di Francesco, Ruchi Choudhary
Abstract:
The use of data collection to support decision making through the reduction of uncertainty is ubiquitous in the management, operation, and design of building energy systems. However, no existing studies in the building energy systems literature have quantified the economic benefits of data collection strategies to determine whether they are worth their cost. This work demonstrates that Value of Information analysis (VoI), a Bayesian Decision Analysis framework, provides a suitable methodology for quantifying the benefits of data collection. Three example decision problems in building energy systems are studied: air-source heat pump maintenance scheduling, ventilation scheduling for indoor air quality, and ground-source heat pump system design. Smart meters, occupancy monitoring systems, and ground thermal tests are shown to be economically beneficial for supporting these decisions respectively. It is proposed that further study of VoI in building energy systems would allow expenditure on data collection to be economised and prioritised, avoiding wastage.
Authors:Rudra Biswas, Jiahui Duan, Shan Deng, Xuezhong Niu, Yixin Qin, Prapti Panigrahi, Varun Parekh, Rajiv Joshi, Kai Ni, Vijaykrishnan Narayanan
Abstract:
This work presents a novel approach to configure 2T-nC ferroelectric RAM (FeRAM) for performing single cell logic-in-memory operations, highlighting its advantages in energy-efficient computation over conventional DRAM-based approaches. Unlike conventional 1T-1C dynamic RAM (DRAM), which incurs refresh overhead, 2T-nC FeRAM offers a promising alternative as a non-volatile memory solution with low energy consumption. Our key findings include the potential of quasi-nondestructive readout (QNRO) sensing in 2T-nC FeRAM for logic-in-memory (LiM) applications, demonstrating its inherent capability to perform inverting logic without requiring external modifications, a feature absent in traditional 1T-1C DRAM. We successfully implement the MINORITY function within a single cell of 2T-nC FeRAM, enabling universal NAND and NOR logic, validated through SPICE simulations and experimental data. Additionally, the research investigates the feasibility of 3D integration with 2T-nC FeRAM, showing substantial improvements in storage and computational density, facilitating bulk-bitwise computation. Our evaluation of eight real-world, data-intensive applications reveals that 2T-nC FeRAM achieves 2x higher performance and 2.5x lower energy consumption compared to DRAM. Furthermore, the thermal stability of stacked 2T-nC FeRAM is validated, confirming its reliable operation when integrated on a compute die. These findings emphasize the advantages of 2T-nC FeRAM for LiM, offering superior performance and energy efficiency over conventional DRAM.
Authors:Jordi Grau-Haro, Ruben Ribes-Serrano, Javier Naranjo-Alcazar, Marta Garcia-Ballesteros, Pedro Zuccarello
Abstract:
Convolutional Neural Networks (CNNs) have demonstrated exceptional performance in audio tagging tasks. However, deploying these models on resource-constrained devices like the Raspberry Pi poses challenges related to computational efficiency and thermal management. In this paper, a comprehensive evaluation of multiple convolutional neural network (CNN) architectures for audio tagging on the Raspberry Pi is conducted, encompassing all 1D and 2D models from the Pretrained Audio Neural Networks (PANNs) framework, a ConvNeXt-based model adapted for audio classification, as well as MobileNetV3 architectures. In addition, two PANNs-derived networks, CNN9 and CNN13, recently proposed, are also evaluated. To enhance deployment efficiency and portability across diverse hardware platforms, all models are converted to the Open Neural Network Exchange (ONNX) format. Unlike previous works that focus on a single model, our analysis encompasses a broader range of architectures and involves continuous 24-hour inference sessions to assess performance stability. Our experiments reveal that, with appropriate model selection and optimization, it is possible to maintain consistent inference latency and manage thermal behavior effectively over extended periods. These findings provide valuable insights for deploying audio tagging models in real-world edge computing scenarios.
Authors:Piotr BiaÅas, Piotr Korcyl, Tomasz Stebel, Dawid Zapolski
Abstract:
We present an application of autoregressive neural networks to Monte Carlo simulations of quantum spin chains using the correspondence with classical two-dimensional spin systems. We use a hierarchy of neural networks capable of estimating conditional probabilities of consecutive spins to evaluate elements of reduced density matrices directly. Using the Ising chain as an example, we calculate the continuum limit of the ground state's von Neumann and Rényi bipartite entanglement entropies of an interval built of up to 5 spins. We demonstrate that our architecture is able to estimate all the needed matrix elements with just a single training for a fixed time discretization and lattice volume. Our method can be applied to other types of spin chains, possibly with defects, as well as to estimating entanglement entropies of thermal states at non-zero temperature.
Authors:Lukas Meyer, Josef Grün, Maximilian Weiherer, Bernhard Egger, Marc Stamminger, Linus Franke
Abstract:
We present MS-Splatting -- a multi-spectral 3D Gaussian Splatting (3DGS) framework that is able to generate multi-view consistent novel views from images of multiple, independent cameras with different spectral domains. In contrast to previous approaches, our method does not require cross-modal camera calibration and is versatile enough to model a variety of different spectra, including thermal and near-infra red, without any algorithmic changes.
Unlike existing 3DGS-based frameworks that treat each modality separately (by optimizing per-channel spherical harmonics) and therefore fail to exploit the underlying spectral and spatial correlations, our method leverages a novel neural color representation that encodes multi-spectral information into a learned, compact, per-splat feature embedding. A shallow multi-layer perceptron (MLP) then decodes this embedding to obtain spectral color values, enabling joint learning of all bands within a unified representation.
Our experiments show that this simple yet effective strategy is able to improve multi-spectral rendering quality, while also leading to improved per-spectra rendering quality over state-of-the-art methods. We demonstrate the effectiveness of this new technique in agricultural applications to render vegetation indices, such as normalized difference vegetation index (NDVI).
Authors:Nikolaos Anastasiou, Spyros Kondylatos, Ioannis Papoutsis
Abstract:
Accurate prediction of wildfire spread is crucial for effective risk management, emergency response, and strategic resource allocation. In this study, we present a deep learning (DL)-based framework for forecasting the final extent of burned areas, using data available at the time of ignition. We leverage a spatio-temporal dataset that covers the Mediterranean region from 2006 to 2022, incorporating remote sensing data, meteorological observations, vegetation maps, land cover classifications, anthropogenic factors, topography data, and thermal anomalies. To evaluate the influence of temporal context, we conduct an ablation study examining how the inclusion of pre- and post-ignition data affects model performance, benchmarking the temporal-aware DL models against a baseline trained exclusively on ignition-day inputs. Our results indicate that multi-day observational data substantially improve predictive accuracy. Particularly, the best-performing model, incorporating a temporal window of four days before to five days after ignition, improves both the F1 score and the Intersection over Union by almost 5% in comparison to the baseline on the test dataset. We publicly release our dataset and models to enhance research into data-driven approaches for wildfire modeling and response.
Authors:Siyang Jiang, Bufang Yang, Lilin Xu, Mu Yuan, Yeerzhati Abudunuer, Kaiwei Liu, Liekang Zeng, Hongkai Chen, Zhenyu Yan, Xiaofan Jiang, Guoliang Xing
Abstract:
The rapid advancements in Large Vision Language Models (LVLMs) offer the potential to surpass conventional labeling by generating richer, more detailed descriptions of on-device human behavior understanding (HBU) in low-resolution vision systems, such as depth, thermal, and infrared. However, existing large vision language model (LVLM) approaches are unable to understand low-resolution data well as they are primarily designed for high-resolution data, such as RGB images. A quick fixing approach is to caption a large amount of low-resolution data, but it requires a significant amount of labor-intensive annotation efforts. In this paper, we propose a novel, labor-saving system, Llambda, designed to support low-resolution HBU. The core idea is to leverage limited labeled data and a large amount of unlabeled data to guide LLMs in generating informative captions, which can be combined with raw data to effectively fine-tune LVLM models for understanding low-resolution videos in HBU. First, we propose a Contrastive-Oriented Data Labeler, which can capture behavior-relevant information from long, low-resolution videos and generate high-quality pseudo labels for unlabeled data via contrastive learning. Second, we propose a Physical-Knowledge Guided Captioner, which utilizes spatial and temporal consistency checks to mitigate errors in pseudo labels. Therefore, it can improve LLMs' understanding of sequential data and then generate high-quality video captions. Finally, to ensure on-device deployability, we employ LoRA-based efficient fine-tuning to adapt LVLMs for low-resolution data. We evaluate Llambda using a region-scale real-world testbed and three distinct low-resolution datasets, and the experiments show that Llambda outperforms several state-of-the-art LVLM systems up to $40.03\%$ on average Bert-Score.
Authors:Varun Darshana Parekh, Zachary Wyatt Hazenstab, Srivatsa Rangachar Srinivasa, Krishnendu Chakrabarty, Kai Ni, Vijaykrishnan Narayanan
Abstract:
Chiplet-based architectures and advanced packaging has emerged as transformative approaches in semiconductor design. While conventional physical design for 2.5D heterogeneous systems typically prioritizes wirelength reduction through tight chiplet packing, this strategy creates thermal bottlenecks and intensifies coefficient of thermal expansion (CTE) mismatches, compromising long-term reliability. Addressing these challenges requires holistic consideration of thermal performance, mechanical stress, and interconnect efficiency. We introduce STAMP-2.5D, the first automated floorplanning methodology that simultaneously optimizes these critical factors. Our approach employs finite element analysis to simulate temperature distributions and stress profiles across chiplet configurations while minimizing interconnect wirelength. Experimental results demonstrate that our thermal structural aware automated floorplanning approach reduces overall stress by 11% while maintaining excellent thermal performance with a negligible 0.5% temperature increase and simultaneously reducing total wirelength by 11% compared to temperature-only optimization. Additionally, we conduct an exploratory study on the effects of temperature gradients on structural integrity, providing crucial insights for reliability-conscious chiplet design. STAMP-2.5D establishes a robust platform for navigating critical trade-offs in advanced semiconductor packaging.
Authors:Huthaifa I. Ashqar, Ahmed Jaber, Taqwa I. Alhadidi, Mohammed Elhenawy
Abstract:
This study aims to comprehensively review and empirically evaluate the application of multimodal large language models (MLLMs) and Large Vision Models (VLMs) in object detection for transportation systems. In the first fold, we provide a background about the potential benefits of MLLMs in transportation applications and conduct a comprehensive review of current MLLM technologies in previous studies. We highlight their effectiveness and limitations in object detection within various transportation scenarios. The second fold involves providing an overview of the taxonomy of end-to-end object detection in transportation applications and future directions. Building on this, we proposed empirical analysis for testing MLLMs on three real-world transportation problems that include object detection tasks namely, road safety attributes extraction, safety-critical event detection, and visual reasoning of thermal images. Our findings provide a detailed assessment of MLLM performance, uncovering both strengths and areas for improvement. Finally, we discuss practical limitations and challenges of MLLMs in enhancing object detection in transportation, thereby offering a roadmap for future research and development in this critical area.
Authors:Qihua Liang, Liang Chen, Yaozong Zheng, Jian Nong, Zhiyi Mo, Bineng Zhong
Abstract:
Multi-modal object tracking has attracted considerable attention by integrating multiple complementary inputs (e.g., thermal, depth, and event data) to achieve outstanding performance. Although current general-purpose multi-modal trackers primarily unify various modal tracking tasks (i.e., RGB-Thermal infrared, RGB-Depth or RGB-Event tracking) through prompt learning, they still overlook the effective capture of spatio-temporal cues. In this work, we introduce a novel multi-modal tracking framework based on a mamba-style state space model, termed UBATrack. Our UBATrack comprises two simple yet effective modules: a Spatio-temporal Mamba Adapter (STMA) and a Dynamic Multi-modal Feature Mixer. The former leverages Mamba's long-sequence modeling capability to jointly model cross-modal dependencies and spatio-temporal visual cues in an adapter-tuning manner. The latter further enhances multi-modal representation capacity across multiple feature dimensions to improve tracking robustness. In this way, UBATrack eliminates the need for costly full-parameter fine-tuning, thereby improving the training efficiency of multi-modal tracking algorithms. Experiments show that UBATrack outperforms state-of-the-art methods on RGB-T, RGB-D, and RGB-E tracking benchmarks, achieving outstanding results on the LasHeR, RGBT234, RGBT210, DepthTrack, VOT-RGBD22, and VisEvent datasets.
Authors:Zheng Jiang, Wei Wang, Gaowei Zhang, Yi Wang
Abstract:
Sea Surface Temperature (SST) is crucial for understanding upper-ocean thermal dynamics and ocean-atmosphere interactions, which have profound economic and social impacts. While data-driven models show promise in SST prediction, their black-box nature often limits interpretability and overlooks key physical processes. Recently, physics-informed neural networks have been gaining momentum but struggle with complex ocean-atmosphere dynamics due to 1) inadequate characterization of seawater movement (e.g., coastal upwelling) and 2) insufficient integration of external SST drivers (e.g., turbulent heat fluxes). To address these challenges, we propose SSTODE, a physics-informed Neural Ordinary Differential Equations (Neural ODEs) framework for SST prediction. First, we derive ODEs from fluid transport principles, incorporating both advection and diffusion to model ocean spatiotemporal dynamics. Through variational optimization, we recover a latent velocity field that explicitly governs the temporal dynamics of SST. Building upon ODE, we introduce an Energy Exchanges Integrator (EEI)-inspired by ocean heat budget equations-to account for external forcing factors. Thus, the variations in the components of these factors provide deeper insights into SST dynamics. Extensive experiments demonstrate that SSTODE achieves state-of-the-art performances in global and regional SST forecasting benchmarks. Furthermore, SSTODE visually reveals the impact of advection dynamics, thermal diffusion patterns, and diurnal heating-cooling cycles on SST evolution. These findings demonstrate the model's interpretability and physical consistency.
Authors:Hao Tu, Yebin Wang, Shaoshuai Mou, Huazhen Fang
Abstract:
Electric vertical take-off and landing (eVTOL) aircraft have emerged as a promising solution to transform urban transportation. They present a few technical challenges for battery management, a prominent one of which is the prediction of the power capability of their lithium-ion battery systems. The challenge originates from the high C-rate discharging conditions required during eVTOL flights as well as the complexity of lithium-ion batteries' electro-thermal dynamics. This paper, for the first time, formulates a power limit prediction problem for eVTOL which explicitly considers long prediction horizons and the possible occurrence of emergency landings. We then harness machine learning to solve this problem in two intertwined ways. First, we adopt a dynamic model that integrates physics with machine learning to predict a lithium-ion battery's voltage and temperature behaviors with high accuracy. Second, while performing search for the maximum power, we leverage machine learning to predict the remaining discharge time and use the prediction to accelerate the search with fast computation. Our validation results show the effectiveness of the proposed study for eVTOL operations.
Authors:Hongtao Yang, Bineng Zhong, Qihua Liang, Zhiruo Zhu, Yaozong Zheng, Ning Li
Abstract:
Recently, visual prompt tuning is introduced to RGB-Thermal (RGB-T) tracking as a parameter-efficient finetuning (PEFT) method. However, these PEFT-based RGB-T tracking methods typically rely solely on spatial domain information as prompts for feature extraction. As a result, they often fail to achieve optimal performance by overlooking the crucial role of frequency-domain information in prompt learning. To address this issue, we propose an efficient Visual Fourier Prompt Tracking (named VFPTrack) method to learn modality-related prompts via Fast Fourier Transform (FFT). Our method consists of symmetric feature extraction encoder with shared parameters, visual fourier prompts, and Modality Fusion Prompt Generator that generates bidirectional interaction prompts through multi-modal feature fusion. Specifically, we first use a frozen feature extraction encoder to extract RGB and thermal infrared (TIR) modality features. Then, we combine the visual prompts in the spatial domain with the frequency domain prompts obtained from the FFT, which allows for the full extraction and understanding of modality features from different domain information. Finally, unlike previous fusion methods, the modality fusion prompt generation module we use combines features from different modalities to generate a fused modality prompt. This modality prompt is interacted with each individual modality to fully enable feature interaction across different modalities. Extensive experiments conducted on three popular RGB-T tracking benchmarks show that our method demonstrates outstanding performance.
Authors:Soumyoraj Mallick, Sanchita Ghosh, Tanushree Roy
Abstract:
Battery management systems (BMSs) rely on real-time estimation of battery temperature distribution in battery cells to ensure safe and optimal operation of Lithium-ion batteries (LIBs). However, physical BMS often suffers from memory and computational resource limitations required by highfidelity models. Temperature prediction using physics-based models becomes challenging due to their higher computational time. In contrast, machine learning based approaches offer faster predictions but demand larger memory overhead. In this work, we develop a lightweight and efficient Kolmogorov-Arnold networks (KAN) based thermal model, KAN-Therm, to predict the core temperature of a cylindrical battery. We have compared the memory overhead and computation costs of our method with Multi-layer perceptron (MLP), recurrent neural network (RNN), and long shortterm memory (LSTM) network. Our results show that the proposed KAN-Therm model exhibit the best prediction accuracy with the least memory overhead and computation time.
Authors:Rupak Bose, Chinedu Innocent Nwoye, Jorge Lazo, Joël Lukas Lavanchy, Nicolas Padoy
Abstract:
Intraoperative adverse events (IAEs), such as bleeding or thermal injury, can lead to severe postoperative complications if undetected. However, their rarity results in highly imbalanced datasets, posing challenges for AI-based detection and severity quantification. We propose BetaMixer, a novel deep learning model that addresses these challenges through a Beta distribution-based mixing approach, converting discrete IAE severity scores into continuous values for precise severity regression (0-5 scale). BetaMixer employs Beta distribution-based sampling to enhance underrepresented classes and regularizes intermediate embeddings to maintain a structured feature space. A generative approach aligns the feature space with sampled IAE severity, enabling robust classification and severity regression via a transformer. Evaluated on the MultiBypass140 dataset, which we extended with IAE labels, BetaMixer achieves a weighted F1 score of 0.76, recall of 0.81, PPV of 0.73, and NPV of 0.84, demonstrating strong performance on imbalanced data. By integrating Beta distribution-based sampling, feature mixing, and generative modeling, BetaMixer offers a robust solution for IAE detection and quantification in clinical settings.
Authors:Luca Colagrande, Jayanth Jonnalagadda, Luca Benini
Abstract:
Modern general-purpose accelerators integrate a large number of programmable area- and energy-efficient processing elements (PEs), to deliver high performance while meeting stringent power delivery and thermal dissipation constraints. In this context, PEs are often implemented by scalar in-order cores, which are highly sensitive to pipeline stalls. Traditional software techniques, such as loop unrolling, mitigate the issue at the cost of increased register pressure, limiting flexibility. We propose scalar chaining, a novel hardware-software solution, to address this issue without incurring the drawbacks of traditional software-only techniques. We demonstrate our solution on register-limited stencil codes, achieving >93% FPU utilizations and a 4% speedup and 10% higher energy efficiency, on average, over highly-optimized baselines. Our implementation is fully open source and performance experiments are reproducible using free software.
Authors:Wei-Lun Chen, Chia-Yeh Hsieh, Yu-Hsiang Kao, Kai-Chun Liu, Sheng-Yu Peng, Yu Tsao
Abstract:
This study presents a novel approach to human keypoint detection in low-resolution thermal images using transfer learning techniques. We introduce the first application of the Timed Up and Go (TUG) test in thermal image computer vision, establishing a new paradigm for mobility assessment. Our method leverages a MobileNetV3-Small encoder and a ViTPose decoder, trained using a composite loss function that balances latent representation alignment and heatmap accuracy. The model was evaluated using the Object Keypoint Similarity (OKS) metric from the COCO Keypoint Detection Challenge. The proposed model achieves better performance with AP, AP50, and AP75 scores of 0.861, 0.942, and 0.887 respectively, outperforming traditional supervised learning approaches like Mask R-CNN and ViTPose-Base. Moreover, our model demonstrates superior computational efficiency in terms of parameter count and FLOPS. This research lays a solid foundation for future clinical applications of thermal imaging in mobility assessment and rehabilitation monitoring.
Authors:Sanchita Ghosh, Soumyoraj Mallick, Tanushree Roy
Abstract:
Monitoring of internal short circuit (ISC) in Lithium-ion battery packs is imperative to safe operations, optimal performance, and extension of pack life. Since ISC in one of the modules inside a battery pack can eventually lead to thermal runaway, it is crucial to detect its early onset. However, the inaccuracy and aging variability of battery models and the unavailability of adequate ISC datasets pose several challenges for both model-based and data-driven approaches. Thus, in this paper, we proposed a model-free Koopman Mode-based module-level ISC detection algorithm for battery packs. The algorithm adopts two parallel Koopman mode generation schemes with the Arnoldi algorithm to capture the Kullback-Leibler divergence-based distributional deviations in Koopman mode statistics in the presence of ISC. Our proposed algorithm utilizes module-level voltage measurements to accurately identify the shorted battery module of the pack without using specific battery models or pre-training with historical battery data. Furthermore, we presented two case studies on shorted battery module detection under both resting and charging conditions. The simulation results illustrated the sensitivity of the proposed algorithm toward ISC and the robustness against measurement noise.
Authors:Sonia Dupuis, Nando Metzger, Konrad Schindler, Frank Göttsche, Stefan Wunderle
Abstract:
Land surface temperature (LST) is an essential climate variable (ECV) crucial for understanding land-atmosphere energy exchange and monitoring climate change, especially in the rapidly warming Arctic. Long-term satellite-based LST records, such as those derived from the Advanced Very High Resolution Radiometer (AVHRR), are essential for detecting climate trends. However, the coarse spatial resolution of AVHRR's global area coverage (GAC) data limit their utility for analyzing fine-scale permafrost dynamics and other surface processes in the Arctic. This paper presents a new 42 years pan-Arctic LST dataset, downscaled from AVHRR GAC to 1 km with a super-resolution algorithm based on a deep anisotropic diffusion model. The model is trained on MODIS LST data, using coarsened inputs and native-resolution outputs, guided by high-resolution land cover, digital elevation, and vegetation height maps. The resulting dataset provides twice-daily, 1 km LST observations for the entire pan-Arctic region over four decades. This enhanced dataset enables improved modelling of permafrost, reconstruction of near-surface air temperature, and assessment of surface mass balance of the Greenland Ice Sheet. Additionally, it supports climate monitoring efforts in the pre-MODIS era and offers a framework adaptable to future satellite missions for thermal infrared observation and climate data record continuity.
Authors:Jiachen Li, Shihao Li, Dongmei Chen
Abstract:
Real-time model-based control of high-dimensional nonlinear systems faces computational intractability, while traditional reduced-order model (ROM) control requires manual expert tuning without online adaptation. We propose AURORA (\textbf{A}utonomous \textbf{U}pdating of \textbf{RO}M and Controller via \textbf{R}ecursive \textbf{A}daptation), a multi-agent LLM framework automating ROM-based controller design with online adaptation. AURORA employs five specialized agents collaborating through iterative generation-judge-revision cycles, with an Evaluation Agent diagnosing degradation sources and routing corrections appropriately. Validated on eight benchmark systems spanning mechanical assemblies, thermal PDEs, and robots. Comparative evaluation across five state-of-the-art LLMs demonstrates high autonomy with minimal intervention, establishing practical viability for autonomous control design.
Authors:Jovana Kovačević, Felix Langner, Erfan Tajalli-Ardekani, Marvin Dorn, Simon Waczowicz, Ralf Mikut, Jörg Matthes, Hüseyin K. Çakmak, Veit Hagenmeyer
Abstract:
Integrating flexible loads and storage systems into the residential sector contributes to the alignment of volatile renewable generation with demand. Besides batteries serving as a short-term storage solution, residential buildings can benefit from a Hydrogen (H2) storage system, allowing seasonal shifting of renewable energy. However, as the initial costs of H2 systems are high, coupling a Fuel Cell (FC) with a Heat Pump (HP) can contribute to the size reduction of the H2 system. The present study develops a Comfort-Oriented Energy Management System for Residential Buildings (ComEMS4Build) comprising Photovoltaics (PV), Battery Energy Storage System (BESS), and H2 storage, where FC and HP are envisioned as complementary technologies. The fuzzy-logic-based ComEMS4Build is designed and evaluated over a period of 12 weeks in winter for a family household building in Germany using a semi-synthetic modeling approach. The Rule-Based Control (RBC), which serves as a lower benchmark, is a scheduler designed to require minimal inputs for operation. The Model Predictive Control (MPC) is intended as a cost-optimal benchmark with an ideal forecast. The results show that ComEMS4Build, similar to MPC, does not violate the thermal comfort of occupants in 10 out of 12 weeks, while RBC has a slightly higher median discomfort of 0.68 Kh. The ComEMS4Build increases the weekly electricity costs by 12.06 EUR compared to MPC, while RBC increases the weekly costs by 30.14 EUR. The ComEMS4Build improves the Hybrid Energy Storage System (HESS) utilization and energy exchange with the main grid compared to the RBC. However, when it comes to the FC operation, the RBC has an advantage, as it reduces the toggling counts by 3.48% and working hours by 7.59% compared to MPC...
Authors:Zhen Huang, Hong Wang, Wenkai Yang, Muxi Tang, Depeng Xie, Ting-Jung Lin, Yu Zhang, Wei W. Xing, Lei He
Abstract:
Thermal management in 3D ICs is increasingly challenging due to higher power densities. Traditional PDE-solving-based methods, while accurate, are too slow for iterative design. Machine learning approaches like FNO provide faster alternatives but suffer from high-frequency information loss and high-fidelity data dependency. We introduce Self-Attention U-Net Fourier Neural Operator (SAU-FNO), a novel framework combining self-attention and U-Net with FNO to capture long-range dependencies and model local high-frequency features effectively. Transfer learning is employed to fine-tune low-fidelity data, minimizing the need for extensive high-fidelity datasets and speeding up training. Experiments demonstrate that SAU-FNO achieves state-of-the-art thermal prediction accuracy and provides an 842x speedup over traditional FEM methods, making it an efficient tool for advanced 3D IC thermal simulations.
Authors:Fabrizio Orlando, Deborah Volpe, Giacomo Orlandi, Mariagrazia Graziano, Fabrizio Riente, Marco Vacca
Abstract:
Combinatorial Optimization (CO) problems exhibit exponential complexity, making their resolution challenging. Simulated Adiabatic Bifurcation (aSB) is a quantum-inspired algorithm to obtain approximate solutions to largescale CO problems written in the Ising form. It explores the solution space by emulating the adiabatic evolution of a network of Kerr-nonlinear parametric oscillators (KPOs), where each oscillator represents a variable in the problem. The optimal solution corresponds to the ground state of this system. A key advantage of this approach is the possibility of updating multiple variables simultaneously, making it particularly suited for hardware implementation. To enhance solution quality and convergence speed, variations of the algorithm have been proposed in the literature, including ballistic (bSB), discrete (dSB), and thermal (HbSB) versions. In this work, we have comprehensively analyzed dSB, bSB, and HbSB using dedicated software models, evaluating the feasibility of using a fixed-point representation for hardware implementation. We then present an opensource hardware architecture implementing the dSB algorithm for Field-Programmable Gate Arrays (FPGAs). The design allows users to adjust the degree of algorithmic parallelization based on their specific requirements. A proof-of-concept implementation that solves 256-variable problems was achieved on an AMD Kria KV260 SoM, a low-tier FPGA, validated using well-known max-cut and knapsack problems.
Authors:Wolfgang Rannetbauer, Simon Hubmer, Carina Hambrock, Ronny Ramlau
Abstract:
The implementation of thermally sprayed components in steel manufacturing presents challenges for production and plant maintenance. While enhancing performance through specialized surface properties, these components may encounter difficulties in meeting modified requirements due to standardization in the refurbishment process. This article proposes updating the established coating process for thermally spray coated components for steel manufacturing (TCCSM) by integrating real-time data analytics and predictive quality management. Two essential components--the data aggregator and the quality predictor--are designed through continuous process monitoring and the application of data-driven methodologies to meet the dynamic demands of the evolving steel landscape. The quality predictor is powered by the simple and effective multiple kernel learning strategy with the goal of realizing predictive quality. The data aggregator, designed with sensors, flow meters, and intelligent data processing for the thermal spray coating process, is proposed to facilitate real-time analytics. The performance of this combination was verified using small-scale tests that enabled not only the accurate prediction of coating quality based on the collected data but also proactive notification to the operator as soon as significant deviations are identified.
Authors:Xingyuan Li, Ruichao Hou, Tongwei Ren, Gangshan Wu
Abstract:
Existing RGB-thermal salient object detection (RGB-T SOD) methods aim to identify visually significant objects by leveraging both RGB and thermal modalities to enable robust performance in complex scenarios, but they often suffer from limited generalization due to the constrained diversity of available datasets and the inefficiencies in constructing multi-modal representations. In this paper, we propose a novel prompt learning-based RGB-T SOD method, named KAN-SAM, which reveals the potential of visual foundational models for RGB-T SOD tasks. Specifically, we extend Segment Anything Model 2 (SAM2) for RGB-T SOD by introducing thermal features as guiding prompts through efficient and accurate Kolmogorov-Arnold Network (KAN) adapters, which effectively enhance RGB representations and improve robustness. Furthermore, we introduce a mutually exclusive random masking strategy to reduce reliance on RGB data and improve generalization. Experimental results on benchmarks demonstrate superior performance over the state-of-the-art methods.
Authors:Yun Li, Jicheng Shi, Colin N. Jones, Neil Yorke-Smith, Tamas Keviczky
Abstract:
Noise pollution from heat pumps (HPs) has been an emerging concern to their broader adoption, especially in densely populated areas. This paper explores a model predictive control (MPC) approach for building climate control, aimed at minimizing the noise nuisance generated by HPs. By exploiting a piecewise linear approximation of HP noise patterns and assuming linear building thermal dynamics, the proposed design can be generalized to handle various HP acoustic patterns with mixed-integer linear programming (MILP). Additionally, two computationally efficient options for defining the noise cost function in the proposed MPC design are discussed. Numerical experiments on a high-fidelity building simulator are performed to demonstrate the viability and effectiveness of the proposed design. Simulation results show that the proposed approach can effectively reduce the noise pollution caused by HPs with negligible energy cost increase.
Authors:Yang Yang, Mingjiao Yan, Zongliang Zhang, Dengmiao Hao, Xuedong Chen, Weixiong Chen
Abstract:
This work develops a polygonal finite element method (PFEM) for the analysis of steady-state and transient thermal stresses in two dimensional continua. The method employs Wachspress rational basis functions to construct conforming interpolations over arbitrary convex polygonal meshes, providing enhanced geometric flexibility and accuracy in capturing complex boundary conditions and heterogeneous material behavior. A quadtree-based acceleration strategy is introduced to significantly reduce computational cost through the reuse of precomputed stiffness and mass matrices. The PFEM is implemented in ABAQUS via a user-defined element (UEL) framework. Comprehensive benchmark problems, including multi-scale and non-matching mesh scenarios, are conducted to verify the accuracy, convergence properties, and computational efficiency of the method. Results indicate that the proposed PFEM offers notable advantages over conventional FEM in terms of mesh adaptability, solution quality, and runtime performance. The method shows strong potential for large-scale simulations involving thermal-mechanical coupling, complex geometries, and multi-resolution modeling.
Authors:Sota Iwabuchi, Ryoya Onishi, Shun Suzuki, Takaaki Kamigaki, Yasutoshi Makino, Hiroyuki Shinoda
Abstract:
In this study, we propose a non-contact thermal presentation method using airborne ultrasound. We generate strong sound field directly on the human skin and present a perceivable temperature rise. The proposed method enables simultaneous presentation of mechanical and thermal stimuli. In preliminary experiments, we confirmed that temperature increase of 5.4 ${}^\circ$C occurs at the palm after 5.0 s.
Authors:Aniket Datar, Anuj Pokhrel, Mohammad Nazeri, Madhan B. Rao, Chenhui Pan, Yufan Zhang, Andre Harrison, Maggie Wigness, Philip R. Osteen, Jinwei Ye, Xuesu Xiao
Abstract:
Long-duration, off-road, autonomous missions require robots to continuously perceive their surroundings regardless of the ambient lighting conditions. Most existing autonomy systems heavily rely on active sensing, e.g., LiDAR, RADAR, and Time-of-Flight sensors, or use (stereo) visible light imaging sensors, e.g., color cameras, to perceive environment geometry and semantics. In scenarios where fully passive perception is required and lighting conditions are degraded to an extent that visible light cameras fail to perceive, most downstream mobility tasks such as obstacle avoidance become impossible. To address such a challenge, this paper presents a Multi-Modal Passive Perception dataset, M2P2, to enable off-road mobility in low-light to no-light conditions. We design a multi-modal sensor suite including thermal, event, and stereo RGB cameras, GPS, two Inertia Measurement Units (IMUs), as well as a high-resolution LiDAR for ground truth, with a novel multi-sensor calibration procedure that can efficiently transform multi-modal perceptual streams into a common coordinate system. Our 10-hour, 32 km dataset also includes mobility data such as robot odometry and actions and covers well-lit, low-light, and no-light conditions, along with paved, on-trail, and off-trail terrain. Our results demonstrate that off-road mobility is possible through only passive perception in extreme low-light conditions using end-to-end learning and classical planning. The project website can be found at https://cs.gmu.edu/~xiao/Research/M2P2/
Authors:Dmitrii Torbunov, Yihui Ren, Lijun Wu, Yimei Zhu
Abstract:
Uncertainty quantification is critical in scientific inverse problems to distinguish identifiable parameters from those that remain ambiguous given available measurements. The Conditional Diffusion Model-based Inverse Problem Solver (CDI) has previously demonstrated effective probabilistic inference for one-dimensional temporal signals, but its applicability to higher-dimensional spatial data remains unexplored. We extend CDI to two-dimensional spatial conditioning, enabling probabilistic parameter inference directly from spatial observations. We validate this extension on convergent beam electron diffraction (CBED) parameter inference - a challenging multi-parameter inverse problem in materials characterization where sample geometry, electronic structure, and thermal properties must be extracted from 2D diffraction patterns. Using simulated CBED data with ground-truth parameters, we demonstrate that CDI produces well-calibrated posterior distributions that accurately reflect measurement constraints: tight distributions for well-determined quantities and appropriately broad distributions for ambiguous parameters. In contrast, standard regression methods - while appearing accurate on aggregate metrics - mask this underlying uncertainty by predicting training set means for poorly constrained parameters. Our results confirm that CDI successfully extends from temporal to spatial domains, providing the genuine uncertainty information required for robust scientific inference.
Authors:Tianyi Zhao, Jiawen Xi, Linhui Xiao, Junnan Li, Xue Yang, Maoxun Yuan, Xingxing Wei
Abstract:
Visual Grounding (VG) aims to localize specific objects in an image according to natural language expressions, serving as a fundamental task in vision-language understanding. However, existing VG benchmarks are mostly derived from datasets collected under clean environments, such as COCO, where scene diversity is limited. Consequently, they fail to reflect the complexity of real-world conditions, such as changes in illumination, weather, etc., that are critical to evaluating model robustness and generalization in safety-critical applications. To address these limitations, we present RGBT-Ground, the first large-scale visual grounding benchmark built for complex real-world scenarios. It consists of spatially aligned RGB and Thermal infrared (TIR) image pairs with high-quality referring expressions, corresponding object bounding boxes, and fine-grained annotations at the scene, environment, and object levels. This benchmark enables comprehensive evaluation and facilitates the study of robust grounding under diverse and challenging conditions. Furthermore, we establish a unified visual grounding framework that supports both uni-modal (RGB or TIR) and multi-modal (RGB-TIR) visual inputs. Based on it, we propose RGBT-VGNet, a simple yet effective baseline for fusing complementary visual modalities to achieve robust grounding. We conduct extensive adaptations to the existing methods on RGBT-Ground. Experimental results show that our proposed RGBT-VGNet significantly outperforms these adapted methods, particularly in nighttime and long-distance scenarios. All resources will be publicly released to promote future research on robust visual grounding in complex real-world environments.
Authors:Che-Chia Chang, Te-Sheng Lin, Ming-Chih Lai
Abstract:
The Stefan problem is a classical free-boundary problem that models phase-change processes and poses computational challenges due to its moving interface and nonlinear temperature-phase coupling. In this work, we develop a physics-informed neural network framework for solving two-phase Stefan problems. The proposed method explicitly tracks the interface motion and enforces the discontinuity in the temperature gradient across the interface while maintaining global consistency of the temperature field. Our approach employs two neural networks: one representing the moving interface and the other for the temperature field. The interface network allows rapid categorization of thermal diffusivity in the spatial domain, which is a crucial step for selecting training points for the temperature network. The temperature network's input is augmented with a modified zero-level set function to accurately capture the jump in its normal derivative across the interface. Numerical experiments on two-phase dynamical Stefan problems demonstrate the superior accuracy and effectiveness of our proposed method compared with the ones obtained by other neural network methodology in literature. The results indicate that the proposed framework offers a robust and flexible alternative to traditional numerical methods for solving phase-change problems governed by moving boundaries. In addition, the proposed method can capture an unstable interface evolution associated with the Mullins-Sekerka instability.
Authors:Hong Wang, Wenkai Yang, Jie Wang, Huanshuo Dong, Zijie Geng, Zhen Huang, Depeng Xie, Zhezheng Hao, Hande Dong
Abstract:
Recent advances in data-driven approaches, such as neural operators (NOs), have shown substantial efficacy in reducing the solution time for integrated circuit (IC) thermal simulations. However, a limitation of these approaches is requiring a large amount of high-fidelity training data, such as chip parameters and temperature distributions, thereby incurring significant computational costs. To address this challenge, we propose a novel algorithm for the generation of IC thermal simulation data, named block Krylov and operator action (BlocKOA), which simultaneously accelerates the data generation process and enhances the precision of generated data. BlocKOA is specifically designed for IC applications. Initially, we use the block Krylov algorithm based on the structure of the heat equation to quickly obtain a few basic solutions. Then we combine them to get numerous temperature distributions that satisfy the physical constraints. Finally, we apply heat operators on these functions to determine the heat source distributions, efficiently generating precise data points. Theoretical analysis shows that the time complexity of BlocKOA is one order lower than the existing method. Experimental results further validate its efficiency, showing that BlocKOA achieves a 420-fold speedup in generating thermal simulation data for 5000 chips with varying physical parameters and IC structures. Even with just 4% of the generation time, data-driven approaches trained on the data generated by BlocKOA exhibits comparable performance to that using the existing method.
Authors:Donato Francesco Falcone, Stephan Menzel, Tommaso Stecconi, Matteo Galetta, Antonio La Porta, Bert Jan Offrein, Valeria Bragaglia
Abstract:
The recent co-optimization of memristive technologies and programming algorithms enabled neural networks training with in-memory computing systems. In this context, novel analog filamentary conductive-metal-oxide (CMO)/HfOx redox-based resistive switching memory (ReRAM) represents a key technology. Despite device performance enhancements reported in literature, the underlying mechanism behind resistive switching is not fully understood. This work presents the first physics-based analytical model of the current transport and of the resistive switching in these devices. As a case study, analog TaOx/HfOx ReRAM devices are considered. The current transport is explained by a trap-to-trap tunneling process, and the resistive switching by a modulation of the defect density within the sub-band of the TaOx that behaves as electric field and temperature confinement layer. The local temperature and electric field distributions are derived from the solution of the electric and heat transport equations in a 3D finite element ReRAM model. The intermediate resistive states are described as a gradual modulation of the TaOx defect density, which results in a variation of its electrical conductivity. The drift-dynamics of ions during the resistive switching is analytically described, allowing the estimation of defect migration energies in the TaOx layer. Moreover, the role of the electro-thermal properties of the CMO layer is unveiled. The proposed analytical model accurately describes the experimental switching characteristic of analog TaOx/HfOx ReRAM devices, increasing the physical understanding and providing the equations necessary for circuit simulations incorporating this technology.
Authors:Alicia Tierz, Jad Mounayer, Beatriz Moya, Francisco Chinesta
Abstract:
Generative thermal design for complex geometries is fundamental in many areas of engineering, yet it faces two main challenges: the high computational cost of high-fidelity simulations and the limitations of conventional generative models. Approaches such as autoencoders (AEs) and variational autoencoders (VAEs) often produce unstructured latent spaces with discontinuities, which restricts their capacity to explore designs and generate physically consistent solutions. To address these limitations, we propose a hybrid framework that combines Variational Rank-Reduction Autoencoders (VRRAEs) with Deep Operator Networks (DeepONets). The VRRAE introduces a truncated SVD within the latent space, leading to continuous, interpretable, and well-structured representations that mitigate posterior collapse and improve geometric reconstruction. The DeepONet then exploits this compact latent encoding in its branch network, together with spatial coordinates in the trunk network, to predict temperature gradients efficiently and accurately. This hybrid approach not only enhances the quality of generated geometries and the accuracy of gradient prediction, but also provides a substantial advantage in inference efficiency compared to traditional numerical solvers. Overall, the study underscores the importance of structured latent representations for operator learning and highlights the potential of combining generative models and operator networks in thermal design and broader engineering applications.
Authors:Steve Chien, Itai Zilberstein, Alberto Candela, David Rijlaarsdam, Tom Hendrix, Aubrey Dunne, Aragon Oriol, Miquel Juan Puig
Abstract:
Dynamic targeting (DT) is a spacecraft autonomy concept in which sensor data is acquired and rapidly analyzed and used to drive subsequent observation. We describe the low Earth orbit application of this approach in which lookahead imagery is analyzed to detect clouds, thermal anomalies, or land use cases to drive higher quality near nadir imaging. Use cases for such a capability include: cloud avoidance, storm hunting, search for planetary boundary layer events, plume study, and beyond. The DT concept requires a lookahead sensor or agility to use a primary sensor in such a mode, edge computing to analyze images rapidly onboard, and a primary followup sensor. Additionally, an inter-satellite or low latency communications link can be leveraged for cross platform tasking. We describe implementation in progress to fly DT in early 2025 on the CogniSAT-6 (Ubotica/Open Cosmos) spacecraft that launched in March 2024 on the SpaceX Transporter-10 launch.
Authors:Mehdi Elahi, Mohamed R. Elshamy, Abdel-Hameed A. Badawy, Ahmad Patooghy
Abstract:
3D-stacked High Bandwidth Memory (HBM) architectures provide high-performance memory interactions to address the well-known performance challenge, namely the memory wall. However, these architectures are susceptible to thermal vulnerabilities due to the inherent vertical adjacency that occurs during the manufacturing process of HBM architectures. We anticipate that adversaries may exploit the intense vertical and lateral adjacency to design and develop thermal performance degradation attacks on the memory banks that host data/instructions from victim applications. In such attacks, the adversary manages to inject short and intense heat pulses from vertically and/or laterally adjacent memory banks, creating a convergent thermal wave that maximizes impact and delays the victim application from accessing its data/instructions. As the attacking application does not access any out-of-range memory locations, it can bypass both design-time security tests and the operating system's memory management policies. In other words, since the attack mimics legitimate workloads, it will be challenging to detect.
Authors:Alexandros Gkillas, Christos Anagnostopoulos, Nikos Piperigkos, Dimitris Tsiktsiris, Theofilos Christodoulou, Theofanis Siamatras, Dimitrios Triantafyllou, Christos Basdekis, Theoktisti Marinopoulou, Panagiotis Lepentsiotis, Elefterios Blitsis, Aggeliki Zacharaki, Nearchos Stylianidis, Leonidas Katelaris, Lamberto Salvan, Aris S. Lalos, Christos Laoudias, Antonios Lalas, Konstantinos Votis
Abstract:
This paper introduces a holistic perception system for internal and external monitoring of autonomous vehicles, with the aim of demonstrating a novel AI-leveraged self-adaptive framework of advanced vehicle technologies and solutions that optimize perception and experience on-board. Internal monitoring system relies on a multi-camera setup designed for predicting and identifying driver and occupant behavior through facial recognition, exploiting in addition a large language model as virtual assistant. Moreover, the in-cabin monitoring system includes AI-empowered smart sensors that measure air-quality and perform thermal comfort analysis for efficient on and off-boarding. On the other hand, external monitoring system perceives the surrounding environment of vehicle, through a LiDAR-based cost-efficient semantic segmentation approach, that performs highly accurate and efficient super-resolution on low-quality raw 3D point clouds. The holistic perception framework is developed in the context of EU's Horizon Europe programm AutoTRUST, and has been integrated and deployed on a real electric vehicle provided by ALKE. Experimental validation and evaluation at the integration site of Joint Research Centre at Ispra, Italy, highlights increased performance and efficiency of the modular blocks of the proposed perception architecture.
Authors:Mohamed R. Elshamy, Mehdi Elahi, Ahmad Patooghy, Abdel-Hameed A. Badawy
Abstract:
Efficient thermal and power management in modern multiprocessor systems-on-chip (MPSoCs) demands accurate power consumption estimation. One of the state-of-the-art approaches, Alternative Blind Power Identification (ABPI), theoretically eliminates the dependence on steady-state temperatures, addressing a major shortcoming of previous approaches. However, ABPI performance has remained unverified in actual hardware implementations. In this study, we conduct the first empirical validation of ABPI on commercial hardware using the NVIDIA Jetson Xavier AGX platform. Our findings reveal that, while ABPI provides computational efficiency and independence from steady-state temperature, it exhibits considerable accuracy deficiencies in real-world scenarios. To overcome these limitations, we introduce a novel approach that integrates Custom Physics-Informed Neural Networks (CPINNs) with the underlying thermal model of ABPI. Our approach employs a specialized loss function that harmonizes physical principles with data-driven learning, complemented by multi-objective genetic algorithm optimization to balance estimation accuracy and computational cost. In experimental validation, CPINN-ABPI achieves a reduction of 84.7\% CPU and 73.9\% GPU in the mean absolute error (MAE) relative to ABPI, with the weighted mean absolute percentage error (WMAPE) improving from 47\%--81\% to $\sim$12\%. The method maintains real-time performance with 195.3~$μ$s of inference time, with similar 85\%--99\% accuracy gains across heterogeneous SoCs.
Authors:Liu Ziyin, Yizhou Xu, Isaac Chuang
Abstract:
With the rapid discovery of emergent phenomena in deep learning and large language models, explaining and understanding their cause has become an urgent need. Here, we propose a rigorous entropic-force theory for understanding the learning dynamics of neural networks trained with stochastic gradient descent (SGD) and its variants. Building on the theory of parameter symmetries and an entropic loss landscape, we show that representation learning is crucially governed by emergent entropic forces arising from stochasticity and discrete-time updates. These forces systematically break continuous parameter symmetries and preserve discrete ones, leading to a series of gradient balance phenomena that resemble the equipartition property of thermal systems. These phenomena, in turn, (a) explain the universal alignment of neural representations between AI models and lead to a proof of the Platonic Representation Hypothesis, and (b) reconcile the seemingly contradictory observations of sharpness- and flatness-seeking behavior of deep learning optimization. Our theory and experiments demonstrate that a combination of entropic forces and symmetry breaking is key to understanding emergent phenomena in deep learning.
Authors:Jiuhong Xiao, Giuseppe Loianno
Abstract:
Geo-localization is an essential component of Unmanned Aerial Vehicle (UAV) navigation systems to ensure precise absolute self-localization in outdoor environments. To address the challenges of GPS signal interruptions or low illumination, Thermal Geo-localization (TG) employs aerial thermal imagery to align with reference satellite maps to accurately determine the UAV's location. However, existing TG methods lack uncertainty measurement in their outputs, compromising system robustness in the presence of textureless or corrupted thermal images, self-similar or outdated satellite maps, geometric noises, or thermal images exceeding satellite maps. To overcome these limitations, this paper presents UASTHN, a novel approach for Uncertainty Estimation (UE) in Deep Homography Estimation (DHE) tasks for TG applications. Specifically, we introduce a novel Crop-based Test-Time Augmentation (CropTTA) strategy, which leverages the homography consensus of cropped image views to effectively measure data uncertainty. This approach is complemented by Deep Ensembles (DE) employed for model uncertainty, offering comparable performance with improved efficiency and seamless integration with any DHE model. Extensive experiments across multiple DHE models demonstrate the effectiveness and efficiency of CropTTA in TG applications. Analysis of detected failure cases underscores the improved reliability of CropTTA under challenging conditions. Finally, we demonstrate the capability of combining CropTTA and DE for a comprehensive assessment of both data and model uncertainty. Our research provides profound insights into the broader intersection of localization and uncertainty estimation. The code and models are publicly available.
Authors:Kazuma Kobayashi, Farid Ahmed, Syed Bahauddin Alam
Abstract:
Real-time monitoring of critical parameters is essential for energy systems' safe and efficient operation. However, traditional sensors often fail and degrade in harsh environments where physical sensors cannot be placed (inaccessible locations). In addition, there are important parameters that cannot be directly measured by sensors. We need machine learning (ML)-based real-time monitoring in those remote locations to ensure system operations. However, traditional ML models struggle to process continuous sensor profile data to fit model requirements, leading to the loss of spatial relationships. Another challenge for real-time monitoring is ``dataset shift" and the need for frequent retraining under varying conditions, where extensive retraining prohibits real-time inference. To resolve these challenges, this study addressed the limitations of real-time monitoring methods by enabling monitoring in locations where physical sensors are impractical to deploy. Our proposed approach, utilizing Multi-Input Operator Network virtual sensors, leverages deep learning to seamlessly integrate diverse data sources and accurately predict key parameters in real-time without the need for additional physical sensors. The approach's effectiveness is demonstrated through thermal-hydraulic monitoring in a nuclear reactor subchannel, achieving remarkable accuracy.
Authors:Israt Zarin Era, Fan Zhou, Ahmed Shoyeb Raihan, Imtiaz Ahmed, Alan Abul-Haj, James Craig, Srinjoy Das, Zhichao Liu
Abstract:
Directed Energy Deposition (DED) offers significant potential for manufacturing complex and multi-material parts. However, internal defects such as porosity and cracks can compromise mechanical properties and overall performance. This study focuses on in-situ monitoring and characterization of melt pools associated with porosity, aiming to improve defect detection and quality control in DED-printed parts. Traditional machine learning approaches for defect identification rely on extensive labeled datasets, often scarce and expensive to generate in real-world manufacturing. To address this, our framework employs self-supervised learning on unlabeled melt pool data using a Vision Transformer-based Masked Autoencoder (MAE) to produce highly representative embeddings. These fine-tuned embeddings are leveraged via transfer learning to train classifiers on a limited labeled dataset, enabling the effective identification of melt pool anomalies. We evaluate two classifiers: (1) a Vision Transformer (ViT) classifier utilizing the fine-tuned MAE Encoder's parameters and (2) the fine-tuned MAE Encoder combined with an MLP classifier head. Our framework achieves overall accuracy ranging from 95.44% to 99.17% and an average F1 score exceeding 80%, with the ViT Classifier slightly outperforming the MAE Encoder Classifier. This demonstrates the scalability and cost-effectiveness of our approach for automated quality control in DED, effectively detecting defects with minimal labeled data.
Authors:Mohamed R. Elshamy, Mehdi Elahi, Ahmad Patooghy, Abdel-Hameed A. Badawy
Abstract:
Fine-grained power estimation in multicore Systems on Chips (SoCs) is crucial for efficient thermal management. BPI (Blind Power Identification) is a recent approach that determines the power consumption of different cores and the thermal model of the chip using only thermal sensor measurements and total power consumption. BPI relies on steady-state thermal data along with a naive initialization in its Non-negative Matrix Factorization (NMF) process, which negatively impacts the power estimation accuracy of BPI. This paper proposes a two-fold approach to reduce these impacts on BPI. First, this paper introduces an innovative approach for NMF initializing, i.e., density-oriented spatial clustering to identify centroid data points of active cores as initial values. This enhances BPI accuracy by focusing on dense regions in the dataset and excluding outlier data points. Second, it proposes the utilization of steady-state temperature data points to enhance the power estimation accuracy by leveraging the physical relationship between temperature and power consumption. Our extensive simulations of real-world cases demonstrate that our approach enhances BPI accuracy in estimating the power per core with no performance cost. For instance, in a four-core processor, the proposed approach reduces the error rate by 76% compared to BPI and by 24% compared to the state of the art in the literature, namely, Blind Power Identification Steady State (BPISS). The results underline the potential of integrating advanced clustering techniques in thermal model identification, paving the way for more accurate and reliable thermal management in multicores and SoCs.
Authors:Mohamed R. Elshamy, Mehdi Elahi, Ahmad Patooghy, Abdel-Hameed A. Badawy
Abstract:
Modern multicore System-on-Chips (SoCs) feature hardware monitoring mechanisms that measure total power consumption. However, these aggregate measurements are often insufficient for fine-grained thermal and power management. This paper presents an enhanced Clustering Blind Power Identification (ICBPI) approach, designed to improve the sensitivity and robustness of the traditional Blind Power Identification (BPI) method. BPI estimates the power consumption of individual cores and models the thermal behavior of an SoC using only thermal sensor data and total power measurements. The proposed ICBPI approach refines BPI's initialization process, particularly improving the non-negative matrix factorization (NNMF) step, which is critical to the accuracy of BPI. ICBPI introduces density-based spatial clustering of applications with noise (DBSCAN) to better align temperature and power consumption data, thereby providing more accurate power consumption estimates. We validate the ICBPI method through two key tasks. The first task evaluates power estimation accuracy across four different multicore architectures, including a heterogeneous processor. Results show that ICBPI significantly enhances accuracy, reducing error rates by 77.56% compared to the original BPI and by 68.44% compared to the state-of-the-art BPISS method. The second task focuses on improving the detection and localization of malicious thermal sensor attacks in heterogeneous processors. The results demonstrate that ICBPI enhances the security and robustness of multicore SoCs against such attacks.
Authors:Hatef Otroshi Shahreza, Anjith George, Sébastien Marcel
Abstract:
Multimodal Large Language Models (MLLMs) have recently demonstrated strong performance on a wide range of vision-language tasks, raising interest in their potential use for biometric applications. In this paper, we conduct a systematic evaluation of state-of-the-art MLLMs for heterogeneous face recognition (HFR), where enrollment and probe images are from different sensing modalities, including visual (VIS), near infrared (NIR), short-wave infrared (SWIR), and thermal camera. We benchmark multiple open-source MLLMs across several cross-modality scenarios, including VIS-NIR, VIS-SWIR, and VIS-THERMAL face recognition. The recognition performance of MLLMs is evaluated using biometric protocols and based on different metrics, including Acquire Rate, Equal Error Rate (EER), and True Accept Rate (TAR). Our results reveal substantial performance gaps between MLLMs and classical face recognition systems, particularly under challenging cross-spectral conditions, in spite of recent advances in MLLMs. Our findings highlight the limitations of current MLLMs for HFR and also the importance of rigorous biometric evaluation when considering their deployment in face recognition systems.
Authors:Lukas Pfromm, Alish Kanani, Harsh Sharma, Janardhan Rao Doppa, Partha Pratim Pande, Umit Y. Ogras
Abstract:
Due to reduced manufacturing yields, traditional monolithic chips cannot keep up with the compute, memory, and communication demands of data-intensive applications, such as rapidly growing deep neural network (DNN) models. Chiplet-based architectures offer a cost-effective and scalable solution by integrating smaller chiplets via a network-on-interposer (NoI). Fast and accurate simulation approaches are critical to unlocking this potential, but existing methods lack the required accuracy, speed, and flexibility. To address this need, this work presents CHIPSIM, a comprehensive co-simulation framework designed for parallel DNN execution on chiplet-based systems. CHIPSIM concurrently models computation and communication, accurately capturing network contention and pipelining effects that conventional simulators overlook. Furthermore, it profiles the chiplet and NoI power consumptions at microsecond granularity for precise transient thermal analysis. Extensive evaluations with homogeneous/heterogeneous chiplets and different NoI architectures demonstrate the framework's versatility, up to 340% accuracy improvement, and power/thermal analysis capability.
Authors:Hadi Nemati, Pedro Sánchez-MartÃn, Ãlvaro Ortega, Lukas Sigrist, Luis Rouco
Abstract:
This paper proposes the integration of Concentrated Solar Power Plant (CSP) in the Renewable-only virtual power plant (RVPP) for bidding in the electricity day-ahead and secondary reserve markets, as well as trading thermal energy through a heat purchase agreement. A reformulated two-stage robust optimization approach is introduced to account for multiple uncertainties, including electricity prices, non-dispatchable renewable energy sources electrical production, CSP thermal production, and uncertainties in electrical and thermal demand consumption. The provision of energy and reserve by the thermal storage of CSP is modeled using an adjustable approach, which allocates a share of energy for up and down reserves based on the profitability of the RVPP. Simulations are conducted for several case studies to demonstrate the effectiveness and computational efficiency of the proposed approach under different RVPP operator decisions against uncertain parameters and various trading strategies for electricity and thermal energy. The simulation results show that integrating CSP into RVPP enhances RVPP flexibility for both electrical and thermal trading. Furthermore, the results indicate that the profitability of the RVPP increases when all trading options are considered, across different levels of conservatism adopted by the RVPP operator in response to uncertain parameters.
Authors:Philip Arm, Oliver Fischer, Joseph Church, Adrian Fuhrer, Hendrik Kolvenbach, Marco Hutter
Abstract:
Legged robots are promising candidates for exploring challenging areas on low-gravity bodies such as the Moon, Mars, or asteroids, thanks to their advanced mobility on unstructured terrain. However, as planetary robots' power and thermal budgets are highly restricted, these robots need energy-efficient control approaches that easily transfer to multiple gravity environments. In this work, we introduce a reinforcement learning-based control approach for legged robots with gravity-scaled power-optimized reward functions. We use our approach to develop and validate a locomotion controller and a base pose controller in gravity environments from lunar gravity (1.62 m/s2) to a hypothetical super-Earth (19.62 m/s2). Our approach successfully scales across these gravity levels for locomotion and base pose control with the gravity-scaled reward functions. The power-optimized locomotion controller reached a power consumption for locomotion of 23.4 W in Earth gravity on a 15.65 kg robot at 0.4 m/s, a 23 % improvement over the baseline policy. Additionally, we designed a constant-force spring offload system that allowed us to conduct real-world experiments on legged locomotion in lunar gravity. In lunar gravity, the power-optimized control policy reached 12.2 W, 36 % less than a baseline controller which is not optimized for power efficiency. Our method provides a scalable approach to developing power-efficient locomotion controllers for legged robots across multiple gravity levels.
Authors:Jiaqi Zhu, Bikramjit Das, Yong Xie, Nikolaos Pappas, Howard H. Yang
Abstract:
Federated learning facilitates collaborative model training across multiple clients while preserving data privacy. However, its performance is often constrained by limited communication resources, particularly in systems supporting a large number of clients. To address this challenge, integrating over-the-air computations into the training process has emerged as a promising solution to alleviate communication bottlenecks. The system significantly increases the number of clients it can support in each communication round by transmitting intermediate parameters via analog signals rather than digital ones. This improvement, however, comes at the cost of channel-induced distortions, such as fading and noise, which affect the aggregated global parameters. To elucidate these effects, this paper develops a theoretical framework to analyze the performance of over-the-air federated learning in large-scale client scenarios. Our analysis reveals three key advantages of scaling up the number of participating clients: (1) Enhanced Privacy: The mutual information between a client's local gradient and the server's aggregated gradient diminishes, effectively reducing privacy leakage. (2) Mitigation of Channel Fading: The channel hardening effect eliminates the impact of small-scale fading in the noisy global gradient. (3) Improved Convergence: Reduced thermal noise and gradient estimation errors benefit the convergence rate. These findings solidify over-the-air model training as a viable approach for federated learning in networks with a large number of clients. The theoretical insights are further substantiated through extensive experimental evaluations.
Authors:Alish Kanani, Lukas Pfromm, Harsh Sharma, Janardhan Rao Doppa, Partha Pratim Pande, Umit Y. Ogras
Abstract:
Chiplet-based integration enables large-scale systems that combine diverse technologies, enabling higher yield, lower costs, and scalability, making them well-suited to AI workloads. Processing-in-Memory (PIM) has emerged as a promising solution for AI inference, leveraging technologies such as ReRAM, SRAM, and FeFET, each offering unique advantages and trade-offs. A heterogeneous chiplet-based PIM architecture can harness the complementary strengths of these technologies to enable higher performance and energy efficiency. However, scheduling AI workloads across such a heterogeneous system is challenging due to competing performance objectives, dynamic workload characteristics, and power and thermal constraints. To address this need, we propose THERMOS, a thermally-aware, multi-objective scheduling framework for AI workloads on heterogeneous multi-chiplet PIM architectures. THERMOS trains a single multi-objective reinforcement learning (MORL) policy that is capable of achieving Pareto-optimal execution time, energy, or a balanced objective at runtime, depending on the target preferences. Comprehensive evaluations show that THERMOS achieves up to 89% faster average execution time and 57% lower average energy consumption than baseline AI workload scheduling algorithms with only 0.14% runtime and 0.022% energy overhead.
Authors:Harsh Sharma, Janardhan Rao Doppa, Umit Y. Ogras, Partha Pratim Pande
Abstract:
Multi-chiplet architectures enabled by glass interposer offer superior electrical performance, enable higher bus widths due to reduced crosstalk, and have lower capacitance in the redistribution layer than current silicon interposer-based systems. These advantages result in lower energy per bit, higher communication frequencies, and extended interconnect range. However, deformation of the package (warpage) in glass interposer-based systems becomes a critical challenge as system size increases, leading to severe mechanical stress and reliability concerns. Beyond a certain size, conventional packaging techniques fail to manage warpage effectively, necessitating new approaches to mitigate warpage induced bending with scalable performance for glass interposer based multi-chiplet systems. To address these inter-twined challenges, we propose a thermal-, warpage-, and performance-aware design framework that employs architecture and packaging co-optimization. The proposed framework disintegrates the surface and embedded chiplets to balance conflicting design objectives, ensuring optimal trade-offs between performance, power, and structural reliability. Our experiments demonstrate that optimized multi-chiplet architectures from our design framework achieve up to 64.7% performance improvement and 40% power reduction compared to traditional 2.5D systems to execute deep neural network workloads with lower fabrication costs.
Authors:Akash Mahajan, Shivam Chaturvedi, Srijita Das, Wencong Su, Van-Hai Bui
Abstract:
The selection of optimal design for power electronic converter parameters involves balancing efficiency and thermal constraints to ensure high performance without compromising safety. This paper introduces a probabilistic-learning-based stochastic surrogate modeling framework to address this challenge and significantly reduce the time required during the design phase. The approach begins with a neural network classifier that evaluates the feasibility of parameter configurations, effectively filtering out unsafe and/or impractical inputs. Subsequently, a probabilistic prediction model estimates the converter's efficiency and temperature while quantifying prediction uncertainty, providing both performance insights and reliability metrics. Finally, a heuristic optimization-based model is employed to optimize a multi-objective function that maximizes efficiency while adhering to thermal constraints. The optimization process incorporates penalty terms to discourage solutions that violate practical thresholds, ensuring actionable and realistic recommendations. An advanced heuristic optimization method is used to find the optimal solution and is compared with several well-known search algorithms, including Genetic Algorithm (GA), Particle Swarm Optimization (PSO), Simulated Annealing (SA), Tabu-Search (TS), and Stochastic Hill Climbing (SHC). The results demonstrate significant improvements in predictive accuracy and optimization outcomes, offering a robust solution for advancing power electronics design.
Authors:Shanthan Kumar Padisala, Satadru Dey
Abstract:
In autonomous electric vehicles (AEVs), battery energy must be judiciously allocated to satisfy primary propulsion demands and secondary auxiliary demands, particularly the Heating, Ventilation, and Air Conditioning (HVAC) system. This becomes especially critical when the battery is in a low state of charge under cold ambient conditions, and cabin heating and battery preconditioning (prior to actual charging) can consume a significant percentage of available energy, directly impacting the driving range. In such cases, one usually prioritizes propulsion or applies heuristic rules for thermal management, often resulting in suboptimal energy utilization. There is a pressing need for a principled approach that can dynamically allocate battery power in a way that balances thermal comfort, battery health and preconditioning, along with range preservation. This paper attempts to address this issue using real-time Model Predictive Control to optimize the power consumption between the propulsion, HVAC, and battery temperature preparation so that it can be charged immediately once the destination is reached.
Authors:Michele Grimaldi, Patryk Cieslak, Eduardo Ochoa, Vibhav Bharti, Hayat Rajani, Ignacio Carlucho, Maria Koskinopoulou, Yvan R. Petillot, Nuno Gracias
Abstract:
Simulations are highly valuable in marine robotics, offering a cost-effective and controlled environment for testing in the challenging conditions of underwater and surface operations. Given the high costs and logistical difficulties of real-world trials, simulators capable of capturing the operational conditions of subsea environments have become key in developing and refining algorithms for remotely-operated and autonomous underwater vehicles. This paper highlights recent enhancements to the Stonefish simulator, an advanced open-source platform supporting development and testing of marine robotics solutions. Key updates include a suite of additional sensors, such as an event-based camera, a thermal camera, and an optical flow camera, as well as, visual light communication, support for tethered operations, improved thruster modelling, more flexible hydrodynamics, and enhanced sonar accuracy. These developments and an automated annotation tool significantly bolster Stonefish's role in marine robotics research, especially in the field of machine learning, where training data with a known ground truth is hard or impossible to collect.
Authors:Leon Nissen, Philipp Zagar, Vishnu Ravi, Aydin Zahedivash, Lara Marie Reimer, Stephan Jonas, Oliver Aalami, Paul Schmiedmayer
Abstract:
The deployment of Large Language Models (LLM) on mobile devices offers significant potential for medical applications, enhancing privacy, security, and cost-efficiency by eliminating reliance on cloud-based services and keeping sensitive health data local. However, the performance and accuracy of on-device LLMs in real-world medical contexts remain underexplored. In this study, we benchmark publicly available on-device LLMs using the AMEGA dataset, evaluating accuracy, computational efficiency, and thermal limitation across various mobile devices. Our results indicate that compact general-purpose models like Phi-3 Mini achieve a strong balance between speed and accuracy, while medically fine-tuned models such as Med42 and Aloe attain the highest accuracy. Notably, deploying LLMs on older devices remains feasible, with memory constraints posing a greater challenge than raw processing power. Our study underscores the potential of on-device LLMs for healthcare while emphasizing the need for more efficient inference and models tailored to real-world clinical reasoning.
Authors:Yang Li, Jiankai Gao, Yuanzheng Li, Chen Chen, Sen Li, Mohammad Shahidehpour, Zhe Chen
Abstract:
To coordinate the interests of operator and users in a microgrid under complex and changeable operating conditions, this paper proposes a microgrid scheduling model considering the thermal flexibility of thermostatically controlled loads and demand response by leveraging physical informed-inspired deep reinforcement learning (DRL) based bi-level programming. To overcome the non-convex limitations of karush-kuhn-tucker (KKT)-based methods, a novel optimization solution method based on DRL theory is proposed to handle the bi-level programming through alternate iterations between levels. Specifically, by combining a DRL algorithm named asynchronous advantage actor-critic (A3C) and automated machine learning-prioritized experience replay (AutoML-PER) strategy to improve the generalization performance of A3C to address the above problems, an improved A3C algorithm, called AutoML-PER-A3C, is designed to solve the upper-level problem; while the DOCPLEX optimizer is adopted to address the lower-level problem. In this solution process, AutoML is used to automatically optimize hyperparameters and PER improves learning efficiency and quality by extracting the most valuable samples. The test results demonstrate that the presented approach manages to reconcile the interests between multiple stakeholders in MG by fully exploiting various flexibility resources. Furthermore, in terms of economic viability and computational efficiency, the proposal vastly exceeds other advanced reinforcement learning methods.
Authors:Lukas Pfromm, Alish Kanani, Harsh Sharma, Parth Solanki, Eric Tervo, Jaehyun Park, Janardhan Rao Doppa, Partha Pratim Pande, Umit Y. Ogras
Abstract:
Rapidly evolving artificial intelligence and machine learning applications require ever-increasing computational capabilities, while monolithic 2D design technologies approach their limits. Heterogeneous integration of smaller chiplets using a 2.5D silicon interposer and 3D packaging has emerged as a promising paradigm to address this limit and meet performance demands. These approaches offer a significant cost reduction and higher manufacturing yield than monolithic 2D integrated circuits. However, the compact arrangement and high compute density exacerbate the thermal management challenges, potentially compromising performance. Addressing these thermal modeling challenges is critical, especially as system sizes grow and different design stages require varying levels of accuracy and speed. Since no single thermal modeling technique meets all these needs, this paper introduces MFIT, a range of multi-fidelity thermal models that effectively balance accuracy and speed. These multi-fidelity models can enable efficient design space exploration and runtime thermal management. Our extensive testing on systems with 16, 36, and 64 2.5D integrated chiplets and 16x3 3D integrated chiplets demonstrates that these models can reduce execution times from days to mere seconds and milliseconds with negligible loss in accuracy.
Authors:Yunpeng Gong, Qingyuan Zeng, Dejun Xu, Zhenzhong Wang, Min Jiang
Abstract:
In recent years, despite significant advancements in adversarial attack research, the security challenges in cross-modal scenarios, such as the transferability of adversarial attacks between infrared, thermal, and RGB images, have been overlooked. These heterogeneous image modalities collected by different hardware devices are widely prevalent in practical applications, and the substantial differences between modalities pose significant challenges to attack transferability. In this work, we explore a novel cross-modal adversarial attack strategy, termed multiform attack. We propose a dual-layer optimization framework based on gradient-evolution, facilitating efficient perturbation transfer between modalities. In the first layer of optimization, the framework utilizes image gradients to learn universal perturbations within each modality and employs evolutionary algorithms to search for shared perturbations with transferability across different modalities through secondary optimization. Through extensive testing on multiple heterogeneous datasets, we demonstrate the superiority and robustness of Multiform Attack compared to existing techniques. This work not only enhances the transferability of cross-modal adversarial attacks but also provides a new perspective for understanding security vulnerabilities in cross-modal systems.
Authors:Danilo Amigo, Felipe Lepe, Enrique Otarola, Gonzalo Rivera
Abstract:
We develop a virtual element method to solve a convective Brinkman-Forchheimer problem coupled with a heat equation. This coupled model may allow for thermal diffusion and viscosity as a function of temperature. Under standard discretization assumptions, we prove the well posedness of the proposed numerical scheme. We also derive optimal error estimates under appropriate regularity assumptions for the solution. We conclude with a series of numerical tests performed with different mesh families that complement our theoretical findings.
Authors:Shihang Li, Zhiqiang Gong, Weien Zhou, Yue Gao, Wen Yao
Abstract:
Accurate reconstruction of temperature field of heat-source systems (TFR-HSS) is crucial for thermal monitoring and reliability assessment in engineering applications such as electronic devices and aerospace structures. However, the high cost of measurement acquisition and the substantial distributional shifts in temperature field across varying conditions present significant challenges for developing reconstruction models with robust generalization capabilities. Existing DNNs-based methods typically formulate TFR-HSS as a one-to-one regression problem based solely on target sparse measurements, without effectively leveraging reference simulation data that implicitly encode thermal knowledge. To address this limitation, we propose IPTR, an implicit physics-guided temperature field reconstruction framework that introduces sparse monitoring-temperature field pair from reference simulations as priors to enrich physical understanding. To integrate both reference and target information, we design a dual physics embedding module consisting of two complementary branches: an implicit physics-guided branch employing cross-attention to distill latent physics from the reference data, and an auxiliary encoding branch based on Fourier layers to capture the spatial characteristics of the target observation. The fused representation is then decoded to reconstruct the full temperature field. Extensive experiments under single-condition, multi-condition, and few-shot settings demonstrate that IPTR consistently outperforms existing methods, achieving state-of-the-art reconstruction accuracy and strong generalization capability.
Authors:Seungwoo Yoo, Kyeongmin Yeo, Jisung Hwang, Minhyuk Sung
Abstract:
We introduce Neural Green's Function, a neural solution operator for linear partial differential equations (PDEs) whose differential operators admit eigendecompositions. Inspired by Green's functions, the solution operators of linear PDEs that depend exclusively on the domain geometry, we design Neural Green's Function to imitate their behavior, achieving superior generalization across diverse irregular geometries and source and boundary functions. Specifically, Neural Green's Function extracts per-point features from a volumetric point cloud representing the problem domain and uses them to predict a decomposition of the solution operator, which is subsequently applied to evaluate solutions via numerical integration. Unlike recent learning-based solution operators, which often struggle to generalize to unseen source or boundary functions, our framework is, by design, agnostic to the specific functions used during training, enabling robust and efficient generalization. In the steady-state thermal analysis of mechanical part geometries from the MCB dataset, Neural Green's Function outperforms state-of-the-art neural operators, achieving an average error reduction of 13.9\% across five shape categories, while being up to 350 times faster than a numerical solver that requires computationally expensive meshing.
Authors:Ainesh Bakshi, Allen Liu, Ankur Moitra, Ewin Tang
Abstract:
A central challenge in quantum physics is to understand the structural properties of many-body systems, both in equilibrium and out of equilibrium. For classical systems, we have a unified perspective which connects structural properties of systems at thermal equilibrium to the Markov chain dynamics that mix to them. We lack such a perspective for quantum systems: there is no framework to translate the quantitative convergence of the Markovian evolution into strong structural consequences. We develop a general framework that brings the breadth and flexibility of the classical theory to quantum Gibbs states at high temperature. At its core is a natural quantum analog of a Dobrushin condition; whenever this condition holds, a concise path-coupling argument proves rapid mixing for the corresponding Markovian evolution. The same machinery bridges dynamic and structural properties: rapid mixing yields exponential decay of conditional mutual information (CMI) without restrictions on the size of the probed subsystems, resolving a central question in the theory of open quantum systems. Our key technical insight is an optimal transport viewpoint which couples quantum dynamics to a linear differential equation, enabling precise control over how local deviations from equilibrium propagate to distant sites.
Authors:Dongqi Zheng, Wenjin Fu
Abstract:
We introduce Constraint-Aware Federated Learning with Lagrangian Dual Optimization (CAFL-L), a principled extension of FedAvg that explicitly incorporates device-level resource constraints including energy, communication, memory, and thermal budgets. CAFL-L employs Lagrangian dual optimization to dynamically adapt training hyperparameters -- freezing depth, local steps, batch size, and communication compression -- while preserving training stability through token-budget preservation via gradient accumulation. Experiments on a character-level language model demonstrate that CAFL-L achieves superior constraint satisfaction compared to standard FedAvg (reducing memory usage by 20% and communication by 95%) while maintaining competitive validation performance, making it practical for deployment on resource-constrained edge devices.
Authors:Dongqi Zheng, Wenjin Fu, Guangzong Chen
Abstract:
We present an automated vision-based system for defect detection and classification of laser power meter sensor coatings. Our approach addresses the critical challenge of identifying coating defects such as thermal damage and scratches that can compromise laser energy measurement accuracy in medical and industrial applications. The system employs an unsupervised anomaly detection framework that trains exclusively on ``good'' sensor images to learn normal coating distribution patterns, enabling detection of both known and novel defect types without requiring extensive labeled defect datasets. Our methodology consists of three key components: (1) a robust preprocessing pipeline using Laplacian edge detection and K-means clustering to segment the area of interest, (2) synthetic data augmentation via StyleGAN2, and (3) a UFlow-based neural network architecture for multi-scale feature extraction and anomaly map generation. Experimental evaluation on 366 real sensor images demonstrates $93.8\%$ accuracy on defective samples and $89.3\%$ accuracy on good samples, with image-level AUROC of 0.957 and pixel-level AUROC of 0.961. The system provides potential annual cost savings through automated quality control and processing times of 0.5 seconds per image in on-device implementation.
Authors:Ramona Rubini, Siavash Khodakarami, Aniruddha Bora, George Em Karniadakis, Michele Dassisti
Abstract:
Accurate time-series forecasting for complex physical systems is the backbone of modern industrial monitoring and control. While deep learning models excel at capturing complex dynamics, currently, their deployment is limited due to physical inconsistency and robustness, hence constraining their reliability in regulated environments. We introduce process-informed forecasting (PIF) models for temperature in pharmaceutical lyophilization. We investigate a wide range of models, from classical ones such as Autoregressive Integrated Moving Average Model (ARIMA) and Exponential Smoothing Model (ETS), to modern deep learning architectures, including Kolmogorov-Arnold Networks (KANs). We compare three different loss function formulations that integrate a process-informed trajectory prior: a fixed-weight loss, a dynamic uncertainty-based loss, and a Residual-Based Attention (RBA) mechanism. We evaluate all models not only for accuracy and physical consistency but also for robustness to sensor noise. Furthermore, we test the practical generalizability of the best model in a transfer learning scenario on a new process. Our results show that PIF models outperform their data-driven counterparts in terms of accuracy, physical plausibility and noise resilience. This work provides a roadmap for developing reliable and generalizable forecasting solutions for critical applications in the pharmaceutical manufacturing landscape.
Authors:Harris Song, Tuan-Anh Vu, Sanjith Menon, Sriram Narasimhan, M. Khalid Jawed
Abstract:
Detecting hidden or partially concealed objects remains a fundamental challenge in multimodal environments, where factors like occlusion, camouflage, and lighting variations significantly hinder performance. Traditional RGB-based detection methods often fail under such adverse conditions, motivating the need for more robust, modality-agnostic approaches. In this work, we present HiddenObject, a fusion framework that integrates RGB, thermal, and depth data using a Mamba-based fusion mechanism. Our method captures complementary signals across modalities, enabling enhanced detection of obscured or camouflaged targets. Specifically, the proposed approach identifies modality-specific features and fuses them in a unified representation that generalizes well across challenging scenarios. We validate HiddenObject across multiple benchmark datasets, demonstrating state-of-the-art or competitive performance compared to existing methods. These results highlight the efficacy of our fusion design and expose key limitations in current unimodal and naïve fusion strategies. More broadly, our findings suggest that Mamba-based fusion architectures can significantly advance the field of multimodal object detection, especially under visually degraded or complex conditions.
Authors:Sofiane Bouaziz, Adel Hafiane, Raphael Canals, Rachid Nedjai
Abstract:
Urban heatwaves, droughts, and land degradation are pressing and growing challenges in the context of climate change. A valuable approach to studying them requires accurate spatio-temporal information on land surface conditions. One of the most important variables for assessing and understanding these phenomena is Land Surface Temperature (LST), which is derived from satellites and provides essential information about the thermal state of the Earth's surface. However, satellite platforms inherently face a trade-off between spatial and temporal resolutions. To bridge this gap, we propose FuseTen, a novel generative framework that produces daily LST observations at a fine 10 m spatial resolution by fusing spatio-temporal observations derived from Sentinel-2, Landsat 8, and Terra MODIS. FuseTen employs a generative architecture trained using an averaging-based supervision strategy grounded in physical principles. It incorporates attention and normalization modules within the fusion process and uses a PatchGAN discriminator to enforce realism. Experiments across multiple dates show that FuseTen outperforms linear baselines, with an average 32.06% improvement in quantitative metrics and 31.42% in visual fidelity. To the best of our knowledge, this is the first non-linear method to generate daily LST estimates at such fine spatial resolution.
Authors:Jinming Liu, Mingtong Chen, Zhengbao Yang
Abstract:
BiFeO3 has attracted much attention as a potential candidate for replacing conventional, lead based piezoelectric materials due to its remarkable spontaneous polarization and high Curie temperature. However, its inherent high leakage currents, which lead to low piezoelectric response and poor temperature stability, have severely limited its practical applications. In this study, lead free piezoelectric ceramics of the 0.7BiFeO3-0.3BaTiO3 (BF-BT) system were prepared, and their microstructures along with high-temperature electrical performance were modulated by introducing Nd3+. The results indicate that moderate Nd doping improves lattice symmetry and reduces oxygen vacancy-related defect dipoles, thereby effectively suppressing leakage currents at temperatures above 75°C. The Nddoped samples exhibit significantly lower leakage current densities, reduced by over 99% compared to the undoped ceramics, reaching values as low as 10-5Acm-2. They also show higher resistivity under elevated temperatures and electric fields, offering notable improvements in thermal stability over the undoped counterparts. In addition, the Nd-doped samples achieved piezoelectric coefficients as high as 172 pC N -1 at room temperature while still maintaining high dielectric and piezoelectric responses at elevated temperatures. This work not only provides an effective way to solve the leakage current problem of BF-BT ceramics in high temperature applications but also indicates a new design strategy for optimizing the high temperature stability of lead free piezoelectric materials, which shows a broad application prospect in the field of high-temperature sensors and actuators.
Authors:Shiyu Xuan, Zechao Li, Jinhui Tang
Abstract:
Multi-modal object tracking integrates auxiliary modalities such as depth, thermal infrared, event flow, and language to provide additional information beyond RGB images, showing great potential in improving tracking stabilization in complex scenarios. Existing methods typically start from an RGB-based tracker and learn to understand auxiliary modalities only from training data. Constrained by the limited multi-modal training data, the performance of these methods is unsatisfactory. To alleviate this limitation, this work proposes a unified multi-modal tracker Diff-MM by exploiting the multi-modal understanding capability of the pre-trained text-to-image generation model. Diff-MM leverages the UNet of pre-trained Stable Diffusion as a tracking feature extractor through the proposed parallel feature extraction pipeline, which enables pairwise image inputs for object tracking. We further introduce a multi-modal sub-module tuning method that learns to gain complementary information between different modalities. By harnessing the extensive prior knowledge in the generation model, we achieve a unified tracker with uniform parameters for RGB-N/D/T/E tracking. Experimental results demonstrate the promising performance of our method compared with recently proposed trackers, e.g., its AUC outperforms OneTracker by 8.3% on TNL2K.
Authors:Shuai Lu, Zeyin Hou, Wei Gu, Yijun Xu
Abstract:
The inherent thermal storage capacity of buildings brings considerable thermal flexibility to the heating/cooling loads, which are promising demand response resources for power systems. It is widely believed that integrating the thermal flexibility of buildings into the distribution system can improve the operating economy and reliability of the system. However, the private information of the buildings needs to be transferred to the distribution system operator (DSO) to achieve a coordinated optimization, bringing serious privacy concerns to users. Given this issue, we propose a novel privacy-preserved optimal dispatch approach for the distribution system incorporating buildings. Using it, the DSO can exploit the thermal flexibility of buildings without accessing their private information, such as model parameters and indoor temperature profiles. Specifically, we first develop an optimal dispatch model for the distribution system integrating buildings, which can be extended to other storage-like flexibility resources. Second, we reveal that the privacy-preserved integration of buildings is a joint privacy preservation problem for both parameters and state variables and then design a privacy-preserved algorithm based on transformation-based encryption, constraint relaxation, and constraint extension techniques. Besides, we implement a detailed privacy analysis for the proposed method, considering both semi-honest adversaries and external eavesdroppers. Case studies demonstrate the accuracy, privacy-preserved performance, and computational efficiency of the proposed method.
Authors:Etienne Chassaing, Florent Forest, Olga Fink, Malcolm Mielle
Abstract:
In the European Union, buildings account for 42% of energy use and 35% of greenhouse gas emissions. Since most existing buildings will still be in use by 2050, retrofitting is crucial for emissions reduction. However, current building assessment methods rely mainly on qualitative thermal imaging, which limits data-driven decisions for energy savings. On the other hand, quantitative assessments using finite element analysis (FEA) offer precise insights but require manual CAD design, which is tedious and error-prone. Recent advances in 3D reconstruction, such as Neural Radiance Fields (NeRF) and Gaussian Splatting, enable precise 3D modeling from sparse images but lack clearly defined volumes and the interfaces between them needed for FEA. We propose Thermoxels, a novel voxel-based method able to generate FEA-compatible models, including both geometry and temperature, from a sparse set of RGB and thermal images. Using pairs of RGB and thermal images as input, Thermoxels represents a scene's geometry as a set of voxels comprising color and temperature information. After optimization, a simple process is used to transform Thermoxels' models into tetrahedral meshes compatible with FEA. We demonstrate Thermoxels' capability to generate RGB+Thermal meshes of 3D scenes, surpassing other state-of-the-art methods. To showcase the practical applications of Thermoxels' models, we conduct a simple heat conduction simulation using FEA, achieving convergence from an initial state defined by Thermoxels' thermal reconstruction. Additionally, we compare Thermoxels' image synthesis abilities with current state-of-the-art methods, showing competitive results, and discuss the limitations of existing metrics in assessing mesh quality.
Authors:Jinho Yang, Hyeongtaek Lee, Junil Choi
Abstract:
Different from conventional passive reconfigurable intelligent surfaces (RISs), incident signals and thermal noise can be amplified at active RISs. By exploiting the amplifying capability of active RISs, noticeable performance improvement can be expected when precise channel state information (CSI) is available. Since obtaining perfect CSI related to an RIS is difficult in practice, a robust transmission design is proposed in this paper to tackle the channel uncertainty issue, which will be more severe for active RIS-aided systems. To account for the worst-case scenario, the minimum achievable rate of each user is derived under a statistical CSI error model. Subsequently, an optimization problem is formulated to maximize the sum of the minimum achievable rate. Since the objective function is non-concave, the formulated problem is transformed into a tractable lower bound maximization problem, which is solved using an alternating optimization method. Numerical results show that the proposed robust design outperforms a baseline scheme that only exploits estimated CSI.
Authors:Tatsuro Sakai, Kanji Tanaka, Yuki Minase, Jonathan Tay Yu Liang, Muhammad Adil Luqman, Daiki Iwata
Abstract:
In robot vision, thermal cameras hold great potential for recognizing humans even in complete darkness. However, their application to multi-person tracking (MPT) has been limited due to data scarcity and the inherent difficulty of distinguishing individuals. In this study, we propose a cooperative MPT system that utilizes co-located RGB and thermal cameras, where pseudo-annotations (bounding boxes and person IDs) are used to train both RGB and thermal trackers. Evaluation experiments demonstrate that the thermal tracker performs robustly in both bright and dark environments. Moreover, the results suggest that a tracker-switching strategy -- guided by a binary brightness classifier -- is more effective for information integration than a tracker-fusion approach. As an application example, we present an image change pattern recognition (ICPR) method, the ``human-as-landmark,'' which combines two key properties: the thermal recognizability of humans in dark environments and the rich landmark characteristics -- appearance, geometry, and semantics -- of static objects (occluders). Whereas conventional SLAM focuses on mapping static landmarks in well-lit environments, the present study takes a first step toward a new Human-Only SLAM paradigm, ``Dynamic-Dark SLAM,'' which aims to map even dynamic landmarks in complete darkness. Additionally, this study demonstrates that knowledge transfer between thermal and depth modalities enables reliable person tracking using low-resolution 3D LiDAR data without RGB input, contributing an important advance toward cross-robot SLAM systems.
Authors:Pouya Shaeri, Saud AlKhaled, Ariane Middel
Abstract:
Outdoor thermal comfort is a critical determinant of urban livability, particularly in hot desert climates where extreme heat poses challenges to public health, energy consumption, and urban planning. Mean Radiant Temperature ($T_{mrt}$) is a key parameter for evaluating outdoor thermal comfort, especially in urban environments where radiation dynamics significantly impact human thermal exposure. Traditional methods of estimating $T_{mrt}$ rely on field measurements and computational simulations, both of which are resource intensive. This study introduces a Physics-Informed Neural Network (PINN) approach that integrates shortwave and longwave radiation modeling with deep learning techniques. By leveraging a multimodal dataset that includes meteorological data, built environment characteristics, and fisheye image-derived shading information, our model enhances predictive accuracy while maintaining physical consistency. Our experimental results demonstrate that the proposed PINN framework outperforms conventional deep learning models, with the best-performing configurations achieving an RMSE of 3.50 and an $R^2$ of 0.88. This approach highlights the potential of physics-informed machine learning in bridging the gap between computational modeling and real-world applications, offering a scalable and interpretable solution for urban thermal comfort assessments.
Authors:Daniel Barros, Paula Fraga-Lamas, Tiago M. Fernandez-Carames, Sergio Ivan Lopes
Abstract:
The Industry 5.0 paradigm focuses on industrial operator well-being and sustainable manufacturing practices, where humans play a central role, not only during the repetitive and collaborative tasks of the manufacturing process, but also in the management of the factory floor assets. Human factors, such as ergonomics, safety, and well-being, push the human-centric smart factory to efficiently adopt novel technologies while minimizing environmental and social impact. As operations at the factory floor increasingly rely on collaborative robots (CoBots) and flexible manufacturing systems, there is a growing demand for redundant safety mechanisms (i.e., automatic human detection in the proximity of machinery that is under operation). Fostering enhanced process safety for human proximity detection allows for the protection against possible incidents or accidents with the deployed industrial devices and machinery. This paper introduces the design and implementation of a cost-effective thermal imaging Safety Sensor that can be used in the scope of Industry 5.0 to trigger distinct safe mode states in manufacturing processes that rely on collaborative robotics. The proposed Safety Sensor uses a hybrid detection approach and has been evaluated under controlled environmental conditions. The obtained results show a 97% accuracy at low computational cost when using the developed hybrid method to detect the presence of humans in thermal images.
Authors:Andrii Lysyi, Anatoliy Sachenko, Pavlo Radiuk, Mykola Lysyi, Oleksandr Melnychenko, Diana Zahorodnia
Abstract:
The subject of this research is the development of an intelligent, integrated framework for the automated inspection of photovoltaic (PV) infrastructure that addresses the critical shortcomings of conventional methods, including thermal palette bias, data redundancy, and high communication bandwidth requirements. The goal of this study is to design, develop, and validate a comprehensive, multi-modal system that fully automates the monitoring workflow, from data acquisition to the generation of actionable, geo-located maintenance alerts, thereby enhancing plant safety and operational efficiency. The methods employed involve a synergistic architecture that begins with a palette-invariant thermal embedding, learned by enforcing representational consistency, which is fused with a contrast-normalized RGB stream via a gated mechanism. This is supplemented by a closed-loop, adaptive re-acquisition controller that uses Rodrigues-based updates for targeted confirmation of ambiguous anomalies and a geospatial deduplication module that clusters redundant alerts using DBSCAN over the haversine distance. In conclusion, this study establishes a powerful new paradigm for proactive PV inspection, with the proposed system achieving a mean Average Precision (mAP@0.5) of 0.903 on the public PVF-10 benchmark, a significant 12-15% improvement over single-modality baselines. Field validation confirmed the system's readiness, achieving 96% recall, while the de-duplication process reduced duplicate-induced false positives by 15-20%, and relevance-only telemetry cut airborne data transmission by 60-70%.
Authors:Zhipeng Ma, Bo Nørregaard Jørgensen, Zheng Grace Ma
Abstract:
Industrial process monitoring increasingly relies on sensor-generated time-series data, yet the lack of labels, high variability, and operational noise make it difficult to extract meaningful patterns using conventional methods. Existing clustering techniques either rely on fixed distance metrics or deep models designed for static data, limiting their ability to handle dynamic, unstructured industrial sequences. Addressing this gap, this paper proposes a novel framework for unsupervised discovery of operational modes in univariate time-series data using image-based convolutional clustering with composite internal evaluation. The proposed framework improves upon existing approaches in three ways: (1) raw time-series sequences are transformed into grayscale matrix representations via overlapping sliding windows, allowing effective feature extraction using a deep convolutional autoencoder; (2) the framework integrates both soft and hard clustering outputs and refines the selection through a two-stage strategy; and (3) clustering performance is objectively evaluated by a newly developed composite score, S_eva, which combines normalized Silhouette, Calinski-Harabasz, and Davies-Bouldin indices. Applied to over 3900 furnace melting operations from a Nordic foundry, the method identifies seven explainable operational patterns, revealing significant differences in energy consumption, thermal dynamics, and production duration. Compared to classical and deep clustering baselines, the proposed approach achieves superior overall performance, greater robustness, and domain-aligned explainability. The framework addresses key challenges in unsupervised time-series analysis, such as sequence irregularity, overlapping modes, and metric inconsistency, and provides a generalizable solution for data-driven diagnostics and energy optimization in industrial systems.
Authors:Rahul Kumar Padhy, Krishnan Suresh, Aaditya Chandrasekhar
Abstract:
Latent heat thermal energy storage (LHTES) systems are compelling candidates for energy storage, primarily owing to their high storage density. Improving their performance is crucial for developing the next-generation efficient and cost effective devices. Topology optimization (TO) has emerged as a powerful computational tool to design LHTES systems by optimally distributing a high-conductivity material (HCM) and a phase change material (PCM). However, conventional TO typically limits to optimizing the geometry for a fixed, pre-selected materials. This approach does not leverage the large and expanding databases of novel materials. Consequently, the co-design of material and geometry for LHTES remains a challenge and unexplored. To address this limitation, we present an automated design framework for the concurrent optimization of material choice and topology. A key challenge is the discrete nature of material selection, which is incompatible with the gradient-based methods used for TO. We overcome this by using a data-driven variational autoencoder (VAE) to project discrete material databases for both the HCM and PCM onto continuous and differentiable latent spaces. These continuous material representations are integrated into an end-to-end differentiable, transient nonlinear finite-element solver that accounts for phase change. We demonstrate this framework on a problem aimed at maximizing the discharged energy within a specified time, subject to cost constraints. The effectiveness of the proposed method is validated through several illustrative examples.
Authors:Neham Jain, Andrew Jong, Sebastian Scherer, Ioannis Gkioulekas
Abstract:
Smoke in real-world scenes can severely degrade the quality of images and hamper visibility. Recent methods for image restoration either rely on data-driven priors that are susceptible to hallucinations, or are limited to static low-density smoke. We introduce SmokeSeer, a method for simultaneous 3D scene reconstruction and smoke removal from a video capturing multiple views of a scene. Our method uses thermal and RGB images, leveraging the fact that the reduced scattering in thermal images enables us to see through the smoke. We build upon 3D Gaussian splatting to fuse information from the two image modalities, and decompose the scene explicitly into smoke and non-smoke components. Unlike prior approaches, SmokeSeer handles a broad range of smoke densities and can adapt to temporally varying smoke. We validate our approach on synthetic data and introduce a real-world multi-view smoke dataset with RGB and thermal images. We provide open-source code and data at the project website.
Authors:Serhii Svystun, Pavlo Radiuk, Oleksandr Melnychenko, Oleg Savenko, Anatoliy Sachenko
Abstract:
Unmanned aerial vehicles (UAVs) equipped with advanced sensors have opened up new opportunities for monitoring wind power plants, including blades, towers, and other critical components. However, reliable defect detection requires high-resolution data and efficient methods to process multispectral imagery. In this research, we aim to enhance defect detection accuracy through the development of an ensemble of YOLO-based deep learning models that integrate both visible and thermal channels. We propose an ensemble approach that integrates a general-purpose YOLOv8 model with a specialized thermal model, using a sophisticated bounding box fusion algorithm to combine their predictions. Our experiments show this approach achieves a mean Average Precision (mAP@.5) of 0.93 and an F1-score of 0.90, outperforming a standalone YOLOv8 model, which scored an mAP@.5 of 0.91. These findings demonstrate that combining multiple YOLO architectures with fused multispectral data provides a more reliable solution, improving the detection of both visual and thermal defects.
Authors:Md Mazharul Islam, Diego Ferrer, Shamiul Alam, Juan P. Mendez, Denis Mamaluy, Wei Pan, Ahmedullah Aziz
Abstract:
The growing demand for ultra low power computing and the emergence of quantum technologies have intensified interest in cryogenic electronics, particularly superconducting devices. Despite their promise, current controlled superconducting components face fundamental challenges in cascadability, limiting their effectiveness in complex logic architectures. To overcome this, recent efforts have focused on developing gate tunable superconducting devices, such as Josephson junction field effect transistors (JJFETs). However, achieving robust control and sufficient supercurrent gain, both critical for transistor-like performance in logic circuits remains a key challenge. A recent advancement in JJFET design, based on InAs and GaSb heterostructures, demonstrates enhanced gain and favorable device characteristics suitable for circuit integration. Building on this innovation, we propose and analyze fundamental voltage controlled logic topologies using the quantum enhanced JJFET. We develop a Verilog A based circuit compatible compact model of the quantum enhanced JJFET which accurately captures the experimentally observed device characteristics. To ensure cascadability, our logic circuits incorporate the multilayered Heater Nanocryotron (nTron), a superconducting nanowire-based thermal switch. Through simulation based analysis, we demonstrate the successful implementation of fundamental logic gates, including NOT, NAND, and NOR. Furthermore, we design a 3 input majority gate, which plays a pivotal role in quantum and reversible computing due to its universality. Finally, to demonstrate the cascadability of our proposed logic topology, we demonstrate the operation of a 2 input XOR gate based on our designed JJFET based NOT, NAND, and NOR gate.
Authors:Weihong Tang, Yun Li, Shalika Walker, Tamas Keviczky
Abstract:
Increasing penetration of renewable energy sources (RES) and electrification of energy systems necessitates the engagement of demand-side management (DSM) to help alleviate congestion in electricity grid. Heat pump and thermal energy storage (HPTES) systems, being energy efficient solutions, are becoming popular in modern buildings and are promising to contribute to demand-side management (DSM) due to their significant share in household electricity consumption. For typical HPTES systems, this paper presents a systematic design framework covering a control-oriented modeling process and energy-flexible model predictive control (MPC) design. The proposed MPC-based DSM strategy offers an innovative solution for efficient DSM by following a two-step DSM framework. In the first step, flexibility assessment is performed to quantitatively evaluate the flexibility potential of the HPTES system by solving a mixed-integer economic MPC problem. In the second step, flexibility exploitation is achieved through reacting to feasible demand response (DR) requests while respecting system constraints. Both numerical simulations and real-world experiments are performed based on a real HPTES installation to showcase the viability and effectiveness of the proposed design.
Authors:Sangwon Kang, Hao Tu, Huazhen Fang
Abstract:
Lithium-ion batteries are the enabling power source for transportation electrification. However, in real-world applications, they remain vulnerable to internal short circuits (ISCs) and the consequential risk of thermal runaway (TR). Toward addressing the challenge of ISCs and TR, we undertake a systematic study that extends from dynamic modeling to fault detection in this paper. First, we develop {\em BattBee}, the first equivalent circuit model to specifically describe the onset of ISCs and the evolution of subsequently induced TR. Drawing upon electrochemical modeling, the model can simulate ISCs at different severity levels and predict their impact on the initiation and progression of TR events. With the physics-inspired design, this model offers strong physical interpretability and predictive accuracy, while maintaining structural simplicity to allow fast computation. Then, building upon the BattBee model, we develop fault detection observers and derive detection criteria together with decision-making logics to identify the occurrence and emergence of ISC and TR events. This detection approach is principled in design and fast in computation, lending itself to practical applications. Validation based on simulations and experimental data demonstrates the effectiveness of both the BattBee model and the ISC/TR detection approach. The research outcomes underscore this study's potential for real-world battery safety risk management.
Authors:Junjie Yu, John S. Schreck, David John Gagne, Keith W. Oleson, Jie Li, Yongtu Liang, Qi Liao, Mingfei Sun, David O. Topping, Zhonghua Zheng
Abstract:
Reinforcement learning (RL)-based heating, ventilation, and air conditioning (HVAC) control has emerged as a promising technology for reducing building energy consumption while maintaining indoor thermal comfort. However, the efficacy of such strategies is influenced by the background climate and their implementation may potentially alter both the indoor climate and local urban climate. This study proposes an integrated framework combining RL with an urban climate model that incorporates a building energy model, aiming to evaluate the efficacy of RL-based HVAC control across different background climates, impacts of RL strategies on indoor climate and local urban climate, and the transferability of RL strategies across cities. Our findings reveal that the reward (defined as a weighted combination of energy consumption and thermal comfort) and the impacts of RL strategies on indoor climate and local urban climate exhibit marked variability across cities with different background climates. The sensitivity of reward weights and the transferability of RL strategies are also strongly influenced by the background climate. Cities in hot climates tend to achieve higher rewards across most reward weight configurations that balance energy consumption and thermal comfort, and those cities with more varying atmospheric temperatures demonstrate greater RL strategy transferability. These findings underscore the importance of thoroughly evaluating RL-based HVAC control strategies in diverse climatic contexts. This study also provides a new insight that city-to-city learning will potentially aid the deployment of RL-based HVAC control.
Authors:Lei Wan, Prabesh Gupta, Andreas Eich, Marcel Kettelgerdes, Hannan Ejaz Keen, Michael Klöppel-Gersdorf, Alexey Vinel
Abstract:
Perception is a core capability of automated vehicles and has been significantly advanced through modern sensor technologies and artificial intelligence. However, perception systems still face challenges in complex real-world scenarios. To improve robustness against various external factors, multi-sensor fusion techniques are essential, combining the strengths of different sensor modalities. With recent developments in Vehicle-to-Everything (V2X communication, sensor fusion can now extend beyond a single vehicle to a cooperative multi-agent system involving Connected Automated Vehicle (CAV) and intelligent infrastructure. This paper presents VALISENS, an innovative multi-sensor system distributed across multiple agents. It integrates onboard and roadside LiDARs, radars, thermal cameras, and RGB cameras to enhance situational awareness and support cooperative automated driving. The thermal camera adds critical redundancy for perceiving Vulnerable Road User (VRU), while fusion with roadside sensors mitigates visual occlusions and extends the perception range beyond the limits of individual vehicles. We introduce the corresponding perception module built on this sensor system, which includes object detection, tracking, motion forecasting, and high-level data fusion. The proposed system demonstrates the potential of cooperative perception in real-world test environments and lays the groundwork for future Cooperative Intelligent Transport Systems (C-ITS) applications.
Authors:Anjith George, Sebastien Marcel
Abstract:
Heterogeneous Face Recognition (HFR) addresses the challenge of matching face images across different sensing modalities, such as thermal to visible or near-infrared to visible, expanding the applicability of face recognition systems in real-world, unconstrained environments. While recent HFR methods have shown promising results, many rely on computation-intensive architectures, limiting their practicality for deployment on resource-constrained edge devices. In this work, we present a lightweight yet effective HFR framework by adapting a hybrid CNN-Transformer architecture originally designed for face recognition. Our approach enables efficient end-to-end training with minimal paired heterogeneous data while preserving strong performance on standard RGB face recognition tasks. This makes it a compelling solution for both homogeneous and heterogeneous scenarios. Extensive experiments across multiple challenging HFR and face recognition benchmarks demonstrate that our method consistently outperforms state-of-the-art approaches while maintaining a low computational overhead.
Authors:Alessandro Arduino, Oriano Bottauscio, Denise Grappein, Stefano Scialó, Fabio Vicini, Umberto Zanovello, Luca Zilberti
Abstract:
Safety assessment of patients with one-dimensionally structured passive implants, like cranial meshes or stents, exposed to low or medium frequency magnetic fields, like those generated in magnetic resonance imaging or magnetic hyperthermia, can be challenging, because of the different length scales of the implant and the human body. Most of the methods used to estimate the heating induced near such implants neglect the presence of the metallic materials within the body, modeling the metal as thermal seeds. To overcome this limitation, a novel numerical approach that solves three-dimensional and one-dimensional coupled problems is proposed. The proposed method is compared with measurements performed on a cranial mesh exposed to the magnetic field generated by a gradient coil system for magnetic resonance imaging. Then, it is applied to a magnetic hyperthermia case study in which a patient with a cranial mesh is exposed to the magnetic field generated by a collar-type magnetic hyperthermia applicator for neck tumour treatment. The experimental comparison of the proposed method predictions and the measurement data shows an improved accuracy near the maximum temperature increase up to 25% with respect to the method based on thermal seeds. The application of the proposed method applied to the magnetic hyperthermia case study leads to a prediction of the maximum temperature increase that is 10% lower than the one overestimated by relying on thermal seeds. At the same time, the proposed method corrects the underestimation of the thermal seeds in the regions where the electromagnetic power is not directly deposited and the temperature increase is only due to heat transfer. The proposed method leads to improved results with respect to previous approximations by modelling the thermal diffusion through the highly conductive metallic implants.
Authors:Jonas Mirlach, Lei Wan, Andreas Wiedholz, Hannan Ejaz Keen, Andreas Eich
Abstract:
In autonomous driving, the integration of roadside perception systems is essential for overcoming occlusion challenges and enhancing the safety of Vulnerable Road Users(VRUs). While LiDAR and visual (RGB) sensors are commonly used, thermal imaging remains underrepresented in datasets, despite its acknowledged advantages for VRU detection in extreme lighting conditions. In this paper, we present R-LiViT, the first dataset to combine LiDAR, RGB, and thermal imaging from a roadside perspective, with a strong focus on VRUs. R-LiViT captures three intersections during both day and night, ensuring a diverse dataset. It includes 10,000 LiDAR frames and 2,400 temporally and spatially aligned RGB and thermal images across 150 traffic scenarios, with 7 and 8 annotated classes respectively, providing a comprehensive resource for tasks such as object detection and tracking. The dataset and the code for reproducing our evaluation results are made publicly available.
Authors:Navneet Singh, Shiva Raj Pokhrel
Abstract:
Quantum Machine Learning (QML) offers significant potential for complex tasks like genome sequence classification, but quantum noise on Noisy Intermediate-Scale Quantum (NISQ) devices poses practical challenges. This study systematically evaluates how various quantum noise models including dephasing, amplitude damping, depolarizing, thermal noise, bit-flip, and phase-flip affect key QML algorithms (QSVC, Peg-QSVC, QNN, VQC) and feature mapping techniques (ZFeatureMap, ZZFeatureMap, and PauliFeatureMap). Results indicate that QSVC is notably robust under noise, whereas Peg-QSVC and QNN are more sensitive, particularly to depolarizing and amplitude-damping noise. The PauliFeatureMap is especially vulnerable, highlighting difficulties in maintaining accurate classification under noisy conditions. These findings underscore the critical importance of feature map selection and noise mitigation strategies in optimizing QML for genomic classification, with promising implications for personalized medicine.
Authors:Serhii Svystun, Oleksandr Melnychenko, Pavlo Radiuk, Oleg Savenko, Andrii Lysyi
Abstract:
With the rapid development of green energy, the efficiency and reliability of wind turbines are key to sustainable renewable energy production. For that reason, this paper presents a novel intelligent system architecture designed for the dynamic collection and real-time processing of visual data to detect defects in wind turbines. The system employs advanced algorithms within a distributed framework to enhance inspection accuracy and efficiency using unmanned aerial vehicles (UAVs) with integrated visual and thermal sensors. An experimental study conducted at the "Staryi Sambir-1" wind power plant in Ukraine demonstrates the system's effectiveness, showing a significant improvement in defect detection accuracy (up to 94%) and a reduction in inspection time per turbine (down to 1.5 hours) compared to traditional methods. The results show that the proposed intelligent system architecture provides a scalable and reliable solution for wind turbine maintenance, contributing to the durability and performance of renewable energy infrastructure.
Authors:Serhii Svystun, Oleksandr Melnychenko, Pavlo Radiuk, Oleg Savenko, Anatoliy Sachenko, Andrii Lysyi
Abstract:
The inspection of wind turbine blades (WTBs) is crucial for ensuring their structural integrity and operational efficiency. Traditional inspection methods can be dangerous and inefficient, prompting the use of unmanned aerial vehicles (UAVs) that access hard-to-reach areas and capture high-resolution imagery. In this study, we address the challenge of enhancing defect detection on WTBs by integrating thermal and RGB images obtained from UAVs. We propose a multispectral image composition method that combines thermal and RGB imagery through spatial coordinate transformation, key point detection, binary descriptor creation, and weighted image overlay. Using a benchmark dataset of WTB images annotated for defects, we evaluated several state-of-the-art object detection models. Our results show that composite images significantly improve defect detection efficiency. Specifically, the YOLOv8 model's accuracy increased from 91% to 95%, precision from 89% to 94%, recall from 85% to 92%, and F1-score from 87% to 93%. The number of false positives decreased from 6 to 3, and missed defects reduced from 5 to 2. These findings demonstrate that integrating thermal and RGB imagery enhances defect detection on WTBs, contributing to improved maintenance and reliability.
Authors:Smruti Suresh, Michael Angelo Carvajal, Nathaniel Hanson, Ethan Holand, Samuel Hibbard, Taskin Padir
Abstract:
The inspection of confined critical infrastructure such as attics or crawlspaces is challenging for human operators due to insufficient task space, limited visibility, and the presence of hazardous materials. This paper introduces a prototype of PARIS (Precision Application Robot for Inaccessible Spaces): a use-inspired teleoperated mobile robot manipulator system that was conceived, developed, and tested for and selected as a Phase I winner of the U.S. Department of Energy's E-ROBOT Prize. To improve the thermal efficiency of buildings, the PARIS platform supports: 1) teleoperated mapping and navigation, enabling the human operator to explore compact spaces; 2) inspection and sensing, facilitating the identification and localization of under-insulated areas; and 3) air-sealing targeted gaps and cracks through which thermal energy is lost. The resulting versatile platform can also be tailored for targeted application of treatments and remediation in constrained spaces.
Authors:Devansh Dhrafani, Yifei Liu, Andrew Jong, Ukcheol Shin, Yao He, Tyler Harp, Yaoyu Hu, Jean Oh, Sebastian Scherer
Abstract:
Robust depth perception in visually-degraded environments is crucial for autonomous aerial systems. Thermal imaging cameras, which capture infrared radiation, are robust to visual degradation. However, due to lack of a large-scale dataset, the use of thermal cameras for unmanned aerial system (UAS) depth perception has remained largely unexplored. This paper presents a stereo thermal depth perception dataset for autonomous aerial perception applications. The dataset consists of stereo thermal images, LiDAR, IMU and ground truth depth maps captured in urban and forest settings under diverse conditions like day, night, rain, and smoke. We benchmark representative stereo depth estimation algorithms, offering insights into their performance in degraded conditions. Models trained on our dataset generalize well to unseen smoky conditions, highlighting the robustness of stereo thermal imaging for depth perception. We aim for this work to enhance robotic perception in disaster scenarios, allowing for exploration and operations in previously unreachable areas. The dataset and source code are available at https://firestereo.github.io.
Authors:Chao Tian, Zikun Zhou, Chao Yang, Guoqing Zhu, Fu'an Zhong, Zhenyu He
Abstract:
The advantage of RGB-Thermal (RGB-T) detection lies in its ability to perform modality fusion and integrate cross-modality complementary information, enabling robust detection under diverse illumination and weather conditions. However, under extreme conditions where one modality exhibits poor quality and disturbs detection, modality separation is necessary to mitigate the impact of noise. To address this problem, we propose a Modality-Decoupled RGB-T detection framework with Query Fusion (MDQF) to balance modality complementation and separation. In this framework, DETR-like detectors are employed as separate branches for the RGB and TIR images, with query fusion interspersed between the two branches in each refinement stage. Herein, query fusion is performed by feeding the high-quality queries from one branch to the other one after query selection and adaptation. This design effectively excludes the degraded modality and corrects the predictions using high-quality queries. Moreover, the decoupled framework allows us to optimize each individual branch with unpaired RGB or TIR images, eliminating the need for paired RGB-T data. Extensive experiments demonstrate that our approach delivers superior performance to existing RGB-T detectors and achieves better modality independence.
Authors:Manuel G. Satué, Manuel R. Arahal, Luis F. Acedo, Manuel G. Ortega
Abstract:
Economic model predictive control has been proposed as a means for solving the unit loading and unit allocation problem in multi-chiller cooling plants. The adjective economic stems from the use of financial cost due to electricity consumption in a time horizon, such is the loss function minimized at each sampling period. The energetic approach is rarely encountered. This article presents for the first time a comparison between the energetic optimization objective and the economic one. The comparison is made on a cooling plant using air-cooled water chillers and a cold storage system. Models developed have been integrated into Simscape, and non-convex mixed optimization methods used to achieve optimal control trajectories for both energetic and economic goals considered separately. The results over several scenarios, and in different seasons, support the consideration of the energetic approach despite the current prevalence of the economic one. The results are dependent on the electric season and the available tariffs. In particular, for the high electric season and considering a representative tariff, the results show that an increment of about 2.15% in energy consumption takes place when using the economic approach instead of the energetic one. On the other hand, a reduction in cost of 2.94% is achieved.
Authors:Reza T. Batley, Sourav Saha
Abstract:
Inverse design of heterogeneous material microstructures is a fundamentally ill-posed and famously computationally expensive problem. This is exacerbated by the high-dimensional design spaces associated with finely resolved images, multimodal input property streams, and a highly nonlinear forward physics. Whilst modern generative models excel at accurately modeling such complex forward behavior, most of them are not intrinsically structured to support fast, stable \emph{deterministic} inversion with a physics-informed bias. This work introduces Janus, a unified generative-predictive framework to address this problem. Janus couples a deep encoder-decoder architecture with a predictive KHRONOS head, a separable neural architecture. Topologically speaking, Janus learns a latent manifold simultaneously isometric for generative inversion and pruned for physical prediction; the joint objective inducing \emph{disentanglement} of the latent space. Janus is first validated on the MNIST dataset, demonstrating high-fidelity reconstruction, accurate classification and diverse generative inversion of all ten target classes. It is then applied to the inverse design of heterogeneous microstructures labeled with thermal conductivity. It achieves a forward prediction accuracy $R^2=0.98$ (2\% relative error) and sub-5\% pixelwise reconstruction error. Inverse solutions satisfy target properties to within $1\%$ relative error. Inverting a sweep through properties reveal smooth traversal of the latent manifold, and UMAP visualization confirms the emergence of a low-dimensional, disentangled manifold. By unifying prediction and generation within a single latent space, Janus enables real-time, physics-informed inverse microstructure generation at a lower computational cost typically associated with classical optimization-based approaches.
Authors:Elham Kiyani, Amit Makarand Deshpande, Madhura Limaye, Zhiwei Gao, Sai Aditya Pradeep, Srikanth Pilla, Gang Li, Zhen Li, George Em Karniadakis
Abstract:
Fiber reinforcement and polymer matrix respond differently to manufacturing conditions due to mismatch in coefficient of thermal expansion and matrix shrinkage during curing of thermosets. These heterogeneities generate residual stresses over multiple length scales, whose partial release leads to process-induced deformation (PID), requiring accurate prediction and mitigation via optimized non-isothermal cure cycles. This study considers a unidirectional AS4 carbon fiber/amine bi-functional epoxy prepreg and models PID using a two-mechanism framework that accounts for thermal expansion/shrinkage and cure shrinkage. The model is validated against manufacturing trials to identify initial and boundary conditions, then used to generate PID responses for a diverse set of non-isothermal cure cycles (time-temperature profiles). Building on this physics-based foundation, we develop a data-driven surrogate based on Deep Operator Networks (DeepONets). A DeepONet is trained on a dataset combining high-fidelity simulations with targeted experimental measurements of PID. We extend this to a Feature-wise Linear Modulation (FiLM) DeepONet, where branch-network features are modulated by external parameters, including the initial degree of cure, enabling prediction of time histories of degree of cure, viscosity, and deformation. Because experimental data are available only at limited time instances (for example, final deformation), we use transfer learning: simulation-trained trunk and branch networks are fixed and only the final layer is updated using measured final deformation. Finally, we augment the framework with Ensemble Kalman Inversion (EKI) to quantify uncertainty under experimental conditions and to support optimization of cure schedules for reduced PID in composites.
Authors:Hyunseung Kim, Dae-Woong Jeong, Changyoung Park, Won-Ji Lee, Ha-Eun Lee, Ji-Hye Lee, Rodrigo Hormazabal, Sung Moon Ko, Sumin Lee, Soorin Yim, Chanhui Lee, Sehui Han, Sang-Ho Cha, Woohyung Lim
Abstract:
Artificial intelligence (AI) has emerged as a powerful accelerator of materials discovery, yet most existing models remain problem-specific, requiring additional data collection and retraining for each new property. Here we introduce and validate GATE (Geometrically Aligned Transfer Encoder) -- a generalizable AI framework that jointly learns 34 physicochemical properties spanning thermal, electrical, mechanical, and optical domains. By aligning these properties within a shared geometric space, GATE captures cross-property correlations that reduce disjoint-property bias -- a key factor causing false negatives in multi-criteria screening. To demonstrate its generalizability, GATE -- without any problem-specific reconfiguration -- was directly applied to the discovery of immersion cooling fluids for data centers, a stringent real-world challenge defined by the Open Compute Project (OCP). Screening billions of candidates, GATE identified 92,861 molecules as promising for practical deployment. Four were experimentally or literarily validated, showing strong agreement with wet-lab measurements and performance comparable to or exceeding a commercial coolant. These results establish GATE as a ready-to-use, generalizable AI platform readily applicable across diverse materials discovery tasks.
Authors:Muhammad Ishfaq Hussain, Ma Van Linh, Zubia Naz, Unse Fatima, Yeongmin Ko, Moongu Jeon
Abstract:
Enhancing scene understanding in adverse visibility conditions remains a critical challenge for surveillance and autonomous navigation systems. Conventional imaging modalities, such as RGB and thermal infrared (MWIR / LWIR), when fused, often struggle to deliver comprehensive scene information, particularly under conditions of atmospheric interference or inadequate illumination. To address these limitations, Short-Wave Infrared (SWIR) imaging has emerged as a promising modality due to its ability to penetrate atmospheric disturbances and differentiate materials with improved clarity. However, the advancement and widespread implementation of SWIR-based systems face significant hurdles, primarily due to the scarcity of publicly accessible SWIR datasets. In response to this challenge, our research introduces an approach to synthetically generate SWIR-like structural/contrast cues (without claiming spectral reproduction) images from existing LWIR data using advanced contrast enhancement techniques. We then propose a multimodal fusion framework integrating synthetic SWIR, LWIR, and RGB modalities, employing an optimized encoder-decoder neural network architecture with modality-specific encoders and a softmax-gated fusion head. Comprehensive experiments on public {RGB-LWIR benchmarks (M3FD, TNO, CAMEL, MSRS, RoadScene) and an additional private real RGB-MWIR-SWIR dataset} demonstrate that our synthetic-SWIR-enhanced fusion framework improves fused-image quality (contrast, edge definition, structural fidelity) while maintaining real-time performance. We also add fair trimodal baselines (LP, LatLRR, GFF) and cascaded trimodal variants of U2Fusion/SwinFusion under a unified protocol. The outcomes highlight substantial potential for real-world applications in surveillance and autonomous systems.
Authors:Elton Pan, Soonhyoung Kwon, Sulin Liu, Mingrou Xie, Alexander J. Hoffman, Yifei Duan, Thorben Prein, Killian Sheriff, Yuriy Roman-Leshkov, Manuel Moliner, Rafael Gomez-Bombarelli, Elsa Olivetti
Abstract:
The synthesis of crystalline materials, such as zeolites, remains a significant challenge due to a high-dimensional synthesis space, intricate structure-synthesis relationships and time-consuming experiments. Considering the one-to-many relationship between structure and synthesis, we propose DiffSyn, a generative diffusion model trained on over 23,000 synthesis recipes spanning 50 years of literature. DiffSyn generates probable synthesis routes conditioned on a desired zeolite structure and an organic template. DiffSyn achieves state-of-the-art performance by capturing the multi-modal nature of structure-synthesis relationships. We apply DiffSyn to differentiate among competing phases and generate optimal synthesis routes. As a proof of concept, we synthesize a UFI material using DiffSyn-generated synthesis routes. These routes, rationalized by density functional theory binding energies, resulted in the successful synthesis of a UFI material with a high Si/Al$_{\text{ICP}}$ of 19.0, which is expected to improve thermal stability and is higher than that of any previously recorded.
Authors:Ruchira V Bhat, Rahul Bhowmick, Avinash Singh, Krishna Kumar Sabapathy
Abstract:
The preparation of quantum Gibbs states is a fundamental challenge in quantum computing, essential for applications ranging from modeling open quantum systems to quantum machine learning. Building on the Meta-Variational Quantum Eigensolver framework proposed by Cervera-Lierta et al.(2021) and a problem driven ansatz design, we introduce two meta-learning algorithms: Meta-Variational Quantum Thermalizer (Meta-VQT) and Neural Network Meta-VQT (NN-Meta VQT) for efficient thermal state preparation of parametrized Hamiltonians on Noisy Intermediate-Scale Quantum (NISQ) devices. Meta-VQT utilizes a fully quantum ansatz, while NN Meta-VQT integrates a quantum classical hybrid architecture. Both leverage collective optimization over training sets to generalize Gibbs state preparation to unseen parameters. We validate our methods on upto 8-qubit Transverse Field Ising Model and the 2-qubit Heisenberg model with all field terms, demonstrating efficient thermal state generation beyond training data. For larger systems, we show that our meta-learned parameters when combined with appropriately designed ansatz serve as warm start initializations, significantly outperforming random initializations in the optimization tasks. Furthermore, a 3- qubit Kitaev ring example showcases our algorithm's effectiveness across finite-temperature crossover regimes. Finally, we apply our algorithms to train a Quantum Boltzmann Machine (QBM) on a 2-qubit Heisenberg model with all field terms, achieving enhanced training efficiency, improved Gibbs state accuracy, and a 30-fold runtime speedup over existing techniques such as variational quantum imaginary time (VarQITE)-based QBM highlighting the scalability and practicality of meta-algorithm-based QBMs.
Authors:Hanyang Zhou, Haozhe Lou, Wenhao Liu, Enyu Zhao, Yue Wang, Daniel Seita
Abstract:
Advancing dexterous manipulation with multi-fingered robotic hands requires rich sensory capabilities, while existing designs lack onboard thermal and torque sensing. In this work, we propose the MOTIF hand, a novel multimodal and versatile robotic hand that extends the LEAP hand by integrating: (i) dense tactile information across the fingers, (ii) a depth sensor, (iii) a thermal camera, (iv), IMU sensors, and (v) a visual sensor. The MOTIF hand is designed to be relatively low-cost (under 4000 USD) and easily reproducible. We validate our hand design through experiments that leverage its multimodal sensing for two representative tasks. First, we integrate thermal sensing into 3D reconstruction to guide temperature-aware, safe grasping. Second, we show how our hand can distinguish objects with identical appearance but different masses - a capability beyond methods that use vision only.
Authors:Evan J. D. Anderson, Michael S. Bullock, Ohad Kimelfeld, Christopher K. Eyre, Filip RozpÄdek, Uzi Pereg, Boulat A. Bash
Abstract:
We explore covert entanglement generation over the lossy thermal-noise bosonic channel, which is a quantum-mechanical model of many practical settings, including optical, microwave, and radio-frequency (RF) channels. Covert communication ensures that an adversary is unable to detect the presence of transmissions, which are concealed in channel noise. We show that a $\textit{square root law}$ (SRL) for covert entanglement generation similar to that for classical: $L_{\rm EG}\sqrt{n}$ entangled bits (ebits) can be generated covertly and reliably over $n$ uses of a bosonic channel. We report a single-letter expression for optimal $L_{\rm EG}$ as well as an achievable method. We additionally analyze the performance of covert entanglement generation using single- and dual-rail photonic qubits, which may be more practical for physical implementation.
Authors:Mia Thomas, Trevor Ablett, Jonathan Kelly
Abstract:
Unmanned aerial vehicles (UAVs) enable operations in remote and hazardous environments, yet the visible-spectrum, camera-based navigation systems often relied upon by UAVs struggle in low-visibility conditions. Thermal cameras, which capture long-wave infrared radiation, are able to function effectively in darkness and smoke, where visible-light cameras fail. This work explores learned cross-spectral (thermal-visible) point features as a means to integrate thermal imagery into established camera-based navigation systems. Existing methods typically train a feature network's detection and description outputs directly, which often focuses training on image regions where thermal and visible-spectrum images exhibit similar appearance. Aiming to more fully utilize the available data, we propose a method to train the feature network on the tasks of matching and registration. We run our feature network on thermal-visible image pairs, then feed the network response into a differentiable registration pipeline. Losses are applied to the matching and registration estimates of this pipeline. Our selected model, trained on the task of matching, achieves a registration error (corner error) below 10 pixels for more than 75% of estimates on the MultiPoint dataset. We further demonstrate that our model can also be used with a classical pipeline for matching and registration.
Authors:Zhiwei Cao, Minghao Li, Feng Lin, Jimin Jia, Yonggang Wen, Jianxiong Yin, Simon See
Abstract:
Data centers (DCs) as mission-critical infrastructures are pivotal in powering the growth of artificial intelligence (AI) and the digital economy. The evolution from Internet DC to AI DC has introduced new challenges in operating and managing data centers for improved business resilience and reduced total cost of ownership. As a result, new paradigms, beyond the traditional approaches based on best practices, must be in order for future data centers. In this research, we propose and develop a novel Physical AI (PhyAI) framework for advancing DC operations and management. Our system leverages the emerging capabilities of state-of-the-art industrial products and our in-house research and development. Specifically, it presents three core modules, namely: 1) an industry-grade in-house simulation engine to simulate DC operations in a highly accurate manner, 2) an AI engine built upon NVIDIA PhysicsNemo for the training and evaluation of physics-informed machine learning (PIML) models, and 3) a digital twin platform built upon NVIDIA Omniverse for our proposed 5-tier digital twin framework. This system presents a scalable and adaptable solution to digitalize, optimize, and automate future data center operations and management, by enabling real-time digital twins for future data centers. To illustrate its effectiveness, we present a compelling case study on building a surrogate model for predicting the thermal and airflow profiles of a large-scale DC in a real-time manner. Our results demonstrate its superior performance over traditional time-consuming Computational Fluid Dynamics/Heat Transfer (CFD/HT) simulation, with a median absolute temperature prediction error of 0.18 °C. This emerging approach would open doors to several potential research directions for advancing Physical AI in future DC operations.
Authors:Sayed Pedram Haeri Boroujeni, Niloufar Mehrabi, Fatemeh Afghah, Connor Peter McGrath, Danish Bhatkar, Mithilesh Anil Biradar, Abolfazl Razi
Abstract:
Fire and smoke phenomena pose a significant threat to the natural environment, ecosystems, and global economy, as well as human lives and wildlife. In this particular circumstance, there is a demand for more sophisticated and advanced technologies to implement an effective strategy for early detection, real-time monitoring, and minimizing the overall impacts of fires on ecological balance and public safety. Recently, the rapid advancement of Artificial Intelligence (AI) and Computer Vision (CV) frameworks has substantially revolutionized the momentum for developing efficient fire management systems. However, these systems extensively rely on the availability of adequate and high-quality fire and smoke data to create proficient Machine Learning (ML) methods for various tasks, such as detection and monitoring. Although fire and smoke datasets play a critical role in training, evaluating, and testing advanced Deep Learning (DL) models, a comprehensive review of the existing datasets is still unexplored. For this purpose, we provide an in-depth review to systematically analyze and evaluate fire and smoke datasets collected over the past 20 years. We investigate the characteristics of each dataset, including type, size, format, collection methods, and geographical diversities. We also review and highlight the unique features of each dataset, such as imaging modalities (RGB, thermal, infrared) and their applicability for different fire management tasks (classification, segmentation, detection). Furthermore, we summarize the strengths and weaknesses of each dataset and discuss their potential for advancing research and technology in fire management. Ultimately, we conduct extensive experimental analyses across different datasets using several state-of-the-art algorithms, such as ResNet-50, DeepLab-V3, and YoloV8.
Authors:Faaiq Waqar, Jungyoun Kwak, Junmo Lee, Minji Shon, Mohammadhosein Gholamrezaei, Kevin Skadron, Shimeng Yu
Abstract:
The Last Level Cache (LLC) is the processor's critical bridge between on-chip and off-chip memory levels - optimized for high density, high bandwidth, and low operation energy. To date, high-density (HD) SRAM has been the conventional device of choice; however, with the slowing of transistor scaling, as reflected in the industry's almost identical HD SRAM cell size from 5 nm to 3 nm, alternative solutions such as 3D stacking with advanced packaging like hybrid bonding are pursued (as demonstrated in AMD's V-cache). Escalating data demands necessitate ultra-large on-chip caches to decrease costly off-chip memory movement, pushing the exploration of device technology toward monolithic 3D (M3D) integration where transistors can be stacked in the back-end-of-line (BEOL) at the interconnect level. M3D integration requires fabrication techniques compatible with a low thermal budget (<400 degC). Among promising BEOL device candidates are amorphous oxide semiconductor (AOS) transistors, particularly desirable for their ultra-low leakage (seconds) when used in a gain-cell configuration. This paper examines device, circuit, and system-level tradeoffs when optimizing BEOL-compatible AOS-based 2-transistor gain cell (2T-GC) for LLC. A cache early-exploration tool, NS-Cache, is developed to model caches in advanced 7 and 3 nm nodes and is integrated with the Gem5 simulator to systematically benchmark the impact of the newfound density/performance when compared to HD-SRAM, MRAM, and 1T1C eDRAM alternatives for LLC.
Authors:Wenjing Gong, Xinyue Ye, Keshu Wu, Suphanut Jamonnak, Wenyu Zhang, Yifan Yang, Xiao Huang
Abstract:
Extreme heat events, exacerbated by climate change, pose significant challenges to urban resilience and planning. This study introduces a climate-responsive digital twin framework integrating the Spatiotemporal Vision Transformer (ST-ViT) model to enhance heat stress forecasting and decision-making. Using a Texas campus as a testbed, we synthesized high-resolution physical model simulations with spatial and meteorological data to develop fine-scale human thermal predictions. The ST-ViT-powered digital twin enables efficient, data-driven insights for planners and stakeholders, supporting targeted heat mitigation strategies and advancing climate-adaptive urban design. This campus-scale demonstration offers a foundation for future applications across broader and more diverse urban contexts.
Authors:Sertac Kilickaya, Cansu Celebioglu, Levent Eren, Murat Askar
Abstract:
Condition monitoring of induction machines is crucial to prevent costly interruptions and equipment failure. Mechanical faults such as misalignment and rotor issues are among the most common problems encountered in industrial environments. To effectively monitor and detect these faults, a variety of sensors, including accelerometers, current sensors, temperature sensors, and microphones, are employed in the field. As a non-contact alternative, thermal imaging offers a powerful monitoring solution by capturing temperature variations in machines with thermal cameras. In this study, we propose using 2-dimensional Self-Organized Operational Neural Networks (Self-ONNs) to diagnose misalignment and broken rotor faults from thermal images of squirrel-cage induction motors. We evaluate our approach by benchmarking its performance against widely used Convolutional Neural Networks (CNNs), including ResNet, EfficientNet, PP-LCNet, SEMNASNet, and MixNet, using a Workswell InfraRed Camera (WIC). Our results demonstrate that Self-ONNs, with their non-linear neurons and self-organizing capability, achieve diagnostic performance comparable to more complex CNN models while utilizing a shallower architecture with just three operational layers. Its streamlined architecture ensures high performance and is well-suited for deployment on edge devices, enabling its use also in more complex multi-function and/or multi-device monitoring systems.
Authors:Jingchao Peng, Thomas Bashford-Rogers, Zhuang Shao, Haitao Zhao, Aru Ranjan Singh, Abhishek Goswami, Kurt Debattista
Abstract:
Infrared (IR) imaging offers advantages in several fields due to its unique ability of capturing content in extreme light conditions. However, the demanding hardware requirements of high-resolution IR sensors limit its widespread application. As an alternative, visible light can be used to synthesize IR images but this causes a loss of fidelity in image details and introduces inconsistencies due to lack of contextual awareness of the scene. This stems from a combination of using visible light with a standard dynamic range, especially under extreme lighting, and a lack of contextual awareness can result in pseudo-thermal-crossover artifacts. This occurs when multiple objects with similar temperatures appear indistinguishable in the training data, further exacerbating the loss of fidelity. To solve this challenge, this paper proposes CapHDR2IR, a novel framework incorporating vision-language models using high dynamic range (HDR) images as inputs to generate IR images. HDR images capture a wider range of luminance variations, ensuring reliable IR image generation in different light conditions. Additionally, a dense caption branch integrates semantic understanding, resulting in more meaningful and discernible IR outputs. Extensive experiments on the HDRT dataset show that the proposed CapHDR2IR achieves state-of-the-art performance compared with existing general domain transfer methods and those tailored for visible-to-infrared image translation.
Authors:Xiaoyang Wang, Xin Chen
Abstract:
The large-scale integration of inverter-interfaced renewable energy sources presents significant challenges to maintaining power balance and nominal frequency in modern power systems. This paper studies grid-level coordinated control of grid-forming (GFM) and grid-following (GFL) inverter-based resources (IBRs) for scalable and optimal frequency control. We propose a fully distributed optimal frequency control algorithm based on the projected primal-dual gradient method and by leveraging the structure of the underlying physical system dynamics. The proposed algorithm i) restores the nominal system frequency while minimizing total control cost and enforcing IBR power capacity limits and line thermal constraints, and ii) operates in a distributed manner that only needs local measurements and neighbor-to-neighbor communication. In particular, when the line thermal constraints are disregarded, the proposed algorithm admits a fully local implementation that requires no communication, while still ensuring optimality and satisfying IBR power capacity limits. We establish the global asymptotic convergence of the algorithm using Lyapunov stability analysis. The effectiveness and optimality of the proposed algorithms are validated through high-fidelity, 100% inverter-based electromagnetic transient (EMT) simulations on the IEEE 39-bus system.
Authors:Dongjun Li, Qiuhao Hu, Weiran Jiang, Haoxuan Dong, Ziyou Song
Abstract:
Effective power and thermal management are essential for ensuring battery efficiency, safety, and longevity in Connected and Automated Electric Vehicles (CAEVs). However, real-time implementation is challenging due to the multi-timescale dynamics and complex trade-offs between energy consumption, battery degradation, traffic efficiency, and thermal regulation. This paper proposes a novel integrated power and thermal management strategy based on the Multi-Horizon Model Predictive Control (MH-MPC) framework to enhance energy efficiency, optimize battery temperature, ensure traffic safety and efficiency, and reduce battery degradation for CAEVs. The proposed strategy is formulated with a focus on the aging term, allowing it to more effectively manage the trade-offs between energy consumption, battery degradation, and temperature regulation. Moreover, the proposed strategy leverages multi-horizon optimization to achieve substantial improvements, reducing computation time by 7.18%, cooling energy by 14.22%, traction energy by 8.26%, battery degradation loss by over 22%, and battery degradation inconsistency by 36.57% compared to the benchmark strategy. Furthermore, sensitivity analyses of key parameters, including weighting factors, sampling time, and prediction horizons, demonstrate the robustness of the strategy and underscore its potential for practical applications in extending battery lifespan while ensuring safety and efficiency.
Authors:Dhrumil Patel, Mark M. Wilde
Abstract:
Thermal states play a fundamental role in various areas of physics, and they are becoming increasingly important in quantum information science, with applications related to semi-definite programming, quantum Boltzmann machine learning, Hamiltonian learning, and the related task of estimating the parameters of a Hamiltonian. Here we establish formulas underlying the basic geometry of parameterized thermal states, and we delineate quantum algorithms for estimating the values of these formulas. More specifically, we prove formulas for the Fisher--Bures and Kubo--Mori information matrices of parameterized thermal states, and our quantum algorithms for estimating their matrix elements involve a combination of classical sampling, Hamiltonian simulation, and the Hadamard test. These results have applications in developing a natural gradient descent algorithm for quantum Boltzmann machine learning, which takes into account the geometry of thermal states, and in establishing fundamental limitations on the ability to estimate the parameters of a Hamiltonian, when given access to thermal-state samples. For the latter task, and for the special case of estimating a single parameter, we sketch an algorithm that realizes a measurement that is asymptotically optimal for the estimation task. We finally stress that the natural gradient descent algorithm developed here can be used for any machine learning problem that employs the quantum Boltzmann machine ansatz.
Authors:Dhrumil Patel, Daniel Koch, Saahil Patel, Mark M. Wilde
Abstract:
Estimating the ground-state energy of Hamiltonians is a fundamental task for which it is believed that quantum computers can be helpful. Several approaches have been proposed toward this goal, including algorithms based on quantum phase estimation and hybrid quantum-classical optimizers involving parameterized quantum circuits, the latter falling under the umbrella of the variational quantum eigensolver. Here, we analyze the performance of quantum Boltzmann machines for this task, which is a less explored ansatz based on parameterized thermal states and which is not known to suffer from the barren-plateau problem. We delineate a hybrid quantum-classical algorithm for this task and rigorously prove that it converges to an $\varepsilon$-approximate stationary point of the energy function optimized over parameter space, while using a number of parameterized-thermal-state samples that is polynomial in $\varepsilon^{-1}$, the number of parameters, and the norm of the Hamiltonian being optimized. Our algorithm estimates the gradient of the energy function efficiently by means of a novel quantum circuit construction that combines classical sampling, Hamiltonian simulation, and the Hadamard test, thus overcoming a key obstacle to quantum Boltzmann machine learning that has been left open since [Amin et al., Phys. Rev. X 8, 021050 (2018)]. Additionally supporting our main claims are calculations of the gradient and Hessian of the energy function, as well as an upper bound on the matrix elements of the latter that is used in the convergence analysis.
Authors:Xingwei Zhong, Kui Cai, Guanghui Song
Abstract:
As an emerging non-volatile memory (NVM) technology, spin-torque transfer magnetic random access memory (STT-MRAM) has received great attention in recent years since it combines the features of low switching energy, fast write/read speed, and high scalability. However, process variation and thermal fluctuation severely affect the data integrity of STT-MRAM, resulting in both write errors and read errors. Therefore, effective error correction codes (ECCs) are necessary for correcting memory cell errors. Meanwhile, the design of channel quantizer plays a critical role in supporting error correction coding for STT-MRAM. In this work, we propose a union bound analysis which can accurately predict the word error rates (WERs) of ECCs with maximum-likelihood (ML) decoding over the quantized STT-MRAM channel. The derived bound provides a theoretical tool for comparing the performance of ECCs with different quantization schemes at very low error rate levels without resorting to lengthy computer simulations. Moreover, we also propose a new criterion to design the channel quantizer by minimizing the WERs of ECC decoding that are obtained from the union bound analysis. Numerical results show that the proposed union-bound-optimized (UBO) quantizer can achieve better error rate performance than the state-of-art quantizers for STT-MRAM.
Authors:Ehsanoddin Ghorbanichemazkati, Amro M. Farid
Abstract:
In the 20th century, individual technology products like the generator, telephone, and automobile were connected to form many of the large-scale, complex, infrastructure networks we know today: the power grid, the communication infrastructure, and the transportation system. Progressively, these networked systems began interacting, forming what is now known as systems-of-systems. Because the component systems in the system-of-systems differ, modeling and analysis techniques with primitives applicable across multiple domains or disciplines are needed. For example, linear graphs and bond graphs have been used extensively in the electrical engineering, mechanical engineering, and mechatronic fields to design and analyze a wide variety of engineering systems. In contrast, hetero-functional graph theory (HFGT) has emerged to study many complex engineering systems and systems-of-systems (e.g. electric power, potable water, wastewater, natural gas, oil, coal, multi-modal transportation, mass-customized production, and personalized healthcare delivery systems). This paper seeks to relate hetero-functional graphs to linear graphs and bond graphs and demonstrate that the former is a generalization of the latter two. The contribution is relayed in three stages. First, the three modeling techniques are compared conceptually. Next, these techniques are contrasted on six example systems: (a) an electrical system, (b) a translational mechanical system, (c) a rotational mechanical system, (d) a fluidic system, (e) a thermal system, and (f) a multi-energy (electro-mechanical) system. Finally, this paper proves mathematically that hetero-functional graphs are a formal generalization of both linear graphs and bond graphs.
Authors:Zixin Jiang, Weili Xu, Bing Dong
Abstract:
The urgent need for building decarbonization calls for a paradigm shift in future autonomous building energy operation, from human-intensive engineering workflows toward intelligent agents that interact with physics-grounded digital environments. This study proposes an end-to-end agentic AI-enabled Physics-Informed Machine Learning (PIML) environment for scalable building energy modeling, simulation, control, and automation. The framework consists of (1) a modular and physics-consistent PIML digital environment spanning building thermal dynamics, Heating, Ventilation, and Air Conditioning (HVAC), and distributed energy resources (DER) for grid-interactive energy management; and (2) an agentic AI layer with 11 specialist agents and 72 Model Context Protocol (MCP) tools that enable end-to-end execution of multi-step energy analytics. A representative case study demonstrates multi-domain, multi-agent coordination for assessing how system and control upgrades affect energy use, operating cost, thermal comfort, and flexibility. In addition, a large-scale benchmark (about 4000 runs) systematically evaluates workflow performance in terms of accuracy, token consumption, execution time, and inference cost. The results quantify the impacts of intelligence mode design, model size, task complexity, and orchestrator-specialist coordination, and provide key lessons for building future agentic AI systems in real-world building energy applications. This work establishes a scalable, physics-grounded foundation for deploying agentic AI in decarbonized and grid-interactive building operations.
Authors:Kai Zhu, Darong Huang, Luis Costero, David Atienza
Abstract:
The increasing power densities and intricate heat dissipation paths in advanced 2.5D/3D chiplet systems necessitate thermal modeling frameworks that deliver detailed thermal maps with high computational efficiency. Traditional compact thermal models (CTMs) often struggle to scale with the complexity and heterogeneity of modern architectures. This work introduces 3D-ICE 4.0, designed for heterogeneous chip-based systems. Key innovations include: (i) preservation of material heterogeneity and anisotropy directly from industrial layouts, integrated with OpenMP and SuperLU MT-based parallel solvers for scalable performance, (ii) adaptive vertical layer partitioning to accurately model vertical heat conduction, and (iii) temperature-aware non-uniform grid generation. The results with different benchmarks demonstrate that 3D-ICE 4.0 achieves speedups ranging from 3.61x-6.46x over state-of-the-art tools, while reducing grid complexity by more than 23.3% without compromising accuracy. Compared to the commercial software COMSOL, 3D-ICE 4.0 effectively captures both lateral and vertical heat flows, validating its precision and robustness. These advances demonstrate that 3D-ICE 4.0 is an efficient solution for thermal modeling in emerging heterogeneous 2.5D/3D integrated systems.
Authors:Madhav Vadlamani, Dyutimoy Chakraborty, Jianwei Jia, Halid Mulaosmanovic, Stefan Duenkel, Sven Beyer, Suman Datta, Shimeng Yu
Abstract:
Ferroelectric-based capacitive crossbar arrays have been proposed for energy-efficient in-memory computing in the charge domain. They combat the challenges like sneak paths and high static power faced by resistive crossbar arrays but are susceptible to thermal noise limiting the effective number of bits (ENOB) for the weighted sum. A direct way to reduce this thermal noise is by lowering the temperature as thermal noise is proportional to temperature. In this work, we first characterize the non-volatile capacitors (nvCaps) on a foundry 28 nm platform at cryogenic temperatures to evaluate the memory window, ON state retention as a function of temperature down to 77K, and then use the calibrated device models to simulate the capacitive crossbar arrays in SPICE at lower temperatures to demonstrate higher ENOB (~5 bits) for 128x128 multiple-and-accumulate (MAC) operations.
Authors:Cristina Luna, Alba Guerra, Almudena Moreno, Manuel Esquer, Willy Roa, Mateusz Krawczak, Robert Popela, Piotr Osica, Davide Nicolis
Abstract:
Planetary exploration missions require robust locomotion systems capable of operating in extreme environments over extended periods. This paper presents the DISTANT (Distant Transmission and Steering Systems) design, a novel approach for relocating rover traction and steering actuators from wheel-mounted positions to a thermally protected warm box within the rover body. The design addresses critical challenges in long-distance traversal missions by protecting sensitive components from thermal cycling, dust contamination, and mechanical wear. A double wishbone suspension configuration with cardan joints and capstan drive steering has been selected as the optimal architecture following comprehensive trade-off analysis. The system enables independent wheel traction, steering control, and suspension management whilst maintaining all motorisation within the protected environment. The design meets a 50 km traverse requirement without performance degradation, with integrated dust protection mechanisms and thermal management solutions. Testing and validation activities are planned for Q1 2026 following breadboard manufacturing at 1:3 scale.
Authors:Zhengyi Liu, Xinrui Wang, Xianyong Fang, Zhengzheng Tu, Linbo Wang
Abstract:
RGB-T salient object detection (SOD) aims to segment attractive objects by combining RGB and thermal infrared images. To enhance performance, the Segment Anything Model has been fine-tuned for this task. However, the imbalance convergence of two modalities and significant gradient difference between high- and low- activations are ignored, thereby leaving room for further performance enhancement. In this paper, we propose a model called \textit{SAMSOD}, which utilizes unimodal supervision to enhance the learning of non-dominant modality and employs gradient deconfliction to reduce the impact of conflicting gradients on model convergence. The method also leverages two decoupled adapters to separately mask high- and low-activation neurons, emphasizing foreground objects by enhancing background learning. Fundamental experiments on RGB-T SOD benchmark datasets and generalizability experiments on scribble supervised RGB-T SOD, fully supervised RGB-D SOD datasets and full-supervised RGB-D rail surface defect detection all demonstrate the effectiveness of our proposed method.
Authors:Julie Rousseau, Hanmin Cai, Philipp Heer, Kristina Orehounig, Gabriela Hug
Abstract:
Buildings represent a promising flexibility source to support the integration of renewable energy sources, as they may shift their heating energy consumption over time without impacting users' comfort. However, a building's predicted flexibility potential is based on uncertain ambient weather forecasts and a typically inaccurate building thermal model. Hence, this paper presents an uncertainty-aware flexibility quantifier using a chance-constrained formulation. Because such a quantifier may be conservative, we additionally model real-time feedback in the quantification, in the form of affine feedback policies. Such adaptation can take the form of intra-day trades or rebound around the flexibility provision period. To assess the flexibility quantification formulations, we further assume that flexible buildings participate in secondary frequency control markets. The results show some increase in flexibility and revenues when introducing affine feedback policies. Additionally, it is demonstrated that accounting for uncertainties in the flexibility quantification is necessary, especially when intra-day trades are not available. Even though an uncertainty-ignorant potential may seem financially profitable in secondary frequency control markets, it comes at the cost of significant thermal discomfort for inhabitants. Hence, we suggest a comfort-preserving approach, aiming to truly reflect thermal discomfort on the economic flexibility revenue, to obtain a fairer comparison.
Authors:Jianren Wang, Jie Han, Abhinav Gupta, Deepak Pathak, Yang Zhang
Abstract:
Quasi-direct-drive (QDD) actuation is transforming legged and manipulator robots by eliminating high-ratio gearboxes, yet it demands motors that deliver very high torque at low speed within a thin, disc-shaped joint envelope. Axial-flux permanent-magnet (AFPM) machines meet these geometric and torque requirements, but scaling them below a 20mm outer diameter is hampered by poor copper fill in conventional wound stators, inflating resistance and throttling continuous torque. This paper introduces a micro-scale AFPM motor that overcomes these limitations through printed-circuit-board (PCB) windings fabricated with advanced IC-substrate high-density interconnect (HDI) technology. The resulting 48-layer stator-formed by stacking four 12-layer HDI modules-achieves a record 45\% copper fill in a package only 5mm thick and 19mm in diameter. We perform comprehensive electromagnetic and thermal analyses to inform the motor design, then fabricate a prototype whose performance characteristics are experimentally verified.
Authors:Sriram Narayanan, Mani Ramanagopal, Srinivasa G. Narasimhan
Abstract:
Long-wave infrared radiation captured by a thermal camera consists of two components: (a) light from the environment reflected or transmitted by a surface, and (b) light emitted by the surface after undergoing heat transport through the object and exchanging heat with the surrounding environment. Separating these components is essential for understanding object properties such as emissivity, temperature, reflectance and shape. Previous thermography studies often assume that only one component is dominant (e.g., in welding) or that the second component is constant and can be subtracted. However, in near-ambient conditions, which are most relevant to computer vision applications, both components are typically comparable in magnitude and vary over time. We introduce the first method that separates reflected and emitted components of light in videos captured by two thermal cameras with different spectral sensitivities. We derive a dual-band thermal image formation model and develop algorithms to estimate the surface's emissivity and its time-varying temperature while isolating a dynamic background. We quantitatively evaluate our approach using carefully calibrated emissivities for a range of materials and show qualitative results on complex everyday scenes, such as a glass filled with hot liquid and people moving in the background.
Authors:Zeqing Leo Yuan, Mani Ramanagopal, Aswin C. Sankaranarayanan, Srinivasa G. Narasimhan
Abstract:
Decomposing an image into its intrinsic photometric factors--shading and reflectance--is a long-standing challenge due to the lack of extensive ground-truth data for real-world scenes. Recent methods rely on synthetic data or sparse annotations for limited indoor and even fewer outdoor scenes. We introduce a novel training-free approach for intrinsic image decomposition using only a pair of visible and thermal images. We leverage the principle that light not reflected from an opaque surface is absorbed and detected as heat by a thermal camera. This allows us to relate the ordinalities between visible and thermal image intensities to the ordinalities of shading and reflectance, which can densely self-supervise an optimizing neural network to recover shading and reflectance. We perform quantitative evaluations with known reflectance and shading under natural and artificial lighting, and qualitative experiments across diverse outdoor scenes. The results demonstrate superior performance over recent learning-based models and point toward a scalable path to curating real-world ordinal supervision, previously infeasible via manual labeling.
Authors:Brooks Kinch, Benjamin Shaffer, Elizabeth Armstrong, Michael Meehan, John Hewson, Nathaniel Trask
Abstract:
We present a framework for constructing real-time digital twins based on structure-preserving reduced finite element models conditioned on a latent variable Z. The approach uses conditional attention mechanisms to learn both a reduced finite element basis and a nonlinear conservation law within the framework of finite element exterior calculus (FEEC). This guarantees numerical well-posedness and exact preservation of conserved quantities, regardless of data sparsity or optimization error. The conditioning mechanism supports real-time calibration to parametric variables, allowing the construction of digital twins which support closed loop inference and calibration to sensor data. The framework interfaces with conventional finite element machinery in a non-invasive manner, allowing treatment of complex geometries and integration of learned models with conventional finite element techniques.
Benchmarks include advection diffusion, shock hydrodynamics, electrostatics, and a complex battery thermal runaway problem. The method achieves accurate predictions on complex geometries with sparse data (25 LES simulations), including capturing the transition to turbulence and achieving real-time inference ~0.1s with a speedup of 3.1x10^8 relative to LES. An open-source implementation is available on GitHub.
Authors:David Atienza, Kai Zhu, Darong Huang, Luis Costero
Abstract:
As processor performance advances, increasing power densities and complex thermal behaviors threaten both energy efficiency and system reliability. This survey covers more than two decades of research on power and thermal modeling and management in modern processors. We start by comparing analytical, regression-based, and neural network-based techniques for power estimation, then review thermal modeling methods, including finite element, finite difference, and data-driven approaches. Next, we categorize dynamic runtime management strategies that balance performance, power consumption, and reliability. Finally, we conclude with a discussion of emerging challenges and promising research directions.
Authors:Bin Xie, Congxuan Zhang, Fagan Wang, Peng Liu, Feng Lu, Zhen Chen, Weiming Hu
Abstract:
The widespread application of Unmanned Aerial Vehicles (UAVs) has raised serious public safety and privacy concerns, making UAV perception crucial for anti-UAV tasks. However, existing UAV tracking datasets predominantly feature conspicuous objects and lack diversity in scene complexity and attribute representation, limiting their applicability to real-world scenarios. To overcome these limitations, we present the CST Anti-UAV, a new thermal infrared dataset specifically designed for Single Object Tracking (SOT) in Complex Scenes with Tiny UAVs (CST). It contains 220 video sequences with over 240k high-quality bounding box annotations, highlighting two key properties: a significant number of tiny-sized UAV targets and the diverse and complex scenes. To the best of our knowledge, CST Anti-UAV is the first dataset to incorporate complete manual frame-level attribute annotations, enabling precise evaluations under varied challenges. To conduct an in-depth performance analysis for CST Anti-UAV, we evaluate 20 existing SOT methods on the proposed dataset. Experimental results demonstrate that tracking tiny UAVs in complex environments remains a challenge, as the state-of-the-art method achieves only 35.92% state accuracy, much lower than the 67.69% observed on the Anti-UAV410 dataset. These findings underscore the limitations of existing benchmarks and the need for further advancements in UAV tracking research. The CST Anti-UAV benchmark is about to be publicly released, which not only fosters the development of more robust SOT methods but also drives innovation in anti-UAV systems.
Authors:Mehdi Elahi, Mohamed R. Elshamy, Abdel-Hameed Badawy, Ahmad Patooghy
Abstract:
Thermal Trojan attacks present a pressing concern for the security and reliability of System-on-Chips (SoCs), especially in mobile applications. The situation becomes more complicated when such attacks are more evasive and operate sporadically to stay hidden from detection mechanisms. In this paper, we introduce Intermittent Thermal Trojans (iThermTroj) that exploit the chips' thermal information in a random time-triggered manner. According to our experiments, iThermTroj attack can easily bypass available threshold-based thermal Trojan detection solutions. We investigate SoC vulnerabilities to variations of iThermTroj through an in-depth analysis of Trojan activation and duration scenarios. We also propose a set of tiny Machine Learning classifiers for run-time anomaly detection to protect SoCs against such intermittent thermal Trojan attacks. Compared to existing methods, our approach improves the attack detection rate by 29.4\%, 17.2\%, and 14.3\% in scenarios where iThermTroj manipulates up to 80\%, 60\%, and 40\% of SoC's thermal data, respectively. Additionally, our method increases the full protection resolution to 0.8 degrees Celsius, meaning that any temperature manipulations exceeding $\pm 0.8$ degrees will be detected with 100\% accuracy.
Authors:Zeinab Salehi, Yijun Chen, Ian R. Petersen, Guodong Shi, Duncan S. Callaway, Elizabeth L. Ratnam
Abstract:
The recent widespread adoption of rooftop solar backed by battery storage is enabling energy customers to both produce and consume electricity (i.e., prosumers of electricity). To facilitate prosumer participation in the electric grid, new market mechanisms are required. In this paper, we design peer-to-peer energy markets where prosumers trade their excess energy with peers to gain profit while satisfying the overall balance in electricity supply and demand. We first consider a market structure, considering the case where voltage and/or thermal constraints are binding. When such grid constraints are binding, market clearing prices can vary across locations. However, heterogeneous prices may be considered by regulators to lack fairness. To ensure uniform pricing, we design two peer-to-peer energy markets with dynamic operating envelopes (DOEs). DOEs enable us to decompose global voltage and thermal constraints across the power grid into local constraints for each prosumer, resulting in uniform prices across the grid. By means of numerical simulations on an IEEE 13-node feeder, we benchmark the proposed market-based approaches in the presence of binding voltage constraints.
Authors:Tingting Liu, Yuan Liu, Jinhui Tang, Liyin Yuan, Chengyu Liu, Chunlai Li, Xiubao Sui, Qian Chen
Abstract:
Thermal infrared (TIR) images, acquired through thermal radiation imaging, are unaffected by variations in lighting conditions and atmospheric haze. However, TIR images inherently lack color and texture information, limiting downstream tasks and potentially causing visual fatigue. Existing colorization methods primarily rely on single-band images with limited spectral information and insufficient feature extraction capabilities, which often result in image distortion and semantic ambiguity. In contrast, multiband infrared imagery provides richer spectral data, facilitating the preservation of finer details and enhancing semantic accuracy. In this paper, we propose a generative adversarial network (GAN)-based framework designed to integrate spectral information to enhance the colorization of infrared images. The framework employs a multi-stage spectral self-attention Transformer network (MTSIC) as the generator. Each spectral feature is treated as a token for self-attention computation, and a multi-head self-attention mechanism forms a spatial-spectral attention residual block (SARB), achieving multi-band feature mapping and reducing semantic confusion. Multiple SARB units are integrated into a Transformer-based single-stage network (STformer), which uses a U-shaped architecture to extract contextual information, combined with multi-scale wavelet blocks (MSWB) to align semantic information in the spatial-frequency dual domain. Multiple STformer modules are cascaded to form MTSIC, progressively optimizing the reconstruction quality. Experimental results demonstrate that the proposed method significantly outperforms traditional techniques and effectively enhances the visual quality of infrared images.
Authors:Ahmed Aboudonia, Johannes Estermann, Keith Moffat, Manfred Morari, John Lygeros
Abstract:
We aim to improve the energy efficiency of train climate control architectures, with a focus on a specific class of regional trains operating throughout Switzerland, especially in Zurich and Geneva. Heating, Ventilation, and Air Conditioning (HVAC) systems represent the second largest energy consumer in these trains after traction. The current architecture comprises a high-level rule-based controller and a low-level tracking controller. To improve train energy efficiency, we propose adding a middle data-driven predictive control layer aimed at minimizing HVAC energy consumption while maintaining passenger comfort. The scheme incorporates a multistep prediction model developed using real-world data collected from a limited number of train coaches. To validate the effectiveness of the proposed architecture, we conduct multiple experiments on a separate set of train coaches; our results suggest energy savings between 10% and 35% with respect to the current architecture.
Authors:Marcella Astrid, Abdelrahman Shabayek, Djamila Aouada
Abstract:
Batteries are essential for various applications, including electric vehicles and renewable energy storage, making safety and efficiency critical concerns. Anomaly detection in battery thermal images helps identify failures early, but traditional deep learning methods require extensive labeled data, which is difficult to obtain, especially for anomalies due to safety risks and high data collection costs. To overcome this, we explore zero-shot anomaly detection using Visual Question Answering (VQA) models, which leverage pretrained knowledge and textbased prompts to generalize across vision tasks. By incorporating prior knowledge of normal battery thermal behavior, we design prompts to detect anomalies without battery-specific training data. We evaluate three VQA models (ChatGPT-4o, LLaVa-13b, and BLIP-2) analyzing their robustness to prompt variations, repeated trials, and qualitative outputs. Despite the lack of finetuning on battery data, our approach demonstrates competitive performance compared to state-of-the-art models that are trained with the battery data. Our findings highlight the potential of VQA-based zero-shot learning for battery anomaly detection and suggest future directions for improving its effectiveness.
Authors:Simon Baeuerle, Ian F. Mendonca, Kristof Van Laerhoven, Ralf Mikut, Andreas Steimer
Abstract:
Coverage Path Planning of Thermal Interface Materials (TIM) plays a crucial role in the design of power electronics and electronic control units. Up to now, this is done manually by experts or by using optimization approaches with a high computational effort. We propose a novel AI-based approach to generate dispense paths for TIM and similar dispensing applications. It is a drop-in replacement for optimization-based approaches. An Artificial Neural Network (ANN) receives the target cooling area as input and directly outputs the dispense path. Our proposed setup does not require labels and we show its feasibility on multiple target areas. The resulting dispense paths can be directly transferred to automated manufacturing equipment and do not exhibit air entrapments. The approach of using an ANN to predict process parameters for a desired target state in real-time could potentially be transferred to other manufacturing processes.
Authors:Shijun Liao, Shijie Qin
Abstract:
Using clean numerical simulation (CNS) in which artificial numerical noise is negligible over a finite, sufficiently long interval of time, we provide evidence, for the first time, that artificial numerical noise in direct numerical simulation (DNS) of turbulence is approximately equivalent to thermal fluctuation and/or stochastic environmental noise. This confers physical significance on the artificial numerical noise of DNS of the Navier-Stokes equations. As a result, DNS on a fine mesh should correspond to turbulence under small internal/external physical disturbance, whereas DNS on a sparse mesh corresponds to turbulent flow under large physical disturbance, respectively. The key point is that: all of them have physical meanings and so are correct in terms of their deterministic physics, even if their statistics are quite different. This is illustrated herein. Our paper provides a positive viewpoint regarding the presence of artificial numerical noise in DNS.
Authors:Qinqin Zhang, Xiaoyu Liang, Ning Xu, Yu Chen
Abstract:
With the advent of the post-Moore era, the 2.5-D advanced package is a promising solution to sustain the development of very large-scale integrated circuits. However, the thermal placement of chiplet, due to the high complexity of thermal simulation, is very challenging. In this paper, a surrogate-assisted simulated annealing algorithm is proposed to simultaneously minimize both the wirelength and the maximum temperature of integrated chips. To alleviate the computational cost of thermal simulation, a radial basis function network is introduced to approximate the thermal field, assisted by which the simulated annealing algorithm converges to the better placement in less time. Numerical results demonstrate that the surrogate-assisted simulated annealing algorithm is competitive to the state-of-the-art thermal placement algorithms of chiplet, suggesting its potential application in the agile design of 2.5D package chip.
Authors:Stefano Riva, Andrea Missaglia, Carolina Introini, In Cheol Bang, Antonio Cammi
Abstract:
In recent years, algorithms aiming at learning models from available data have become quite popular due to two factors: 1) the significant developments in Artificial Intelligence techniques and 2) the availability of large amounts of data. Nevertheless, this topic has already been addressed by methodologies belonging to the Reduced Order Modelling framework, of which perhaps the most famous equation-free technique is Dynamic Mode Decomposition. This algorithm aims to learn the best linear model that represents the physical phenomena described by a time series dataset: its output is a best state operator of the underlying dynamical system that can be used, in principle, to advance the original dataset in time even beyond its span. However, in its standard formulation, this technique cannot deal with parametric time series, meaning that a different linear model has to be derived for each parameter realization. Research on this is ongoing, and some versions of a parametric Dynamic Mode Decomposition already exist. This work contributes to this research field by comparing the different algorithms presently deployed and assessing their advantages and shortcomings compared to each other. To this aim, three different thermal-hydraulics problems are considered: two benchmark 'flow over cylinder' test cases at diverse Reynolds numbers, whose datasets are, respectively, obtained with the FEniCS finite element solver and retrieved from the CFDbench dataset, and the DYNASTY experimental facility operating at Politecnico di Milano, which studies the natural circulation established by internally heated fluids for Generation IV nuclear applications, whose dataset was generated using the RELAP5 nodal solver.
Authors:Jie Tian, Martin Taylor Sobczak, Dhanush Patil, Jixin Hou, Lin Pang, Arunachalam Ramanathan, Libin Yang, Xianyan Chen, Yuval Golan, Xiaoming Zhai, Hongyue Sun, Kenan Song, Xianqiao Wang
Abstract:
Metamaterials, renowned for their exceptional mechanical, electromagnetic, and thermal properties, hold transformative potential across diverse applications, yet their design remains constrained by labor-intensive trial-and-error methods and limited data interoperability. Here, we introduce CrossMatAgent -- a novel multi-agent framework that synergistically integrates large language models with state-of-the-art generative AI to revolutionize metamaterial design. By orchestrating a hierarchical team of agents -- each specializing in tasks such as pattern analysis, architectural synthesis, prompt engineering, and supervisory feedback -- our system leverages the multimodal reasoning of GPT-4o alongside the generative precision of DALL-E 3 and a fine-tuned Stable Diffusion XL model. This integrated approach automates data augmentation, enhances design fidelity, and produces simulation- and 3D printing-ready metamaterial patterns. Comprehensive evaluations, including CLIP-based alignment, SHAP interpretability analyses, and mechanical simulations under varied load conditions, demonstrate the framework's ability to generate diverse, reproducible, and application-ready designs. CrossMatAgent thus establishes a scalable, AI-driven paradigm that bridges the gap between conceptual innovation and practical realization, paving the way for accelerated metamaterial development.
Authors:Stefano Riva, Carolina Introini, J. Nathan Kutz, Antonio Cammi
Abstract:
The recent developments in data-driven methods have paved the way to new methodologies to provide accurate state reconstruction of engineering systems; nuclear reactors represent particularly challenging applications for this task due to the complexity of the strongly coupled physics involved and the extremely harsh and hostile environments, especially for new technologies such as Generation-IV reactors. Data-driven techniques can combine different sources of information, including computational proxy models and local noisy measurements on the system, to robustly estimate the state. This work leverages the novel Shallow Recurrent Decoder architecture to infer the entire state vector (including neutron fluxes, precursors concentrations, temperature, pressure and velocity) of a reactor from three out-of-core time-series neutron flux measurements alone. In particular, this work extends the standard architecture to treat parametric time-series data, ensuring the possibility of investigating different accidental scenarios and showing the capabilities of this approach to provide an accurate state estimation in various operating conditions. This paper considers as a test case the Molten Salt Fast Reactor (MSFR), a Generation-IV reactor concept, characterised by strong coupling between the neutronics and the thermal hydraulics due to the liquid nature of the fuel. The promising results of this work are further strengthened by the possibility of quantifying the uncertainty associated with the state estimation, due to the considerably low training cost. The accurate reconstruction of every characteristic field in real-time makes this approach suitable for monitoring and control purposes in the framework of a reactor digital twin.
Authors:Clayton Miller, Yun Xuan Chua, Matias Quintana, Binyu Lei, Filip Biljecki, Mario Frei
Abstract:
Humans can play a more active role in improving their comfort in the built environment if given the right information at the right place and time. This paper outlines the use of Just-in-Time Adaptive Interventions (JITAI) implemented in the context of the built environment to provide information that helps humans minimize the impact of heat and noise on their daily lives. This framework is based on the open-source Cozie iOS smartwatch platform. It includes data collection through micro-surveys and intervention messages triggered by environmental, contextual, and personal history conditions. An eight-month deployment of the method was completed in Singapore with 103 participants who submitted more than 12,000 micro-surveys and had more than 3,600 JITAI intervention messages delivered to them. A weekly survey conducted during two deployment phases revealed an overall increase in perceived usefulness ranging from 8-19% over the first three weeks of data collection. For noise-related interventions, participants showed an overall increase in location changes ranging from 4-11% and a 2-17% increase in earphone use to mitigate noise distractions. For thermal comfort-related interventions, participants demonstrated a 3-13\% increase in adjustments to their location or thermostat to feel more comfortable. The analysis found evidence that personality traits (such as conscientiousness), gender, and environmental preferences could be factors in determining the perceived helpfulness of JITAIs and influencing behavior change. These findings underscore the importance of tailoring intervention strategies to individual traits and environmental conditions, setting the stage for future research to refine the delivery, timing, and content of intervention messages.
Authors:Haotian Ji, Dong Wu, Chi Zhang, Xiangyu Hu
Abstract:
Maintaining a comfortable temperature inside a building requires appropriate thermal insulation of windows, which can be optimised iteratively with numerical simulation. Smoothed particle hydrodynamics(SPH) is a fully Lagrangian method widely used for simulating multi-physics applications with high computational efficiency and accuracy. It is advantageous in physically coupled problems such as heat-fluid-solid or any other type of physically coupled simulations. The focus of this study is to simulate the heat transfer process in various window frames under convective boundary conditions according to ISO10077-2:2012. This paper demonstrates the accuracy and compatibility of SPH when dealing with heat transfer problems, which ensures further development of thermal coupling with other physical fields. The results and methods used in this paper provide some guidance on how to properly handle heat transfer simulations using SPH, which can be extended to multi-physics coupled simulations in the future.
Authors:Sheng Zhang, Yunhao Fan, Kotaro Shimizu, Gia-Wei Chern
Abstract:
We review the recent development of machine-learning (ML) force-field frameworks for Landau-Lifshitz-Gilbert (LLG) dynamics simulations of itinerant electron magnets, focusing on the general theory and implementations of symmetry-invariant representations of spin configurations. The crucial properties that such magnetic descriptors must satisfy are differentiability with respect to spin rotations and invariance to both lattice point-group symmetry and internal spin rotation symmetry. We propose an efficient implementation based on the concept of reference irreducible representations, modified from the group-theoretical power-spectrum and bispectrum methods. The ML framework is demonstrated using the s-d models, which are widely applied in spintronics research. We show that LLG simulations based on local fields predicted by the trained ML models successfully reproduce representative non-collinear spin structures, including 120$^\circ$, tetrahedral, and skyrmion crystal orders of the triangular-lattice s-d models. Large-scale thermal quench simulations enabled by ML models further reveal intriguing freezing dynamics and glassy stripe states consisting of skyrmions and bi-merons. Our work highlights the utility of ML force-field approach to dynamical modeling of complex spin orders in itinerant electron magnets.
Authors:Mehdi Elahi, Mohamed R. Elshamy, Abdel-Hameed Badawy, Mahdi Fazeli, Ahmad Patooghy
Abstract:
As mobile systems become more advanced, the security of System-on-Chips (SoCs) is increasingly threatened by thermal attacks. This research introduces a new attack method called the Multi-stage Adaptive Thermal Trojan for Efficiency and Resilience Degradation (MATTER). MATTER takes advantage of weaknesses in Dynamic Thermal Management (DTM) systems by manipulating temperature sensor interfaces, which leads to incorrect thermal sensing and disrupts the SoC's ability to manage heat effectively. Our experiments show that this attack can degrade DTM performance by as much as 73%, highlighting serious vulnerabilities in modern mobile devices. By exploiting the trust placed in temperature sensors, MATTER causes DTM systems to make poor decisions i.e., failing to activate cooling when needed. This not only affects how well the system works but also threatens the lifespan of the hardware. This paper provides a thorough analysis of how MATTER works and emphasizes the need for stronger thermal management systems in SoCs.
Authors:Zixin Huang, Mark M. Wilde
Abstract:
Bosonic Gaussian thermal states form a fundamental class of states in quantum information science. This paper explores the information geometry of these states, focusing on characterizing the distance between two nearby states and the geometry induced by a parameterization in terms of their mean vectors and Hamiltonian matrices. In particular, for the family of bosonic Gaussian thermal states, we derive expressions for their Fisher-Bures and Kubo-Mori information matrices with respect to their mean vectors and Hamiltonian matrices. An important application of our formulas consists of fundamental limits on how well one can estimate these parameters. We additionally establish formulas for the derivatives and the symmetric logarithmic derivatives of bosonic Gaussian thermal states. The former could have applications in gradient descent algorithms for quantum machine learning when using bosonic Gaussian thermal states as an ansatz, and the latter in formulating optimal strategies for single parameter estimation of bosonic Gaussian thermal states. Finally, the expressions for the aforementioned information matrices could have additional applications in natural gradient descent algorithms when using bosonic Gaussian thermal states as an ansatz.
Authors:Taeheon Kim, Sangyun Chung, Youngjoon Yu, Yong Man Ro
Abstract:
Multispectral pedestrian detection is a crucial component in various critical applications. However, a significant challenge arises due to the misalignment between these modalities, particularly under real-world conditions where data often appear heavily misaligned. Conventional methods developed on well-aligned or minimally misaligned datasets fail to address these discrepancies adequately. This paper introduces a new framework for multispectral pedestrian detection designed specifically to handle heavily misaligned datasets without the need for costly and complex traditional pre-processing calibration. By leveraging Large-scale Vision-Language Models (LVLM) for cross-modal semantic alignment, our approach seeks to enhance detection accuracy by aligning semantic information across the RGB and thermal domains. This method not only simplifies the operational requirements but also extends the practical usability of multispectral detection technologies in practical applications.
Authors:Shijun Liao, Shijie Qin
Abstract:
Randomness is one of the most important characteristics of turbulence, but its origin remains an open question. By means of a ``thought experiment'' via several clean numerical experiments based on the Navier-Stokes equations for two-dimensional turbulent Kolmogorov flow, we reveal a new phenomenon, which we call the ``noise-expansion cascade'' whereby all micro-level noises/disturbances at different orders of magnitudes in the initial condition of Navier-Stokes equations enlarge consistently, say, one by one like an inverse cascade, to macro-level. More importantly, each noise/disturbance input may greatly change the macro-level characteristics and statistics of the resulting turbulence, clearly indicating that micro-level noise/disturbance might have great influence on macro-level characteristics and statistics of turbulence. Besides, the noise-expansion cascade closely connects randomness of micro-level noise/disturbance and macro-level disorder of turbulence, thus revealing an origin of randomness of turbulence. This also highly suggests that unavoidable thermal fluctuations must be considered when simulating turbulence, even if such fluctuations are several orders of magnitudes smaller than other external environmental disturbances. Hopefully, the ``noise-expansion cascade'' as a fundamental property of the NS equations could greatly deepen our understandings about turbulence, and besides is helpful for attacking the fourth millennium problem posed by Clay Mathematics Institute in 2000.
Authors:Stefano Riva, Carolina Introini, Antonio Cammi, J. Nathan Kutz
Abstract:
Reliable, real-time state estimation in nuclear reactors is of critical importance for monitoring, control and safety. It further empowers the development of digital twins that are sufficiently accurate for real-world deployment. As nuclear engineering systems are typically characterised by extreme environments, their in-core sensing is a challenging task, even more so in Generation-IV reactor concepts, which feature molten salt or liquid metals as thermal carriers. The emergence of data-driven methods allows for new techniques for accurate and robust estimation of the full state space vector characterising the reactor (mainly composed by neutron fluxes and the thermal-hydraulics fields). These techniques can combine different sources of information, including computational proxy models and local noisy measurements on the system, in order to robustly estimate the state. This work leverages the Shallow Recurrent Decoder (SHRED) architecture to estimate the entire state vector of a reactor from three, out-of-core time-series neutron flux measurements alone. Specifically, the Molten Salt Fast Reactor, in the EVOL geometry (Evaluation and Viability of Liquid Fuel Fast Reactor System project), is demonstrated as a test case, with neutron flux measurements alone allowing for reconstruction of the 20 coupled field variables of the dynamics. This approach can further quantify the uncertainty associated with the state estimation due to its considerably low training cost on compressed data. The accurate reconstruction of every characteristic field in real-time makes this approach suitable for monitoring and control purposes in the framework of a reactor digital twin.
Authors:Niraj Aryal, Sheng Zhang, Weiguo Yin, Gia-Wei Chern
Abstract:
We present a machine learning (ML) method for efficient computation of vibrational thermal expectation values of physical properties from first principles. Our approach is based on the non-perturbative frozen phonon formulation in which stochastic Monte Carlo algorithm is employed to sample configurations of nuclei in a supercell at finite temperatures based on a first-principles phonon model. A deep-learning neural network is trained to accurately predict physical properties associated with sampled phonon configurations, thus bypassing the time-consuming {\em ab initio} calculations. To incorporate the point-group symmetry of the electronic system into the ML model, group-theoretical methods are used to develop a symmetry-invariant descriptor for phonon configurations in the supercell. We apply our ML approach to compute the temperature dependent electronic energy gap of silicon based on density functional theory (DFT). We show that, with less than a hundred DFT calculations for training the neural network model, an order of magnitude larger number of sampling can be achieved for the computation of the vibrational thermal expectation values. Our work highlights the promising potential of ML techniques for finite temperature first-principles electronic structure methods.
Authors:Mohammad Pivezhandi, Mahdi Banisharif, Abusayeed Saifullah, Ali Jannesari
Abstract:
Dynamic voltage and frequency scaling (DVFS) and task-to-core allocation are critical for thermal management and balancing energy and performance in embedded systems. Existing approaches either rely on utilization-based heuristics that overlook stall times, or require extensive offline profiling for table generation, preventing runtime adaptation. We propose a model-based hierarchical multi-agent reinforcement learning (MARL) framework for thermal- and energy-aware scheduling on multi-core platforms. Two collaborative agents decompose the exponential action space, achieving 358ms latency for subsequent decisions. First decisions require 3.5 to 8.0s including one-time LLM feature extraction. An accurate environment model leverages regression techniques to predict thermal dynamics and performance states. When combined with LLM-extracted semantic features, the environment model enables zero-shot deployment for new workloads on trained platforms by generating synthetic training data without requiring workload-specific profiling samples. We introduce LLM-based semantic feature extraction that characterizes OpenMP programs through 13 code-level features without execution. The Dyna-Q-inspired framework integrates direct reinforcement learning with model-based planning, achieving 20x faster convergence than model-free methods. Experiments on BOTS and PolybenchC benchmarks across NVIDIA Jetson TX2, Jetson Orin NX, RubikPi, and Intel Core i7 demonstrate 7.09x better energy efficiency and 4.0x better makespan than Linux ondemand governor. First-decision latency is 8,300x faster than table-based profiling, enabling practical deployment in dynamic embedded systems.
Authors:Mohammad Pivezhandi, Abusayeed Saifullah, Ali Jannesari
Abstract:
With advancements in multicore embedded systems, leakage power, exponentially tied to chip temperature, has surpassed dynamic power consumption. Energy-aware solutions use dynamic voltage and frequency scaling (DVFS) to mitigate overheating in performance-intensive scenarios, while software approaches allocate high-utilization tasks across core configurations in parallel systems to reduce power. However, existing heuristics lack per-core frequency monitoring, failing to address overheating from uneven core activity, and task assignments without detailed profiling overlook irregular execution patterns. We target OpenMP DAG workloads. Because makespan, energy, and thermal goals often conflict within a single benchmark, this work prioritizes performance (makespan) while reporting energy and thermal as secondary outcomes. To overcome these issues, we propose HiDVFS (a hierarchical multi-agent, performance-aware DVFS scheduler) for parallel systems that optimizes task allocation based on profiling data, core temperatures, and makespan-first objectives. It employs three agents: one selects cores and frequencies using profiler data, another manages core combinations via temperature sensors, and a third sets task priorities during resource contention. A makespan-focused reward with energy and temperature regularizers estimates future states and enhances sample efficiency. Experiments on the NVIDIA Jetson TX2 using the BOTS suite (9 benchmarks) compare HiDVFS against state-of-the-art approaches. With multi-seed validation (seeds 42, 123, 456), HiDVFS achieves the best finetuned performance with 4.16 plus/minus 0.58s average makespan (L10), representing a 3.44x speedup over GearDVFS (14.32 plus/minus 2.61s) and 50.4% energy reduction (63.7 kJ vs 128.4 kJ). Across all BOTS benchmarks, HiDVFS achieves an average 3.95x speedup and 47.1% energy reduction.
Authors:Ibai Ramirez, Jokin Alcibar, Joel Pino, Mikel Sanz, Jose I. Aizpurua
Abstract:
Physics-Informed Neural Networks (PINNs) provide a framework for integrating physical laws with data. However, their application to Prognostics and Health Management (PHM) remains constrained by the limited uncertainty quantification (UQ) capabilities. Most existing PINN-based prognostics approaches are deterministic or account only for epistemic uncertainty, limiting their suitability for risk-aware decision-making. This work introduces a heteroscedastic Bayesian Physics-Informed Neural Network (B-PINN) framework that jointly models epistemic and aleatoric uncertainty, yielding full predictive posteriors for spatiotemporal insulation material ageing estimation. The approach integrates Bayesian Neural Networks (BNNs) with physics-based residual enforcement and prior distributions, enabling probabilistic inference within a physics-informed learning architecture. The framework is evaluated on transformer insulation ageing application, validated with a finite-element thermal model and field measurements from a solar power plant, and benchmarked against deterministic PINNs, dropout-based PINNs (d-PINNs), and alternative B-PINN variants. Results show that the proposed B-PINN provides improved predictive accuracy and better-calibrated uncertainty estimates than competing approaches. A systematic sensitivity study further analyzes the impact of boundary-condition, initial-condition, and residual sampling strategies on accuracy, calibration, and generalization. Overall, the findings highlight the potential of Bayesian physics-informed learning to support uncertainty-aware prognostics and informed decision-making in transformer asset management.
Authors:Mohammad Pivezhandi, Mahdi Banisharif, Saeed Bakhshan, Abusayeed Saifullah, Ali Jannesari
Abstract:
Performance prediction for OpenMP workloads on heterogeneous embedded SoCs is challenging due to complex interactions between task DAG structure, control-flow irregularity, cache and branch behavior, and thermal dynamics; classical heuristics struggle under workload irregularity, tabular regressors discard structural information, and model-free RL risks overheating resource-constrained devices. We introduce GraphPerf-RT, the first surrogate that unifies task DAG topology, CFG-derived code semantics, and runtime context (per-core DVFS, thermal state, utilization) in a heterogeneous graph representation with typed edges encoding precedence, placement, and contention. Multi-task evidential heads predict makespan, energy, cache and branch misses, and utilization with calibrated uncertainty (Normal-Inverse-Gamma), enabling risk-aware scheduling that filters low-confidence rollouts. We validate GraphPerf-RT on three embedded ARM platforms (Jetson TX2, Jetson Orin NX, RUBIK Pi), achieving R^2 > 0.95 with well-calibrated uncertainty (ECE < 0.05). To demonstrate end-to-end scheduling utility, we integrate the surrogate with four RL methods on Jetson TX2: single-agent model-free (SAMFRL), single-agent model-based (SAMBRL), multi-agent model-free (MAMFRL-D3QN), and multi-agent model-based (MAMBRL-D3QN). Experiments across 5 seeds (200 episodes each) show that MAMBRL-D3QN with GraphPerf-RT as the world model achieves 66% makespan reduction (0.97 +/- 0.35s) and 82% energy reduction (0.006 +/- 0.005J) compared to model-free baselines, demonstrating that accurate, uncertainty-aware surrogates enable effective model-based planning on thermally constrained embedded systems.
Authors:Jing Tao, Yonghong Zong, Banglei Guana, Pengju Sun, Taihang Lei, Yang Shanga, Qifeng Yu
Abstract:
In photogrammetry, accurately fusing infrared (IR) and visible (VIS) spectra while preserving the geometric fidelity of visible features and incorporating thermal radiation is a significant challenge, particularly under extreme conditions. Existing methods often compromise visible imagery quality, impacting measurement accuracy. To solve this, we propose a region perception-based fusion framework that combines multi-exposure and multi-modal imaging using a spatially varying exposure (SVE) camera. This framework co-fuses multi-modal and multi-exposure data, overcoming single-exposure method limitations in extreme environments. The framework begins with region perception-based feature fusion to ensure precise multi-modal registration, followed by adaptive fusion with contrast enhancement. A structural similarity compensation mechanism, guided by regional saliency maps, optimizes IR-VIS spectral integration. Moreover, the framework adapts to single-exposure scenarios for robust fusion across different conditions. Experiments conducted on both synthetic and real-world data demonstrate superior image clarity and improved performance compared to state-of-the-art methods, as evidenced by both quantitative and visual evaluations.
Authors:Farah Alsafadi, Alexandra Akins, Xu Wu
Abstract:
Deep generative modeling provides a powerful pathway to overcome data scarcity in energy-related applications where experimental data are often limited, costly, or difficult to obtain. By learning the underlying probability distribution of the training dataset, deep generative models, such as the diffusion model (DM), can generate high-fidelity synthetic samples that statistically resemble the training data. Such synthetic data generation can significantly enrich the size and diversity of the available training data, and more importantly, improve the robustness of downstream machine learning models in predictive tasks. The objective of this paper is to investigate the effectiveness of DM for overcoming data scarcity in nuclear energy applications. By leveraging a public dataset on critical heat flux (CHF) that cover a wide range of commercial nuclear reactor operational conditions, we developed a DM that can generate an arbitrary amount of synthetic samples for augmenting of the CHF dataset. Since a vanilla DM can only generate samples randomly, we also developed a conditional DM capable of generating targeted CHF data under user-specified thermal-hydraulic conditions. The performance of the DM was evaluated based on their ability to capture empirical feature distributions and pair-wise correlations, as well as to maintain physical consistency. The results showed that both the DM and conditional DM can successfully generate realistic and physics-consistent CHF data. Furthermore, uncertainty quantification was performed to establish confidence in the generated data. The results demonstrated that the conditional DM is highly effective in augmenting CHF data while maintaining acceptable levels of uncertainty.
Authors:Mohamed Youssef, Lukas Brunner, Klaus Rundhammer, Gerald Czech, Oliver Bimber
Abstract:
We introduce a novel method for reconstructing surface temperatures through occluding forest vegetation by combining signal processing and machine learning. Our goal is to enable fully automated aerial wildfire monitoring using autonomous drones, allowing for the early detection of ground fires before smoke or flames are visible. While synthetic aperture (SA) sensing mitigates occlusion from the canopy and sunlight, it introduces thermal blur that obscures the actual surface temperatures. To address this, we train a visual state space model to recover the subtle thermal signals of partially occluded soil and fire hotspots from this blurred data. A key challenge was the scarcity of real-world training data. We overcome this by integrating a latent diffusion model into a vector quantized to generated a large volume of realistic surface temperature simulations from real wildfire recordings, which we further expanded through temperature augmentation and procedural thermal forest simulation. On simulated data across varied ambient and surface temperatures, forest densities, and sunlight conditions, our method reduced the RMSE by a factor of 2 to 2.5 compared to conventional thermal and uncorrected SA imaging. In field experiments focused on high-temperature hotspots, the improvement was even more significant, with a 12.8-fold RMSE gain over conventional thermal and a 2.6-fold gain over uncorrected SA images. We also demonstrate our model's generalization to other thermal signals, such as human signatures for search and rescue. Since simple thresholding is frequently inadequate for detecting subtle thermal signals, the morphological characteristics are equally essential for accurate classification. Our experiments demonstrated another clear advantage: we reconstructed the complete morphology of fire and human signatures, whereas conventional imaging is defeated by partial occlusion.
Authors:Feng Guo, Luis D. Couto, Khiem Trad, Guangdi Hu, Mohammadhosein Safari
Abstract:
This paper addresses state of charge (SOC) estimation for lithium iron phosphate (LFP) batteries, where the relatively flat open-circuit voltage (OCV-SOC) characteristic reduces observability. A residual bias compensation dual extended Kalman filter (RBC-DEKF) is developed. Unlike conventional bias compensation methods that treat the bias as an augmented state within a single filter, the proposed dual-filter structure decouples residual bias estimation from electrochemical state estimation. One EKF estimates the system states of a control-oriented parameter-grouped single particle model with thermal effects, while the other EKF estimates a residual bias that continuously corrects the voltage observation equation, thereby refining the model-predicted voltage in real time. Unlike bias-augmented single-filter schemes that enlarge the covariance coupling, the decoupled bias estimator refines the voltage observation without perturbing electrochemical state dynamics. Validation is conducted on an LFP cell from a public dataset under three representative operating conditions: US06 at 0 degC, DST at 25 degC, and FUDS at 50 degC. Compared with a conventional EKF using the same model and identical state filter settings, the proposed method reduces the average SOC RMSE from 3.75% to 0.20% and the voltage RMSE between the filtered model voltage and the measured voltage from 32.8 mV to 0.8 mV. The improvement is most evident in the mid-SOC range where the OCV-SOC curve is flat, confirming that residual bias compensation significantly enhances accuracy for model-based SOC estimation of LFP batteries across a wide temperature range.
Authors:Nicolas Rouger, Luiz Villa, Matthieu Masson, Pauline Kergus, Joseph Kemdeg, Lorenzo Leijnen, Jean Alinei, Adrien Colomb, Ayoub Farah-Hassan, Arnauld Biganzoli
Abstract:
The switching losses of power transistors are generally measured using the so-called double pulse method. Measuring the opposition of two switching cells is a complementary method that is more accurate but indirect. However, implementing this method can be more complex and requires calibration steps and comprehensive control, with the added issue of thermal management. In this context, we proposed to address this topic through open and collaborative science, first in the form of a two-day hackathon, followed by monthly open sessions. More than 20 participants contributed to the two-day hackathon, followed by monthly sessions for those wishing to continue working together. This enabled us to set up an automated bench, in open science, including the generation of switching commands, the configuration and control of measuring instruments, and the hardware part. Here we present and share our work and this open approach.
Authors:Johannes Autenrieb, Patrick Gruhn
Abstract:
Hypersonic glide vehicles (HGVs) operate in challenging flight regimes characterized by strong nonlinearities in actuation and stringent physical constraints. These include state-dependent actuator limitations, asymmetric control bounds, and thermal loads that vary with maneuvering conditions. This paper introduces an iterative control allocation method to address these challenges in real time. The proposed algorithm searches for control inputs that achieve the desired moment commands while respecting constraints on input magnitude and rate. For slender HGV configurations, thermal loads and drag generation are strongly correlated-lower drag typically results in reduced surface heating. By embedding drag-sensitive soft constraints, the method improves energy efficiency and implicitly reduces surface temperatures, lowering the vehicle's infrared signature. These features are particularly advantageous for long-range military operations that require low observability. The approach is demonstrated using the DLR's Generic Hypersonic Glide Vehicle 2 (GHGV-2) simulation model. The results confirm the method's effectiveness in maintaining control authority under realistic, constrained flight conditions.
Authors:Jihun Lim, Sungwon Lee
Abstract:
To facilitate the widespread adoption of renewable energy, dispatchable, zero-emission power sources are essential for grid stability. This work performs a comprehensive techno-economic analysis of a self-sustainable thermophotovoltaic (TPV) system, an architecture that integrates solar charging to function as a standalone power generation asset. Using theory-based models for air-bridge InGaAs and Si diode cells, our analysis reveals that while the system is not currently competitive from a pure levelized of storage cost (LCOS) perspective due to the high capital expenditure for thermal battery materials, its primary value lies in its competitive levelized cost of electricity (LCOE). The results demonstrate that the LCOE of this self-sustaining system can be competitive with conventional dispatchable generators, such as gas turbines. Furthermore, at scales exceeding the gigawatt-hour level, a Si-based system can also achieve an LCOE comparable to that of traditional gas-turbine power plants, despite having a lower conversion efficiency than its InGaAs counterpart. This highlights a practical engineering pathway for leveraging silicon's immense manufacturing scalability, offering a lower-risk route to deployment compared to III-V materials. Ultimately, this work establishes the self-sustainable TPV architecture as a compelling pathway toward providing grid-scale, on-demand, zero-emission power.
Authors:Ahmad Alsheikh, Andreas Fischer
Abstract:
Accurate and efficient temperature prediction is critical for optimizing the preheating process of PET preforms in industrial microwave systems prior to blow molding. We propose a novel deep learning framework for generalized temperature prediction. Unlike traditional models that require extensive retraining for each material or design variation, our method introduces a data-efficient neural architecture that leverages transfer learning and model fusion to generalize across unseen scenarios. By pretraining specialized neural regressor on distinct conditions such as recycled PET heat capacities or varying preform geometries and integrating their representations into a unified global model, we create a system capable of learning shared thermal dynamics across heterogeneous inputs. The architecture incorporates skip connections to enhance stability and prediction accuracy. Our approach reduces the need for large simulation datasets while achieving superior performance compared to models trained from scratch. Experimental validation on two case studies material variability and geometric diversity demonstrates significant improvements in generalization, establishing a scalable ML-based solution for intelligent thermal control in manufacturing environments. Moreover, the approach highlights how data-efficient generalization strategies can extend to other industrial applications involving complex physical modeling with limited data.
Authors:C. Coelho, M. Hohmann, D. Fernández, L. Penter, S. Ihlenfeldt, O. Niggemann
Abstract:
Thermal errors in machine tools significantly impact machining precision and productivity. Traditional thermal error correction/compensation methods rely on measured temperature-deformation fields or on transfer functions. Most existing data-driven compensation strategies employ neural networks (NNs) to directly predict thermal errors or specific compensation values. While effective, these approaches are tightly bound to particular error types, spatial locations, or machine configurations, limiting their generality and adaptability. In this work, we introduce a novel paradigm in which NNs are trained to predict high-fidelity temperature and heat flux fields within the machine tool. The proposed framework enables subsequent computation and correction of a wide range of error types using modular, swappable downstream components. The NN is trained using data obtained with the finite element method under varying initial conditions and incorporates a correlation-based selection strategy that identifies the most informative measurement points, minimising hardware requirements during inference. We further benchmark state-of-the-art time-series NN architectures, namely Recurrent NN, Gated Recurrent Unit, Long-Short Term Memory (LSTM), Bidirectional LSTM, Transformer, and Temporal Convolutional Network, by training both specialised models, tailored for specific initial conditions, and general models, capable of extrapolating to unseen scenarios. The results show accurate and low-cost prediction of temperature and heat flux fields, laying the basis for enabling flexible and generalisable thermal error correction in machine tool environments.
Authors:Malek Succar, Mohamed I. Ibrahim
Abstract:
Current control techniques for cryogenically cooled qubits are realized with coaxial cables, posing multiple challenges in terms of cost, thermal load, size, and long-term scalability. Emerging approaches to tackle this issue include cryogenic CMOS electronics at 4 K, and photonic links for direct qubit control. In this paper, we propose a multiplexed all-passive cryogenic high frequency direct detection control platform (cryo-HFDD). The proposed classical interface for direct qubit control utilizes optical or sub-THz bands. We present the possible tradeoffs of this platform, and compare it with current state-of-the-art cryogenic CMOS and conventional coaxial approaches. We assess the feasibility of adopting these efficient links for a wide range of microwave qubit power levels. Specifically, we estimate the heat load to achieve the required signal-to-noise ratio SNR considering different noise sources, component losses, as well as link density. We show that multiplexed photonic receivers at 4 K can aggressively scale the control of thousands of qubits. This opens the door for low cost scalable quantum computing systems.
Authors:Manuel Nkegoum, Minh-Tan Pham, Ãlisa Fromont, Bruno Avignon, Sébastien Lefèvre
Abstract:
Few-shot multispectral object detection (FSMOD) addresses the challenge of detecting objects across visible and thermal modalities with minimal annotated data. In this paper, we explore this complex task and introduce a framework named "FSMODNet" that leverages cross-modality feature integration to improve detection performance even with limited labels. By effectively combining the unique strengths of visible and thermal imagery using deformable attention, the proposed method demonstrates robust adaptability in complex illumination and environmental conditions. Experimental results on two public datasets show effective object detection performance in challenging low-data regimes, outperforming several baselines we established from state-of-the-art models. All code, models, and experimental data splits can be found at https://anonymous.4open.science/r/Test-B48D.
Authors:Naveed D. Riaziat, Joseph Chen, Axel Krieger, Jeremy D. Brown
Abstract:
Electrosurgery is a surgical technique that can improve tissue cutting by reducing cutting force and bleeding. However, electrosurgery adds a risk of thermal injury to surrounding tissue. Expert surgeons estimate desirable cutting velocities based on experience but have no quantifiable reference to indicate if a particular velocity is optimal. Furthermore, prior demonstrations of autonomous electrosurgery have primarily used constant tool velocity, which is not robust to changes in electrosurgical tissue characteristics, power settings, or tool type. Thermal imaging feedback provides information that can be used to reduce thermal injury while balancing cutting force by controlling tool velocity. We introduce Thermography for Electrosurgical Rate Modulation via Optimization (ThERMO) to autonomously reduce thermal injury while balancing cutting force by intelligently controlling tool velocity. We demonstrate ThERMO in tissue phantoms and compare its performance to the constant velocity approach. Overall, ThERMO improves cut success rate by a factor of three and can reduce peak cutting force by a factor of two. ThERMO responds to varying environmental disturbances, reduces damage to tissue, and completes cutting tasks that would otherwise result in catastrophic failure for the constant velocity approach.
Authors:Alex Keilmann, Claudia Redenbach, Francois Willot
Abstract:
In fields such as material design or biomedicine, fiber materials play an important role. Fiber simulations, also called digital twins, provide a basis for testing and optimizing the material's physical behavior digitally. Inter-fiber contacts can influence the thermal and mechanical behavior of a fiber system; to our knowledge, however, there exist no parametric fiber models allowing for explicit modeling of the number of inter-fiber contacts. Therefore, this paper proposes an extension of the iterative force-biased fiber packing by Altendorf \& Jeulin. In this extension, we model the inter-fiber contacts explicitly and add another force to the force-biased packing to increase the number of contacts. We successfully validate the packing with respect to its parameter accuracy. Moreover, we show that the extension indeed increases the number of contacts, even exceeding theoretical values. Hence, this packing scheme has the potential to achieve higher accuracy in physical simulations.
Authors:C. Coelho, D. Fernández, M. Hohmann, L. Penter, S. Ihlenfeldt, O. Niggemann
Abstract:
This data set descriptor introduces a structured, high-resolution dataset of transient thermal simulations for a vertical axis of a machine tool test rig. The data set includes temperature and heat flux values recorded at 29 probe locations at 1800 time steps, sampled every second over a 30-minute range, across 17 simulation runs derived from a fractional factorial design. First, a computer-aided design model was de-featured, segmented, and optimized, followed by finite element (FE) modelling. Detailed information on material, mesh, and boundary conditions is included. To support research and model development, the dataset provides summary statistics, thermal evolution plots, correlation matrix analyses, and a reproducible Jupyter notebook. The data set is designed to support machine learning and deep learning applications in thermal modelling for prediction, correction, and compensation of thermally induced deviations in mechanical systems, and aims to support researchers without FE expertise by providing ready-to-use simulation data.
Authors:Ibai Ramirez, Jokin Alcibar, Joel Pino, Mikel Sanz, David Pardo, Jose I. Aizpurua
Abstract:
Scientific Machine Learning (SciML) integrates physics and data into the learning process, offering improved generalization compared with purely data-driven models. Despite its potential, applications of SciML in prognostics remain limited, partly due to the complexity of incorporating partial differential equations (PDEs) for ageing physics and the scarcity of robust uncertainty quantification methods. This work introduces a Bayesian Physics-Informed Neural Network (B-PINN) framework for probabilistic prognostics estimation. By embedding Bayesian Neural Networks into the PINN architecture, the proposed approach produces principled, uncertainty-aware predictions. The method is applied to a transformer ageing case study, where insulation degradation is primarily driven by thermal stress. The heat diffusion PDE is used as the physical residual, and different prior distributions are investigated to examine their impact on predictive posterior distributions and their ability to encode a priori physical knowledge. The framework is validated against a finite element model developed and tested with real measurements from a solar power plant. Results, benchmarked against a dropout-PINN baseline, show that the proposed B-PINN delivers more reliable prognostic predictions by accurately quantifying predictive uncertainty. This capability is crucial for supporting robust and informed maintenance decision-making in critical power assets.
Authors:Johannes van Randenborgh, Moritz Schulze Darup
Abstract:
An aquifer thermal energy storage (ATES) can mitigate CO2 emissions of heating, ventilation, and air conditioning (HVAC) systems for buildings. In application, an ATES keeps large quantities of thermal energy in groundwater-saturated aquifers. Normally, an ATES system comprises two (one for heat and one for cold) storages and supports the heating and cooling efforts of simultaneously present HVAC system components. This way, the operation and emissions of installed and, usually, fossil fuel-based components are reduced. The control of ATES systems is challenging, and various control schemes, including model predictive control (MPC), have been proposed. In this context, we present a lightweight input-output-data-based autoregressive with exogenous input (ARX) model of the hybrid ATES system dynamics. The ARX model allows the design of an output-based MPC scheme, resulting in an easy-to-solve quadratic program and avoiding challenging state estimations of ground temperatures. A numerical study discusses the accuracy of the ARX predictor and controller performance.
Authors:Haoshuo Zhang, Yufei Bo, Meixia Tao
Abstract:
Multimodal semantic communication has great potential to enhance downstream task performance by integrating complementary information across modalities. This paper introduces ProMSC-MIS, a novel Prompt-based Multimodal Semantic Communication framework for Multi-Spectral Image Segmentation. It enables efficient task-oriented transmission of spatially aligned RGB and thermal images over band-limited channels. Our framework has two main design novelties. First, by leveraging prompt learning and contrastive learning, unimodal semantic encoders are pre-trained to learn diverse and complementary semantic representations by using features from one modality as prompts for another. Second, a semantic fusion module that combines cross-attention mechanism and squeeze-and-excitation (SE) networks is designed to effectively fuse cross-modal features. Experimental results demonstrate that ProMSC-MIS substantially outperforms conventional image transmission combined with state-of-the-art segmentation methods. Notably, it reduces the required channel bandwidth by 50%--70% at the same segmentation performance, while also decreasing the storage overhead and computational complexity by 26% and 37%, respectively. Ablation studies also validate the effectiveness of the proposed pre-training and semantic fusion strategies. Our scheme is highly suitable for applications such as autonomous driving and nighttime surveillance.
Authors:Anton Belichenko, Daria Trinitatova, Aigul Nasibullina, Lev Yakovlev, Dzmitry Tsetserukou
Abstract:
Understanding the neural correlates of sensory imagery is crucial for advancing cognitive neuroscience and developing novel Brain-Computer Interface (BCI) paradigms. This study investigated the influence of imagined temperature sensations (ITS) on neural activity within the sensorimotor cortex. The experimental study involved the evaluation of neural activity using electroencephalography (EEG) during both real thermal stimulation (TS: 40°C Hot, 20°C Cold) applied to the participants' hand, and the mental temperature imagination (ITS) of the corresponding hot and cold sensations. The analysis focused on quantifying the event-related desynchronization (ERD) of the sensorimotor mu-rhythm (8-13 Hz). The experimental results revealed a characteristic mu-ERD localized over central scalp regions (e.g., C3) during both TS and ITS conditions. Although the magnitude of mu-ERD during ITS was slightly lower than during TS, this difference was not statistically significant (p>.05). However, ERD during both ITS and TS was statistically significantly different from the resting baseline (p<.001). These findings demonstrate that imagining temperature sensations engages sensorimotor cortical mechanisms in a manner comparable to actual thermal perception. This insight expands our understanding of the neurophysiological basis of sensory imagery and suggests the potential utility of ITS for non-motor BCI control and neurorehabilitation technologies.
Authors:Sijie Yang, Binyu Lei, Filip Biljecki
Abstract:
Ensuring liveability and comfort is one of the fundamental objectives of urban planning. Numerous studies have employed computational methods to assess and quantify factors related to urban comfort such as greenery coverage, thermal comfort, and walkability. However, a clear definition of urban comfort and its comprehensive evaluation framework remain elusive. Our research explores the theoretical interpretations and methodologies for assessing urban comfort within digital planning, emphasising three key dimensions: multidimensional analysis, data support, and AI assistance.
Authors:Michele Minervini, Madison Chin, Jacob Kupperman, Nana Liu, Ivy Luo, Meghan Ly, Soorya Rethinasamy, Kathie Wang, Mark M. Wilde
Abstract:
A quantum thermodynamic system is described by a Hamiltonian and a list of conserved, non-commuting charges, and a fundamental goal is to determine the minimum energy of the system subject to constraints on the charges. Recently, [Liu et al., arXiv:2505.04514] proposed first- and second-order classical and hybrid quantum-classical algorithms for solving a dual chemical potential maximization problem, and they proved that these algorithms converge to global optima by means of gradient-ascent approaches. In this paper, we benchmark these algorithms on several problems of interest in thermodynamics, including one- and two-dimensional quantum Heisenberg models with nearest and next-to-nearest neighbor interactions and with the charges set to the total $x$, $y$, and $z$ magnetizations. We also offer an alternative compelling interpretation of these algorithms as methods for designing ground and thermal states of controllable Hamiltonians, with potential applications in molecular and material design. Furthermore, we introduce stabilizer thermodynamic systems as thermodynamic systems based on stabilizer codes, with the Hamiltonian constructed from a given code's stabilizer operators and the charges constructed from the code's logical operators. We benchmark the aforementioned algorithms on several examples of stabilizer thermodynamic systems, including those constructed from the one-to-three-qubit repetition code, the perfect one-to-five-qubit code, and the two-to-four-qubit error-detecting code. Finally, we observe that the aforementioned hybrid quantum-classical algorithms, when applied to stabilizer thermodynamic systems, can serve as alternative methods for encoding qubits into stabilizer codes at a fixed temperature, and we provide an effective method for warm-starting these encoding algorithms whenever a single qubit is encoded into multiple physical qubits.
Authors:Johannes van Randenborgh, Steffen Daniel, Moritz Schulze Darup
Abstract:
Borehole thermal energy storage (BTES) can reduce the operation of fossil fuel-based heating, ventilation, and air conditioning systems for buildings. With BTES, thermal energy is stored via a borehole heat exchanger in the ground. Model predictive control (MPC) may maximize the use of BTES by achieving a dynamic interaction between the building and BTES. However, modeling BTES for MPC is challenging, and a trade-off between model accuracy and an easy-to-solve optimal control problem (OCP) must be found. This manuscript presents an accurate numerical model yielding an easy-to-solve linear-quadratic OCP.
Authors:Ziqi Zhang, Shiheng Chen, Runze Yang, Zhisheng Wei, Wei Zhang, Lei Wang, Zhanzhi Liu, Fengshan Zhang, Jing Wu, Xiaoyong Pan, Hongbin Shen, Longbing Cao, Zhaohong Deng
Abstract:
Developing enzymes with desired thermal properties is crucial for a wide range of industrial and research applications, and determining temperature stability is an essential step in this process. Experimental determination of thermal parameters is labor-intensive, time-consuming, and costly. Moreover, existing computational approaches are often hindered by limited data availability and imbalanced distributions. To address these challenges, we introduce a curated temperature stability dataset designed for model development and benchmarking in enzyme thermal modeling. Leveraging this dataset, we present the \textit{Segment Transformer}, a novel deep learning framework that enables efficient and accurate prediction of enzyme temperature stability. The model achieves state-of-the-art performance with an RMSE of 24.03, MAE of 18.09, and Pearson and Spearman correlations of 0.33, respectively. These results highlight the effectiveness of incorporating segment-level representations, grounded in the biological observation that different regions of a protein sequence contribute unequally to thermal behavior. As a proof of concept, we applied the Segment Transformer to guide the engineering of a cutinase enzyme. Experimental validation demonstrated a 1.64-fold improvement in relative activity following heat treatment, achieved through only 17 mutations and without compromising catalytic function.
Authors:Aurelio Venditti, Walter Gubinelli, Enise F. Altin, Luca Colombo, Pietro Simeoni, Benyamin Davaji, Matteo Rinaldi
Abstract:
This letter introduces a novel class of miniaturized, uncooled, and ultra-fast infrared (IR) resonant thermal detectors (RTDs) based on 30%-doped Aluminum Scandium Nitride (AlScN) nanoplates. Exploiting high electromechanical coupling, good thermal properties, and enhanced and selective IR absorption, the presented device aims to demonstrate significant advancements over the state-of-the-art IR RTDs. This single pixel combines compact footprint, high spectral selectivity and responsivity, reduced noise, and fast thermal response, allowing for the potential development of innovative IR thermal imagers through multi-pixel integration. The flexural nature of the actuated resonance mode eventually enables an interferometric optical readout, paving the way towards achieving extremely low Noise Equivalent Power levels. These results demonstrate a high IR responsivity of around 130 ppt/pW, a thermal time constant of around 330 us, and a large out-of-plane displacement. This work represents the first experimental integration on a resonating platform of plasmonic absorbers that utilize AlScN as dielectric layer.
Authors:Francesco Malandrino, Olga Chukhno, Alessandro Catania, Antonella Molinaro, Carla Fabiana Chiasserini
Abstract:
Extended reality (XR) devices, commonly known as wearables, must handle significant computational loads under tight latency constraints. To meet these demands, they rely on a combination of on-device processing and edge offloading. This letter focuses on offloading strategies for wearables by considering their impact across three time scales: instantaneous power consumption, short-term temperature fluctuations, and long-term battery duration. We introduce a comprehensive system model that captures these temporal dynamics, and propose a stochastic and stationary offloading strategy, called TAO (for temperature-aware offloading), designed to minimize the offloading cost while adhering to power, thermal, and energy constraints. Our performance evaluation, leveraging COMSOL models of real-world wearables, confirms that TAO reduces offloading cost by over 35% compared to state-of-the-art approaches, without violating the wearable operational limits.
Authors:Alex C. Newkirk, Jared Fernandez, Jonathan Koomey, Imran Latif, Emma Strubell, Arman Shehabi, Constantine Samaras
Abstract:
As AI's energy demand continues to grow, it is critical to enhance the understanding of characteristics of this demand, to improve grid infrastructure planning and environmental assessment. By combining empirical measurements from Brookhaven National Laboratory during AI training on 8-GPU H100 systems with open-source benchmarking data, we develop statistical models relating computational intensity to node-level power consumption. We measure the gap between manufacturer-rated thermal design power (TDP) and actual power demand during AI training. Our analysis reveals that even computationally intensive workloads operate at only 76% of the 10.2 kW TDP rating. Our architecture-specific model, calibrated to floating-point operations, predicts energy consumption with 11.4% mean absolute percentage error, significantly outperforming TDP-based approaches (27-37% error). We identified distinct power signatures between transformer and CNN architectures, with transformers showing characteristic fluctuations that may impact grid stability.
Authors:Farida Mohsen, Ali Safa
Abstract:
Accurate rotational odometry is crucial for autonomous robotic systems, particularly for small, power-constrained platforms such as drones and mobile robots. This study introduces thermal-gyro fusion, a novel sensor fusion approach that integrates ultra-low-resolution thermal imaging with gyroscope readings for rotational odometry. Unlike RGB cameras, thermal imaging is invariant to lighting conditions and, when fused with gyroscopic data, mitigates drift which is a common limitation of inertial sensors. We first develop a multimodal data acquisition system to collect synchronized thermal and gyroscope data, along with rotational speed labels, across diverse environments. Subsequently, we design and train a lightweight Convolutional Neural Network (CNN) that fuses both modalities for rotational speed estimation. Our analysis demonstrates that thermal-gyro fusion enables a significant reduction in thermal camera resolution without significantly compromising accuracy, thereby improving computational efficiency and memory utilization. These advantages make our approach well-suited for real-time deployment in resource-constrained robotic systems. Finally, to facilitate further research, we publicly release our dataset as supplementary material.
Authors:Tianci Miao, Qihang Zheng, Yangyang Hu, Xiaoyu Cheng, Jie Liang, Liang Chen, Aiying Guo, Jingjing Liu, Kailin Ren, Jianhua Zhang
Abstract:
As the technology node continues to shrink, nanosheet field effect transistors (NSFETs) and complementary FETs (CFETs) become valid candidates for the 3nm and sub-nanometre nodes. However, due to the shrinking device size, self-heating and inter-device thermal crosstalk of NSFETs and CFETs become more severe. It is important to accurately calculate the self-heating and thermal crosstalk of devices and to study the electrical and thermal characteristics of logic gates, etc. In this work, a thermal network model considering the thermal crosstalk of neighboring devices is proposed, which can accurately calculate the self-heating and thermal crosstalk. The electrical and thermal characteristics of NSFETs and CFETs are compared, and it is found that CFETs have more severe self-heating and thermal crosstalk. The electro-thermal characteristics of inverters, logic gates and ring oscillators composed of NSFETs and CFETs are further investigated. Compared with NSFETs, logic gates and ring oscillators composed of CFETs are more seriously affected by self-heating and should be given extra attention. The thermal network model proposed in this paper can be further used to study the thermal optimization strategy of devices and circuits to enhance the electrical performance, achieving the design technology co-optimizations (DTCO).
Authors:Ozan Baris Mulayim, Pengrui Quan, Liying Han, Xiaomin Ouyang, Dezhi Hong, Mario Bergés, Mani Srivastava
Abstract:
Building energy management (BEM) tasks require processing and learning from a variety of time-series data. Existing solutions rely on bespoke task- and data-specific models to perform these tasks, limiting their broader applicability. Inspired by the transformative success of Large Language Models (LLMs), Time-Series Foundation Models (TSFMs), trained on diverse datasets, have the potential to change this. Were TSFMs to achieve a level of generalizability across tasks and contexts akin to LLMs, they could fundamentally address the scalability challenges pervasive in BEM. To understand where they stand today, we evaluate TSFMs across four dimensions: (1) generalizability in zero-shot univariate forecasting, (2) forecasting with covariates for thermal behavior modeling, (3) zero-shot representation learning for classification tasks, and (4) robustness to performance metrics and varying operational conditions. Our results reveal that TSFMs exhibit \emph{limited} generalizability, performing only marginally better than statistical models on unseen datasets and modalities for univariate forecasting. Similarly, inclusion of covariates in TSFMs does not yield performance improvements, and their performance remains inferior to conventional models that utilize covariates. While TSFMs generate effective zero-shot representations for downstream classification tasks, they may remain inferior to statistical models in forecasting when statistical models perform test-time fitting. Moreover, TSFMs forecasting performance is sensitive to evaluation metrics, and they struggle in more complex building environments compared to statistical models. These findings underscore the need for targeted advancements in TSFM design, particularly their handling of covariates and incorporating context and temporal dynamics into prediction mechanisms, to develop more adaptable and scalable solutions for BEM.
Authors:Johannes van Randenborgh, Moritz Schulze Darup
Abstract:
Aquifer thermal energy storages (ATES) represent groundwater saturated aquifers that store thermal energy in the form of heated or cooled groundwater. Combining two ATES, one can harness excess thermal energy from summer (heat) and winter (cold) to support the building's heating, ventilation, and air conditioning (HVAC) technology. In general, a dynamic operation of ATES throughout the year is beneficial to avoid using fossil fuel-based HVAC technology and maximize the ``green use'' of ATES. Model predictive control (MPC) with an appropriate system model may become a crucial control approach for ATES systems. Consequently, the MPC model should reflect spatial temperature profiles around ATES' boreholes to predict extracted groundwater temperatures accurately. However, meaningful predictions require the estimation of the current state of the system, as measurements are usually only at the borehole of the ATES. In control, this is often realized by model-based observers. Still, observing the state of an ATES system is non-trivial, since the model is typically hybrid. We show how to exploit the specific structure of the hybrid ATES model and design an easy-to-solve moving horizon estimator based on a quadratic program.
Authors:Yorick Estievenart, Sukanya Patra, Souhaib Ben Taieb
Abstract:
Efficient and reliable operation of Concentrated Solar Power (CSP) plants is essential for meeting the growing demand for sustainable energy. However, high-temperature solar receivers face severe operational risks, such as freezing, deformation, and corrosion, resulting in costly downtime and maintenance. To monitor CSP plants, cameras mounted on solar receivers record infrared images at irregular intervals ranging from one to five minutes throughout the day. Anomalous images can be detected by thresholding an anomaly score, where the threshold is chosen to optimize metrics such as the F1-score on a validation set. This work proposes a framework, using risk control, for generating more reliable decision thresholds with finite-sample coverage guarantees on any chosen risk function. Our framework also incorporates an abstention mechanism, allowing high-risk predictions to be deferred to domain experts. Second, we propose a density forecasting method to estimate the likelihood of an observed image given a sequence of previously observed images, using this likelihood as its anomaly score. Third, we analyze the deployment results of our framework across multiple training scenarios over several months for two CSP plants. This analysis provides valuable insights to our industry partner for optimizing maintenance operations. Finally, given the confidential nature of our dataset, we provide an extended simulated dataset, leveraging recent advancements in generative modeling to create diverse thermal images that simulate multiple CSP plants. Our code is publicly available.
Authors:Kun Yang, Yuxiang Liu, Zeyu Cui, Yu Liu, Maojun Zhang, Shen Yan, Qing Wang
Abstract:
Thermal infrared imaging offers the advantage of all-weather capability, enabling non-intrusive measurement of an object's surface temperature. Consequently, thermal infrared images are employed to reconstruct 3D models that accurately reflect the temperature distribution of a scene, aiding in applications such as building monitoring and energy management. However, existing approaches predominantly focus on static 3D reconstruction for a single time period, overlooking the impact of environmental factors on thermal radiation and failing to predict or analyze temperature variations over time. To address these challenges, we propose the NTR-Gaussian method, which treats temperature as a form of thermal radiation, incorporating elements like convective heat transfer and radiative heat dissipation. Our approach utilizes neural networks to predict thermodynamic parameters such as emissivity, convective heat transfer coefficient, and heat capacity. By integrating these predictions, we can accurately forecast thermal temperatures at various times throughout a nighttime scene. Furthermore, we introduce a dynamic dataset specifically for nighttime thermal imagery. Extensive experiments and evaluations demonstrate that NTR-Gaussian significantly outperforms comparison methods in thermal reconstruction, achieving a predicted temperature error within 1 degree Celsius.
Authors:Mohammad Pivezhandi, Abusayeed Saifullah, Prashant Modekurthy
Abstract:
Optimizing task-to-core allocation can substantially reduce power consumption in multi-core platforms without degrading user experience. However, many existing approaches overlook critical factors such as parallelism, compute intensity, and heterogeneous core types. In this paper, we introduce a statistical learning approach for feature selection that identifies the most influential features - such as core type, speed, temperature, and application-level parallelism or memory intensity - for accurate environment modeling and efficient energy optimization. Our experiments, conducted with state-of-the-art Linux governors and thermal modeling techniques, show that correlation-aware task-to-core allocation lowers energy consumption by up to 10% and reduces core temperature by up to 5 degrees Celsius compared to random core selection. Furthermore, our compressed, bootstrapped regression model improves thermal prediction accuracy by 6% while cutting model parameters by 16%, yielding an overall mean square error reduction of 61.6% relative to existing approaches. We provided results based on superscalar Intel Core i7 12th Gen processors with 14 cores, but validated our method across a diverse set of hardware platforms and effectively balanced performance, power, and thermal demands through statistical feature evaluation.
Authors:Xiang Fu, Brandon M. Wood, Luis Barroso-Luque, Daniel S. Levine, Meng Gao, Misko Dzamba, C. Lawrence Zitnick
Abstract:
Machine learning interatomic potentials (MLIPs) have become increasingly effective at approximating quantum mechanical calculations at a fraction of the computational cost. However, lower errors on held out test sets do not always translate to improved results on downstream physical property prediction tasks. In this paper, we propose testing MLIPs on their practical ability to conserve energy during molecular dynamic simulations. If passed, improved correlations are found between test errors and their performance on physical property prediction tasks. We identify choices which may lead to models failing this test, and use these observations to improve upon highly-expressive models. The resulting model, eSEN, provides state-of-the-art results on a range of physical property prediction tasks, including materials stability prediction, thermal conductivity prediction, and phonon calculations.
Authors:Qinghao Zhang, Wenrui Li, Pinjia Zhang
Abstract:
The thermal sensitive electrical parameter (TSEP) method is crucial for enhancing the reliability of power devices through junction temperature monitoring. The TSEP method comprises three key processes: calibration, regression, and application. While significant efforts have been devoted to improving regression algorithms and increasing TSEP sensitivity to enhance junction temperature monitoring accuracy, these approaches have reached a bottleneck. In reality, the calibration method significantly influences monitoring accuracy, an aspect often overlooked in conventional TSEP methods. To address this issue, we propose a high-accuracy calibration method for transient TSEPs. First, a temperature compensation strategy based on thermal analysis is introduced to mitigate the temperature difference caused by load current during dual pulse tests. Second, the impact of stray parameters is analyzed to identify coupled parameters, which are typically neglected in existing methods. Third, it is observed that random errors follow a logarithm Gaussian distribution, covering a hidden variable. A neural network is used to obtain the junction temperature predictive model. The proposed calibration method is experimental validated in threshold voltage as an example. Compared with conventional calibration methods, the mean absolute error is reduced by over 30%. Moreover, this method does not require additional hardware cost and has good generalization.
Authors:Armin Gooran-Shoorakchaly, Sarah Sharif, Yaser Banad
Abstract:
Achieving reliable resistive switching in oxide-based memristive devices requires precise control over conductive filament (CF) formation and behavior, yet the fundamental relationship between oxide material properties and switching uniformity remains incompletely understood. Here, we develop a comprehensive physical model to investigate how electrical and thermal conductivities influence CF dynamics in TaOx-based memristors. Our simulations reveal that higher electrical conductivity promotes oxygen vacancy generation and reduces forming voltage, while higher thermal conductivity enhances heat dissipation, leading to increased forming voltage. The uniformity of resistive switching is strongly dependent on the interplay between these transport properties. We identify two distinct pathways for achieving optimal High Resistance State (HRS) uniformity with standard deviation-to-mean ratios as low as 0.045, each governed by different balances of electrical and thermal transport mechanisms. For the Low Resistance State (LRS), high uniformity (0.009) can be maintained when either electrical or thermal conductivity is low. The resistance ratio between HRS and LRS shows a strong dependence on these conductivities, with higher ratios observed at lower conductivity values. These findings provide essential guidelines for material selection in RRAM devices, particularly for applications demanding high reliability and uniform switching characteristics.
Authors:Zaid Abulawi, Rui Hu, Prasanna Balaprakash, Yang Liu
Abstract:
Accurate predictions and uncertainty quantification (UQ) are essential for decision-making in risk-sensitive fields such as system safety modeling. Deep ensembles (DEs) are efficient and scalable methods for UQ in Deep Neural Networks (DNNs); however, their performance is limited when constructed by simply retraining the same DNN multiple times with randomly sampled initializations. To overcome this limitation, we propose a novel method that combines Bayesian optimization (BO) with DE, referred to as BODE, to enhance both predictive accuracy and UQ.
We apply BODE to a case study involving a Densely connected Convolutional Neural Network (DCNN) trained on computational fluid dynamics (CFD) data to predict eddy viscosity in sodium fast reactor thermal stratification modeling. Compared to a manually tuned baseline ensemble, BODE estimates total uncertainty approximately four times lower in a noise-free environment, primarily due to the baseline's overestimation of aleatoric uncertainty. Specifically, BODE estimates aleatoric uncertainty close to zero, while aleatoric uncertainty dominates the total uncertainty in the baseline ensemble. We also observe a reduction of more than 30% in epistemic uncertainty. When Gaussian noise with standard deviations of 5% and 10% is introduced into the data, BODE accurately fits the data and estimates uncertainty that aligns with the data noise. These results demonstrate that BODE effectively reduces uncertainty and enhances predictions in data-driven models, making it a flexible approach for various applications requiring accurate predictions and robust UQ.
Authors:Stefan Henneking, Jacob Grosek, Leszek Demkowicz
Abstract:
This article presents an ultraweak discontinuous Petrov-Galerkin (DPG) formulation of the time-harmonic Maxwell equations for the vectorial envelope of the electromagnetic field in a weakly-guiding multi-mode fiber waveguide. This formulation is derived using an envelope ansatz for the vector-valued electric and magnetic field components, factoring out an oscillatory term of $exp(-i \mathsf{k}z)$ with a user-defined wavenumber $\mathsf{k}$, where $z$ is the longitudinal fiber axis and field propagation direction. The resulting formulation is a modified system of the time-harmonic Maxwell equations for the vectorial envelope of the propagating field. This envelope is less oscillatory in the $z$-direction than the original field, so that it can be more efficiently discretized and computed, enabling solution of the vectorial DPG Maxwell system for $1000\times$ longer fibers than previously possible. Different approaches for incorporating a perfectly matched layer for absorbing the outgoing wave modes at the fiber end are derived and compared numerically. The resulting formulation is used to solve a 3D Maxwell model of an ytterbium-doped active gain fiber amplifier, coupled with the heat equation for including thermal effects. The nonlinear model is then used to simulate thermally-induced transverse mode instability (TMI). The numerical experiments demonstrate that it is computationally feasible to perform simulations and analysis of real-length optical fiber laser amplifiers using discretizations of the full vectorial time-harmonic Maxwell equations. The approach promises a new high-fidelity methodology for analyzing TMI in high-power fiber laser systems and is extendable to including other nonlinearities.
Authors:Gargi Panda, Soumitra Kundu, Saumik Bhattacharya, Aurobinda Routray
Abstract:
Multi-modal image fusion (MMIF) enhances the information content of the fused image by combining the unique as well as common features obtained from different modality sensor images, improving visualization, object detection, and many more tasks. In this work, we introduce an interpretable network for the MMIF task, named FNet, based on an l0-regularized multi-modal convolutional sparse coding (MCSC) model. Specifically, for solving the l0-regularized CSC problem, we develop an algorithm unrolling-based l0-regularized sparse coding (LZSC) block. Given different modality source images, FNet first separates the unique and common features from them using the LZSC block and then these features are combined to generate the final fused image. Additionally, we propose an l0-regularized MCSC model for the inverse fusion process. Based on this model, we introduce an interpretable inverse fusion network named IFNet, which is utilized during FNet's training. Extensive experiments show that FNet achieves high-quality fusion results across five different MMIF tasks. Furthermore, we show that FNet enhances downstream object detection in visible-thermal image pairs. We have also visualized the intermediate results of FNet, which demonstrates the good interpretability of our network.
Authors:S. A. N. Nouwens, M. M. Paulides, W. P. M. H. Heemels
Abstract:
Optimization-based controllers, such as Model Predictive Control (MPC), have attracted significant research interest due to their intuitive concept, constraint handling capabilities, and natural application to multi-input multi-output systems. However, the computational complexity of solving a receding horizon problem at each time step remains a challenge for the deployment of MPC. This is particularly the case for systems constrained by many inequalities. Recently, we introduced the concept of constraint-adaptive MPC (ca-MPC) to address this challenge for linear systems with hard constraints. In ca-MPC, at each time step, a subset of the constraints is removed from the optimization problem, thereby accelerating the optimization procedure, while resulting in identical closed-loop behavior. The present paper extends this framework to soft-constrained MPC by detecting and removing constraints based on sub-optimal predicted input sequences, which is rather easy for soft-constrained MPC due to the receding horizon principle and the inclusion of slack variables. We will translate these new ideas explicitly to an offset-free output tracking problem. The effectiveness of these ideas is demonstrated on a two-dimensional thermal transport model, showing a three order of magnitude improvement in online computational time of the MPC scheme.
Authors:Luca Fehlings, Md Hanif Ali, Paolo Gibertini, Egidio A. Gallicchio, Udayan Ganguly, Veeresh Deshpande, Erika Covi
Abstract:
The growing use of ferroelectric-based technology, extending beyond conventional memory storage applications, necessitates the development of compact models that can be easily integrated into circuit simulation environments. These models assist circuit designers in the design and the early assessment of the performance of their systems. The Heracles model is a physics-based compact model for circuit simulations in a SPICE environment for HfO2-based ferroelectric capacitors (FeCaps). The model has been calibrated based on experimental data obtained from HfO2-based FeCaps. A thermal model with an accurate description of the device parasitics is included to derive precise device characteristics based on first principles. The incorporation of statistical device data enables Monte Carlo analysis based on realistic distributions, thereby rendering the model particularly well-suited for design-technology co-optimization (DTCO). The model's efficacy is further demonstrated in circuit simulations using an integrated circuit with current programming, wherein partial switching of the ferroelectric polarization is observed. Finally, the model was benchmarked in an array simulation, reaching convergence in 1.8 s with an array size of 100 kb.
Authors:Bach Do, Sina Jafari Ghalekohneh, Taiwo Adebiyi, Bo Zhao, Ruda Zhang
Abstract:
Nonreciprocal thermal emitters that break Kirchhoff's law of thermal radiation promise exciting applications for thermal and energy applications. The design of the bandwidth and angular range of the nonreciprocal effect, which directly affects the performance of nonreciprocal emitters, typically relies on physical intuition. In this study, we present a general numerical approach to maximize the nonreciprocal effect. We choose doped magneto-optic materials and magnetic Weyl semimetal materials as model materials and focus on pattern-free multilayer structures. The optimization randomly starts from a less effective structure and incrementally improves the broadband nonreciprocity through the combination of Bayesian optimization and reparameterization. Optimization results show that the proposed approach can discover structures that can achieve broadband nonreciprocal emission at wavelengths from 5 to 40 micrometers using only a fewer layers, significantly outperforming current state-of-the-art designs based on intuition in terms of both performance and simplicity.
Authors:Prakash Thakolkaran, Yiwen Zheng, Yaqi Guo, Aniruddh Vashisth, Siddhant Kumar
Abstract:
The thermal conductivity of covalent organic frameworks (COFs), an emerging class of nanoporous polymeric materials, is crucial for many applications, yet the link between their structure and thermal properties remains poorly understood. Analysis of a dataset containing over 2,400 COFs reveals that conventional features such as density, pore size, void fraction, and surface area do not reliably predict thermal conductivity. To address this, an attention-based machine learning model was trained, accurately predicting thermal conductivities even for structures outside the training set. The attention mechanism was then utilized to investigate the model's success. The analysis identified dangling molecular branches as a key predictor of thermal conductivity, a discovery supported by feature importance assessments conducted on regression models. These findings indicate that COFs with dangling functional groups exhibit lower thermal transfer capabilities. Molecular dynamics simulations support this observation, revealing significant mismatches in the vibrational density of states due to the presence of dangling branches.
Authors:Amir Farzin Nikkhah, Dong Chen, Bradford Campbell, Somayeh Asadi, Arsalan Heydarian
Abstract:
Unmanned Aerial Vehicles (UAVs) are transforming infrastructure inspections in the Architecture, Engineering, Construction, and Facility Management (AEC+FM) domain. By synthesizing insights from over 150 studies, this review paper highlights UAV-based methodologies for data acquisition, photogrammetric modeling, defect detection, and decision-making support. Key innovations include path optimization, thermal integration, and advanced machine learning (ML) models such as YOLO and Faster R-CNN for anomaly detection. UAVs have demonstrated value in structural health monitoring (SHM), disaster response, urban infrastructure management, energy efficiency evaluations, and cultural heritage preservation. Despite these advancements, challenges in real-time processing, multimodal data fusion, and generalizability remain. A proposed workflow framework, informed by literature and a case study, integrates RGB imagery, LiDAR, and thermal sensing with transformer-based architectures to improve accuracy and reliability in detecting structural defects, thermal anomalies, and geometric inconsistencies. The proposed framework ensures precise and actionable insights by fusing multimodal data and dynamically adapting path planning for complex environments, presented as a comprehensive step-by-step guide to address these challenges effectively. This paper concludes with future research directions emphasizing lightweight AI models, adaptive flight planning, synthetic datasets, and richer modality fusion to streamline modern infrastructure inspections.
Authors:Teddy Koker, Abhijeet Gangan, Mit Kotak, Jaime Marian, Tess Smidt
Abstract:
Many materials properties depend on higher-order derivatives of the potential energy surface, yet machine learned interatomic potentials (MLIPs) trained with a standard loss on energy, force, and stress errors can exhibit error in curvature, degrading the prediction of vibrational properties. We introduce phonon fine-tuning (PFT), which directly supervises second-order force constants of materials by matching MLIP energy Hessians to DFT-computed force constants from finite displacement phonon calculations. To scale to large supercells, PFT stochastically samples Hessian columns and computes the loss with a single Hessian-vector product. We also use a simple co-training scheme to incorporate upstream data to mitigate catastrophic forgetting. On the MDR Phonon benchmark, PFT improves Nequix MP by 55% on average across phonon thermodynamic properties and achieves state-of-the-art accuracy among models trained on Materials Project trajectories. PFT also generalizes to improve properties beyond second-derivatives, improving thermal conductivity predictions that rely on third-order derivatives of the potential energy.
Authors:Sahan Sanjaya, Aruna Jayasena, Prabhat Mishra
Abstract:
Side-channel attacks try to extract secret information from a system by analyzing different side-channel signatures, such as power consumption, electromagnetic emanation, thermal dissipation, acoustics, time, etc. Power-based side-channel attack is one of the most prominent side-channel attacks in cybersecurity, which rely on data-dependent power variations in a system to extract sensitive information. While there are related surveys, they primarily focus on power side-channel attacks on cryptographic implementations. In recent years, power-side channel attacks have been explored in diverse application domains, including key extraction from cryptographic implementations, reverse engineering of machine learning models, user behavior data exploitation, and instruction-level disassembly. In this paper, we provide a comprehensive survey of power side-channel attacks and their countermeasures in different application domains. Specifically, this survey aims to classify recent power side-channel attacks and provide a comprehensive comparison based on application-specific considerations.
Authors:O. Tansel Baydas, Ozgur B. Akan
Abstract:
The convergence of Terahertz (THz) communications and Federated Learning (FL) promises ultra-fast distributed learning, yet the impact of realistic wideband impairments on optimization dynamics remains theoretically uncharacterized. This paper bridges this gap by developing a multicarrier stochastic framework that explicitly couples local gradient updates with frequency-selective THz effects, including beam squint, molecular absorption, and jitter. Our analysis uncovers a critical diversity trap: under standard unbiased aggregation, the convergence error floor is driven by the harmonic mean of subcarrier SNRs. Consequently, a single spectral hole caused by severe beam squint can render the entire bandwidth useless for reliable model updates. We further identify a fundamental bandwidth limit, revealing that expanding the spectrum beyond a critical point degrades convergence due to the integration of thermal noise and gain collapse at band edges. Finally, we demonstrate that an SNR-weighted aggregation strategy is necessary to suppress the variance singularity at these spectral holes, effectively recovering convergence in high-squint regimes where standard averaging fails. Numerical results validate the expected impact of the discussed physical layer parameters' on performance of THz-FL systems.
Authors:Qionglin Ren, Dawei Zhang, Chunxu Tian, Dan Zhang
Abstract:
Research in Anti-UAV (Unmanned Aerial Vehicle) tracking has explored various modalities, including RGB, TIR, and RGB-T fusion. However, a unified framework for cross-modal collaboration is still lacking. Existing approaches have primarily focused on independent models for individual tasks, often overlooking the potential for cross-modal information sharing. Furthermore, Anti-UAV tracking techniques are still in their infancy, with current solutions struggling to achieve effective multimodal data fusion. To address these challenges, we propose UAUTrack, a unified single-target tracking framework built upon a single-stream, single-stage, end-to-end architecture that effectively integrates multiple modalities. UAUTrack introduces a key component: a text prior prompt strategy that directs the model to focus on UAVs across various scenarios. Experimental results show that UAUTrack achieves state-of-the-art performance on the Anti-UAV and DUT Anti-UAV datasets, and maintains a favourable trade-off between accuracy and speed on the Anti-UAV410 dataset, demonstrating both high accuracy and practical efficiency across diverse Anti-UAV scenarios.
Authors:Soumyadeep Chandra, Sayeed Shafayet Chowdhury, Kaushik Roy
Abstract:
Thermal analysis is increasingly critical in modern integrated circuits, where non-uniform power dissipation and high transistor densities can cause rapid temperature spikes and reliability concerns. Traditional methods, such as FEM-based simulations offer high accuracy but computationally prohibitive for early-stage design, often requiring multiple iterative redesign cycles to resolve late-stage thermal failures. To address these challenges, we propose 'ThermAl', a physics-informed generative AI framework which effectively identifies heat sources and estimates full-chip transient and steady-state thermal distributions directly from input activity profiles. ThermAl employs a hybrid U-Net architecture enhanced with positional encoding and a Boltzmann regularizer to maintain physical fidelity. Our model is trained on an extensive dataset of heat dissipation maps, ranging from simple logic gates (e.g., inverters, NAND, XOR) to complex designs, generated via COMSOL. Experimental results demonstrate that ThermAl delivers precise temperature mappings for large circuits, with a root mean squared error (RMSE) of only 0.71°C, and outperforms conventional FEM tools by running up to ~200 times faster. We analyze performance across diverse layouts and workloads, and discuss its applicability to large-scale EDA workflows. While thermal reliability assessments often extend beyond 85°C for post-layout signoff, our focus here is on early-stage hotspot detection and thermal pattern learning. To ensure generalization beyond the nominal operating range 25-55°C, we additionally performed cross-validation on an extended dataset spanning 25-95°C maintaining a high accuracy (<2.2% full-scale RMSE) even under elevated temperature conditions representative of peak power and stress scenarios.
Authors:Yukta Pareek, Khadija Omar Said, Satadru Dey, Ashish Ranjan Kumar
Abstract:
Underground mining operations are actively exploring the use of large-format lithium-ion batteries (LIBs) to power their equipment. LIBs have high energy density, long cycle life, and favorable safety record. They also have low noise, heat, and emission footprints. This fosters a conducive workplace environment for underground mining personnel. However, many occurrences of LIB failure have resulted in dangerous situations in underground mines. The combustion products, including toxic emissions, can rapidly travel throughout the mine using the ventilation network. Therefore, it is critical to monitor the temperature and smoke concentration underground at all times to ensure the safety of the miners. High-fidelity models can be developed for specific scenarios of LIB failure, but are computationally prohibitive for large underground mine volumes, complex geometries, and long duration combustion events. To mitigate computation-related issues associated with high-fidelity models, we developed cyber-physical systems (CPS) models to examine temperature and smoke dynamics. The mine supervisory control center, acting as the cyber framework, operates in conjunction with the physical underground mine. The CPS models, trained on high-fidelity computational fluid dynamics (CFD) model data sets, present an exceptional estimate of the evolution of temperature and smoke concentration in the underground mine tunnel. Once implemented, the research results can help mine operators make informed decisions during emergencies.
Authors:Abishek Karthik, Sreya Mynampati, Pandiyaraju V
Abstract:
Solar energy is one of the most abundant and tapped sources of renewable energies with enormous future potential. Solar panel output can vary widely with factors like intensity, temperature, dirt, debris and so on affecting it. We have implemented a model on detecting dust and fault on solar panels. These two applications are centralized as a single-platform and can be utilized for routine-maintenance and any other checks. These are checked against various parameters such as power output, sinusoidal wave (I-V component of solar cell), voltage across each solar cell and others. Firstly, we filter and preprocess the obtained images using gamma removal and Gaussian filtering methods alongside some predefined processes like normalization. The first application is to detect whether a solar cell is dusty or not based on various pre-determined metrics like shadowing, leaf, droppings, air pollution and from other human activities to extent of fine-granular solar modules. The other one is detecting faults and other such occurrences on solar panels like faults, cracks, cell malfunction using thermal imaging application. This centralized platform can be vital since solar panels have different efficiency across different geography (air and heat affect) and can also be utilized for small-scale house requirements to large-scale solar farm sustentation effectively. It incorporates CNN, ResNet models that with self-attention mechanisms-KerNet model which are used for classification and results in a fine-tuned system that detects dust or any fault occurring. Thus, this multi-application model proves to be efficient and optimized in detecting dust and faults on solar panels. We have performed various comparisons and findings that demonstrates that our model has better efficiency and accuracy results overall than existing models.
Authors:Leonhard Duda, Khadijeh Alibabaei, Elena Vollmer, Leon Klug, Valentin Kozlov, Lisana Berberi, Mishal Benz, Rebekka Volk, Juan Pedro Gutiérrez Hermosillo Muriedas, Markus Götz, Judith Sáínz-Pardo Díaz, Álvaro López García, Frank Schultmann, Achim Streit
Abstract:
Federated Learning (FL) is an approach for training a shared Machine Learning (ML) model with distributed training data and multiple participants. FL allows bypassing limitations of the traditional Centralized Machine Learning CL if data cannot be shared or stored centrally due to privacy or technical restrictions -- the participants train the model locally with their training data and do not need to share it among the other participants. This paper investigates the practical implementation and effectiveness of FL in a real-world scenario, specifically focusing on unmanned aerial vehicle (UAV)-based thermal images for common thermal feature detection in urban environments. The distributed nature of the data arises naturally and makes it suitable for FL applications, as images captured in two German cities are available. This application presents unique challenges due to non-identical distribution and feature characteristics of data captured at both locations. The study makes several key contributions by evaluating FL algorithms in real deployment scenarios rather than simulation. We compare several FL approaches with a centralized learning baseline across key performance metrics such as model accuracy, training time, communication overhead, and energy usage. This paper also explores various FL workflows, comparing client-controlled workflows and server-controlled workflows. The findings of this work serve as a valuable reference for understanding the practical application and limitations of the FL methods in segmentation tasks in UAV-based imaging.
Authors:Khadija Omar Said, Yukta Pareek, Satadru Dey, Ashish Ranjan Kumar
Abstract:
Large-format lithium-ion batteries (LIBs) provide effective energy storage solutions for high-power equipment used in underground mining operations. They have high Columbic efficiency and minimal heat and emission footprints. However, improper use of LIBs, accidents, or other factors may increase the probability of thermal runaway (TR), a rapid combustion reaction that discharges toxic and flammable substances. Several such incidents have been documented in mines. Since repeatable TR experiments to uncover the transient-state propagation of TR are expensive and hazardous, high-fidelity models are usually developed to mimic the impact of these events. They are resource-intensive and are impractical to develop for many scenarios that could be observed in a mine. Therefore, dynamic models within a reduced-order framework were constructed to represent the transient-state combustion event. Reduced order models (ROMs) reasonably replicate trends in temperature and smoke, showing strong alignment with the ground-truth dataset.
Authors:Zhiyuan Fan, Bolun Xu
Abstract:
The dual challenge of decarbonizing the economy and meeting rising global energy demand underscores the need for scalable and cost-effective carbon dioxide removal technologies. Direct air capture (DAC) is among the most promising approaches, but its high energy intensity, particularly the thermal energy required for sorbent regeneration, remains a critical barrier to cost reduction and sustainable deployment. This study explores solar-thermal DAC systems that combine concentrated solar thermal technology with low-cost sand-based thermal energy storage to meet this demand. We analyze the techno-economic performance of such systems in both grid-connected and stand-alone configurations. Results show that solar-thermal DAC can achieve annual capacity factors exceeding 80% and CO2 removal costs as low as 160-200 USD per ton, making it competitive with leading DAC technologies. The proposed system operates most efficiently with short-cycle sorbents that align with solar availability. The stand-alone Solar-DAC systems, which rely solely on solar energy for both electricity and thermal energy, are particularly promising in regions with high solar capacity and sandy terrain, exhibiting minimal ambient sensitivity from temperature and humidity. An optimal 6000 ton/yr modular system design takes <1 km2 land-use requirement and potentially >26 Gt/year DAC capacity is identified for sandy terrain alone globally. In areas with sedimentary basins suitable for CO2 storage, solar-powered DAC offers a lower-cost alternative to geothermal heating, which often faces geological and economic constraints.
Authors:Yukta Pareek, Abdul Malik Al Mardhouf Al Saadi, Amrita Basak, Satadru Dey
Abstract:
Laser Powder Bed Fusion (L-PBF) is a widely adopted additive manufacturing process for fabricating complex metallic parts layer by layer. Effective thermal management is essential to ensure part quality and structural integrity, as thermal gradients and residual stresses can lead to defects such as warping and cracking. However, existing experimental or computational techniques lack the ability to forecast future temperature distributions in real time, an essential capability for proactive process control. This paper presents a real-time thermal state forecasting framework for L-PBF, based on a physics-informed reduced-order thermal model integrated with a Kalman filtering scheme. The proposed approach efficiently captures inter-layer heat transfer dynamics and enables accurate tracking and forecasting of spatial and temporal temperature evolution. Validation across multiple part geometries using measured data demonstrates that the method reliably estimates and forecasts peak temperatures and cooling trends. By enabling predictive thermal control, this framework offers a practical and computationally efficient solution for thermal management in L-PBF, paving the way toward closed-loop control in L-PBF.
Authors:Binh Huy Nguyen, Matti Schneider
Abstract:
We investigate the implications of a given symmetry of a random microstructure on the obtained effective tensor and its fluctuation in the context of thermal conductivity, and study strategies for enforcing these symmetries in postprocessing via orthogonal projectors. Within the framework of the representative volume element (RVE) method, we establish the invariance conditions for the effective tensor and its fluctuation under different symmetry groups of the microstructure. Interestingly, the symmetry of the considered cell type in the RVE method may break the ensemble symmetry and compromise the approximation of the effective properties. To rectify this issue, we introduce dedicated techniques which permit to enforce the expected symmetries in postprocessing and study the implications on the bounds for the effective properties as well as the total, the random and the systematic errors. We provide theoretical arguments that suitable projections lead to unbiased variance-reduction strategies which furthermore enforce the expected symmetries exactly. Through large-scale FFT-based homogenization simulations, we study the symmetry structure of the estimated effective conductivities and their fluctuations. Moreover, we demonstrate the power of the symmetry-projection techniques for fiber-reinforced composite microstructures of industrial scale.
Authors:Fengyi Wang, Xiangyu Fu, Nitish Thakor, Gordon Cheng
Abstract:
The human somatosensory system integrates multimodal sensory feedback, including tactile, proprioceptive, and thermal signals, to enable comprehensive perception and effective interaction with the environment. Inspired by the biological mechanism, we present a sensorized soft anthropomorphic hand equipped with diverse sensors designed to emulate the sensory modalities of the human hand. This system incorporates biologically inspired encoding schemes that convert multimodal sensory data into spike trains, enabling highly-efficient processing through Spiking Neural Networks (SNNs). By utilizing these neuromorphic signals, the proposed framework achieves 97.14% accuracy in object recognition across varying poses, significantly outperforming previous studies on soft hands. Additionally, we introduce a novel differentiator neuron model to enhance material classification by capturing dynamic thermal responses. Our results demonstrate the benefits of multimodal sensory fusion and highlight the potential of neuromorphic approaches for achieving efficient, robust, and human-like perception in robotic systems.
Authors:Yuqi Han, Songqian Zhang, Weijian Su, Ke Li, Jiayu Yang, Jinli Suo, Qiang Zhang
Abstract:
The thermal camera excels at perceiving outdoor environments under low-light conditions, making it ideal for applications such as nighttime autonomous driving and unmanned navigation. However, thermal cameras encounter challenges when capturing signage from objects made of similar materials, which can pose safety risks for accurately understanding semantics in autonomous driving systems. In contrast, the neuromorphic vision camera, also known as an event camera, detects changes in light intensity asynchronously and has proven effective in high-speed, low-light traffic environments. Recognizing the complementary characteristics of these two modalities, this paper proposes UTA-Sign, an unsupervised thermal-event video augmentation for traffic signage in low-illumination environments, targeting elements such as license plates and roadblock indicators. To address the signage blind spots of thermal imaging and the non-uniform sampling of event cameras, we developed a dual-boosting mechanism that fuses thermal frames and event signals for consistent signage representation over time. The proposed method utilizes thermal frames to provide accurate motion cues as temporal references for aligning the uneven event signals. At the same time, event signals contribute subtle signage content to the raw thermal frames, enhancing the overall understanding of the environment. The proposed method is validated on datasets collected from real-world scenarios, demonstrating superior quality in traffic signage sketching and improved detection accuracy at the perceptual level.
Authors:Arunava Chaudhuri, Shubhi Shukla, Sarani Bhattacharya, Debdeep Mukhopadhyay
Abstract:
Transformers have become the backbone of many Machine Learning (ML) applications, including language translation, summarization, and computer vision. As these models are increasingly deployed in shared Graphics Processing Unit (GPU) environments via Machine Learning as a Service (MLaaS), concerns around their security grow. In particular, the risk of side-channel attacks that reveal architectural details without physical access remains underexplored, despite the high value of the proprietary models they target. This work to the best of our knowledge is the first to investigate GPU power and thermal fluctuations as side-channels and further exploit them to extract information from pre-trained transformer models. The proposed analysis shows how these side channels can be exploited at user-privilege to reveal critical architectural details such as encoder/decoder layer and attention head for both language and vision transformers. We demonstrate the practical impact by evaluating multiple language and vision pre-trained transformers which are publicly available. Through extensive experimental evaluations, we demonstrate that the attack model achieves a high accuracy of over 89% on average for model family identification and 100% for hyperparameter classification, in both single-process as well as noisy multi-process scenarios. Moreover, by leveraging the extracted architectural information, we demonstrate highly effective black-box transfer adversarial attacks with an average success rate exceeding 93%, underscoring the security risks posed by GPU side-channel leakage in deployed transformer models.
Authors:Mouyang Cheng, Weiliang Luo, Hao Tang, Bowen Yu, Yongqiang Cheng, Weiwei Xie, Ju Li, Heather J. Kulik, Mingda Li
Abstract:
Diffusion-based deep generative models have emerged as powerful tools for inverse materials design. Yet, many existing approaches overlook essential chemical constraints such as oxidation state balance, which can lead to chemically invalid structures. Here we introduce CrysVCD (Crystal generator with Valence-Constrained Design), a modular framework that integrates chemical rules directly into the generative process. CrysVCD first employs a transformer-based elemental language model to generate valence-balanced compositions, followed by a diffusion model to generate crystal structures. The valence constraint enables orders-of-magnitude more efficient chemical valence checking, compared to pure data-driven approaches with post-screening. When fine-tuned on stability metrics, CrysVCD achieves 85% thermodynamic stability and 68% phonon stability. Moreover, CrysVCD supports conditional generation of functional materials, enabling discovery of candidates such as high thermal conductivity semiconductors and high-$κ$ dielectric compounds. Designed as a general-purpose plugin, CrysVCD can be integrated into diverse generative pipeline to promote chemical validity, offering a reliable, scientifically grounded path for materials discovery.
Authors:Aidan Furlong, Xingang Zhao, Robert Salko, Xu Wu
Abstract:
Accurate prediction of critical heat flux (CHF) is an essential component of safety analysis in pressurized and boiling water reactors. To support reliable prediction of this quantity, several empirical correlations and lookup tables have been constructed from physical experiments over the past several decades. With the onset of accessible machine learning (ML) frameworks, multiple initiatives have been established with the goal of predicting CHF more accurately than these traditional methods. While purely data-driven surrogate modeling has been extensively investigated, these approaches lack interpretability, lack resilience to data scarcity, and have been developed mostly using data from tube experiments. As a result, bias-correction hybrid approaches have become increasingly popular, which correct initial "low-fidelity" estimates provided by deterministic base models by using ML-predicted residuals. This body of work has mostly considered round tube geometries; annular geometry-specific ML models have not yet been deployed in thermal hydraulic codes. This study developed, deployed, and validated four ML models to predict CHF in annular geometries using the CTF subchannel code. Three empirical correlation models, Biasi, Bowring, and Katto, were used as base models for comparison. The ML models were trained and tested using 577 experimental annulus data points from four datasets: Becker, Beus, Janssen, and Mortimore. Baseline CHF predictions were obtained from the empirical correlations, with mean relative errors above 26%. The ML-driven models achieved mean relative errors below 3.5%, with no more than one point exceeding the 10% error envelope. In all cases, the hybrid ML models significantly outperformed their empirical counterparts.
Authors:Raffael Theiler, Olga Fink
Abstract:
Accurate short-term state forecasting is essential for efficient and stable operation of modern power systems, especially in the context of increasing variability introduced by renewable and distributed energy resources. As these systems evolve rapidly, it becomes increasingly important to reliably predict their states in the short term to ensure operational stability, support control decisions, and enable interpretable monitoring of sensor and machine behavior. Modern power systems often span multiple physical domains - including electrical, mechanical, hydraulic, and thermal - posing significant challenges for modeling and prediction. Graph Neural Networks (GNNs) have emerged as a promising data-driven framework for system state estimation and state forecasting in such settings. By leveraging the topological structure of sensor networks, GNNs can implicitly learn inter-sensor relationships and propagate information across the network. However, most existing GNN-based methods are designed under the assumption of homogeneous sensor relationships and are typically constrained to a single physical domain. This limitation restricts their ability to integrate and reason over heterogeneous sensor data commonly encountered in real-world energy systems, such as those used in energy conversion infrastructure. In this work, we propose the use of Heterogeneous Graph Attention Networks to address these limitations. Our approach models both homogeneous intra-domain and heterogeneous inter-domain relationships among sensor data from two distinct physical domains - hydraulic and electrical - which exhibit fundamentally different temporal dynamics. Experimental results demonstrate that our method significantly outperforms conventional baselines on average by 35.5% in terms of normalized root mean square error, confirming its effectiveness in multi-domain, multi-rate power system state forecasting.
Authors:Luca Spagnuolo, Gabriel Giribaldi, Filippo Perli, Alberto Corigliano, Luca Colombo, Matteo Rinaldi
Abstract:
This study presents power handling improvements in cross-sectional Lame-Mode Resonators (CLMRs) designed for operation in the Ku-band. Previously fabricated CLMR devices failed at approximately 8 dBm of input power, primarily due to electromigration in the aluminum interdigitated electrodes (IDTs). To better understand this mechanism in CLMRs, a data driven thermal model is developed to analyze localized heating effects within the resonator body, which are known to accelerate electromigration. Based on insights from this model, Aluminum Silicon Copper (AlSiCu) was selected for the IDTs due to its superior thermal stability and resistance to electromigration. Devices fabricated with AlSiCu exhibited no signs of performance degradation, with the best-performing resonator achieving a mechanical quality factor (Qm) of 360, a maximum Bode quality factor (QBode) of 500, and an electromechanical coupling coefficient (kt2) of 6.3%. Moreover, the use of AlSiCu significantly increased the maximum input power the device can withstand, showing an improvement of up to 6 dBm over previous devices. These improvements in power handling make the devices strong candidates for high-power Ku-band filtering applications.
Authors:Jiadong He, Liang Yu, Zhiqiang Chen, Dawei Qiu, Dong Yue, Goran Strbac, Meng Zhang, Yujian Ye, Yi Wang
Abstract:
This letter proposes an Adversarial Inverse Reinforcement Learning (AIRL)-based energy management method for a smart home, which incorporates an implicit thermal dynamics model. In the proposed method, historical optimal decisions are first generated using a neural network-assisted Hierarchical Model Predictive Control (HMPC) framework. These decisions are then used as expert demonstrations in the AIRL module, which aims to train a discriminator to distinguish expert demonstrations from transitions generated by a reinforcement learning agent policy, while simultaneously updating the agent policy that can produce transitions to confuse the discriminator. The proposed HMPC-AIRL method eliminates the need for explicit thermal dynamics models, prior or predictive knowledge of uncertain parameters, or manually designed reward functions. Simulation results based on real-world traces demonstrate the effectiveness and data efficiency of the proposed method.
Authors:Aidan Furlong, Xingang Zhao, Robert Salko, Xu Wu
Abstract:
Critical heat flux (CHF) marks the transition from nucleate to film boiling, where heat transfer to the working fluid can rapidly deteriorate. Accurate CHF prediction is essential for efficiency, safety, and preventing equipment damage, particularly in nuclear reactors. Although widely used, empirical correlations frequently exhibit discrepancies in comparison with experimental data, limiting their reliability in diverse operational conditions. Traditional machine learning (ML) approaches have demonstrated the potential for CHF prediction but have often suffered from limited interpretability, data scarcity, and insufficient knowledge of physical principles. Hybrid model approaches, which combine data-driven ML with physics-based models, mitigate these concerns by incorporating prior knowledge of the domain. This study integrated a purely data-driven ML model and two hybrid models (using the Biasi and Bowring CHF correlations) within the CTF subchannel code via a custom Fortran framework. Performance was evaluated using two validation cases: a subset of the Nuclear Regulatory Commission CHF database and the Bennett dryout experiments. In both cases, the hybrid models exhibited significantly lower error metrics in comparison with conventional empirical correlations. The pure ML model remained competitive with the hybrid models. Trend analysis of error parity indicates that ML-based models reduce the tendency for CHF overprediction, improving overall accuracy. These results demonstrate that ML-based CHF models can be effectively integrated into subchannel codes and can potentially increase performance in comparison with conventional methods.
Authors:Shang Zhang, Huanbin Zhang, Dali Feng, Yujie Cui, Ruoyan Xiong, Cen He
Abstract:
Thermal infrared (TIR) object tracking often suffers from challenges such as target occlusion, motion blur, and background clutter, which significantly degrade the performance of trackers. To address these issues, this paper pro-poses a novel Siamese Motion Mamba Tracker (SMMT), which integrates a bidirectional state-space model and a self-attention mechanism. Specifically, we introduce the Motion Mamba module into the Siamese architecture to ex-tract motion features and recover overlooked edge details using bidirectional modeling and self-attention. We propose a Siamese parameter-sharing strate-gy that allows certain convolutional layers to share weights. This approach reduces computational redundancy while preserving strong feature represen-tation. In addition, we design a motion edge-aware regression loss to improve tracking accuracy, especially for motion-blurred targets. Extensive experi-ments are conducted on four TIR tracking benchmarks, including LSOTB-TIR, PTB-TIR, VOT-TIR2015, and VOT-TIR 2017. The results show that SMMT achieves superior performance in TIR target tracking.
Authors:Shang Zhang, HuiPan Guan, XiaoBo Ding, Ruoyan Xiong, Yue Zhang
Abstract:
Thermal infrared target tracking is crucial in applications such as surveillance, autonomous driving, and military operations. In this paper, we propose a novel tracker, SMTT, which effectively addresses common challenges in thermal infrared imagery, such as noise, occlusion, and rapid target motion, by leveraging multi-task learning, joint sparse representation, and adaptive graph regularization. By reformulating the tracking task as a multi-task learning problem, the SMTT tracker independently optimizes the representation of each particle while dynamically capturing spatial and feature-level similarities using a weighted mixed-norm regularization strategy. To ensure real-time performance, we incorporate the Accelerated Proximal Gradient method for efficient optimization. Extensive experiments on benchmark datasets - including VOT-TIR, PTB-TIR, and LSOTB-TIR - demonstrate that SMTT achieves superior accuracy, robustness, and computational efficiency. These results highlight SMTT as a reliable and high-performance solution for thermal infrared target tracking in complex environments.
Authors:Shang Zhang, Xiaobo Ding, Huanbin Zhang, Ruoyan Xiong, Yue Zhang
Abstract:
Thermal infrared (TIR) target tracking methods often adopt the correlation filter (CF) framework due to its computational efficiency. However, the low resolution of TIR images, along with tracking interference, significantly limits the perfor-mance of TIR trackers. To address these challenges, we introduce STARS, a novel sparse learning-based CF tracker that incorporates spatio-temporal regulari-zation and super-resolution reconstruction. First, we apply adaptive sparse filter-ing and temporal domain filtering to extract key features of the target while reduc-ing interference from background clutter and noise. Next, we introduce an edge-preserving sparse regularization method to stabilize target features and prevent excessive blurring. This regularization integrates multiple terms and employs the alternating direction method of multipliers to optimize the solution. Finally, we propose a gradient-enhanced super-resolution method to extract fine-grained TIR target features and improve the resolution of TIR images, addressing performance degradation in tracking caused by low-resolution sequences. To the best of our knowledge, STARS is the first to integrate super-resolution methods within a sparse learning-based CF framework. Extensive experiments on the LSOTB-TIR, PTB-TIR, VOT-TIR2015, and VOT-TIR2017 benchmarks demonstrate that STARS outperforms state-of-the-art trackers in terms of robustness.
Authors:Ruoyan Xiong, Yuke Hou, Princess Retor Torboh, Hui He, Huanbin Zhang, Yue Zhang, Yanpin Wang, Huipan Guan, Shang Zhang
Abstract:
To address the challenge of capturing highly discriminative features in ther-mal infrared (TIR) tracking, we propose a novel Siamese tracker based on cross-channel fine-grained feature learning and progressive fusion. First, we introduce a cross-channel fine-grained feature learning network that employs masks and suppression coefficients to suppress dominant target features, en-abling the tracker to capture more detailed and subtle information. The net-work employs a channel rearrangement mechanism to enhance efficient in-formation flow, coupled with channel equalization to reduce parameter count. Additionally, we incorporate layer-by-layer combination units for ef-fective feature extraction and fusion, thereby minimizing parameter redun-dancy and computational complexity. The network further employs feature redirection and channel shuffling strategies to better integrate fine-grained details. Second, we propose a specialized cross-channel fine-grained loss function designed to guide feature groups toward distinct discriminative re-gions of the target, thus improving overall target representation. This loss function includes an inter-channel loss term that promotes orthogonality be-tween channels, maximizing feature diversity and facilitating finer detail capture. Extensive experiments demonstrate that our proposed tracker achieves the highest accuracy, scoring 0.81 on the VOT-TIR 2015 and 0.78 on the VOT-TIR 2017 benchmark, while also outperforming other methods across all evaluation metrics on the LSOTB-TIR and PTB-TIR benchmarks.
Authors:Ruoyan Xiong, Huanbin Zhang, Shentao Wang, Hui He, Yuke Hou, Yue Zhang, Yujie Cui, Huipan Guan, Shang Zhang
Abstract:
Thermal infrared (TIR) images typically lack detailed features and have low contrast, making it challenging for conventional feature extraction models to capture discriminative target characteristics. As a result, trackers are often affected by interference from visually similar objects and are susceptible to tracking drift. To address these challenges, we propose a novel saliency-guided Siamese network tracker based on key fine-grained feature infor-mation. First, we introduce a fine-grained feature parallel learning convolu-tional block with a dual-stream architecture and convolutional kernels of varying sizes. This design captures essential global features from shallow layers, enhances feature diversity, and minimizes the loss of fine-grained in-formation typically encountered in residual connections. In addition, we propose a multi-layer fine-grained feature fusion module that uses bilinear matrix multiplication to effectively integrate features across both deep and shallow layers. Next, we introduce a Siamese residual refinement block that corrects saliency map prediction errors using residual learning. Combined with deep supervision, this mechanism progressively refines predictions, ap-plying supervision at each recursive step to ensure consistent improvements in accuracy. Finally, we present a saliency loss function to constrain the sali-ency predictions, directing the network to focus on highly discriminative fi-ne-grained features. Extensive experiment results demonstrate that the pro-posed tracker achieves the highest precision and success rates on the PTB-TIR and LSOTB-TIR benchmarks. It also achieves a top accuracy of 0.78 on the VOT-TIR 2015 benchmark and 0.75 on the VOT-TIR 2017 benchmark.
Authors:Shang Zhang, Yuke Hou, Guoqiang Gong, Ruoyan Xiong, Yue Zhang
Abstract:
Correlation filter (CF)-based trackers have gained significant attention for their computational efficiency in thermal infrared (TIR) target tracking. However, ex-isting methods struggle with challenges such as low-resolution imagery, occlu-sion, background clutter, and target deformation, which severely impact tracking performance. To overcome these limitations, we propose RAMCT, a region-adaptive sparse correlation filter tracker that integrates multi-channel feature opti-mization with an adaptive regularization strategy. Firstly, we refine the CF learn-ing process by introducing a spatially adaptive binary mask, which enforces spar-sity in the target region while dynamically suppressing background interference. Secondly, we introduce generalized singular value decomposition (GSVD) and propose a novel GSVD-based region-adaptive iterative Tikhonov regularization method. This enables flexible and robust optimization across multiple feature channels, improving resilience to occlusion and background variations. Thirdly, we propose an online optimization strategy with dynamic discrepancy-based pa-rameter adjustment. This mechanism facilitates real time adaptation to target and background variations, thereby improving tracking accuracy and robustness. Ex-tensive experiments on LSOTB-TIR, PTB-TIR, VOT-TIR2015, and VOT-TIR2017 benchmarks demonstrate that RAMCT outperforms other state-of-the-art trackers in terms of accuracy and robustness.
Authors:Qishun Wang, Zhengzheng Tu, Chenglong Li, Bo Jiang
Abstract:
RGB-Thermal Video Object Detection (RGBT VOD) can address the limitation of traditional RGB-based VOD in challenging lighting conditions, making it more practical and effective in many applications.
However, similar to most RGBT fusion tasks, it still mainly relies on manually aligned multimodal image pairs.
In this paper, we propose a novel Multimodal Spatio-temporal Graph learning Network (MSGNet) for alignment-free RGBT VOD problem by leveraging the robust graph representation learning model.
Specifically, we first design an Adaptive Partitioning Layer (APL) to estimate the corresponding regions of the Thermal image within the RGB image (high-resolution), achieving a preliminary inexact alignment.
Then, we introduce the Spatial Sparse Graph Learning Module (S-SGLM) which employs a sparse information passing mechanism on the estimated inexact alignment to achieve reliable information interaction between different modalities.
Moreover, to fully exploit the temporal cues for RGBT VOD problem, we introduce Hybrid Structured Temporal Modeling (HSTM), which involves a Temporal Sparse Graph Learning Module (T-SGLM) and Temporal Star Block (TSB). T-SGLM aims to filter out some redundant information between adjacent frames by employing the sparse aggregation mechanism on the temporal graph. Meanwhile, TSB is dedicated to achieving the complementary learning of local spatial relationships.
Extensive comparative experiments conducted on both the aligned dataset VT-VOD50 and the unaligned dataset UVT-VOD2024 demonstrate the effectiveness and superiority of our proposed method. Our project will be made available on our website for free public access.
Authors:Edoardo Del Bianco, Davide Torielli, Federico Rollo, Damiano Gasperini, Arturo Laurenzi, Lorenzo Baccelliere, Luca Muratore, Marco Roveri, Nikos G. Tsagarakis
Abstract:
Modern humanoid robots have shown their promising potential for executing various tasks involving the grasping and manipulation of objects using their end-effectors. Nevertheless, in the most of the cases, the grasping and manipulation actions involve low to moderate payload and interaction forces. This is due to limitations often presented by the end-effectors, which can not match their arm-reachable payload, and hence limit the payload that can be grasped and manipulated. In addition, grippers usually do not embed adequate perception in their hardware, and grasping actions are mainly driven by perception sensors installed in the rest of the robot body, frequently affected by occlusions due to the arm motions during the execution of the grasping and manipulation tasks. To address the above, we developed a modular high grasping force gripper equipped with embedded multi-modal perception functionalities. The proposed gripper can generate a grasping force of 110 N in a compact implementation. The high grasping force capability is combined with embedded multi-modal sensing, which includes an eye-in-hand camera, a Time-of-Flight (ToF) distance sensor, an Inertial Measurement Unit (IMU) and an omnidirectional microphone, permitting the implementation of perception-driven grasping functionalities.
We extensively evaluated the grasping force capacity of the gripper by introducing novel payload evaluation metrics that are a function of the robot arm's dynamic motion and gripper thermal states. We also evaluated the embedded multi-modal sensing by performing perception-guided enhanced grasping operations.
Authors:Silas Weinert, Jonas Bundschuh, Yvonne Späck-Leigsnering, Herbert De Gersem
Abstract:
Foil windings have, due to their layered structure, different properties than conventional wire windings, which make them advantageous for high frequency applications. Both electromagnetic and thermal analyses are relevant for foil windings. These two physical areas are coupled through Joule losses and temperature dependent material properties. For an efficient simulation of foil windings, homogenization techniques are used to avoid resolving the single turns. Therefore, this paper comprises a coupled magneto-thermal simulation that uses a homogenization method in the electromagnetic and thermal part. A weak coupling with different time step sizes for both parts is presented. The method is validated on a simple geometry and showcased for a pot transformer that uses a foil and a wire winding.
Authors:Yangfan Xu, Qu Hao, Lilian Zhang, Jun Mao, Xiaofeng He, Wenqi Wu, Changhao Chen
Abstract:
Visual SLAM is essential for mobile robots, drone navigation, and VR/AR, but traditional RGB camera systems struggle in low-light conditions, driving interest in thermal SLAM, which excels in such environments. However, thermal imaging faces challenges like low contrast, high noise, and limited large-scale annotated datasets, restricting the use of deep learning in outdoor scenarios. We present DarkSLAM, a noval deep learning-based monocular thermal SLAM system designed for large-scale localization and reconstruction in complex lighting conditions.Our approach incorporates the Efficient Channel Attention (ECA) mechanism in visual odometry and the Selective Kernel Attention (SKA) mechanism in depth estimation to enhance pose accuracy and mitigate thermal depth degradation. Additionally, the system includes thermal depth-based loop closure detection and pose optimization, ensuring robust performance in low-texture thermal scenes. Extensive outdoor experiments demonstrate that DarkSLAM significantly outperforms existing methods like SC-Sfm-Learner and Shin et al., delivering precise localization and 3D dense mapping even in challenging nighttime environments.
Authors:Shengyu Tao, Guangyuan Ma, Huixiong Yang, Minyan Lu, Guodan Wei, Guangmin Zhou, Xuan Zhang
Abstract:
As electric vehicles (EVs) approach the end of their operational life, their batteries retain significant economic value and present promising opportunities for second-life use and material recycling. This is particularly compelling for Global South and other underdeveloped regions, where reliable energy storage is vital to addressing critical challenges posed by weak and even nonexistent power grid and energy infrastructures. However, despite this potential, widespread adoption has been hindered by critical uncertainties surrounding the technical performance, safety, and recertification of second-life batteries. In cases where they have been redeployed, mismatches between estimated and actual performance often render batteries technically unsuitable or hazardous, turning them into liabilities for communities they were intended to benefit. This considerable misalignment exacerbates energy access disparities and undermines the broader vision of energy justice, highlighting an urgent need for robust and scalable solutions to unlock the potential. In the PulseBat Dataset, the authors tested 464 retired lithium-ion batteries, covering 3 cathode material types, 6 historical usages, 3 physical formats, and 6 capacity designs. The pulse test experiments were performed repeatedly for each second-life battery with 10 pulse width, 10 pulse magnitude, multiple state-of-charge, and state-of-health conditions, e.g., from 0.37 to 1.03. The PulseBat Dataset recorded these test conditions and the voltage response as well as the temperature signals that were subject to the injected pulse current, which could be used as a valuable data resource for critical diagnostics tasks such as state-of-charge estimation, state-of-health estimation, cathode material type identification, open-circuit voltage reconstruction, thermal management, and beyond.
Authors:Aidan Furlong, Xingang Zhao, Bob Salko, Xu Wu
Abstract:
Over the past decade, the investigation of machine learning (ML) within the field of nuclear engineering has grown significantly. With many approaches reaching maturity, the next phase of investigation will determine the feasibility and usefulness of ML model implementation in a production setting. Several of the codes used for reactor design and assessment are primarily written in the Fortran language, which is not immediately compatible with TensorFlow-trained ML models. This study presents a framework for implementing deep neural networks (DNNs) and Bayesian neural networks (BNNs) in Fortran, allowing for native execution without TensorFlow's C API, Python runtime, or ONNX conversion. Designed for ease of use and computational efficiency, the framework can be implemented in any Fortran code, supporting iterative solvers and UQ via ensembles or BNNs. Verification was performed using a two-input, one-output test case composed of a noisy sinusoid to compare Fortran-based predictions to those from TensorFlow. The DNN predictions showed negligible differences and achieved a 19.6x speedup, whereas the BNN predictions exhibited minor disagreement, plausibly due to differences in random number generation. An 8.0x speedup was noted for BNN inference. The approach was then further verified on a nuclear-relevant problem predicting critical heat flux (CHF), which demonstrated similar behavior along with significant computational gains. Discussion regarding the framework's successful integration into the CTF thermal-hydraulics code is also included, outlining its practical usefulness. Overall, this framework was shown to be effective at implementing both DNN and BNN model inference within Fortran, allowing for the continued study of ML-based methods in real-world nuclear applications.
Authors:Leon Blumrich, Christian Bergfried, Armin Galetzka, Herbert De Gersem, Roland Seebacher, Annette Mütze, Yvonne Späck-Leigsnering
Abstract:
Accurate and efficient thermal simulations of induction machines are indispensable for detecting thermal hot spots and hence avoiding potential material failure in an early design stage. A goal is the better utilization of the machines with reduced safety margins due to a better knowledge of the critical conditions. In this work, the parameters of a two-dimensional induction machine model are calibrated according to evidence from measurements, by solving an inverse field problem. The set of parameters comprise material parameters as well as parameters that model three-dimensional effects. This allows a consideration of physical effects without explicit knowledge of its quantities. First, the accuracy of the approach is studied using an academic example in combination with synthetic data. Afterwards, it is successfully applied to a realistic induction machine model.
Authors:Lena Baumann, Lukas Einkemmer, Christian Klingenberg, Jonas Kusch
Abstract:
Computing numerical solutions of the thermal radiative transfer equations on a finely resolved grid can be costly due to high computational and memory requirements. A numerical reduced order method that has recently been applied to a wide variety of kinetic partial differential equations is the concept of dynamical low-rank approximation (DLRA). In this paper, we consider the thermal radiative transfer equations with Su-Olson closure, leading to a linearized kinetic model. For the conducted theoretical and practical considerations we use a multiplicative splitting of the distribution function that poses additional challenges in finding an energy stable discretization and deriving a hyperbolic Courant-Friedrichs-Lewy (CFL) condition. We propose such an energy stable DLRA scheme that makes use of the augmented basis update & Galerkin integrator. This integrator allows for additional basis augmentations, enabling us to give a mathematically rigorous proof of energy stability and local mass conservation. Numerical examples confirm the derived properties and show the computational advantages of the DLRA scheme compared to a numerical solution of the full system of equations.
Authors:Hamid Toshani, Janith Petangoda, Chatura Samarakoon, Phillip Stanley-Marbell
Abstract:
Uniform temperature distribution in Selective Laser Sintering (SLS) is essential for producing durable 3D prints. Achieving uniformity requires a laser power control system that minimises deviation of the printing temperatures from the target temperature. Because the estimate of the actual process temperature is an input to the laser power control, uncertainty in the estimate of the actual temperature can lead to fluctuations in laser power that affect the thermal performance of the SLS. This article investigates the sensitivity of a laser power control system to temperature measurement uncertainty. This article evaluates the effectiveness of two methods for quantifying the effect of input uncertainty on a SLS laser power control system: a recent innovation in uncertainty-tracked architecture and traditional Monte Carlo simulation. We show that recent advances in computer architecture for arithmatic on probability distributions make it possible for the first time, to perform control system uncertainty analysis with latencies under 30 ms, while achieving the same level of uncertainty analysis as Monte Carlo methods with latencies that are two orders of magnitude slower.
Authors:Weiming Xu, Peng Zhang
Abstract:
As core thermal power generation equipment, steam turbines incur significant expenses and adverse effects on operation when facing interruptions like downtime, maintenance, and damage. Accurate anomaly detection is the prerequisite for ensuring the safe and stable operation of steam turbines. However, challenges in steam turbine anomaly detection, including inherent anomalies, lack of temporal information analysis, and high-dimensional data complexity, limit the effectiveness of existing methods. To address these challenges, we proposed an Enhanced Long Short-Term Memory Variational Autoencoder using Deep Advanced Features and Gaussian Mixture Model (ELSTMVAE-DAF-GMM) for precise unsupervised anomaly detection in unlabeled datasets. Specifically, LSTMVAE, integrating LSTM with VAE, was used to project high-dimensional time-series data to a low-dimensional phase space. The Deep Autoencoder-Local Outlier Factor (DAE-LOF) sample selection mechanism was used to eliminate inherent anomalies during training, further improving the model's precision and reliability. The novel deep advanced features (DAF) hybridize latent embeddings and reconstruction discrepancies from the LSTMVAE model and provide a more comprehensive data representation within a continuous and structured phase space, significantly enhancing anomaly detection by synergizing temporal dynamics with data pattern variations. These DAF were incorporated into GMM to ensure robust and effective unsupervised anomaly detection. We utilized real operating data from industry steam turbines and conducted both comparison and ablation experiments, demonstrating superior anomaly detection outcomes characterized by high accuracy and minimal false alarm rates compared with existing methods.
Authors:Daniel Menges, Florian Stadtmann, Henrik Jordheim, Adil Rasheed
Abstract:
This paper explores the development and practical application of a predictive digital twin specifically designed for condition monitoring, using advanced mathematical models and thermal imaging techniques. Our work presents a comprehensive approach to integrating Proper Orthogonal Decomposition (POD), Robust Principal Component Analysis (RPCA), and Dynamic Mode Decomposition (DMD) to establish a robust predictive digital twin framework. We employ these methods in a real-time experimental setup involving a heated plate monitored through thermal imaging. This system effectively demonstrates the digital twin's capabilities in real-time predictions, condition monitoring, and anomaly detection. Additionally, we introduce the use of a human-machine interface that includes virtual reality, enhancing user interaction and system understanding. The primary contributions of our research lie in the demonstration of these advanced techniques in a tangible setup, showcasing the potential of digital twins to transform industry practices by enabling more proactive and strategic asset management.
Authors:Duy Nhat Phan, Sushant Jha, James P. Mavo, Erin L. Lanigan, Linh Nguyen, Lokendra Poudel, Rahul Bhowmik
Abstract:
Additive Manufacturing (AM) is transforming the manufacturing sector by enabling efficient production of intricately designed products and small-batch components. However, metal parts produced via AM can include flaws that cause inferior mechanical properties, including reduced fatigue response, yield strength, and fracture toughness. To address this issue, we leverage convolutional neural networks (CNN) to analyze thermal images of printed layers, automatically identifying anomalies that impact these properties. We also investigate various synthetic data generation techniques to address limited and imbalanced AM training data. Our models' defect detection capabilities were assessed using images of Nickel alloy 718 layers produced on a laser powder bed fusion AM machine and synthetic datasets with and without added noise. Our results show significant accuracy improvements with synthetic data, emphasizing the importance of expanding training sets for reliable defect detection. Specifically, Generative Adversarial Networks (GAN)-generated datasets streamlined data preparation by eliminating human intervention while maintaining high performance, thereby enhancing defect detection capabilities. Additionally, our denoising approach effectively improves image quality, ensuring reliable defect detection. Finally, our work integrates these models in the CLoud ADditive MAnufacturing (CLADMA) module, a user-friendly interface, to enhance their accessibility and practicality for AM applications. This integration supports broader adoption and practical implementation of advanced defect detection in AM processes.
Authors:Lokendra Poudel, Sushant Jha, Ryan Meeker, Duy-Nhat Phan, Rahul Bhowmik
Abstract:
Ultrasonic Additive Manufacturing (UAM) employs ultrasonic welding to bond similar or dissimilar metal foils to a substrate, resulting in solid, consolidated metal components. However, certain processing conditions can lead to inter-layer defects, affecting the final product's quality. This study develops a method to monitor in-process quality using deep learning-based convolutional neural networks (CNNs). The CNN models were evaluated on their ability to classify samples with and without embedded thermocouples across five power levels (300W, 600W, 900W, 1200W, 1500W) using thermal images with supervised labeling. Four distinct CNN classification models were created for different scenarios including without (baseline) and with thermocouples, only without thermocouples across power levels, only with thermocouples across power levels, and combined without and with thermocouples across power levels. The models achieved 98.29% accuracy on combined baseline and thermocouple images, 97.10% for baseline images across power levels, 97.43% for thermocouple images, and 97.27% for both types across power levels. The high accuracy, above 97%, demonstrates the system's effectiveness in identifying and classifying conditions within the UAM process, providing a reliable tool for quality assurance and process control in manufacturing environments.
Authors:Christian Bergfried, Samaneh Abdi Qezeljeh, Ilia V. Roisman, Herbert De Gersem, Jeanette Hussong, Yvonne Späck-Leigsnering
Abstract:
The need for higher power density in electrical machines require better cooling strategies. Spray cooling is a very promising and relatively simple technology to apply, but involves extremely complicated physics. In this paper, a quasi-3D thermal finite-element model of a stator winding is created, by extrusion of a 2D cross-sectional finite-element model along the winding direction. The possible effects of spray cooling are simulated as a heat flux using an impedance boundary condition at the surface of the winding overhang. The results confirm the beneficial performance of spray cooling. The model indicates that spray cooling may allow a ten times larger power density than for standard air- or water-cooled machines.
Authors:Qishun Wang, Zhengzheng Tu, Kunpeng Wang, Le Gu, Chuanwang Guo
Abstract:
Existing RGB-Thermal Video Object Detection (RGBT VOD) methods predominantly rely on the manual alignment of image pairs, that is both labor-intensive and time-consuming. This dependency significantly restricts the scalability and practical applicability of these methods in real-world scenarios. To address this critical limitation, we propose a novel framework termed the Mixture of Scale Experts Network (MSENet). MSENet integrates multiple experts trained at different perceptual scales, enabling the capture of scale discrepancies between RGB and thermal image pairs without the need for explicit alignment. Specifically, to address the issue of unaligned scales, MSENet introduces a set of experts designed to perceive the correlation between RGBT image pairs across various scales. These experts are capable of identifying and quantifying the scale differences inherent in the image pairs. Subsequently, a dynamic routing mechanism is incorporated to assign adaptive weights to each expert, allowing the network to dynamically select the most appropriate experts based on the specific characteristics of the input data. Furthermore, to address the issue of weakly unaligned positions, we integrate deformable convolution into the network. Deformable convolution is employed to learn position displacements between the RGB and thermal modalities, thereby mitigating the impact of spatial misalignment. To provide a comprehensive evaluation platform for alignment-free RGBT VOD, we introduce a new benchmark dataset. This dataset includes eleven common object categories, with a total of 60,988 images and 271,835 object instances. The dataset encompasses a wide range of scenes from both daily life and natural environments, ensuring high content diversity and complexity.
Authors:Yichen Guo, Paul Fischer, Misun Min
Abstract:
A spectral-element-based formulation of incompressible MHD is presented in the context of the open-source fluid-thermal code, Nek5000/RS. The formulation supports magnetic fields in a solid domain that surrounds the fluid domain. Several steady-state and time-transient model problems are presented as part of the code verification process. Nek5000/RS is designed for large-scale turbulence simulations, which will be the next step with this new MHD capability.
Authors:Matteo Tomasetto, Andrea Manzoni, Francesco Braghin
Abstract:
Steering a system towards a desired target in a very short amount of time is challenging from a computational standpoint. Indeed, the intrinsically iterative nature of optimal control problems requires multiple simulations of the physical system to be controlled. Moreover, the control action needs to be updated whenever the underlying scenario undergoes variations. Full-order models based on, e.g., the Finite Element Method, do not meet these requirements due to the computational burden they usually entail. On the other hand, conventional reduced order modeling techniques such as the Reduced Basis method, are intrusive, rely on a linear superimposition of modes, and lack of efficiency when addressing nonlinear time-dependent dynamics. In this work, we propose a non-intrusive Deep Learning-based Reduced Order Modeling (DL-ROM) technique for the rapid control of systems described in terms of parametrized PDEs in multiple scenarios. In particular, optimal full-order snapshots are generated and properly reduced by either Proper Orthogonal Decomposition or deep autoencoders (or a combination thereof) while feedforward neural networks are exploited to learn the map from scenario parameters to reduced optimal solutions. Nonlinear dimensionality reduction therefore allows us to consider state variables and control actions that are both low-dimensional and distributed. After (i) data generation, (ii) dimensionality reduction, and (iii) neural networks training in the offline phase, optimal control strategies can be rapidly retrieved in an online phase for any scenario of interest. The computational speedup and the high accuracy obtained with the proposed approach are assessed on different PDE-constrained optimization problems, ranging from the minimization of energy dissipation in incompressible flows modelled through Navier-Stokes equations to the thermal active cooling in heat transfer.
Authors:Jeiyoon Park, Daehwan Lee, Changmin Yeo, Yongshin Han, Minseop Kim
Abstract:
Despite its efficiency, there has been little research on the practical aspects required for real-world deployment of on-device AI models, such as the device's CPU utilization and thermal conditions. In this paper, through extensive experiments, we investigate two key issues that must be addressed to deploy on-device models in real-world services: (i) the selection of on-device models and the resource consumption of each model, and (ii) the capability and potential of on-device models for domain adaptation. To this end, we focus on a task of translating live-stream chat messages and manually construct LiveChatBench, a benchmark consisting of 1,000 Korean-English parallel sentence pairs. Experiments on five mobile devices demonstrate that, although serving a large and heterogeneous user base requires careful consideration of highly constrained deployment settings and model selection, the proposed approach nevertheless achieves performance comparable to commercial models such as GPT-5.1 on the well-targeted task. We expect that our findings will provide meaningful insights to the on-device AI community.
Authors:Ahmed S. Alahmed, Audun Botterud, Saurabh Amin, Ali T. Al-Awami
Abstract:
We develop a mathematical framework for the optimal dispatch of flexible water desalination plants (WDPs) as hybrid generator-load resources. WDPs integrate thermal generation, membrane-based controllable loads, and renewable energy sources, offering unique operational flexibility for power system operations. They can simultaneously participate in two markets: selling desalinated water to a water utility, and bidirectionally transacting electricity with the grid based on their net electricity demand. We formulate the dispatch decision problem of a profit-maximizing WDP, capturing operational, technological, and market-based coupling between water and electricity flows. The threshold-based structure we derive provides computationally tractable coordination suitable for large-scale deployment, offering operational insights into how thermal generation and membrane-based loads complementarily provide continuous bidirectional flexibility. The thresholds are analytically characterized in closed form as explicit functions of technology and tariff parameters. We examine how small changes in the exogenous tariff and technology parameters affect the WDP's profit. Extensive simulations illustrate the optimal WDP's operation, profit, and water-electricity exchange, demonstrating significant improvements relative to benchmark algorithms.
Authors:Aahan Sachdeva, Dhanvinkumar Ganeshkumar, James E. Gallagher, Tyler Treat, Edward J. Oughton
Abstract:
Autonomous robotic platforms are playing a growing role across the emergency services sector, supporting missions such as search and rescue operations in disaster zones and reconnaissance. However, traditional red-green-blue (RGB) detection pipelines struggle in low-light environments, and thermal-based systems lack color and texture information. To overcome these limitations, we present an adaptive framework that fuses RGB and long-wave infrared (LWIR) video streams at multiple fusion ratios and dynamically selects the optimal detection model for each illumination condition. We trained 33 You Only Look Once (YOLO) models on over 22,000 annotated images spanning three light levels: no-light (<10 lux), dim-light (10-1000 lux), and full-light (>1000 lux). To integrate both modalities, fusion was performed by blending aligned RGB and LWIR frames at eleven ratios, from full RGB (100/0) to full LWIR (0/100) in 10% increments. Evaluation showed that the best full-light model (80/20 RGB-LWIR) and dim-light model (90/10 fusion) achieved 92.8% and 92.0% mean confidence; both significantly outperformed the YOLOv5 nano (YOLOv5n) and YOLOv11 nano (YOLOv11n) baselines. Under no-light conditions, the top 40/60 fusion reached 71.0%, exceeding baselines though not statistically significant. Adaptive RGB-LWIR fusion improved detection confidence and reliability across all illumination conditions, enhancing autonomous robotic vision performance.
Authors:James E. Gallagher, Edward J. Oughton, Jana Kosecka
Abstract:
Landmines remain a persistent humanitarian threat, with 110 million actively deployed mines across 60 countries, claiming 26,000 casualties annually. This research evaluates adaptive Red-Green-Blue (RGB) and Long-Wave Infrared (LWIR) fusion for Unmanned Aerial Systems (UAS)-based detection of surface-laid landmines, leveraging the thermal contrast between the ordnance and the surrounding soil to enhance feature extraction. Using You Only Look Once (YOLO) architectures (v8, v10, v11) across 114 test images, generating 35,640 model-condition evaluations, YOLOv11 achieved optimal performance (86.8% mAP), with 10 to 30% thermal fusion at 5 to 10m altitude identified as the optimal detection parameters. A complementary architectural comparison revealed that while RF-DETR achieved the highest accuracy (69.2% mAP), followed by Faster R-CNN (67.6%), YOLOv11 (64.2%), and RetinaNet (50.2%), YOLOv11 trained 17.7 times faster than the transformer-based RF-DETR (41 minutes versus 12 hours), presenting a critical accuracy-efficiency tradeoff for operational deployment. Aggregated multi-temporal training datasets outperformed season-specific approaches by 1.8 to 9.6%, suggesting that models benefit from exposure to diverse thermal conditions. Anti-Tank (AT) mines achieved 61.9% detection accuracy, compared with 19.2% for Anti-Personnel (AP) mines, reflecting both the size differential and thermal-mass differences between these ordnance classes. As this research examined surface-laid mines where thermal contrast is maximized, future research should quantify thermal contrast effects for mines buried at varying depths across heterogeneous soil types.
Authors:Zekai Shao, Yufan Hu, Jingyuan Liu, Bin Fan, Hongmin Liu
Abstract:
Parameter-efficient fine-tuning has emerged as a promising paradigm in RGB-T tracking, enabling downstream task adaptation by freezing pretrained parameters and fine-tuning only a small set of parameters. This set forms a rank space made up of multiple individual ranks, whose expressiveness directly shapes the model's adaptability. However, quantitative analysis reveals low-rank adaptation exhibits significant redundancy in the rank space, with many ranks contributing almost no practical information. This hinders the model's ability to learn more diverse knowledge to address the various challenges in RGB-T tracking. To address this issue, we propose the Group Orthogonal Low-Rank Adaptation (GOLA) framework for RGB-T tracking, which effectively leverages the rank space through structured parameter learning. Specifically, we adopt a rank decomposition partitioning strategy utilizing singular value decomposition to quantify rank importance, freeze crucial ranks to preserve the pretrained priors, and cluster the redundant ranks into groups to prepare for subsequent orthogonal constraints. We further design an inter-group orthogonal constraint strategy. This constraint enforces orthogonality between rank groups, compelling them to learn complementary features that target diverse challenges, thereby alleviating information redundancy. Experimental results demonstrate that GOLA effectively reduces parameter redundancy and enhances feature representation capabilities, significantly outperforming state-of-the-art methods across four benchmark datasets and validating its effectiveness in RGB-T tracking tasks.
Authors:Rostislav-Paul Wilhelm, Fabio Bacchini
Abstract:
Validity of fluid models breaks down for non-thermal or weakly collisional plasmas which often occur e.g. in the solar wind. In these regimes one has to resort to modelling through the first-principle Vlasov-Maxwell system, but its six-dimensional phase-space dynamics, strong filamentation, and multi-scale structure make direct numerical simulation extremely demanding. Particle-In-Cell (PIC) methods remain the standard for ion-scale studies, yet their memory cost and intrinsic noise hinder accurate electron-scale simulations. In this paper, we introduce an alternative method based on an iterative-in-time approximation of characteristics. The approach reconstructs the phase-space dynamics from the time history of the electromagnetic fields and the initial distribution functions, enabling extremely high effective resolution far below the phase-space grid scale without storing or advecting high-dimensional data. Earlier work demonstrated this capability for the multi-species electrostatic Vlasov system. Here we discuss an extension of the method to the full Vlasov-Maxwell equations using a Hamiltonian splitting to advance the solution in a structure-preserving way while retaining the reduced memory footprint.
Authors:David Lee, Kieran Ricardo, Tamara Tambyah
Abstract:
A high order discontinuous Galerkin method for the material transport of thermodynamic tracers is coupled to a low order mixed finite element solver in the context of the thermal shallow water equations. The coupling preserves the energy conserving structure of the low order dynamics solver, while the high order material transport scheme is provably tracer variance conserving, or damping with the inclusion of upwinding. The two methods are coupled via the multigrid hierarchy of the low order dynamics solver, with the basis functions of the high order transport being collocated at the Gauss-Legendre quadrature points with the low order dynamics on the finest scale multigrid mesh. Standard test cases are presented to verify the consistency and conservation properties of the method. While the overall scheme is limited by the formal order of accuracy of the low order dynamics, the use of high order, tracer variance conserving transport is shown to preserve richer turbulent solutions without compromising model stability compared to a purely low order method.
Authors:Zhenyu Chen, Yuguo Shao, Zhengwei Liu, Zhaohui Wei
Abstract:
Quantum algorithms based on parameterized quantum circuits (PQCs) have enabled a wide range of applications on near-term quantum devices. However, existing PQC architectures face several challenges, among which the ``barren plateaus" phenomenon is particularly prominent. In such cases, the loss function concentrates exponentially with increasing system size, thereby hindering effective parameter optimization. To address this challenge, we propose a general and hardware-efficient method for eliminating barren plateaus in an arbitrary PQC. Specifically, our approach achieves this by inserting a layer of easily implementable quantum channels into the original PQC, each channel requiring only one ancilla qubit and four additional gates, yielding a modified PQC (MPQC) that is provably at least as expressive as the original PQC and, under mild assumptions, is guaranteed to be free from barren plateaus. Furthermore, by appropriately adjusting the structure of MPQCs, we rigorously prove that any parameter in the original PQC can be made trainable. Importantly, the absence of barren plateaus in MPQCs is robust against realistic noise, making our approach directly applicable to current noisy intermediate-scale quantum (NISQ) hardware. Numerically, we demonstrate the practicality of our method by modifying a commonly used PQC for thermal-state preparation. The results show that {barren plateaus are effectively eliminated} in this class of circuits with up to 100 qubits and 2400 layers, whereas the original ansatz suffers from severe gradient vanishing.
Authors:Daniela Martin, Connor O'Brien, Valmir P Moraes Filho, Jinsu Hong, Jasmine R. Kobayashi, Evangelia Samara, Joseph Gallego
Abstract:
We present a scalable machine learning framework for analyzing Parker Solar Probe (PSP) solar wind data using distributed processing and the quantum-inspired Kernel Density Matrices (KDM) method. The PSP dataset (2018--2024) exceeds 150 GB, challenging conventional analysis approaches. Our framework leverages Dask for large-scale statistical computations and KDM to estimate univariate and bivariate distributions of key solar wind parameters, including solar wind speed, proton density, and proton thermal speed, as well as anomaly thresholds for each parameter. We reveal characteristic trends in the inner heliosphere, including increasing solar wind speed with distance from the Sun, decreasing proton density, and the inverse relationship between speed and density. Solar wind structures play a critical role in enhancing and mediating extreme space weather phenomena and can trigger geomagnetic storms; our analyses provide quantitative insights into these processes. This approach offers a tractable, interpretable, and distributed methodology for exploring complex physical datasets and facilitates reproducible analysis of large-scale in situ measurements. Processed data products and analysis tools are made publicly available to advance future studies of solar wind dynamics and space weather forecasting. The code and configuration files used in this study are publicly available to support reproducibility.
Authors:Kunal Shankar, Ninad Gaikwad, Anamika Dubey
Abstract:
Achieving the flexibility from house heating, cooling, and ventilation systems (HVAC) has the potential to enable large-scale demand response by aggregating HVAC load adjustments across many homes. This demand response strategy helps distribution grid to flexibly ramp-up or ramp-down local load demand so that it can optimally match the bulk power system generation profile. However, achieving this capability requires house thermal models that are both computationally efficient and robust to operating conditions. In this work, parameters of the Resistance-Capacitance (RC) network thermal model for houses are estimated using three optimization algorithms: Nonlinear Least Squares (NLS), Batch Estimation (BE), and Maximum Likelihood Estimation (MLE). The resulting models are evaluated through a Forward-Simulation across four different seasons and three setpoints. The results illustrate a principled way of selecting reduced order models and estimation methods with respect to the robustness offered to seasonal and setpoint variations in training-testing datasets
Authors:Gregory Yeghiyan, Jurius Azar, Devson Butani, Chan-Jin Chung
Abstract:
This paper presents a real-time spill detection system that utilizes pretrained deep learning models with RGB and thermal imaging to classify spill vs. no-spill scenarios across varied environments. Using a balanced binary dataset (4,000 images), our experiments demonstrate the advantages of thermal imaging in inference speed, accuracy, and model size. We achieve up to 100% accuracy using lightweight models like VGG19 and NasNetMobile, with thermal models performing faster and more robustly across different lighting conditions. Our system runs on consumer-grade hardware (RTX 4080) and achieves inference times as low as 44 ms with model sizes under 350 MB, highlighting its deployability in safety-critical contexts. Results from experiments with a real robot and test datasets indicate that a VGG19 model trained on thermal imaging performs best.
Authors:Sitan Chen, Jordan Cotler, Hsin-Yuan Huang
Abstract:
Characterizing quantum many-body systems is a fundamental problem across physics, chemistry, and materials science. While significant progress has been made, many existing Hamiltonian learning protocols demand digital quantum control over the entire system, creating a disconnect from many real-world settings that provide access only through small, local probes. Motivated by this, we introduce and formalize the problem of quantum probe tomography, where one seeks to learn the parameters of a many-body Hamiltonian using a single local probe access to a small subsystem of a many-body thermal state undergoing time evolution. We address the identifiability problem of determining which Hamiltonians can be distinguished from probe data through a new combination of tools from algebraic geometry and smoothed analysis. Using this approach, we prove that generic Hamiltonians in various physically natural families are identifiable up to simple, unavoidable structural symmetries. Building on these insights, we design the first efficient end-to-end algorithm for probe tomography that learns Hamiltonian parameters to accuracy $\varepsilon$, with query complexity scaling polynomially in $1/\varepsilon$ and classical post-processing time scaling polylogarithmically in $1/\varepsilon$. In particular, we demonstrate that translation- and rotation-invariant nearest-neighbor Hamiltonians on square lattices in one, two, and three dimensions can be efficiently reconstructed from single-site probes of the Gibbs state, up to inversion symmetry about the probed site. Our results demonstrate that robust Hamiltonian learning remains achievable even under severely constrained experimental access.
Authors:Kebin Contreras, Luis Toscano-Palomino, Mauro Dalla Mura, Jorge Bacca
Abstract:
Recovering the past from present observations is an intriguing challenge with potential applications in forensics and scene analysis. Thermal imaging, operating in the infrared range, provides access to otherwise invisible information. Since humans are typically warmer (37 C -98.6 F) than their surroundings, interactions such as sitting, touching, or leaning leave residual heat traces. These fading imprints serve as passive temporal codes, allowing for the inference of recent events that exceed the capabilities of RGB cameras. This work proposes a time-reversed reconstruction framework that uses paired RGB and thermal images to recover scene states from a few seconds earlier. The proposed approach couples Visual-Language Models (VLMs) with a constrained diffusion process, where one VLM generates scene descriptions and another guides image reconstruction, ensuring semantic and structural consistency. The method is evaluated in three controlled scenarios, demonstrating the feasibility of reconstructing plausible past frames up to 120 seconds earlier, providing a first step toward time-reversed imaging from thermal traces.
Authors:Ahmed S. Alahmed, Audun Botterud, Saurabh Amin, Ali T. Al-Awami
Abstract:
We develop a mathematical framework to jointly schedule water and electricity in a profit-maximizing renewable colocated water desalination plant that integrates both thermal and membrane based technologies. The price-taking desalination plant sells desalinated water to a water utility at a given price and engages in bidirectional electricity transactions with the grid, purchasing or selling power based on its net electricity demand. We show that the optimal scheduling policy depends on the plant's internal renewable generation and follows a simple threshold structure. Under the optimal policy, thermal based water output decreases monotonically with renewable output, while membrane based water output increases monotonically. We characterize the structure and intuition behind the threshold policy and examine key special properties.
Authors:Earl Ranario, Ismael Mayanja, Heesup Yun, Brian N. Bailey, J. Mason Earles
Abstract:
Accurate plant segmentation in thermal imagery remains a significant challenge for high throughput field phenotyping, particularly in outdoor environments where low contrast between plants and weeds and frequent occlusions hinder performance. To address this, we present a framework that leverages synthetic RGB imagery, a limited set of real annotations, and GAN-based cross-modality alignment to enhance semantic segmentation in thermal images. We trained models on 1,128 synthetic images containing complex mixtures of crop and weed plants in order to generate image segmentation masks for crop and weed plants. We additionally evaluated the benefit of integrating as few as five real, manually segmented field images within the training process using various sampling strategies. When combining all the synthetic images with a few labeled real images, we observed a maximum relative improvement of 22% for the weed class and 17% for the plant class compared to the full real-data baseline. Cross-modal alignment was enabled by translating RGB to thermal using CycleGAN-turbo, allowing robust template matching without calibration. Results demonstrated that combining synthetic data with limited manual annotations and cross-domain translation via generative models can significantly boost segmentation performance in complex field environments for multi-model imagery.
Authors:Austin Wilson, Sahar Kapasi, Zane Greene, Alexis E. Block
Abstract:
Many research groups face challenges when legacy (unsupported) robotic platforms lose manufacturer support and cannot accommodate modern sensing, speech, and interaction capabilities. We present the Enhanced NAO, a revitalized version of Aldebaran's NAO robot that uses upgraded microphones, RGB-D and thermal cameras, and additional compute resources in a fully self-contained package. This system combines cloud and local models for perception and dialogue, while preserving the NAO's expressive body and behaviors. In a pilot validation study, the Enhanced NAO delivered significantly higher conversational quality and stronger user preference compared to the NAO AI Edition, without increasing response latency. Key upgrades, such as beamforming microphones and low-latency audio processing, reduced artifacts like self-hearing and improved multi-party separation. Expanded visual and thermal sensing established a foundation for future interaction capabilities. Beyond the NAO, our framework provides a platform-agnostic strategy for extending the lifespan and research utility of legacy robots, ensuring they remain valuable tools for human-robot interaction.
Authors:Selma Yahia, Ildi Alla, Girija Bangalore Mohan, Daniel Rau, Mridula Singh, Valeria Loscri
Abstract:
Autonomous vehicles (AVs) rely heavily on LiDAR sensors for accurate 3D perception. We show a novel class of low-cost, passive LiDAR spoofing attacks that exploit mirror-like surfaces to inject or remove objects from an AV's perception. Using planar mirrors to redirect LiDAR beams, these attacks require no electronics or custom fabrication and can be deployed in real settings. We define two adversarial goals: Object Addition Attacks (OAA), which create phantom obstacles, and Object Removal Attacks (ORA), which conceal real hazards. We develop geometric optics models, validate them with controlled outdoor experiments using a commercial LiDAR and an Autoware-equipped vehicle, and implement a CARLA-based simulation for scalable testing. Experiments show mirror attacks corrupt occupancy grids, induce false detections, and trigger unsafe planning and control behaviors. We discuss potential defenses (thermal sensing, multi-sensor fusion, light-fingerprinting) and their limitations.
Authors:Chi Yang, Fu Wang, Xiaofei Yang, Hao Huang, Weijia Cao, Xiaowen Chu
Abstract:
Cloud phase profiles are critical for numerical weather prediction (NWP), as they directly affect radiative transfer and precipitation processes. In this study, we present a benchmark dataset and a baseline framework for transforming multimodal satellite observations into detailed 3D cloud phase structures, aiming toward operational cloud phase profile retrieval and future integration with NWP systems to improve cloud microphysics parameterization. The multimodal observations consist of (1) high--spatiotemporal--resolution, multi-band visible (VIS) and thermal infrared (TIR) imagery from geostationary satellites, and (2) accurate vertical cloud phase profiles from spaceborne lidar (CALIOP\slash CALIPSO) and radar (CPR\slash CloudSat). The dataset consists of synchronized image--profile pairs across diverse cloud regimes, defining a supervised learning task: given VIS/TIR patches, predict the corresponding 3D cloud phase structure. We adopt SGMAGNet as the main model and compare it with several baseline architectures, including UNet variants and SegNet, all designed to capture multi-scale spatial patterns. Model performance is evaluated using standard classification metrics, including Precision, Recall, F1-score, and IoU. The results demonstrate that SGMAGNet achieves superior performance in cloud phase reconstruction, particularly in complex multi-layer and boundary transition regions. Quantitatively, SGMAGNet attains a Precision of 0.922, Recall of 0.858, F1-score of 0.763, and an IoU of 0.617, significantly outperforming all baselines across these key metrics.
Authors:Rui Chen, Domenico Chiaradia, Antonio Frisoli, Daniele Leonardis
Abstract:
This paper presents a novel fabric-based thermal-haptic interface for virtual reality and teleoperation. It integrates pneumatic actuation and conductive fabric with an innovative ultra-lightweight design, achieving only 2~g for each finger unit. By embedding heating elements within textile pneumatic chambers, the system delivers modulated pressure and thermal stimuli to fingerpads through a fully soft, wearable interface.
Comprehensive characterization demonstrates rapid thermal modulation with heating rates up to 3$^{\circ}$C/s, enabling dynamic thermal feedback for virtual or teleoperation interactions. The pneumatic subsystem generates forces up to 8.93~N at 50~kPa, while optimization of fingerpad-actuator clearance enhances cooling efficiency with minimal force reduction. Experimental validation conducted with two different user studies shows high temperature identification accuracy (0.98 overall) across three thermal levels, and significant manipulation improvements in a virtual pick-and-place tasks. Results show enhanced success rates (88.5\% to 96.4\%, p = 0.029) and improved force control precision (p = 0.013) when haptic feedback is enabled, validating the effectiveness of the integrated thermal-haptic approach for advanced human-machine interaction applications.
Authors:Ninad Gaikwad, Kasey Dettlaff, Athul Jose P, Anamika Dubey
Abstract:
We present a new open-source, GUI-based application created using Plotly-Dash, along with an integrated PostgreSQL-based relational database, developed to streamline EnergyPlus building model simulation workflows. The application facilitates data generation, aggregation (across thermal zones), and visualization based on customizable user preferences, while the database efficiently stores and retrieves complex simulation data generated by EnergyPlus. We demonstrate the need for this application and database, emphasizing how existing approaches for generating, managing, and analyzing EnergyPlus simulation data can be cumbersome, particularly when handling a large number of building models with varying simulation setups. This integrated framework enables building energy engineers and researchers to simplify their EnergyPlus simulations, manage generated simulation data, perform data analyses, and support data-driven modeling tasks.
Authors:Ninad Gaikwad, Kunal Shankar, Anamika Dubey, Alan Love, Olvar Bergland
Abstract:
We need computationally efficient and accurate building thermal dynamics models for use in grid-edge applications. This work evaluates two grey-box approaches for modeling building thermal dynamics: RC-network models and structured regression models. For RC-network models, we compare parameter estimation methods including Nonlinear Least Squares, Batch Estimation, and Maximum Likelihood Estimation. We use the Almon Lag Structure with Linear Least Squares for estimating the structured regression models. The performance of these models and methods is evaluated on simulated house and commercial building data across three different simulation types.
Authors:Siyuan He, Peiran Yan, Yandong He, Youwei Zhuo, Tianyu Jia
Abstract:
The autoregressive decoding in LLMs is the major inference bottleneck due to the memory-intensive operations and limited hardware bandwidth. 3D-stacked architecture is a promising solution with significantly improved memory bandwidth, which vertically stacked multi DRAM dies on top of logic die. However, our experiments also show the 3D-stacked architecture faces severer thermal issues compared to 2D architecture, in terms of thermal temperature, gradient and scalability. To better exploit the potential of 3D-stacked architecture, we present Tasa, a heterogeneous architecture with cross-stack thermal optimizations to balance the temperature distribution and maximize the performance under the thermal constraints. High-performance core is designed for compute-intensive operations, while high-efficiency core is used for memory-intensive operators, e.g. attention layers. Furthermore, we propose a bandwidth sharing scheduling to improve the bandwidth utilization in such heterogeneous architecture. Extensive thermal experiments show that our Tasa architecture demonstrates greater scalability compared with the homogeneous 3D-stacked architecture, i.e. up to 5.55 $\tccentigrade$, 9.37 $\tccentigrade$, and 7.91 $\tccentigrade$ peak temperature reduction for 48, 60, and 72 core configurations. Our experimental for Llama-65B and GPT-3 66B inferences also demonstrate 2.85x and 2.21x speedup are obtained over the GPU baselines and state-of-the-art heterogeneous PIM-based LLM accelerator
Authors:Yukai Chen, Massimiliano Di Todaro, Bjorn Vermeersch, Herman Oprins, Daniele Jahier Pagliari, Julien Ryckaert, Dwaipayan Biswas, James Myers
Abstract:
Advances in nanosheet technologies have significantly increased power densities, exacerbating thermal management challenges in 2.5D/3D chiplet-based Systems-in-Package (SiP). While traditional thermal analyses often employ uniform power maps to simplify computational complexity, this practice neglects localized heating effects, leading to inaccuracies in thermal estimations, especially when comparing power delivery networks (PDN) in 3D integration. This work examines the thermal impact of non-uniform power distributions on SiPs utilizing frontside (FSPDN) and backside (BSPDN) power delivery approaches. Using high-resolution thermal simulations with non-uniform power maps at resolutions down to 5 micrometers, we demonstrate that uniform power assumptions substantially underestimate peak temperatures and fail to reveal critical thermal differences between BSPDN and FSPDN configurations in 3D scenarios. Our results highlight that BSPDN configurations in 3D, although beneficial in simplified uniform scenarios, exhibit pronounced thermal penalties under realistic, localized workloads due to limited lateral heat spreading. These findings emphasize the necessity of adopting fine-grained, workload-aware power maps in early-stage thermal modeling to enable accurate PDN assessment and informed thermal-aware design decisions in advanced nanosheet-based 3D SiP.
Authors:Imran Latif, Muhammad Ali Shafique, Hayat Ullah, Alex C. Newkirk, Xi Yu, Arslan Munir
Abstract:
The unprecedented growth in artificial intelligence (AI) workloads, recently dominated by large language models (LLMs) and vision-language models (VLMs), has intensified power and cooling demands in data centers. This study benchmarks LLMs and VLMs on two HGX nodes, each with 8x NVIDIA H100 graphics processing units (GPUs), using liquid and air cooling. Leveraging GPU Burn, Weights and Biases, and IPMItool, we collect detailed thermal, power, and computation data. Results show that the liquid-cooled systems maintain GPU temperatures between 41-50 degrees Celsius, while the air-cooled counterparts fluctuate between 54-72 degrees Celsius under load. This thermal stability of liquid-cooled systems yields 17 percent higher performance (54 TFLOPs per GPU vs. 46 TFLOPs per GPU), improved performance per watt, reduced energy overhead, and greater system efficiency than the air-cooled counterparts. These findings underscore the energy and sustainability benefits of liquid cooling, offering a compelling path forward for hyperscale data centers s
Authors:Doyeong Lim, Yang Liu, Zavier Ndum Ndum, Christian Young, Yassin Hassan
Abstract:
This paper presents a multipurpose artificial intelligence (AI)-driven thermal-fluid testbed designed to advance Small Modular Reactor technologies by seamlessly integrating physical experimentation with advanced computational intelligence. The platform uniquely combines a versatile three-loop thermal-fluid facility with a high-fidelity digital twin and sophisticated AI frameworks for real-time prediction, control, and operational assistance. Methodologically, the testbed's digital twin, built upon the System Analysis Module code, is coupled with a Gated Recurrent Unit (GRU) neural network. This machine learning model, trained on experimental data, enables faster-than-real-time simulation, providing predictive insights into the system's dynamic behavior. The practical application of this AI integration is showcased through case studies. An AI-driven control framework where the GRU model accurately forecasts future system states and the corresponding control actions required to meet operational demands. Furthermore, an intelligent assistant, powered by a large language model, translates complex sensor data and simulation outputs into natural language, offering operators actionable analysis and safety recommendations. Comprehensive validation against experimental transients confirms the platform's high fidelity, with the GRU model achieving a temperature prediction root mean square error of 1.42 K. This work establishes an integrated research environment at the intersection of AI and thermal-fluid science, showcasing how AI-driven methodologies in modeling, control, and operator support can accelerate the innovation and deployment of next-generation nuclear systems.
Authors:Subed Lamichhane, Haotian Lu, Sheldon X. -D. Tan
Abstract:
Electromigration (EM) remains a critical reliability concern in current and future copper-based VLSI circuits. As technology scales down, EM-induced IR drop becomes increasingly severe. While several EM-aware IR drop analysis tools have been proposed, few incorporate the real impact of temperature distribution on both EM and IR drop effects. In this work, we introduce EMSpice 2.1, an enhanced tool built upon the existing coupled IR-EM analysis framework, EMSpice 2.0, for EM-aware IR drop analysis. For the first time, EMSpice 2.1 uniquely integrates Joule heating effects and practical thermal maps derived from actual chip conditions. Additionally, it features improved interoperability with commercial EDA tools, facilitating more comprehensive EM and IR drop sign-off analysis. Our findings demonstrate that specific hotspot patterns significantly impact the lifetime of interconnects and overall chip reliability due to EM failures. Furthermore, our tool exhibits strong agreement with industry-standard tools such as COMSOL, achieving a speedup of over 200 times while maintaining high accuracy.
Authors:Kazuma Kitazawa, Tsuyoshi Takatani
Abstract:
Shape estimation for transparent objects is challenging due to their complex light transport. To circumvent these difficulties, we leverage the Shape from Polarization (SfP) technique in the Long-Wave Infrared (LWIR) spectrum, where most materials are opaque and emissive. While a few prior studies have explored LWIR SfP, these attempts suffered from significant errors due to inadequate polarimetric modeling, particularly the neglect of reflection. Addressing this gap, we formulated a polarization model that explicitly accounts for the combined effects of emission and reflection. Based on this model, we estimated surface normals using not only a direct model-based method but also a learning-based approach employing a neural network trained on a physically-grounded synthetic dataset. Furthermore, we modeled the LWIR polarimetric imaging process, accounting for inherent systematic errors to ensure accurate polarimetry. We implemented a prototype system and created ThermoPol, the first real-world benchmark dataset for LWIR SfP. Through comprehensive experiments, we demonstrated the high accuracy and broad applicability of our method across various materials, including those transparent in the visible spectrum.
Authors:Lukas Schichler, Karin Festl, Selim Solmaz, Daniel Watzenig
Abstract:
Despite significant progress in autonomous navigation, a critical gap remains in ensuring reliable localization in hazardous environments such as tunnels, urban disaster zones, and underground structures. Tunnels present a uniquely difficult scenario: they are not only prone to GNSS signal loss, but also provide little features for visual localization due to their repetitive walls and poor lighting. These conditions degrade conventional vision-based and LiDAR-based systems, which rely on distinguishable environmental features. To address this, we propose a novel sensor fusion framework that integrates a thermal camera with a LiDAR to enable robust localization in tunnels and other perceptually degraded environments. The thermal camera provides resilience in low-light or smoke conditions, while the LiDAR delivers precise depth perception and structural awareness. By combining these sensors, our framework ensures continuous and accurate localization across diverse and dynamic environments. We use an Extended Kalman Filter (EKF) to fuse multi-sensor inputs, and leverages visual odometry and SLAM (Simultaneous Localization and Mapping) techniques to process the sensor data, enabling robust motion estimation and mapping even in GNSS-denied environments. This fusion of sensor modalities not only enhances system resilience but also provides a scalable solution for cyber-physical systems in connected and autonomous vehicles (CAVs). To validate the framework, we conduct tests in a tunnel environment, simulating sensor degradation and visibility challenges. The results demonstrate that our method sustains accurate localization where standard approaches deteriorate due to the tunnels featureless geometry. The frameworks versatility makes it a promising solution for autonomous vehicles, inspection robots, and other cyber-physical systems operating in constrained, perceptually poor environments.
Authors:Xiaolei Bian, Changfu Zou, Björn Fridholm, Christian Sundvall, Torsten Wik
Abstract:
Accurate state-of-charge (SOC) estimation is essential for optimizing battery performance, ensuring safety, and maximizing economic value. Conventional current and voltage measurements, however, have inherent limitations in fully inferring the multiphysics-resolved dynamics inside battery cells. This creates an accuracy barrier that constrains battery usage and reduces cost-competitiveness and sustainability across industries dependent on battery technology. In this work, we introduce an integrated sensor framework that combines novel mechanical, thermal, gas, optical, and electrical sensors with traditional measurements to break through this barrier. We generate three unique datasets with eleven measurement types and propose an explainable machine-learning approach for SOC estimation. This approach renders the measured signals and the predictive result of machine learning physically interpretable with respect to battery SOC, offering fundamental insights into the time-varying importance of different signals. Our experimental results reveal a marked increase in SOC estimation accuracy--enhanced from 46.1% to 74.5%--compared to conventional methods. This approach not only advances SOC monitoring precision but also establishes a foundation for monitoring additional battery states to further improve safety, extend lifespan, and facilitate fast charging.
Authors:Dilshod Nematov, Mirabbos Hojamberdiev
Abstract:
The rapid advancement of machine learning and artificial intelligence (AI)-driven techniques is revolutionizing materials discovery, property prediction, and material design by minimizing human intervention and accelerating scientific progress. This review provides a comprehensive overview of smart, machine learning (ML)-driven approaches, emphasizing their role in predicting material properties, discovering novel compounds, and optimizing material structures. Key methodologies ranging from deep learning, graph neural networks, and Bayesian optimization to automated generative models, such as generative adversarial networks (GANs) and variational autoencoders (VAEs) enable the autonomous design of materials with tailored functionalities. By leveraging AutoML frameworks (e.g., AutoGluon, TPOT, and H2O.ai), researchers can automate the model selection, hyperparameter tuning, and feature engineering, significantly improving the efficiency of materials informatics. Furthermore, the integration of AI-driven robotic laboratories and high-throughput computing has established a fully automated pipeline for rapid synthesis and experimental validation, drastically reducing the time and cost of material discovery. This review highlights real-world applications of automated ML-driven approaches in predicting mechanical, thermal, electrical, and optical properties of materials, demonstrating successful cases in superconductors, catalysts, photovoltaics, and energy storage systems. We also address key challenges, such as data quality, interpretability, and the integration of AutoML with quantum computing, which are essential for future advancements. Ultimately, the synergy between AI, automated experimentation, and computational modeling transforms the way the materials are discovered, optimized, and designed, paving the way for next-generation innovations in energy, electronics, and nanotechnology.
Authors:Haozhen Cheng, Jan Stock, André Xhonneux, Hüseyin K. Ãakmak, Veit Hagenmeyer
Abstract:
Improving energy efficiency by monitoring system behavior and predicting future energy scenarios in light of increased penetration of renewable energy sources are becoming increasingly important, especially for energy systems that distribute and provide heat. On this background, digital twins of cities become paramount in advancing urban energy system planning and infrastructure management. The use of recorded energy data from sensors in district digital twins in collaborative co-simulation platforms is a promising way to analyze detailed system behavior and estimate future scenarios. However, the development and coupling of multi-physics energy system models need to be validated before they can be used for further in-depth analyses. In the present paper, a new multi-physics/-modal and highly configurable building model is presented. Its accuracy and reliability are validated by comparison with data from the TABULA project, ensuring its relevance and applicability to real-world scenarios. The modularity and flexibility with regard to the system configurability of the developed building model is evaluated on various real building types. In addition, the applicability of the building model in a multi-energy system is highlighted by implementing the model in a collaborative co-simulation setup and by coupling it to a district heating grid model in yearly co-simulations. The simulation results for the proposed multi-physical/-modal building modeling concept show a very high level of agreement compared to published reference building data and can therefore be used individually as flexible and modular building models including both thermal and electrical systems for future sector-coupled energy system analyses in view of sustainability.
Authors:Zhangdi Liu, Ling An, Mengke Song, Zhuohang Yu, Shan Wang, Kezhen Qi, Zhenyu Zhang, Chichun Zhou
Abstract:
The design of inorganic catalysts and the prediction of their catalytic efficiency are fundamental challenges in chemistry and materials science. Traditional catalyst evaluation methods primarily rely on machine learning techniques; however, these methods often struggle to process multi-source heterogeneous data, limiting both predictive accuracy and generalization. To address these limitations, this study introduces the Embedding-Attention-Permutated CNN-Residual (EAPCR) deep learning model. EAPCR constructs a feature association matrix using embedding and attention mechanisms and enhances predictive performance through permutated CNN architectures and residual connections. This approach enables the model to accurately capture complex feature interactions across various catalytic conditions, leading to precise efficiency predictions. EAPCR serves as a powerful tool for computational researchers while also assisting domain experts in optimizing catalyst design, effectively bridging the gap between data-driven modeling and experimental applications. We evaluate EAPCR on datasets from TiO2 photocatalysis, thermal catalysis, and electrocatalysis, demonstrating its superiority over traditional machine learning methods (e.g., linear regression, random forest) as well as conventional deep learning models (e.g., ANN, NNs). Across multiple evaluation metrics (MAE, MSE, R2, and RMSE), EAPCR consistently outperforms existing approaches. These findings highlight the strong potential of EAPCR in inorganic catalytic efficiency prediction. As a versatile deep learning framework, EAPCR not only improves predictive accuracy but also establishes a solid foundation for future large-scale model development in inorganic catalysis.
Authors:Runxi Wang, Ziheng Wang, Ting Lin, Jacob M. Raby, Mircea R. Stan, Xinfei Guo
Abstract:
The rapid advancement of three-dimensional integrated circuits (3DICs) has heightened the need for early-phase design space exploration (DSE) to minimize design iterations and unexpected challenges. Emphasizing the pre-register-transfer level (Pre-RTL) design phase is crucial for reducing trial-and-error costs. However, 3DIC design introduces additional complexities due to thermal constraints and an expanded design space resulting from vertical stacking and various cooling strategies. Despite this need, existing Pre-RTL DSE tools for 3DICs remain scarce, with available solutions often lacking comprehensive design options and full customization support. To bridge this gap, we present Cool-3D, an end-to-end, thermal-aware framework for 3DIC design that integrates mainstream architectural-level simulators, including gem5, McPAT, and HotSpot 7.0, with advanced cooling models. Cool-3D enables broad and fine-grained design space exploration, built-in microfluidic cooling support for thermal analysis, and an extension interface for non-parameterizable customization, allowing designers to model and optimize 3DIC architectures with greater flexibility and accuracy. To validate the Cool-3D framework, we conduct three case studies demonstrating its ability to model various hardware design options and accurately capture thermal behaviors. Cool-3D serves as a foundational framework that not only facilitates comprehensive 3DIC design space exploration but also enables future innovations in 3DIC architecture, cooling strategies, and optimization techniques. The entire framework, along with the experimental data, is in the process of being released on GitHub.
Authors:Jorge GarcÃa-Torres, Ãyvind Meinich-Bache, Sara Brunner, Siren Rettedal, Vilde Kolstad, Kjersti Engan
Abstract:
Around 10% of newborns require some help to initiate breathing, and 5\% need ventilation assistance. Accurate Time of Birth (ToB) documentation is essential for optimizing neonatal care, as timely interventions are vital for proper resuscitation. However, current clinical methods for recording ToB often rely on manual processes, which can be prone to inaccuracies. In this study, we present a novel two-stream fusion system that combines the power of image and video analysis to accurately detect the ToB from thermal recordings in the delivery room and operating theater. By integrating static and dynamic streams, our approach captures richer birth-related spatiotemporal features, leading to more robust and precise ToB estimation. We demonstrate that this synergy between data modalities enhances performance over single-stream approaches. Our system achieves 95.7% precision and 84.8% recall in detecting birth within short video clips. Additionally, with the help of a score aggregation module, it successfully identifies ToB in 100% of test cases, with a median absolute error of 2 seconds and an absolute mean deviation of 4.5 seconds compared to manual annotations.
Authors:Jorge GarcÃa-Torres, Ãyvind Meinich-Bache, Siren Rettedal, Kjersti Engan
Abstract:
Approximately 10% of newborns need some assistance to start breathing and 5\% proper ventilation. It is crucial that interventions are initiated as soon as possible after birth. Accurate documentation of Time of Birth (ToB) is thereby essential for documenting and improving newborn resuscitation performance. However, current clinical practices rely on manual recording of ToB, typically with minute precision. In this study, we present an AI-driven, video-based system for automated ToB detection using thermal imaging, designed to preserve the privacy of healthcare providers and mothers by avoiding the use of identifiable visual data. Our approach achieves 91.4% precision and 97.4% recall in detecting ToB within thermal video clips during performance evaluation. Additionally, our system successfully identifies ToB in 96% of test cases with an absolute median deviation of 1 second compared to manual annotations. This method offers a reliable solution for improving ToB documentation and enhancing newborn resuscitation outcomes.
Authors:Erik Schnaubelt, Andrea Vitrano, Mariusz Wozniak, Emmanuele Ravaioli, Arjan Verweij, Sebastian Schöps
Abstract:
Thermal transient responses of superconducting magnets can be simulated using the finite element (FE) method. Some accelerator magnets use cables whose electric insulation is significantly thinner than the bare electric conductor. The FE discretisation of such geometries with high-quality meshes leads to many degrees of freedom. This increases the computational time, particularly since non-linear material properties are involved. In this work, we propose to use a thermal thin-shell approximation (TSA) to improve the computational efficiency when solving the heat diffusion equation in two dimensions. We apply the method to compute the thermal transient response of superconducting accelerator magnets used for CERN's Large Hadron Collider (LHC) and High-Luminosity LHC. The TSA collapses thin electrical insulation layers into lines while accurately representing the thermal gradient across the insulation's thickness. The TSA is implemented in the multipole module of the open-source Finite Element Quench Simulator (FiQuS), which can generate the multipole magnet models programmatically from input text files. First, the TSA approach is verified by comparison to classical FE simulations with meshed surface insulation regions for a simple block of four cables and a detailed model of the MBH dipole. The results show that the TSA approach reduces the computational time significantly while preserving the accuracy of the solution. Second, the quench heater (QH) delay computed with the TSA method is compared to measurements for the MBH magnet. To this end, the thermal transient simulation is coupled to a magnetostatic solution to account for magneto-resistive effects. Third, the TSA's full capabilities are showcased in non-linear magneto-thermal simulations of several LHC and HL-LHC superconducting magnet models. The full source code, including all input files, is publicly available.
Authors:Shaojie Zhang, Ozgur B. Akan
Abstract:
Molecular communication (MC) is an emerging paradigm that takes inspiration from biological processes, enabling communication at the nanoscale and facilitating the development of the Internet of Bio-Nano Things (IoBNT). Traditional models of MC often rely on idealized assumptions that overlook practical challenges related to noise and signal behavior. This paper proposes and evaluates the first physical MC ion transmitter (ITX) using an ion exchange membrane. The circuit network model is used to simulate ion transport and analyze both transient and steady-state behavior. This analysis includes the effects of noise sources such as thermal and shot noise on signal integrity and SNR. The main contributions of this paper are to demonstrate how a practical MC ITX can produce a realistic waveform and to highlight future research challenges associated with a physical membrane-based ITX.
Authors:Tun-Chieh Lou, Chung-Che Wang, Jyh-Shing Roger Jang, Henian Li, Lang Lin, Norman Chang
Abstract:
This paper proposes the use of iterative transfer learning applied to deep learning models for side-channel attacks. Currently, most of the side-channel attack methods train a model for each individual byte, without considering the correlation between bytes. However, since the models' parameters for attacking different bytes may be similar, we can leverage transfer learning, meaning that we first train the model for one of the key bytes, then use the trained model as a pretrained model for the remaining bytes. This technique can be applied iteratively, a process known as iterative transfer learning. Experimental results show that when using thermal or power consumption map images as input, and multilayer perceptron or convolutional neural network as the model, our method improves average performance, especially when the amount of data is insufficient.
Authors:Mariusz Wozniak, Erik Schnaubelt, Sina Atalay, Bernardo Bordini, Julien Dular, Tim Mulder, Emmanuele Ravaioli, Arjan Verweij
Abstract:
High-temperature superconductor (HTS) coated conductors (CC) are often wound into pancake coils with electrical insulation in-between the turns. The copper terminals are used for current injection and conduction cooling. An inherent variation of the critical current along the CC length results from its manufacturing process. This variation causes non-uniform heat generation, particularly when the coil is operated at a high fraction of the nominal critical current or when large critical current defects are present. The temperature distribution resulting from the balance between cooling and heating, in combination with the magnetic field and critical current distributions, determines whether a thermal runaway occurs. Accurately predicting the level of critical current defects that can be tolerated during conduction-cooled operation is difficult and requires a 3D coupled electromagnetic and thermal simulation. This paper presents the results of simulations that are performed with the open-source Finite Element Quench Simulator (FiQuS) tool developed at CERN as part of the STEAM framework. The 3D coupled magnetodynamic-thermal simulations are based on the H-phi formulation and use thin shell approximations, a CC homogenization and conduction-cooling. The critical current (Ic) is varied along the CC length. The effect of a single defect specified as a reduction of Ic along the CC length is investigated in terms of the coil's ability to reach and maintain the operating conditions. The Ic and length of the defect that results in a thermal runaway are analyzed in terms of defect location. In addition, a classical 1D scenario with a quench heater is studied. Both the local defect and the heater cases are compared in terms of the voltage signal available for quench detection. These cases result in very different requirements for quench detection, and their implications are discussed.
Authors:Jorge GarcÃa-Torres, Ãyvind Meinich-Bache, Anders Johannessen, Siren Rettedal, Vilde Kolstad, Kjersti Engan
Abstract:
Around 5-10\% of newborns need assistance to start breathing. Currently, there is a lack of evidence-based research, objective data collection, and opportunities for learning from real newborn resuscitation emergency events. Generating and evaluating automated newborn resuscitation algorithm activity timelines relative to the Time of Birth (ToB) offers a promising opportunity to enhance newborn care practices. Given the importance of prompt resuscitation interventions within the "golden minute" after birth, having an accurate ToB with second precision is essential for effective subsequent analysis of newborn resuscitation episodes. Instead, ToB is generally registered manually, often with minute precision, making the process inefficient and susceptible to error and imprecision. In this work, we explore the fusion of Artificial Intelligence (AI) and thermal imaging to develop the first AI-driven ToB detector. The use of temperature information offers a promising alternative to detect the newborn while respecting the privacy of healthcare providers and mothers. However, the frequent inconsistencies in thermal measurements, especially in a multi-camera setup, make normalization strategies critical. Our methodology involves a three-step process: first, we propose an adaptive normalization method based on Gaussian mixture models (GMM) to mitigate issues related to temperature variations; second, we implement and deploy an AI model to detect the presence of the newborn within the thermal video frames; and third, we evaluate and post-process the model's predictions to estimate the ToB. A precision of 88.1\% and a recall of 89.3\% are reported in the detection of the newborn within thermal frames during performance evaluation. Our approach achieves an absolute median deviation of 2.7 seconds in estimating the ToB relative to the manual annotations.
Authors:Max Linnander, Dustin Goetz, Gregory Reardon, Vijay Kumar, Elliot Hawkes, Yon Visell
Abstract:
Tactile displays that lend tangible form to digital content could transform computing interactions. However, achieving the resolution, speed, and dynamic range needed for perceptual fidelity remains challenging. We present a tactile display that directly converts projected light into visible tactile patterns via a photomechanical surface populated with millimeter-scale optotactile pixels. The pixels transduce incident light into mechanical displacements through photostimulated thermal gas expansion, yielding millimeter scale displacements with response times of 2 to 100 milliseconds. Employing projected light for power transmission and addressing renders these displays highly scalable. We demonstrate optically driven displays with up to 1,511 addressable pixels -- several times more pixels than any prior tactile display attaining comparable performance. Perceptual studies confirm that these displays can reproduce diverse spatiotemporal tactile patterns with high fidelity. This research establishes a foundation for practical, versatile high-resolution tactile displays driven by light.
Authors:Juno Nam, Sulin Liu, Gavin Winter, KyuJung Jun, Soojung Yang, Rafael Gómez-Bombarelli
Abstract:
We introduce LiFlow, a generative framework to accelerate molecular dynamics (MD) simulations for crystalline materials that formulates the task as conditional generation of atomic displacements. The model uses flow matching, with a Propagator submodel to generate atomic displacements and a Corrector to locally correct unphysical geometries, and incorporates an adaptive prior based on the Maxwell-Boltzmann distribution to account for chemical and thermal conditions. We benchmark LiFlow on a dataset comprising 25-ps trajectories of lithium diffusion across 4,186 solid-state electrolyte (SSE) candidates at four temperatures. The model obtains a consistent Spearman rank correlation of 0.7-0.8 for lithium mean squared displacement (MSD) predictions on unseen compositions. Furthermore, LiFlow generalizes from short training trajectories to larger supercells and longer simulations while maintaining high accuracy. With speed-ups of up to 600,000$\times$ compared to first-principles methods, LiFlow enables scalable simulations at significantly larger length and time scales.
Authors:S. Ares de Parga, J. R. Bravo, N. Sibuet, J. A. Hernandez, R. Rossi, Stefan Boschert, Enrique S. Quintana-OrtÃ, Andrés E. Tomás, Cristian CÄtÄlin Tatu, Fernando Vázquez-Novoa, Jorge Ejarque, Rosa M. Badia
Abstract:
The integration of reduced-order models (ROMs) with high-performance computing (HPC) is critical for developing digital twins, particularly for real-time monitoring and predictive maintenance of industrial systems. This paper presents a comprehensive, HPC-enabled workflow for developing and deploying projection-based reduced-order models (PROMs) for large-scale mechanical simulations. We use PyCOMPSs' parallel framework to efficiently execute ROM training simulations, employing parallel singular value decomposition (SVD) algorithms such as randomized SVD, Lanczos SVD, and full SVD based on tall-skinny QR (TSQR). Moreover, we introduce a partitioned version of the hyper-reduction scheme known as the Empirical Cubature Method (ECM) to further enhance computational efficiency in PROMs for mechanical systems. Despite the widespread use of HPC for PROMs, there is a significant lack of publications detailing comprehensive workflows for building and deploying end-to-end PROMs in HPC environments. Our workflow is validated through a case study focusing on the thermal dynamics of a motor, a multiphysics problem involving convective heat transfer and mechanical components. The PROM is designed to deliver a real-time prognosis tool that could enable rapid and safe motor restarts post-emergency shutdowns under different operating conditions, demonstrating its potential impact on the practice of simulations in engineering mechanics. To facilitate deployment, we use the Workflow as a Service (WaaS) strategy and Functional Mock-Up Units (FMUs) to ensure compatibility and ease of integration across HPC, edge, and cloud environments. The outcomes illustrate the efficacy of combining PROMs and HPC, establishing a precedent for scalable, real-time digital twin applications in computational mechanics across multiple industries.
Authors:Godwin K. Peprah, Yicun Huang, Torsten Wik, Faisal Altaf, Changfu Zou
Abstract:
Optimal cooling that minimises thermal gradients and the average temperature is essential for enhanced battery safety and health. This work presents a new modelling approach for battery cells of different shapes by integrating Chebyshev spectral-Galerkin method and model component decomposition. As a result, a library of reduced-order computationally efficient battery thermal models is obtained, characterised by different numbers of states. These models are validated against a high-fidelity finite element model and are compared with a thermal equivalent circuit (TEC) model under real-world vehicle driving and battery cooling scenarios. Illustrative results demonstrate that the proposed model with four states can faithfully capture the two-dimensional thermal dynamics, while the model with only one state significantly outperforms the widely-used two-state TEC model in both accuracy and computational efficiency, reducing computation time by 28.7%. Furthermore, our developed models allow for independent control of tab and surface cooling channels, enabling effective thermal performance optimisation. Additionally, the proposed model's versatility and effectiveness are demonstrated through various applications, including the evaluation of different cooling scenarios, closed-loop temperature control, and cell design optimisation.
Authors:Zhanyue Zhao, Yiwei Jiang, Charles Bales, Yang Wang, Gregory Fischer
Abstract:
Intracorporeal needle-based therapeutic ultrasound (NBTU) offers a minimally invasive approach for the thermal ablation of malignant brain tumors, including both primary and metastatic cancers. NBTU utilizes a high-frequency alternating electric field to excite a piezoelectric transducer, generating acoustic waves that cause localized heating and tumor cell ablation, and it provides a more precise ablation by delivering lower acoustic power doses directly to targeted tumors while sparing surrounding healthy tissue. Building on our previous work, this study introduces a database for optimizing pre-operative surgical planning by simulating ablation effects in varied tissue environments and develops an extended simulation model incorporating various tumor types and sizes to evaluate thermal damage under trans-tissue conditions. A comprehensive database is created from these simulations, detailing critical parameters such as CEM43 isodose maps, temperature changes, thermal dose areas, and maximum ablation distances for four directional probes. This database serves as a valuable resource for future studies, aiding in complex trajectory planning and parameter optimization for NBTU procedures. Moreover, a novel probe selection method is proposed to enhance pre-surgical planning, providing a strategic approach to selecting probes that maximize therapeutic efficiency and minimize ablation time. By avoiding unnecessary thermal propagation and optimizing probe angles, this method has the potential to improve patient outcomes and streamline surgical procedures. Overall, the findings of this study contribute significantly to the field of NBTU, offering a robust framework for enhancing treatment precision and efficacy in clinical settings.
Authors:Zhanyue Zhao, Benjamin Szewczyk, Matthew Tarasek, Charles Bales, Yang Wang, Ming Liu, Yiwei Jiang, Chitresh Bhushan, Eric Fiveland, Zahabiya Campwala, Rachel Trowbridge, Phillip M. Johansen, Zachary Olmsted, Goutam Ghoshal, Tamas Heffter, Katie Gandomi, Farid Tavakkolmoghaddam, Christopher Nycz, Erin Jeannotte, Shweta Mane, Julia Nalwalk, E. Clif Burdette, Jiang Qian, Desmond Yeo, Julie Pilitsis, Gregory S. Fischer
Abstract:
Intracorporeal needle-based therapeutic ultrasound (NBTU) is a minimally invasive option for intervening in malignant brain tumors, commonly used in thermal ablation procedures. This technique is suitable for both primary and metastatic cancers, utilizing a high-frequency alternating electric field (up to 10 MHz) to excite a piezoelectric transducer. The resulting rapid deformation of the transducer produces an acoustic wave that propagates through tissue, leading to localized high-temperature heating at the target tumor site and inducing rapid cell death. To optimize the design of NBTU transducers for thermal dose delivery during treatment, numerical modeling of the acoustic pressure field generated by the deforming piezoelectric transducer is frequently employed. The bioheat transfer process generated by the input pressure field is used to track the thermal propagation of the applicator over time. Magnetic resonance thermal imaging (MRTI) can be used to experimentally validate these models. Validation results using MRTI demonstrated the feasibility of this model, showing a consistent thermal propagation pattern. However, a thermal damage isodose map is more advantageous for evaluating therapeutic efficacy. To achieve a more accurate simulation based on the actual brain tissue environment, a new finite element method (FEM) simulation with enhanced damage evaluation capabilities was conducted. The results showed that the highest temperature and ablated volume differed between experimental and simulation results by 2.1884°C (3.71%) and 0.0631 cm$^3$ (5.74%), respectively. The lowest Pearson correlation coefficient (PCC) for peak temperature was 0.7117, and the lowest Dice coefficient for the ablated area was 0.7021, indicating a good agreement in accuracy between simulation and experiment.
Authors:Xiaofan Yang, Yubin Liu, Wei Pan, Guoqing Chu, Junming Zhang, Jie Zhao, Zhuoqi Man, Xuanming Cao
Abstract:
Recent advances in multi-modal detection have significantly improved detection accuracy in challenging environments (e.g., low light, overexposure). By integrating RGB with modalities such as thermal and depth, multi-modal fusion increases data redundancy and system robustness. However, significant challenges remain in effectively extracting task-relevant information both within and across modalities, as well as in achieving precise cross-modal alignment. While CNNs excel at feature extraction, they are limited by constrained receptive fields, strong inductive biases, and difficulty in capturing long-range dependencies. Transformer-based models offer global context but suffer from quadratic computational complexity and are confined to pairwise correlation modeling. Mamba and other State Space Models (SSMs), on the other hand, are hindered by their sequential scanning mechanism, which flattens 2D spatial structures into 1D sequences, disrupting topological relationships and limiting the modeling of complex higher-order dependencies. To address these issues, we propose a multi-modal perception network based on hypergraph theory called M2I2HA. Our architecture includes an Intra-Hypergraph Enhancement module to capture global many-to-many high-order relationships within each modality, and an Inter-Hypergraph Fusion module to align, enhance, and fuse cross-modal features by bridging configuration and spatial gaps between data sources. We further introduce a M2-FullPAD module to enable adaptive multi-level fusion of multi-modal enhanced features within the network, meanwhile enhancing data distribution and flow across the architecture. Extensive object detection experiments on multiple public datasets against baselines demonstrate that M2I2HA achieves state-of-the-art performance in multi-modal object detection tasks.
Authors:Sophie Villenave, Pierre Raimbaud, Guillaume Lavoué
Abstract:
Thermal feedback is critical to a range of Virtual Reality (VR) applications, such as firefighting training or thermal comfort simulation. Previous studies showed that adding congruent thermal feedback positively influences User eXperience (UX). However, existing work did not compare different levels of thermal feedback quality and mostly used less immersive virtual environments. To investigate these gaps in the scientific literature, we conducted a within-participant user study in two highly-immersive scenarios, Desert Island (n=25) and Snowy Mountains (n=24). Participants explored the scenarios in three conditions (Audio-Visual only, Static-Thermal Feedback, and Dynamic-Thermal Feedback). To assess the complex and subtle effects of thermal feedback on UX, we performed a multimodal analysis by crossing data from questionnaires, semi-structured interviews, and behavioral indicators. Our results show that despite an already high level of presence in the Audio-Visual only condition, adding thermal feedback increased presence further. Comparison between levels of thermal feedback quality showed no significant difference in UX questionnaires, however this result is nuanced according to participant profiles and interviews. Furthermore, we show that although the order of passage did not influence UX directly, it influenced user behavior. We propose guidelines for the use of thermal feedback in VR, and the design of studies in complex multisensory scenarios.
Authors:Qi Zhu, Yu Yang, Liang Yu, Qing-Shan Jia, Costas J. Spanos, Xiaohong Guan
Abstract:
The heating, ventilation and air-conditioning (HVAC) system dominates building's energy consumption and meanwhile exhibits substantial operational flexibility that can be exploited for providing grid services. However, the goal is largely hindered by the difficulty to characterize the system's operating flexibility due to the complex building thermal dynamics, system operating limits and human comfort constraints. To address this challenge, this paper develops an unified virtual battery (VB) modeling framework for characterizing the operating flexibility of both single-zone and multi-zone building HVAC systems, enabling flexible buildings to function like virtual batteries. Specifically, a physically meaningful representation state is first identified to represent building thermal conditions under thermal comfort constraints and a VB model is then established for characterizing the operating flexibility of single-zone HVAC systems. We subsequently extend the VB modeling framework to multi-zone HVAC systems and establish a set of zone-level VB models to characterize the building's zonal operating flexibility. We further develop a systematic method to aggregate the VB models into a low-order and low-complexity aggregated VB model, significantly reducing model and computational complexity. We demonstrate the VB model through demand response (DR) applications and conclude that the VB model can well capture the operating flexibility of building HVAC systems and enable effective DR participation. The DR strategies obtained from the VB model can be efficiently decomposed to zone-level control inputs for maintaining human thermal comfort while achieving near-optimal operation cost.
Authors:Manuel Nkegoum, Minh-Tan Pham, Élisa Fromont, Bruno Avignon, Sébastien Lefèvre
Abstract:
Multispectral object detection is critical for safety-sensitive applications such as autonomous driving and surveillance, where robust perception under diverse illumination conditions is essential. However, the limited availability of annotated multispectral data severely restricts the training of deep detectors. In such data-scarce scenarios, textual class information can serve as a valuable source of semantic supervision. Motivated by the recent success of Vision-Language Models (VLMs) in computer vision, we explore their potential for few-shot multispectral object detection. Specifically, we adapt two representative VLM-based detectors, Grounding DINO and YOLO-World, to handle multispectral inputs and propose an effective mechanism to integrate text, visual and thermal modalities. Through extensive experiments on two popular multispectral image benchmarks, FLIR and M3FD, we demonstrate that VLM-based detectors not only excel in few-shot regimes, significantly outperforming specialized multispectral models trained with comparable data, but also achieve competitive or superior results under fully supervised settings. Our findings reveal that the semantic priors learned by large-scale VLMs effectively transfer to unseen spectral modalities, ofFering a powerful pathway toward data-efficient multispectral perception.
Authors:Benjamin C. Koenig, Sili Deng
Abstract:
Thermal runaway in lithium-ion batteries is strongly influenced by the state of charge (SOC). Existing predictive models typically infer scalar kinetic parameters at a full SOC or a few discrete SOC levels, preventing them from capturing the continuous SOC dependence that governs exothermic behavior during abuse conditions. To address this, we apply the Kolmogorov-Arnold Chemical Reaction Neural Network (KA-CRNN) framework to learn continuous and realistic SOC-dependent exothermic cathode-electrolyte interactions. We apply a physics-encoded KA-CRNN to learn SOC-dependent kinetic parameters for cathode-electrolyte decomposition directly from differential scanning calorimetry (DSC) data. A mechanistically informed reaction pathway is embedded into the network architecture, enabling the activation energies, pre-exponential factors, enthalpies, and related parameters to be represented as continuous and fully interpretable functions of the SOC. The framework is demonstrated for NCA, NM, and NMA cathodes, yielding models that reproduce DSC heat-release features across all SOCs and provide interpretable insight into SOC-dependent oxygen-release and phase-transformation mechanisms. This approach establishes a foundation for extending kinetic parameter dependencies to additional environmental and electrochemical variables, supporting more accurate and interpretable thermal-runaway prediction and monitoring.
Authors:Hyuna Kwon, Babak Sadigh, Sebastien Hamel, Vincenzo Lordi, John Klepeis, Fei Zhou
Abstract:
Atomistic simulations generate large volumes of noisy structural data, but extracting phase labels, order parameters (OPs), and defect information in a way that is universal, robust, and interpretable remains challenging. Existing tools such as PTM and CNA are restricted to a small set of hand-crafted lattices (e.g.\ FCC/BCC/HCP), degrade under strong thermal disorder or defects, and produce hard, template-based labels without per-atom probability or confidence scores. Here we introduce a log-probability foundation model that unifies denoising, phase classification, and OP extraction within a single probabilistic framework. We reuse the MACE-MP foundation interatomic potential on crystal structures mapped to AFLOW prototypes, training it to predict per-atom, per-phase logits $l$ and to aggregate them into a global log-density $\log \hat{P}_θ(\boldsymbol{r})$ whose gradient defines a conservative score field. Denoising corresponds to gradient ascent on this learned log-density, phase labels follow from $\arg\max_c l_{ac}$, and the $l$ values act as continuous, defect-sensitive and interpretable OPs quantifying the Euclidean distance to ideal phases. We demonstrate universality across hundreds of prototypes, robustness under strong thermal and defect-induced disorder, and accurate treatment of complex systems such as ice polymorphs, ice--water interfaces, and shock-compressed Ti.
Authors:Song Zhang, Ruohan Guo, Xiaohua Ge, Perter Mahon, Weixiang Shen
Abstract:
Reliable health assessment of retired lithium-ion batteries is essential for safe and economically viable second-life deployment, yet remains difficult due to sparse measurements, incomplete historical records, heterogeneous chemistries, and limited or noisy battery health labels. Conventional laboratory diagnostics, such as full charge-discharge cycling, pulse tests, Electrochemical Impedance Spectroscopy (EIS) measurements, and thermal characterization, provide accurate degradation information but are too time-consuming, equipment-intensive, or condition-sensitive to be applied at scale during retirement-stage sorting, leaving real-world datasets fragmented and inconsistent. This review synthesizes recent advances that address these constraints through physical health indicators, experiment testing methods, data-generation and augmentation techniques, and a spectrum of learning-based modeling routes spanning supervised, semi-supervised, weakly supervised, and unsupervised paradigms. We highlight how minimal-test features, synthetic data, domain-invariant representations, and uncertainty-aware prediction enable robust inference under limited or approximate labels and across mixed chemistries and operating histories. A comparative evaluation further reveals trade-offs in accuracy, interpretability, scalability, and computational burden. Looking forward, progress toward physically constrained generative models, cross-chemistry generalization, calibrated uncertainty estimation, and standardized benchmarks will be crucial for building reliable, scalable, and deployment-ready health prediction tools tailored to the realities of retired-battery applications.
Authors:Ruike Lyu, Anna Li, Jianxiao Wang, Hongxi Luo, Yan Shen, Hongye Guo, Ershun Du, Chongqing Kang, Jesse Jenkins
Abstract:
In many countries, declining demand in energy-intensive industries (EIIs) such as cement, steel, and aluminum is leading to industrial overcapacity. Although overcapacity is traditionally seen as problematic, it could unlock EIIs' flexibility in electricity use. Using China's aluminum smelting sector as a case, we evaluate the system-level cost-benefit of retaining EII overcapacity for flexible electricity use in decarbonized systems. We find that overcapacity enables smelters to adopt a seasonal operation paradigm, ceasing production during winter load peaks driven by heating electrification and renewable seasonality. In a 2050-net-zero scenario, this paradigm reduces China's electricity-system investment and operating costs by 15-72 billion CNY per year (8-34% of the industry's product value), enough to offset the costs of maintaining overcapacity and product storage. Seasonal operation also cuts workforce fluctuations across aluminum smelting and thermal-power sectors by up to 62%, potentially mitigating socio-economic disruptions from industrial restructuring and the energy transition.
Authors:Jasan Zughaibi, Denis von Arx, Maurus Derungs, Florian Heemeyer, Luca A. Antonelli, Quentin Boehler, Michael Muehlebach, Bradley J. Nelson
Abstract:
Electromagnetic navigation systems (eMNS) enable a number of magnetically guided surgical procedures. A challenge in magnetically manipulating surgical tools is that the effective workspace of an eMNS is often severely constrained by power and thermal limits. We show that system-level control design significantly expands this workspace by reducing the currents needed to achieve a desired motion. We identified five key system approaches that enable this expansion: (i) motion-centric torque/force objectives, (ii) energy-optimal current allocation, (iii) real-time pose estimation, (iv) dynamic feedback, and (v) high-bandwidth eMNS components. As a result, we stabilize a 3D inverted pendulum on an eight-coil OctoMag eMNS with significantly lower currents (0.1-0.2 A vs. 8-14 A), by replacing a field-centric field-alignment strategy with a motion-centric torque/force-based approach. We generalize to multi-agent control by simultaneously stabilizing two inverted pendulums within a shared workspace, exploiting magnetic-field nonlinearity and coil redundancy for independent actuation. A structured analysis compares the electromagnetic workspaces of both paradigms and examines current-allocation strategies that map motion objectives to coil currents. Cross-platform evaluation of the clinically oriented Navion eMNS further demonstrates substantial workspace expansion by maintaining stable balancing at distances up to 50 cm from the coils. The results demonstrate that feedback is a practical path to scalable, efficient, and clinically relevant magnetic manipulation.
Authors:Yating Zou, Batuhan Keskin, Gregor G. Taylor, Zenghui Li, Jie Wang, Eduard Alarcon, Fabio Sebastiano, Masoud Babaie, Edoardo Charbon
Abstract:
Quantum technologies offer unprecedented capabilities in computation and secure information transfer. Their implementation requires qubits to operate at cryogenic temperatures (CT) while control and readout electronics typically still remains at room temperature (RT). As systems scale to millions of qubits, the electronics should also operate at CT to avoid a wiring bottleneck. However, wired power transfer from RT for such electronics introduces severe challenges, including thermal load between cooling stages, Joule heating, noise coupling, and wiring scalability. This paper addresses those challenges by evaluating several candidate architectures for scalable power transfer in the dilution frige: high-voltage (HV) wired power transfer, radiative wireless transfer, non-radiative wireless transfer, and a hybrid HV and non-radiative transfer. These architectures are analyzed in terms of thermal load, power loss, heating, coupling noise, power density, scalability, reliability, and complexity. Comparative analysis demonstrates the trade-offs among these architectures, while highlighting HV non-radiative transfer as a promising candidate for scalable quantum systems.
Authors:Sékou-Oumar Kaba, Kusha Sareen, Daniel Levy, Siamak Ravanbakhsh
Abstract:
Effectively leveraging prior knowledge of a system's physics is crucial for applications of machine learning to scientific domains. Previous approaches mostly focused on incorporating physical insights at the architectural level. In this paper, we propose a framework to leverage physical information directly into the loss function for prediction and generative modeling tasks on systems like molecules and spins. We derive energy loss functions assuming that each data sample is in thermal equilibrium with respect to an approximate energy landscape. By using the reverse KL divergence with a Boltzmann distribution around the data, we obtain the loss as an energy difference between the data and the model predictions. This perspective also recasts traditional objectives like MSE as energy-based, but with a physically meaningless energy. In contrast, our formulation yields physically grounded loss functions with gradients that better align with valid configurations, while being architecture-agnostic and computationally efficient. The energy loss functions also inherently respect physical symmetries. We demonstrate our approach on molecular generation and spin ground-state prediction and report significant improvements over baselines.
Authors:Diego R. Rivera, Ernesto Castillo, Felipe Galarce, Douglas R. Q. Pacheco
Abstract:
This study evaluates a data assimilation framework based on reduced-order modeling (ROM-DA), complemented by a hybrid data-filling strategy, to reconstruct dynamic temperature fields in a phase-change-material (PCM) integrated solar chimney from limited temperature measurements. The goal is to enhance the estimation accuracy of the outlet airflow velocity. A regularized least-squares formulation is employed to estimate temperature distributions within an inclined solar chimney using RT-42 as the PCM. The methodology combines (i) a reduced-order model derived from high-fidelity finite-volume simulations of unsteady conjugate heat transfer with liquid-solid phase change and surface radiation, and (ii) three experimental datasets with 22, 135, and 203 measurement points. Missing data are reconstructed using a hybrid filling scheme based on boundary-layer and bicubic interpolations. The assimilated temperature fields are integrated into the thermally coupled forward solver to improve velocity predictions. Results show that the ROM-DA framework reconstructs the transient temperature fields in both the air and PCM domains with relative errors below 10 percent for sparse data and below 3 percent for expanded datasets. When applied to experimental measurements, the approach enhances the fidelity of temperature and velocity fields compared with the baseline model, reducing the outlet velocity RMS error by 20 percent. This represents the first application of a ROM-DA framework to a coupled multiphysics solar chimney with PCM integration, demonstrating its potential for near-real-time thermal state estimation and digital-twin development.
Authors:Paul Mayr, Alessandro Pisano, Stefan Koch, Markus Reichhartinger
Abstract:
A sliding-mode-based adaptive boundary control law is proposed for a class of uncertain thermal reaction-diffusion processes subject to matched disturbances. The disturbances are assumed to be bounded, but the corresponding bounds are unknown, thus motivating the use of adaptive control strategies. A boundary control law comprising a proportional and discontinuous term is proposed, wherein the magnitude of the discontinuous relay term is adjusted via a gradient-based adaptation algorithm. Depending on how the adaptation algorithm is parameterized, the adaptive gain can be either a nondecreasing function of time (monodirectional adaptation) or it can both increase and decrease (bidirectional adaptation). The convergence and stability properties of these two solutions are investigated by Lyapunov analyses, and two distinct stability results are derived, namely, asymptotic stability for the monodirectional adaptation and globally uniformly ultimately bounded solutions for the bidirectional adaptation. The proposed algorithms are then specified to address the control problem of stabilizing a desired temperature profile in a metal beam equipped with thermoelectric boundary actuators. Experiments are conducted to investigate the real-world performance of the proposed sliding-mode-based adaptive control, with a particular focus on comparing the monodirectional and bidirectional adaptation laws.
Authors:Piyush Dashpute, Niki Nezakati, Wolfgang Heidrich, Vishwanath Saragadam
Abstract:
Thermal images from low-cost cameras often suffer from low resolution, fixed pattern noise, and other localized degradations. Available datasets for thermal imaging are also limited in both size and diversity. To address these challenges, we propose a patch-based diffusion framework (TDiff) that leverages the local nature of these distortions by training on small thermal patches. In this approach, full-resolution images are restored by denoising overlapping patches and blending them using smooth spatial windowing. To our knowledge, this is the first patch-based diffusion framework that models a learned prior for thermal image restoration across multiple tasks. Experiments on denoising, super-resolution, and deblurring demonstrate strong results on both simulated and real thermal data, establishing our method as a unified restoration pipeline.
Authors:Christopher Silver, Thangarajah Akilan
Abstract:
Falls among seniors are a major public health issue. Existing solutions using wearable sensors, ambient sensors, and RGB-based vision systems face challenges in reliability, user compliance, and practicality. Studies indicate that stakeholders, such as older adults and eldercare facilities, prefer non-wearable, passive, privacy-preserving, and real-time fall detection systems that require no user interaction. This study proposes an advanced thermal fall detection method using a Bidirectional Convolutional Long Short-Term Memory (BiConvLSTM) model, enhanced with spatial, temporal, feature, self, and general attention mechanisms. Through systematic experimentation across hundreds of model variations exploring the integration of attention mechanisms, recurrent modules, and motion flow, we identified top-performing architectures. Among them, BiConvLSTM achieved state-of-the-art performance with a ROC-AUC of $99.7\%$ on the TSF dataset and demonstrated robust results on TF-66, a newly emerged, diverse, and privacy-preserving benchmark. These results highlight the generalizability and practicality of the proposed model, setting new standards for thermal fall detection and paving the way toward deployable, high-performance solutions.
Authors:Yusheng Zheng, Wenxue Liu, Yunhong Che, Ferdinand Grimm, Jingyuan Zhao, Xiaosong Hu, Simona Onori, Remus Teodorescu, Gregory J. Offer
Abstract:
Since the internal temperature is less accessible than surface temperature, there is an urgent need to develop accurate and real-time estimation algorithms for better thermal management and safety. This work presents a novel framework for resource-efficient and scalable development of accurate, robust, and adaptive internal temperature estimation algorithms by blending physics-based modeling with machine learning, in order to address the key challenges in data collection, model parameterization, and estimator design that traditionally hinder both approaches. In this framework, a physics-based model is leveraged to generate simulation data that includes different operating scenarios by sweeping the model parameters and input profiles. Such a cheap simulation dataset can be used to pre-train the machine learning algorithm to capture the underlying mapping relationship. To bridge the simulation-to-reality gap resulting from imperfect modeling, transfer learning with unsupervised domain adaptation is applied to fine-tune the pre-trained machine learning model, by using limited operational data (without internal temperature values) from target batteries. The proposed framework is validated under different operating conditions and across multiple cylindrical batteries with convective air cooling, achieving a root mean square error of 0.5 °C when relying solely on prior knowledge of battery thermal properties, and less than 0.1 °C when using thermal parameters close to the ground truth. Furthermore, the role of the simulation data quality in the proposed framework has been comprehensively investigated to identify promising ways of synthetic data generation to guarantee the performance of the machine learning model.
Authors:Rileigh Bandy, Rebecca Morrison, Erin Mussoni, Teresa Portone
Abstract:
During hypersonic flight, air reacts with a planetary re-entry vehicle's thermal protection system (TPS), creating reaction products that deplete the TPS. Reliable assessment of TPS performance depends on accurate ablation models. New finite-rate gas-surface chemistry models are advancing state-of-the-art in TPS ablation modeling, but model reductions that omit chemical species and reactions may be necessary in some cases for computational tractability. This work develops hybrid physics-based and data-driven enrichments to improve the predictive capability and quantify uncertainties in such low-fidelity models while maintaining computational tractability. We focus on discrepancies in predicted carbon monoxide production that arise because the low-fidelity model tracks only a subset of reactions. To address this, we embed targeted enrichments into the low-fidelity model to capture the influence of omitted reactions. Numerical results show that the hybrid enrichments significantly improve predictive accuracy while requiring the addition of only three reactions.
Authors:Matthieu Mesnage, Sophie Villenave, Bertrand Massot, Matthieu Blanchard, Pierre Raimbaud, Guillaume Lavoué, Claudine Gehin
Abstract:
Nowadays, the majority of wearable thermal feedback systems designed for use in virtual reality applications are not compatible or not integrated to standard controllers and are based on temperature control. The objectives of the present work is to enable integration with existing controllers, in this case Valve Index controllers, and to propose an alternative approach to managing thermal stimulation with Peltier modules by controlling heat flow instead of temperature. We introduce StimulHeat as a wireless, low power thermal feedback system, based on the continuous relationship between heat and current injection in thermoelectric device (TED). First, we designed an optimized TED driver capable of injecting a continuous, bidirectional current into the TED, thereby driving it as a heater or cooler. Subsequently, this driver was implemented in an electronic board to include temperature and heat flow control loops, as well as Bluetooth Low Energy interface for remote control. A mechanical integration was conducted, in the form of a controller extension which is non-intrusive and can be clipped to Valve Index controllers to enclose the TED, temperature sensors and electronics. Finally, we present a user study validating StimulHeat for use in Virtual Reality, utilizing a Unity-built virtual environment with our open-source package.
Authors:Riddhiman Raut, Evan M. Mihalko, Amrita Basak
Abstract:
This study presents the development of a domain-responsive edge-aware multiscale Graph Neural Network for predicting steady, turbulent flow and thermal behavior in a two-dimensional channel containing arbitrarily shaped complex pin-fin geometries. The training dataset was constructed through an automated framework that integrated geometry generation, meshing, and flow-field solutions in ANSYS Fluent. The pin-fin geometry was parameterized using piecewise cubic splines, producing 1,000 diverse configurations through Latin Hypercube Sampling. Each simulation was converted into a graph structure, where nodes carried a feature vector containing spatial coordinates, a normalized streamwise position, one-hot boundary indicators, and a signed distance to the nearest boundary such as wall. This graph structure served as input to the newly developed Graph Neural Network, which was trained to predict temperature, velocity magnitude, and pressure at each node using data from ANSYS. The network predicted fields with outstanding accuracy, capturing boundary layers, recirculation, and the stagnation region upstream of the pin-fins while reducing wall time by 2-3 orders of magnitude. In conclusion, the novel graph neural network offered a fast and reliable surrogate for simulations in complex flow configurations.
Authors:Paul M. Riechers, Thomas J. Elliott
Abstract:
To make sense of the world around us, we develop models, constructed to enable us to replicate, describe, and explain the behaviours we see. Focusing on the broad case of sequences of correlated random variables, i.e., classical stochastic processes, we tackle the question of determining whether or not two different models produce the same observable behavior. This is the problem of identifiability. Curiously, the physics of the model need not correspond to the physics of the observations; recent work has shown that it is even advantageous -- in terms of memory and thermal efficiency -- to employ quantum models to generate classical stochastic processes. We resolve the identifiability problem in this regime, providing a means to compare any two models of a classical process, be the models classical, quantum, or `post-quantum', by mapping them to a canonical `generalized' hidden Markov model. Further, this enables us to place (sometimes tight) bounds on the minimal dimension required of a quantum model to generate a given classical stochastic process.
Authors:Ferdinand Thein, Hendrik Ranocha
Abstract:
The ultra--relativistic Euler equations describe gases in the relativistic case when the thermal energy dominates. These equations for an ideal gas are given in terms of the pressure, the spatial part of the dimensionless four-velocity, and the particle density. Kunik et al.\ (2024, https://doi.org/10.1016/j.jcp.2024.113330) proposed genuine multi--dimensional benchmark problems for the ultra--relativistic Euler equations. In particular, they compared full two-dimensional discontinuous Galerkin simulations for radially symmetric problems with solutions computed using a specific one-dimensional scheme. Of particular interest in the solutions are the formation of shock waves and a pressure blow-up. In the present work we derive an entropy-stable flux for the ultra--relativistic Euler equations. Therefore, we derive the main field (or entropy variables) and the corresponding potentials. We then present the entropy-stable flux and conclude with simulation results for different test cases both in 2D and in 3D.
Authors:Adam Suski, Elina Spyrou, Richard Green
Abstract:
The ability of deeply decarbonised power systems to ensure adequacy may increasingly depend on long-duration energy storage (LDES). A central challenge is whether capacity markets (CMs), originally designed around thermal generation, can provide efficient investment signals when storage becomes a central participant. While recent studies have advanced methods for accrediting variable renewables and short-duration storage, the effectiveness of these methods in CMs with substantial LDES penetration remains largely unexplored. To address this gap, we extend a two-stage stochastic equilibrium investment model by endogenising continuous, duration-based capacity accreditation for storage and apply it to a Great Britain-based case using 40 years of weather-driven demand and renewable profiles under varying emission limits. Results show that well-calibrated CMs can sustain near-efficient investment and mitigate revenue volatility, but their effectiveness diminishes in deeply decarbonized systems, underscoring both their potential and the regulatory challenges of supporting large-scale LDES.
Authors:Cyril Voyant, Milan Despotovic, Luis Garcia-Gutierrez, Mohammed Asloune, Yves-Marie Saint-Drenan, Jean-Laurent Duchaud, hjuvan Antone Faggianelli, Elena Magliaro
Abstract:
A novel methodology for short-term energy forecasting using an Extreme Learning Machine ($\mathtt{ELM}$) is proposed. Using six years of hourly data collected in Corsica (France) from multiple energy sources (solar, wind, hydro, thermal, bioenergy, and imported electricity), our approach predicts both individual energy outputs and total production (including imports, which closely follow energy demand, modulo losses) through a Multi-Input Multi-Output ($\mathtt{MIMO}$) architecture. To address non-stationarity and seasonal variability, sliding window techniques and cyclic time encoding are incorporated, enabling dynamic adaptation to fluctuations. The $\mathtt{ELM}$ model significantly outperforms persistence-based forecasting, particularly for solar and thermal energy, achieving an $\mathtt{nRMSE}$ of $17.9\%$ and $5.1\%$, respectively, with $\mathtt{R^2} > 0.98$ (1-hour horizon). The model maintains high accuracy up to five hours ahead, beyond which renewable energy sources become increasingly volatile. While $\mathtt{MIMO}$ provides marginal gains over Single-Input Single-Output ($\mathtt{SISO}$) architectures and offers key advantages over deep learning methods such as $\mathtt{LSTM}$, it provides a closed-form solution with lower computational demands, making it well-suited for real-time applications, including online learning. Beyond predictive accuracy, the proposed methodology is adaptable to various contexts and datasets, as it can be tuned to local constraints such as resource availability, grid characteristics, and market structures.
Authors:N. Marrani, T. Hageman, E. MartÃnez-Pañeda
Abstract:
The hydrogen trapping behaviour of metallic alloys is generally characterised using Thermal Desorption Spectroscopy (TDS). However, as an indirect method, extracting key parameters (trap binding energies and densities) remains a significant challenge. To address these limitations, this work introduces a machine learning-based scheme for parameter identification from TDS spectra. A multi-Neural Network (NN) model is developed and trained exclusively on synthetic data to predict trapping parameters directly from experimental data. The model comprises two multi-layer, fully connected, feed-forward NNs trained with backpropagation. The first network (classification model) predicts the number of distinct trap types. The second network (regression model) then predicts the corresponding trap densities and binding energies. The NN architectures, hyperparameters, and data pre-processing were optimised to minimise the amount of training data. The proposed model demonstrated strong predictive capabilities when applied to three tempered martensitic steels of different compositions. The code developed is freely provided.
Authors:Pallock Halder, Satyajit Mojumder
Abstract:
Modern engineering systems are increasingly equipped with sensors for real-time monitoring and decision-making. However, the data collected by these sensors is often noisy and difficult to interpret, limiting its utility for control and diagnostics. In this work, we propose a physics-informed denoising framework that integrates energy-based model and Fisher score regularization to jointly reduce data noise and enforce physical consistency with a physics-based model. The approach is first validated on benchmark problems, including the simple harmonic oscillator, Burgers' equation, and Laplace's equation, across varying noise levels. We then apply the denoising framework to real thermal emission data from laser powder bed fusion (LPBF) additive manufacturing experiments, using a trained Physics-Informed Neural Network (PINN) surrogate model of the LPBF process to guide denoising. Results show that the proposed method outperforms baseline neural network denoisers, effectively reducing noise under a range of LPBF processing conditions. This physics-guided denoising strategy enables robust, real-time interpretation of low-cost sensor data, facilitating predictive control and improved defect mitigation in additive manufacturing.
Authors:Pegah GhafGhanbari, Mircea Lazar, Javad Mohammadpour Velni
Abstract:
Cold Atmospheric Pressure Plasma Jets (APPJs) show significant potential for biomedical applications, but their inherent complexity, characterized by nonlinear dynamics and strong sensitivity to operating conditions like tip-to-surface distance, presents considerable challenges for achieving robust and reliable real-time control. To address these issues, this paper presents the Neural Parameter-Varying Data-enabled Predictive Control (NPV-DeePC) framework. By integrating hyper neural networks (hypernets) into the neural Data-enabled Predictive Control (DeePC) paradigm, the proposed method adaptively captures system nonlinearities and parameter variations, updates the neural feature space accordingly, and enables efficient and accurate trajectory prediction and control. The NPV-DeePC framework is validated through extensive simulations involving surface temperature tracking and thermal dose delivery. The results highlight its ability to outperform existing controllers in terms of accuracy and adaptability. The computational efficiency of the NPV-DeePC approach makes it a viable candidate for real-time applications. These findings underscore its potential to advance the safe and precise control of APPJs and provide a scalable solution for other parameter-varying nonlinear systems.
Authors:Mohammad Jahanbakht, Alex Olsen, Ross Marchant, Emilie Fillols, Mostafa Rahimi Azghadi
Abstract:
Weed mapping plays a critical role in precision management by providing accurate and timely data on weed distribution, enabling targeted control and reduced herbicide use. This minimizes environmental impacts, supports sustainable land management, and improves outcomes across agricultural and natural environments. Recent advances in weed mapping leverage ground-vehicle Red Green Blue (RGB) cameras, satellite and drone-based remote sensing combined with sensors such as spectral, Near Infra-Red (NIR), and thermal cameras. The resulting data are processed using advanced techniques including big data analytics and machine learning, significantly improving the spatial and temporal resolution of weed maps and enabling site-specific management decisions. Despite a growing body of research in this domain, there is a lack of comprehensive literature reviews specifically focused on weed mapping. In particular, the absence of a structured analysis spanning the entire mapping pipeline, from data acquisition to processing techniques and mapping tools, limits progress in the field. This review addresses these gaps by systematically examining state-of-the-art methods in data acquisition (sensor and platform technologies), data processing (including annotation and modelling), and mapping techniques (such as spatiotemporal analysis and decision support tools). Following PRISMA guidelines, we critically evaluate and synthesize key findings from the literature to provide a holistic understanding of the weed mapping landscape. This review serves as a foundational reference to guide future research and support the development of efficient, scalable, and sustainable weed management systems.
Authors:Hang-Cheng Dong, Lu Zou, Bingguo Liu, Dong Ye, Guodong Liu
Abstract:
Surface defect detection plays a critical role in industrial quality inspection. Recent advances in artificial intelligence have significantly enhanced the automation level of detection processes. However, conventional semantic segmentation and object detection models heavily rely on large-scale annotated datasets, which conflicts with the practical requirements of defect detection tasks. This paper proposes a novel weakly supervised semantic segmentation framework comprising two key components: a region-aware class activation map (CAM) and pseudo-label training. To address the limitations of existing CAM methods, especially low-resolution thermal maps, and insufficient detail preservation, we introduce filtering-guided backpropagation (FGBP), which refines target regions by filtering gradient magnitudes to identify areas with higher relevance to defects. Building upon this, we further develop a region-aware weighted module to enhance spatial precision. Finally, pseudo-label segmentation is implemented to refine the model's performance iteratively. Comprehensive experiments on industrial defect datasets demonstrate the superiority of our method. The proposed framework effectively bridges the gap between weakly supervised learning and high-precision defect segmentation, offering a practical solution for resource-constrained industrial scenarios.
Authors:Shruti Bansal, Wenshan Wang, Yifei Liu, Parv Maheshwari
Abstract:
Autonomous systems rely on sensors to estimate the environment around them. However, cameras, LiDARs, and RADARs have their own limitations. In nighttime or degraded environments such as fog, mist, or dust, thermal cameras can provide valuable information regarding the presence of objects of interest due to their heat signature. They make it easy to identify humans and vehicles that are usually at higher temperatures compared to their surroundings. In this paper, we focus on the adaptation of thermal cameras for robotics and automation, where the biggest hurdle is the lack of data. Several multi-modal datasets are available for driving robotics research in tasks such as scene segmentation, object detection, and depth estimation, which are the cornerstone of autonomous systems. However, they are found to be lacking in thermal imagery. Our paper proposes a solution to augment these datasets with synthetic thermal data to enable widespread and rapid adaptation of thermal cameras. We explore the use of conditional diffusion models to convert existing RGB images to thermal images using self-attention to learn the thermal properties of real-world objects.
Authors:R. Sharma, M. Raissi, Y. B. Guo
Abstract:
Efficient simulation of Laser Powder Bed Fusion (LPBF) is crucial for process prediction due to the lasting issue of high computation cost using traditional numerical methods such as finite element analysis (FEA). This study presents an efficient modeling framework termed FEA-Regulated Physics-Informed Neural Network (FEA-PINN) to accelerate the thermal field prediction in a LPBF process while maintaining the FEA accuracy. A novel dynamic material updating strategy is developed to capture the dynamic phase change of powder-liquid-solid in the PINN model. The PINN model incorporates temperature-dependent material properties and phase change behavior using the apparent heat capacity method. While the PINN model demonstrates high accuracy with a small training data and enables generalization of new process parameters via transfer learning, it faces the challenge of high computation cost in time-dependent problems due to the residual accumulation. To overcome this issue, the FEA-PINN framework integrates corrective FEA simulations during inference to enforce physical consistency and reduce error drift. A comparative analysis shows that FEA-PINN achieves equivalent accuracy to FEA while significantly reducing computational cost. The framework has been validated using the benchmark FEA data and demonstrated through single-track scanning in LPBF.
Authors:Yu Liu, Yangtao Meng, Xianfei Pan, Jie Jiang, Changhao Chen
Abstract:
Thermal cameras capture environmental data through heat emission, a fundamentally different mechanism compared to visible light cameras, which rely on pinhole imaging. As a result, traditional visual relocalization methods designed for visible light images are not directly applicable to thermal images. Despite significant advancements in deep learning for camera relocalization, approaches specifically tailored for thermal camera-based relocalization remain underexplored. To address this gap, we introduce ThermalLoc, a novel end-to-end deep learning method for thermal image relocalization. ThermalLoc effectively extracts both local and global features from thermal images by integrating EfficientNet with Transformers, and performs absolute pose regression using two MLP networks. We evaluated ThermalLoc on both the publicly available thermal-odometry dataset and our own dataset. The results demonstrate that ThermalLoc outperforms existing representative methods employed for thermal camera relocalization, including AtLoc, MapNet, PoseNet, and RobustLoc, achieving superior accuracy and robustness.
Authors:Jiawen Li, Jiang Guo, Yuanzhe Li, Zetian Mao, Jiaxing Shen, Tashi Xu, Diptesh Das, Jinming He, Run Hu, Yaerim Lee, Koji Tsuda, Junichiro Shiomi
Abstract:
Metamaterials are artificially engineered structures that manipulate electromagnetic waves, having optical properties absent in natural materials. Recently, machine learning for the inverse design of metamaterials has drawn attention. However, the highly nonlinear relationship between the metamaterial structures and optical behaviour, coupled with fabrication difficulties, poses challenges for using machine learning to design and manufacture complex metamaterials. Herein, we propose a general framework that implements customised spectrum-to-shape and size parameters to address one-to-many metamaterial inverse design problems using conditional diffusion models. Our method exhibits superior spectral prediction accuracy, generates a diverse range of patterns compared to other typical generative models, and offers valuable prior knowledge for manufacturing through the subsequent analysis of the diverse generated results, thereby facilitating the experimental fabrication of metamaterial designs. We demonstrate the efficacy of the proposed method by successfully designing and fabricating a free-form metamaterial with a tailored selective emission spectrum for thermal camouflage applications.
Authors:Chao Tian, Chao Yang, Guoqing Zhu, Qiang Wang, Zhenyu He
Abstract:
RGB-Thermal (RGB-T) object detection utilizes thermal infrared (TIR) images to complement RGB data, improving robustness in challenging conditions. Traditional RGB-T detectors assume balanced training data, where both modalities contribute equally. However, in real-world scenarios, modality degradation-due to environmental factors or technical issues-can lead to extreme modality imbalance, causing out-of-distribution (OOD) issues during testing and disrupting model convergence during training. This paper addresses these challenges by proposing a novel base-and-auxiliary detector architecture. We introduce a modality interaction module to adaptively weigh modalities based on their quality and handle imbalanced samples effectively. Additionally, we leverage modality pseudo-degradation to simulate real-world imbalances in training data. The base detector, trained on high-quality pairs, provides a consistency constraint for the auxiliary detector, which receives degraded samples. This framework enhances model robustness, ensuring reliable performance even under severe modality degradation. Experimental results demonstrate the effectiveness of our method in handling extreme modality imbalances~(decreasing the Missing Rate by 55%) and improving performance across various baseline detectors.
Authors:Myeongseok Nam, Wongi Park, Minsol Kim, Hyejin Hur, Soomok Lee
Abstract:
Recently, 3D Gaussian Splatting (3D-GS) based on Thermal Infrared (TIR) imaging has gained attention in novel-view synthesis, showing real-time rendering. However, novel-view synthesis with thermal infrared images suffers from transmission effects, emissivity, and low resolution, leading to floaters and blur effects in rendered images. To address these problems, we introduce Veta-GS, which leverages a view-dependent deformation field and a Thermal Feature Extractor (TFE) to precisely capture subtle thermal variations and maintain robustness. Specifically, we design view-dependent deformation field that leverages camera position and viewing direction, which capture thermal variations. Furthermore, we introduce the Thermal Feature Extractor (TFE) and MonoSSIM loss, which consider appearance, edge, and frequency to maintain robustness. Extensive experiments on the TI-NSD benchmark show that our method achieves better performance over existing methods.
Authors:Michael Roop, Sagy Ephrati
Abstract:
We derive the global model of thermal quasi-geostrophy on the sphere via asymptotic expansion of the thermal rotating shallow water equations. The model does not rely on the asymptotic expansion of the Coriolis force and extends the quasi-geostrophic model on the sphere by including an additional transported buoyancy field acting as a source term for the potential vorticity. We give its Hamiltonian description in terms of semidirect product Lie--Poisson brackets. The Hamiltonian formulation reveals the existence of an infinite number of conservation laws, Casimirs, parameterized by two arbitrary smooth functions. A structure-preserving discretization is provided based on Zeitlin's self-consistent matrix approximation for hydrodynamics. A Casimir-preserving time integrator is employed to numerically fully preserve the resulting finite-dimensional Lie--Poisson structure. Simulations reveal the formation of vorticity and buoyancy fronts, and large-scale structures in the buoyancy dynamics induced by the buoyancy-bathymetry interaction.
Authors:Mingquan Feng, Yixin Huang, Yifan Fu, Shaobo Wang, Junchi Yan
Abstract:
The design of optimization algorithms for neural networks remains a critical challenge, with most existing methods relying on heuristic adaptations of gradient-based approaches. This paper introduces KO (Kinetics-inspired Optimizer), a novel neural optimizer inspired by kinetic theory and partial differential equation (PDE) simulations. We reimagine the training dynamics of network parameters as the evolution of a particle system governed by kinetic principles, where parameter updates are simulated via a numerical scheme for the Boltzmann transport equation (BTE) that models stochastic particle collisions. This physics-driven approach inherently promotes parameter diversity during optimization, mitigating the phenomenon of parameter condensation, i.e. collapse of network parameters into low-dimensional subspaces, through mechanisms analogous to thermal diffusion in physical systems. We analyze this property, establishing both a mathematical proof and a physical interpretation. Extensive experiments on image classification (CIFAR-10/100, ImageNet) and text classification (IMDB, Snips) tasks demonstrate that KO consistently outperforms baseline optimizers (e.g., Adam, SGD), achieving accuracy improvements while computation cost remains comparable.
Authors:Yi Zhang, Nikolaos Farmakidis, Ioannis Roumpos, Miltiadis Moralis-Pegios, Apostolos Tsakyridis, June Sang Lee, Bowei Dong, Yuhan He, Samarth Aggarwal, Nikolaos Pleros, Harish Bhaskaran
Abstract:
Optical computing systems deliver unrivalled processing speeds for scalar operations. Yet, integrated implementations have been constrained to low-dimensional tensor operations that fall short of the vector dimensions required for modern artificial intelligence. We demonstrate an all-optical neuromorphic computing system based on time division multiplexing, capable of processing input vectors exceeding 250,000 elements within a unified framework. The platform harnesses optically driven thermo-optic modulation in standing wave optical fields, with titanium nano-antennas functioning as wavelength-selective absorbers. Counterintuitively, the thermal time dynamics of the system enable simultaneous time integration of ultra-fast (50GHz) signals and the application of programmable, non-linear activation functions, entirely within the optical domain. This unified framework constitutes a leap towards large-scale photonic computing that satisfies the dimensional requirements of AI workloads.
Authors:Adrian Esser, Chiara Basla, Peter Wolf, Robert Riener
Abstract:
Exosuits have recently been developed as alternatives to rigid exoskeletons and are increasingly adopted for both upper and lower limb therapy and assistance in clinical and home environments. Many cable-driven exosuits have been developed but little has been published on their electromechanical designs and performance. Therefore, this paper presents a comprehensive design and performance analysis of a two degree of freedom tendon driver unit (TDU) for cable-driven wearable exosuits. Detailed methodologies are presented to benchmark the functionality of the TDU. A static torque output test compares the commanded and measured torques. A velocity control test evaluates the attenuation and phase shift across velocities. A noise test evaluates how loud the TDU is for the wearer under different speeds. A thermal stress test captures the cooling performance of the TDU to ensure safe operation at higher loads. Finally, a battery endurance test evaluates the runtime of the TDU under various loading conditions to inform the usable time. To demonstrate these tests, a modular TDU system for cable-driven applications is introduced, which allows components such as motors, pulleys, and sensors to be adapted based on the requirements of the intended application. By sharing detailed methodologies and performance results, this study aims to provide a TDU design that may be leveraged by others and resources for researchers and engineers to better document the capabilities of their TDU designs.
Authors:Tarik Sahin, Jacopo Bonari, Sebastian Brandstaeter, Alexander Popp
Abstract:
The effective contact area in rough surface contact plays a critical role in multi-physics phenomena such as wear, sealing, and thermal or electrical conduction. Although accurate numerical methods, like the Boundary Element Method (BEM), are available to compute this quantity, their high computational cost limits their applicability in multi-query contexts, such as uncertainty quantification, parameter identification, and multi-scale algorithms, where many repeated evaluations are required. This study proposes a surrogate modeling framework for predicting the effective contact area using fast-to-evaluate data-driven techniques. Various machine learning algorithms are trained on a precomputed dataset, where the inputs are the imposed load and statistical roughness parameters, and the output is the corresponding effective contact area. All models undergo hyperparameter optimization to enable fair comparisons in terms of predictive accuracy and computational efficiency, evaluated using established quantitative metrics. Among the models, the Kernel Ridge Regressor demonstrates the best trade-off between accuracy and efficiency, achieving high predictive accuracy, low prediction time, and minimal training overhead-making it a strong candidate for general-purpose surrogate modeling. The Gaussian Process Regressor provides an attractive alternative when uncertainty quantification is required, although it incurs additional computational cost due to variance estimation. The generalization capability of the Kernel Ridge model is validated on an unseen simulation scenario, confirming its ability to transfer to new configurations. Database generation constitutes the dominant cost in the surrogate modeling process. Nevertheless, the approach proves practical and efficient for multi-query tasks, even when accounting for this initial expense.
Authors:Zihao Gong, Saikat Guha
Abstract:
Quick detection of transmittance changes in optical channel is crucial for secure communication. We demonstrate that pre-shared entanglement using two-mode squeezed vacuum states significantly reduces detection latency compared to classical and entanglement-augmented coherent-state probes. The change detection latency is inversely proportional to the quantum relative entropy (QRE), which goes to infinity in the absence of thermal noise, suggesting idealized instantaneous detection. However, in realistic scenarios, we show that QRE scales logarithmically with the inverse of the thermal noise mean photon number. We propose a receiver that achieves this scaling and quantify its performance gains over existing methods. Additionally, we explore the fundamental trade-off between communication capacity and change detection latency, highlighting how pre-shared entanglement enhances both.
Authors:Alexander Winkler, Pranav Shah, Katrin Baumgärtner, Vasu Sharma, David Gordon, Jakob Andert
Abstract:
This study presents a novel state estimation approach integrating Deep Neural Networks (DNNs) into Moving Horizon Estimation (MHE). This is a shift from using traditional physics-based models within MHE towards data-driven techniques. Specifically, a Long Short-Term Memory (LSTM)-based DNN is trained using synthetic data derived from a high-fidelity thermal model of a Permanent Magnet Synchronous Machine (PMSM), applied within a thermal derating torque control strategy for battery electric vehicles. The trained DNN is directly embedded within an MHE formulation, forming a discrete-time nonlinear optimal control problem (OCP) solved via the acados optimization framework. Model-in-the-Loop simulations demonstrate accurate temperature estimation even under noisy sensor conditions and simulated sensor failures. Real-time implementation on embedded hardware confirms practical feasibility, achieving computational performance exceeding real-time requirements threefold. By integrating the learned LSTM-based dynamics directly into MHE, this work achieves state estimation accuracy, robustness, and adaptability while reducing modeling efforts and complexity. Overall, the results highlight the effectiveness of combining model-based and data-driven methods in safety-critical automotive control systems.
Authors:Abigail R. Hering, Mansha Dubey, Elahe Hosseini, Meghna Srivastava, Yu An, Juan-Pablo Correa-Baena, Houman Homayoun, Marina S. Leite
Abstract:
Halide perovskites exhibit unpredictable properties in response to environmental stressors, due to several composition-dependent degradation mechanisms. In this work, we apply data visualization and machine learning (ML) techniques to reveal unexpected correlations between composition, temperature, and material properties while using high throughput, in situ environmental photoluminescence (PL) experiments. Correlation heatmaps show the strong influence of Cs content on film degradation, and dimensionality reduction visualization methods uncover clear composition-based data clusters. An extreme gradient boosting algorithm (XGBoost) effectively forecasts PL features for ten perovskite films with both composition-agnostic (>85% accuracy) and composition-dependent (>75% accuracy) model approaches, while elucidating the relative feature importance of composition (up to 99%). This model validates a previously unseen anti-correlation between Cs content and material thermal stability. Our ML-based framework can be expanded to any perovskite family, significantly reducing the analysis time currently employed to identify stable options for photovoltaics.
Authors:Mehmet Basaran, Frederik Rogiers, Martine Baelmans, Maarten Blommaert
Abstract:
With advancements in additive manufacturing (AM) capabilities, new opportunities arise to design compact heat exchangers (cHEXs) that leverage AM's degrees of freedom (DOFs) to enhance energy and material efficiency. However, excessive size reduction in counterflow cHEXs can compromise effectiveness due to axial heat conduction through the solid material, influenced by thermal conductivity and wall thickness. This study investigates how AM material selection and thin-wall production limitations might constrain the core size of counterflow plate heat exchangers when targeting maximum power density. An optimization framework evaluates power densities for six materials: plastic, austenitic steel, Al2O3, AlN, aluminum, and copper. Evaluations are conducted under constant effectiveness and pressure drop while accounting for AM-specific plate thickness limits and a lower bound on plate spacing to address fouling. Across all scenarios, copper cHEXs exhibit the lowest power density, despite high thermal conductivity. Without constraints on plate thickness and spacing, the optimal plastic cHEX achieves a power density 1800x greater than the steel baseline, while copper decreases by a factor of 0.98. With equal plate thickness of 0.5 mm for all materials, plastic retains the highest power density, 12.2x more than copper. Introducing a fouling constraint of 0.8 mm plate spacing shifts the optimal material to austenitic steel. When material-specific plate thicknesses are considered, the plastic cHEX achieves the highest power density, five times greater than copper, due to superior thin-wall resolution. This study highlights the impact of AM constraints on the energy and material efficiency of cHEXs, and shows that low-conductivity materials like plastic or austenitic steel can outperform high-conductivity materials like copper in compact designs.
Authors:Mohammad Shadman Hashem, Ahsan Raza, Sama E Shan, Seokhee Jeon
Abstract:
A wide range of haptic feedback is crucial for achieving high realism and immersion in virtual environments. Therefore, a multi-modal haptic interface that provides various haptic signals simultaneously is highly beneficial. This paper introduces a novel silicone fingertip actuator that is pneumatically actuated, delivering a realistic and effective haptic experience by simultaneously providing pressure, vibrotactile, and cold thermal feedback. The actuator features a design with multiple air chambers, each with controllable volume achieved through pneumatic valves connected to compressed air tanks. The lower air chamber generates pressure feedback, while the upper chamber produces vibrotactile feedback. In addition, two integrated lateral air nozzles create a cold thermal sensation. To showcase the system's capabilities, we designed two unique 3D surfaces in the virtual environment: a frozen meat surface and an abrasive icy surface. These surfaces simulate tactile perceptions of coldness, pressure, and texture. Comprehensive performance assessments and user studies were conducted to validate the actuator's effectiveness, highlighting its diverse feedback capabilities compared to traditional actuators that offer only single feedback modalities.
Authors:Isaac Corley, Conor Wallace, Sourav Agrawal, Burton Putrah, Jonathan Lwowski
Abstract:
Solar photovoltaic (PV) farms represent a major source of global renewable energy generation, yet their true operational efficiency often remains unknown at scale. In this paper, we present a comprehensive, data-driven framework for large-scale airborne infrared inspection of North American solar installations. Leveraging high-resolution thermal imagery, we construct and curate a geographically diverse dataset encompassing thousands of PV sites, enabling machine learning-based detection and localization of defects that are not detectable in the visible spectrum. Our pipeline integrates advanced image processing, georeferencing, and airborne thermal infrared anomaly detection to provide rigorous estimates of performance losses. We highlight practical considerations in aerial data collection, annotation methodologies, and model deployment across a wide range of environmental and operational conditions. Our work delivers new insights into the reliability of large-scale solar assets and serves as a foundation for ongoing research on performance trends, predictive maintenance, and scalable analytics in the renewable energy sector.
Authors:Peiyi Chen, Irene M. Gamba, Qin Li, Li Wang
Abstract:
For nano-materials, heat conductivity is an ill-defined concept. This classical concept assumes the validity of Fourier's law, which states the heat flux is proportional to temperature gradient, with heat conductivity used to denote this ratio. However, this macroscopic constitutive relation breaks down at nano-scales. Instead, heat is propagated using phonon transport equation, an ab initio model derived from the first principle. In this equation, a material's thermal property is coded in a coefficient termed the relaxation time ($Ï$). We study an inverse problem in this paper, by using material's temperature response upon heat injection to infer the relaxation time. This inverse problem is formulated in a PDE-constrained optimization, and numerically solved by Stochastic Gradient Descent (SGD) method and its variants. In the execution of SGD, Fréchet derivative is computed and Lipschitz continuity is proved. This approach, in comparison to the earlier studies, honors the nano-structure of of heat conductivity in a nano-material, and we numerically verify the break down of the Fourier's law.
Authors:Saeed Asadi, Mohsen Mohammadagha, Hajar Kazemi Naeini
Abstract:
In recent decades, Earth-to-Air Heat Exchangers (EAHEs), also known as underground air ducts, have garnered significant attention for their ability to provide energy-efficient cooling and heating solutions while maintaining a minimal environmental footprint. These systems leverage the relatively stable underground temperature to regulate indoor climates, reducing reliance on conventional heating, ventilation, and air conditioning (HVAC) systems. This review systematically categorizes and synthesizes research on EAHEs into three primary areas: analytical, numerical, and exergoeconomic studies. Analytical approaches focus on developing theoretical models to predict thermal performance, while numerical simulations provide insights into system optimization and real-world applications. Exergoeconomic analyses, integrating thermodynamic efficiency with economic considerations, offer valuable perspectives on cost-effectiveness and long-term viability. By consolidating existing contributions across these domains, this study serves as a comprehensive reference for researchers, engineers, and policymakers seeking to enhance the design, implementation, and performance of EAHE systems. The findings emphasize the pivotal role of EAHEs in reducing energy consumption, lowering greenhouse gas emissions, and improving economic sustainability. Additionally, this review identifies key challenges, including soil thermal conductivity variations, moisture effects, and system integration with renewable energy sources, which require further investigation. By addressing these challenges, EAHEs can be further optimized to serve as a cornerstone in sustainable energy management, contributing to global efforts toward energy-efficient building solutions and climate change mitigation.
Authors:F. Bagagiolo, E. Bertolazzi, L. Marzufero, A. Pegoretti, D. Rigotti
Abstract:
This paper presents research conducted at the University of Trento addressing an industrial challenge from Fater S.p.A. regarding the thermal bonding of non-woven fabrics for diaper production. The problem consists in a possible analysis of the behavior of the bonding process of a non-woven fabric. In particular, the bonding process is not given by the use of some kind of glue, but just by the pressure of two fiber webs through two high-velocity steel-made rollers. The research comprised the formulation and theoretical as well as numerical analysis of analytical, mechanical and thermal models for the stress-strain behavior of the non-woven fabric's fibers and for the bonding process with heating effects.
Authors:Zhong Zheng, Seyfal Sultanov, Michael E. Papka, Zhiling Lan
Abstract:
High-performance computing (HPC) systems are essential for scientific discovery and engineering innovation. However, their growing power demands pose significant challenges, particularly as systems scale to the exascale level. Prior uncore frequency tuning studies have primarily focused on conventional HPC workloads running on homogeneous systems. As HPC advances toward heterogeneous computing, integrating diverse GPU workloads on heterogeneous CPU-GPU systems, it is crucial to revisit and enhance uncore scaling. Our investigation reveals that uncore frequency scales down only when CPU power approaches its TDP (Thermal Design Power), an uncommon scenario in GPU-dominant applications, resulting in unnecessary power waste in modern heterogeneous computing systems. To address this, we present MAGUS, a user-transparent uncore frequency scaling runtime for heterogeneous computing. Effective uncore tuning is inherently complex, requiring dynamic detection of application execution phases that affect uncore utilization. Moreover, any robust strategy must work across a diverse range of applications, each with unique behaviors and resource requirements. Finally, an efficient runtime should introduce minimal overhead. We incorporate several key techniques in the design of MAGUS, including monitoring and predicting memory throughput, managing frequent phase transitions, and leveraging vendor-supplied power management support. We evaluate MAGUS using a diverse set of GPU benchmarks and applications across multiple heterogeneous systems with different CPU and GPU architectures. The experimental results show that MAGUS achieves up to 27% energy savings and 26% energy-delay product (EDP) reduction compared to the default settings while maintaining a performance loss below 5% and an overhead under 1%.
Authors:Sirui Li, Federica Bragone, Matthieu Barreau, Tor Laneryd, Kateryna Morozovska
Abstract:
Our work aims at simulating and predicting the temperature conditions inside a power transformer using Physics-Informed Neural Networks (PINNs). The predictions obtained are then used to determine the optimal placement for temperature sensors inside the transformer under the constraint of a limited number of sensors, enabling efficient performance monitoring. The method consists of combining PINNs with Mixed Integer Optimization Programming to obtain the optimal temperature reconstruction inside the transformer. First, we extend our PINN model for the thermal modeling of power transformers to solve the heat diffusion equation from 1D to 2D space. Finally, we construct an optimal sensor placement model inside the transformer that can be applied to problems in 1D and 2D.
Authors:Julian Bedei, Lucas Koch, Kevin Badalian, Alexander Winkler, Patrick Schaber, Jakob Andert
Abstract:
This work introduces a toolchain for applying Reinforcement Learning (RL), specifically the Deep Deterministic Policy Gradient (DDPG) algorithm, in safety-critical real-world environments. As an exemplary application, transient load control is demonstrated on a single-cylinder internal combustion engine testbench in Homogeneous Charge Compression Ignition (HCCI) mode, that offers high thermal efficiency and low emissions. However, HCCI poses challenges for traditional control methods due to its nonlinear, autoregressive, and stochastic nature. RL provides a viable solution, however, safety concerns, such as excessive pressure rise rates, must be addressed when applying to HCCI. A single unsuitable control input can severely damage the engine or cause misfiring and shut down. Additionally, operating limits are not known a priori and must be determined experimentally. To mitigate these risks, real-time safety monitoring based on the k-nearest neighbor algorithm is implemented, enabling safe interaction with the testbench. The feasibility of this approach is demonstrated as the RL agent learns a control policy through interaction with the testbench. A root mean square error of 0.1374 bar is achieved for the indicated mean effective pressure, comparable to neural network-based controllers from the literature. The toolchain's flexibility is further demonstrated by adapting the agent's policy to increase ethanol energy shares, promoting renewable fuel use while maintaining safety. This RL approach addresses the longstanding challenge of applying RL to safety-critical real-world environments. The developed toolchain, with its adaptability and safety mechanisms, paves the way for future applicability of RL in engine testbenches and other safety-critical settings.
Authors:Suman Itani, Yibo Zhang, Jiadong Zang
Abstract:
Thermoelectric materials provide a sustainable way to convert waste heat into electricity. However, data-driven discovery and optimization of these materials are challenging because of a lack of a reliable database. Here we developed a comprehensive database of 7,123 thermoelectric compounds, containing key information such as chemical composition, structural detail, seebeck coefficient, electrical and thermal conductivity, power factor, and figure of merit (ZT). We used the GPTArticleExtractor workflow, powered by large language models (LLM), to extract and curate data automatically from the scientific literature published in Elsevier journals. This process enabled the creation of a structured database that addresses the challenges of manual data collection. The open access database could stimulate data-driven research and advance thermoelectric material analysis and discovery.
Authors:Felipe Galarce, Diego Rivera, Douglas Pacheco, Alfonso Caiazzo, Ernesto Castillo
Abstract:
This article presents and assesses a framework for estimating temperature fields in real time for food-freezing applications, significantly reducing computational load while ensuring accurate temperature monitoring, which represents a promising technological tool for optimizing and controlling food engineering processes. The strategy is based on (i) a mathematical model of a convection-dominated problem coupling thermal convection and turbulence, and (ii) a least-squares approach for solving the inverse data assimilation problem, regularized by projecting the governing dynamics onto a reduced-order model (ROM). The unsteady freezing process considers a salmon slice in a freezer cabinet, modeled with temperature-dependent thermophysical properties. The forward problem is approximated using a third-order WENO finite volume solver, including an optimized second-order backward scheme for time discretization. We employ our data assimilation framework to reconstruct the temperature field based on a limited number of sensors and to estimate temperature distributions within frozen food. Sensor placement is optimized using a novel greedy algorithm, which maximizes the observability of the reduced-order dynamics for a fixed set of sensors. The proposed approach allows efficient extrapolation from external sensor measurements to the internal temperature of the food under realistic turbulent flow conditions, which is crucial for maintaining food quality.
Authors:R. Sharma, Y. B. Guo
Abstract:
Understanding thermal stress evolution in metal additive manufacturing (AM) is crucial for producing high-quality components. Recent advancements in machine learning (ML) have shown great potential for modeling complex multiphysics problems in metal AM. While physics-based simulations face the challenge of high computational costs, conventional data-driven ML models require large, labeled training datasets to achieve accurate predictions. Unfortunately, generating large datasets for ML model training through time-consuming experiments or high-fidelity simulations is highly expensive in metal AM. To address these challenges, this study introduces a physics-informed neural network (PINN) framework that incorporates governing physical laws into deep neural networks (NNs) to predict temperature and thermal stress evolution during the laser metal deposition (LMD) process. The study also discusses the enhanced accuracy and efficiency of the PINN model when supplemented with small simulation data. Furthermore, it highlights the PINN transferability, enabling fast predictions with a set of new process parameters using a pre-trained PINN model as an online soft sensor, significantly reducing computation time compared to physics-based numerical models while maintaining accuracy.
Authors:Temitayo N. Adeyeye, Sidra Gibeault, Daniel P. Lathrop, Matthew W. Daniels, Mark D. Stiles, Jabez J. McClelland, William A. Borders, Jason T. Ryan, Philippe Talatchian, Ursula Ebels, Advait Madhavan
Abstract:
In the superparamagnetic regime, magnetic tunnel junctions switch between two resistance states due to random thermal fluctuations. The dwell time distribution in each state is exponential. We sample this distribution using a temporal encoding scheme, in which information is encoded in the time at which the device switches between its resistance states. We then develop a circuit element known as a probabilistic delay cell that applies an electrical current step to a superparamagnetic tunnel junction and a temporal measurement circuit that measures the timing of the first switching event. Repeated experiments confirm that these times are exponentially distributed. Temporal processing methods then allow us to digitally compute with these exponentially distributed probabilistic delay cells. We describe how to use these circuits in a Metropolis-Hastings stepper and in a weighted random sampler, both of which are computationally intensive applications that benefit from the efficient generation of exponentially distributed random numbers.
Authors:Neil He, Ming-Cheng Cheng, Yu Liu
Abstract:
The rising demand for high-performance computing (HPC) has made full-chip dynamic thermal simulation in many-core GPUs critical for optimizing performance and extending device lifespans. Proper orthogonal decomposition (POD) with Galerkin projection (GP) has shown to offer high accuracy and massive runtime improvements over direct numerical simulation (DNS). However, previous implementations of POD-GP use MPI-based libraries like PETSc and FEniCS and face significant runtime bottlenecks. We propose a $\textbf{Py}$Torch-based $\textbf{POD-GP}$ library (PyPOD-GP), a GPU-optimized library for chip-level thermal simulation. PyPOD-GP achieves over $23.4\times$ speedup in training and over $10\times$ speedup in inference on a GPU with over 13,000 cores, with just $1.2\%$ error over the device layer.
Authors:Difei Zhang, Frank Schäfer, Julian Arnold
Abstract:
The detection of phase transitions is a central task in many-body physics. To automate this process, the task can be phrased as a classification problem. Classification problems can be approached in two fundamentally distinct ways: through either a discriminative or a generative method. In general, it is unclear which of these two approaches is most suitable for a given problem. The choice is expected to depend on factors such as the availability of system knowledge, dataset size, desired accuracy, computational resources, and other considerations. In this work, we answer the question of how one should approach the solution of phase-classification problems by performing a numerical case study on the thermal phase transition in the classical two-dimensional square-lattice ferromagnetic Ising model.
Authors:Kshitij Nikhal, Cedric Nimpa Fondje, Benjamin S. Riggan
Abstract:
Cross-spectral biometrics, such as matching imagery of faces or persons from visible (RGB) and infrared (IR) bands, have rapidly advanced over the last decade due to increasing sensitivity, size, quality, and ubiquity of IR focal plane arrays and enhanced analytics beyond the visible spectrum. Current techniques for mitigating large spectral disparities between RGB and IR imagery often include learning a discriminative common subspace by exploiting precisely curated data acquired from multiple spectra. Although there are challenges with determining robust architectures for extracting common information, a critical limitation for supervised methods is poor scalability in terms of acquiring labeled data. Therefore, we propose a novel unsupervised cross-spectral framework that combines (1) a new pseudo triplet loss with cross-spectral voting, (2) a new cross-spectral attention network leveraging multiple subspaces, and (3) structured sparsity to perform more discriminative cross-spectral clustering. We extensively compare our proposed RGB-IR biometric learning framework (and its individual components) with recent and previous state-of-the-art models on two challenging benchmark datasets: DEVCOM Army Research Laboratory Visible-Thermal Face Dataset (ARL-VTF) and RegDB person re-identification dataset, and, in some cases, achieve performance superior to completely supervised methods.
Authors:Cody R. Longwell, Conor K. Trygstad, Nestor O. Perez-Arancibia
Abstract:
We present a new evolution of the Very Little Eel-Inspired roBot, the VLEIBot++, a 900-mg swimmer driven by two 10-mg bare high-work density (HWD) actuators, whose functionality is based on the use of shape-memory alloy (SMA) wires. An actuator of this type consumes an average power of about 40 mW during in-air operation. We integrated onboard power and computation into the VLEIBot++ using a custom-built printed circuit board (PCB) and an 11-mAh 3.7-V 507-mg single-cell lithium-ion (Li-Ion) battery, which in conjunction enable autonomous swimming for about 20 min on a single charge. This robot can swim at speeds of up to 18.7 mm/s (0.46 Bl/s) and is the first subgram microswimmer with onboard power, actuation, and computation developed to date. Unfortunately, the approach employed to actuate VLEIBot++ prototypes is infeasible for underwater applications because a typical 10-mg bare SMA-based microactuator requires an average power on the order of 800 mW when operating underwater. To address this issue, we introduce a new 13-mg power-efficient high-performance SMA-based microactuator that can function with similar power requirements (approx. 80 mW on average) and actuation performance (approx. 3 mm at low frequencies) in air and water. This design is based on the use of a sealed flexible air-capsule that encloses the SMA wires that drive the microactuator with the purpose of passively controlling the heat-transfer rate of the thermal system. Furthermore, this new power-efficient encapsulated actuator requires low voltages of excitation (3 to 4 V) and simple power electronics to function. The breakthroughs presented in this paper represent a path towards the creation of insect-scale autonomous underwater vehicles (AUVs).
Authors:Aditya Kasliwal, Ishaan Gakhar, Aryan Kamani, Pratinav Seth, Ujjwal Verma
Abstract:
In the last few years, the fusion of multi-modal data has been widely studied for various applications such as robotics, gesture recognition, and autonomous navigation. Indeed, high-quality visual sensors are expensive, and consumer-grade sensors produce low-resolution images. Researchers have developed methods to combine RGB color images with non-visual data, such as thermal, to overcome this limitation to improve resolution. Fusing multiple modalities to produce visually appealing, high-resolution images often requires dense models with millions of parameters and a heavy computational load, which is commonly attributed to the intricate architecture of the model.
We propose LapGSR, a multimodal, lightweight, generative model incorporating Laplacian image pyramids for guided thermal super-resolution. This approach uses a Laplacian Pyramid on RGB color images to extract vital edge information, which is then used to bypass heavy feature map computation in the higher layers of the model in tandem with a combined pixel and adversarial loss. LapGSR preserves the spatial and structural details of the image while also being efficient and compact. This results in a model with significantly fewer parameters than other SOTA models while demonstrating excellent results on two cross-domain datasets viz. ULB17-VT and VGTSR datasets.
Authors:Takato Ito, Takeshi Tanabe, Shoichi Hasegawa, Naoto Ienaga, Yoshihiro Kuroda
Abstract:
In recent years, thermal feedback has emerged as a significant sensory modality in virtual reality. However, the concept of conveying the sensation of thermal movement remains largely unexplored. We propose HeatFlicker, a virtual campfire device that recreates the flickering of fire by using a thermal illusion of moving heat identified in preliminary experiments. This device creates the illusion of heat moving from a fixed heat source. In our demonstration, we provide a novel thermal experience by simulating the flickering of a real fire.
Authors:Souta Mizuno, Jiayi Xu, Shoichi Hasegawa, Naoto Ienaga, Yoshihiro Kuroda
Abstract:
Pain sensation presentation with movable sensory position is important to imitate the pain caused by objects in motion and the pain corresponding to a person's movements. We aimed at proposing a novel dynamic pain sensation experience, called DynaPain. DynaPain was achieved by the non-contact thermal grill illusion and the apparent movement. The demonstration provided the dynamic heat and pain experience through interaction with a flame beetle moving on the arm.
Authors:Mohammad Shadman Hashem, Ahsan Raza, Seokhee Jeon
Abstract:
Multi-mode haptic feedback is essential to achieve high realism and immersion in virtual environments. This paper proposed a novel silicone fingertip actuator integrated with a hot thermal fabric finger sleeve to render pressure, vibration, and hot thermal feedback simultaneously. The actuator is pneumatically actuated to render a realistic and effective tactile experience in accordance with hot thermal sensation. The silicone actuator, with two air chambers controlled by pneumatic valves connected to compressed air tanks. Simultaneously, a PWM signal from a microcontroller regulates the temperature of the thermal fabric sleeve, enhancing overall system functionality. The lower chamber of the silicone actuator is responsible for pressure feedback, whereas the upper chamber is devoted to vibrotactile feedback. The conductive yarn or thread was utilized to spread the thermal feedback actuation points on the thermal fabric's surface. To demonstrate the actuator's capability, a VR environment consisting of a bowl of liquid and a stove with fire was designed. Based on different functionalities the scenario can simulate the tactile perception of pressure, vibration, and temperature simultaneously or consecutively.
Authors:Jiayi Xu, Kazuma Nakamura, Yoshihiro Kuroda, Masahiko Inami
Abstract:
MoHeat is a modular hardware and software platform designed for rapid prototyping of highly responsive, non-contact thermal feedback interactions. In our previous work, we developed an intensity-adjustable, highly responsive, non-contact thermal feedback system by integrating the vortex effect and thermal radiation. In this study, we further enhanced the system by developing an authoring tool that allows users to freely adjust the intensity of thermal stimuli, the duration of stimuli, the delay time before stimuli, and the interval between alternating hot and cold stimuli. This modular approach enables countless combinations of non-contact thermal feedback experiences.
Authors:Haotong Liang, Chuangye Wang, Heshan Yu, Dylan Kirsch, Rohit Pant, Austin McDannald, A. Gilad Kusne, Ji-Cheng Zhao, Ichiro Takeuchi
Abstract:
Iterative cycles of theoretical prediction and experimental validation are the cornerstone of the modern scientific method. However, the proverbial "closing of the loop" in experiment-theory cycles in practice are usually ad hoc, often inherently difficult, or impractical to repeat on a systematic basis, beset by the scale or the time constraint of computation or the phenomena under study. Here, we demonstrate Autonomous MAterials Search Engine (AMASE), where we enlist robot science to perform self-driving continuous cyclical interaction of experiments and computational predictions for materials exploration. In particular, we have applied the AMASE formalism to the rapid mapping of a temperature-composition phase diagram, a fundamental task for the search and discovery of new materials. Thermal processing and experimental determination of compositional phase boundaries in thin films are autonomously interspersed with real-time updating of the phase diagram prediction through the minimization of Gibbs free energies. AMASE was able to accurately determine the eutectic phase diagram of the Sn-Bi binary thin-film system on the fly from a self-guided campaign covering just a small fraction of the entire composition - temperature phase space, translating to a 6-fold reduction in the number of necessary experiments. This study demonstrates for the first time the possibility of real-time, autonomous, and iterative interactions of experiments and theory carried out without any human intervention.
Authors:Paul Fergus, Carl Chalmers, Steve Longmore, Serge Wich
Abstract:
The rapid decline in global biodiversity demands innovative conservation strategies. This paper examines the use of artificial intelligence (AI) in wildlife conservation, focusing on the Conservation AI platform. Leveraging machine learning and computer vision, Conservation AI detects and classifies animals, humans, and poaching-related objects using visual spectrum and thermal infrared cameras. The platform processes this data with convolutional neural networks (CNNs) and Transformer architectures to monitor species, including those which are critically endangered. Real-time detection provides the immediate responses required for time-critical situations (e.g. poaching), while non-real-time analysis supports long-term wildlife monitoring and habitat health assessment. Case studies from Europe, North America, Africa, and Southeast Asia highlight the platform's success in species identification, biodiversity monitoring, and poaching prevention. The paper also discusses challenges related to data quality, model accuracy, and logistical constraints, while outlining future directions involving technological advancements, expansion into new geographical regions, and deeper collaboration with local communities and policymakers. Conservation AI represents a significant step forward in addressing the urgent challenges of wildlife conservation, offering a scalable and adaptable solution that can be implemented globally.
Authors:Yosuke Ueno, Satoshi Imamura, Yuna Tomida, Teruo Tanimoto, Masamitsu Tanaka, Yutaka Tabuchi, Koji Inoue, Hiroshi Nakamura
Abstract:
Cryogenic quantum computers play a leading role in demonstrating quantum advantage. Given the severe constraints on the cooling capacity in cryogenic environments, thermal design is crucial for the scalability of these computers. The sources of heat dissipation include passive inflow via inter-temperature wires and the power consumption of components located in the cryostat, such as wire amplifiers and quantum-classical interfaces. Thus, a critical challenge is to reduce the number of wires by reducing the required inter-temperature bandwidth while maintaining minimal additional power consumption in the cryostat. One solution to address this challenge is near-data processing using ultra-low-power computational logic within the cryostat. Based on the workload analysis and domain-specific system design focused on Variational Quantum Algorithms (VQAs), we propose the Cryogenic Counter-based Co-processor for VQAs (C3-VQA) to enhance the design scalability of cryogenic quantum computers under the thermal constraint. The C3-VQA utilizes single-flux-quantum logic, which is an ultra-low-power superconducting digital circuit that operates at the 4 K environment. The C3-VQA precomputes a part of the expectation value calculations for VQAs and buffers intermediate values using simple bit operation units and counters in the cryostat, thereby reducing the required inter-temperature bandwidth with small additional power consumption. Consequently, the C3-VQA reduces the number of wires, leading to a reduction in the total heat dissipation in the cryostat. Our evaluation shows that the C3-VQA reduces the total heat dissipation at the 4 K stage by 30% and 81% under sequential-shot and parallel-shot execution scenarios, respectively. Furthermore, a case study in quantum chemistry shows that the C3-VQA reduces total heat dissipation by 87% with a 10,000-qubit system.
Authors:Sarah Nataj, Magnus Appel, Joe Alexandersen
Abstract:
We develop a space-time spectral element method for topology optimization of transient heat conduction. The forward problem is discretized with summation-by-parts (SBP) operators, and interface/boundary and initial/terminal conditions are imposed weakly via simultaneous approximation terms (SAT), yielding a stable monolithic space-time scheme on heterogeneous domains. Stability is proven under specific conditions on the SAT parameters, scaled with the spatial mesh resolution and material properties. We compute design sensitivities using a discrete space-time adjoint scheme that is dual-consistent with the primal SBP-SAT scheme. Dual consistency ensures that the discrete adjoint consistently approximates the continuous dual problem and, under standard smoothness assumptions, yields superconvergent functional estimates. We validate the resulting optimal designs by comparison with an independently computed reference optimal design and report time-to-solution and cost-of-accuracy curves, comparing against low-order time-marching and all-at-once solvers for the forward and adjoint systems. The proposed scheme attains high accuracy with fewer space-time degrees of freedom and remains stable, reducing time-to-solution and memory compared with an alternative all-at-once solver. This makes it a future candidate for large-scale topology optimization of time-dependent thermal systems.
Authors:Mingshu Cai, Osamu Yoshie, Yuya Ieiri
Abstract:
Modern surveillance systems increasingly rely on multi-wavelength sensors and deep neural networks to recognize faces in infrared images captured at night. However, most facial recognition models are trained on visible light datasets, leading to substantial performance degradation on infrared inputs due to significant domain shifts. Early feature-based methods for infrared face recognition proved ineffective, prompting researchers to adopt generative approaches that convert infrared images into visible light images for improved recognition. This paradigm, known as Heterogeneous Face Recognition (HFR), faces challenges such as model and modality discrepancies, leading to distortion and feature loss in generated images. To address these limitations, this paper introduces a novel latent diffusion-based model designed to generate high-quality visible face images from thermal inputs while preserving critical identity features. A multi-attribute classifier is incorporated to extract key facial attributes from visible images, mitigating feature loss during infrared-to-visible image restoration. Additionally, we propose the Self-attn Mamba module, which enhances global modeling of cross-modal features and significantly improves inference speed. Experimental results on two benchmark datasets demonstrate the superiority of our approach, achieving state-of-the-art performance in both image quality and identity preservation.
Authors:Nazanin Mahjourian, Vinh Nguyen
Abstract:
Many manufacturing environments operate in low-light conditions or within enclosed machines where conventional vision systems struggle. Infrared cameras provide complementary advantages in such environments. Simultaneously, supervised AI systems require large labeled datasets, which makes zero-shot learning frameworks more practical for applications including infrared cameras. Recent advances in vision-language foundation models (VLMs) offer a new path in zero-shot predictions from paired image-text representations. However, current VLMs cannot understand infrared camera data since they are trained on RGB data. This work introduces VLM-IRIS (Vision-Language Models for InfraRed Industrial Sensing), a zero-shot framework that adapts VLMs to infrared data by preprocessing infrared images captured by a FLIR Boson sensor into RGB-compatible inputs suitable for CLIP-based encoders. We demonstrate zero-shot workpiece presence detection on a 3D printer bed where temperature differences between the build plate and workpieces make the task well-suited for thermal imaging. VLM-IRIS converts the infrared images to magma representation and applies centroid prompt ensembling with a CLIP ViT-B/32 encoder to achieve high accuracy on infrared images without any model retraining. These findings demonstrate that the proposed improvements to VLMs can be effectively extended to thermal applications for label-free monitoring.
Authors:Michael Dumbser, Andrea Thomann, Maurizio Tavelli, Walter Boscheri
Abstract:
We introduce a novel structure-preserving vertex-staggered semi-implicit four-split discretization of a unified first order hyperbolic formulation of continuum mechanics that is able to describe at the same time fluid and solid materials within the same mathematical model. The governing PDE system goes back to pioneering work of Godunov, Romenski, Peshkov and collaborators. Previous structure-preserving discretizations of this system allowed to respect the curl-free properties of the distortion field and the specific thermal impulse in the absence of source terms and were consistent with the low Mach number limit with respect to the adiabatic sound speed. However, the evolution of the thermal impulse and the distortion field were still discretized explicitly, thus requiring a rather severe CFL stability restriction on the time step based on the shear sound speed and the finite, but potentially large, speed of heat waves. Instead, the new four-split semi-implicit scheme presented in this paper has a material time step restriction only. For this purpose, the governing PDE system is split into four subsystems: i) a convective subsystem, which is the only one that is treated explicitly; ii) a heat subsystem, iii) a subsystem containing momentum, distortion field and specific thermal impulse; iv) a pressure subsystem. The three subsystems ii)-iv) are all discretized implicitly, hence a rather mild CFL restriction based on the velocity of the continuum is imposed. The method is asymptotically consistent with the low Mach number limit and the stiff relaxation limits. Moreover, it maintains an exactly curl-free distortion field and thermal impulse in the case of linear source terms or in their absence. The scheme is benchmarked against classical test cases verifying its theoretical properties.
Authors:Manisha More, Kavya Bhand, Kaustubh Mukdam, Kavya Sharma, Manas Kawtikwar, Hridayansh Kaware, Prajwal Kavhar
Abstract:
Early diagnosis of critical diseases can significantly improve patient survival and reduce treatment costs. However, existing diagnostic techniques are often costly, invasive, and inaccessible in low-resource regions. This paper presents a multimodal artificial intelligence (AI) diagnostic framework integrating image analysis, thermal imaging, and audio signal processing for early detection of three major health conditions: skin cancer, vascular blood clots, and cardiopulmonary abnormalities. A fine-tuned MobileNetV2 convolutional neural network was trained on the ISIC 2019 dataset for skin lesion classification, achieving 89.3% accuracy, 91.6% sensitivity, and 88.2% specificity. A support vector machine (SVM) with handcrafted features was employed for thermal clot detection, achieving 86.4% accuracy (AUC = 0.89) on synthetic and clinical data. For cardiopulmonary analysis, lung and heart sound datasets from PhysioNet and Pascal were processed using Mel-Frequency Cepstral Coefficients (MFCC) and classified via Random Forest, reaching 87.2% accuracy and 85.7% sensitivity. Comparative evaluation against state-of-the-art models demonstrates that the proposed system achieves competitive results while remaining lightweight and deployable on low-cost devices. The framework provides a promising step toward scalable, real-time, and accessible AI-based pre-diagnostic healthcare solutions.
Authors:Shilaj Baral, Youngkyu Lee, Sangam Khanal, Joongoo Jeon
Abstract:
Purely data-driven surrogates for fluid dynamics often fail catastrophically from error accumulation, while existing hybrid methods have lacked the automation and robustness for practical use. To solve this, we developed XRePIT, a novel hybrid simulation strategy that synergizes machine learning (ML) acceleration with solver-based correction. We specifically designed our method to be fully automated and physics-aware, ensuring the stability and practical applicability that previous approaches lacked. We demonstrate that this new design overcomes long-standing barriers, achieving the first stable, accelerated rollouts for over 10,000 timesteps. The method also generalizes robustly to unseen boundary conditions and, crucially, scales to 3D flows. Our approach delivers speedups up to 4.98$\times$ while maintaining high physical fidelity, resolving thermal fields with relative errors of ~1E-3 and capturing low magnitude velocity dynamics with errors below 1E-2 ms-1. This work thus establishes a mature and scalable hybrid method, paving the way for its use in real-world engineering.
Authors:Francesco Pivi, Simone Gazza, Davide Evangelista, Roberto Amadini, Maurizio Gabbrielli
Abstract:
Generative models based on flow matching have demonstrated remarkable success in various domains, yet they suffer from a fundamental limitation: the lack of interpretability in their intermediate generation steps. In fact these models learn to transform noise into data through a series of vector field updates, however the meaning of each step remains opaque. We address this problem by proposing a general framework constraining each flow step to be sampled from a known physical distribution. Flow trajectories are mapped to (and constrained to traverse) the equilibrium states of the simulated physical process. We implement this approach through the 2D Ising model in such a way that flow steps become thermal equilibrium points along a parametric cooling schedule. Our proposed architecture includes an encoder that maps discrete Ising configurations into a continuous latent space, a flow-matching network that performs temperature-driven diffusion, and a projector that returns to discrete Ising states while preserving physical constraints. We validate this framework across multiple lattice sizes, showing that it preserves physical fidelity while outperforming Monte Carlo generation in speed as the lattice size increases. In contrast with standard flow matching, each vector field represents a meaningful stepwise transition in the 2D Ising model's latent space. This demonstrates that embedding physical semantics into generative flows transforms opaque neural trajectories into interpretable physical processes.
Authors:Sen Zhan, Lingkang Jin, Haoyang Zhang, Nikolaos G. Paterakis
Abstract:
The secure operation of power distribution systems is challenged by the growing integration of distributed energy resources. Leveraging the flexibility of battery storage offers a cost-effective alternative to measures like generation curtailment, which results in energy losses. However, developing an effective operational model for battery storage is hindered by inaccurate grid models, unavailability of load data, nonlinear relationship between power injections and network states, intertemporal constraints, and complex electrochemical and thermal dynamics. To address these challenges, this paper proposes a data-driven operational control scheme for battery storage in distribution systems. Linear and convex quadratic operational constraints are constructed based on real-time distribution system and battery storage measurements. Lyapunov optimization decouples multi-period battery operation, enabling a real-time, forecast-free control strategy with low computational complexity. Numerical studies using nonlinear distribution system and battery storage simulators validate the effectiveness of the approach in ensuring secure distribution system operation and satisfaction of voltage and thermal constraints of battery storage.
Authors:Samuel Donachie, Ulysse Remond, Arthur Mathorel, Kyryl Kazymyrenko
Abstract:
Quantum computing holds great promise for solving classically intractable problems such as linear systems and partial differential equations (PDEs). While fully fault-tolerant quantum computers remain out of reach, current noisy intermediate-scale quantum (NISQ) devices enable the exploration of hybrid quantum-classical algorithms. Among these, Variational Quantum Algorithms (VQAs) have emerged as a leading candidate for near-term applications. In this work, we investigate the use of VQAs to solve PDEs arising in stationary heat transfer. These problems are discretized via the finite element method (FEM), yielding linear systems of the form Ku=f, where K is the stiffness matrix. We define a cost function that encodes the thermal energy of the system, and optimize it using various ansatz families. To improve trainability and bypass barren plateaus, we introduce a remeshing strategy which gradually increases resolution by reusing optimized parameters from coarser discretizations. Our results demonstrate convergence of scalar quantities with mesh refinement. This work provides a practical methodology for applying VQAs to PDEs, offering insight into the capabilities and limitations of current quantum hardware.
Authors:Boyang Wu, Miguel Onorato, Zaher Hani, Yulin Pan
Abstract:
In this work, we provide a validity condition for the normal form transformation to remove the non-resonant cubic terms in the $β$-FPUT system. We show that for a wave field with random phases, the normal form transformation is valid by dominant probability if $β\ll 1/N^{1+ε}$, with $N$ the number of masses and $ε$ an arbitrarily small constant. To obtain this condition, a bound is needed for a summation in the transformation equation, which we prove rigorously in the paper. The condition also suggests that the importance of the non-resonant terms in the evolution equation is governed by the parameter $βN$. We design numerical experiments to demonstrate that this is indeed the case for spectra at both thermal-equilibrium and out-of-equilibrium conditions. The methodology developed in this paper is applicable to other Hamiltonian systems where a normal form transformation needs to be applied.
Authors:Anoy Saha, Mona Ghassemi
Abstract:
The electrification of aircraft is reshaping the foundations of aerospace design by positioning electrical systems at the center of propulsion, control, and onboard functionality. This chapter provides an overview of electrical system architectures for electric and hybrid electric aircraft, highlighting both established principles and emerging design strategies. The discussion begins with the motivations for electrification, including reducing environmental impact, improving operational efficiency, and replacing complex pneumatic and hydraulic subsystems with lighter and more reliable electrical alternatives. Aircraft electrical architectures are classified into four major categories: conventional, more electric, all electric, and hybrid electric. A range of system topologies is examined, including direct current (DC), alternating current (AC), hybrid, and distributed configurations. Each is considered in terms of its effectiveness in delivering power, enabling redundancy, supporting fault isolation, and managing thermal performance. Real world examples are presented to demonstrate practical applications, with case studies drawn from the Boeing 787 Dreamliner, the Eviation Alice commuter aircraft, and NASA X57 Maxwell demonstrator. These examples illustrate the ongoing transition from incremental subsystem electrification toward fully integrated architectures that promise higher efficiency and greater sustainability.
Authors:Javane Rostampoor, Raviraj Adve
Abstract:
Quantum sensing has attracted significant attention due to its ability to measure physical quantities with extremely high accuracy. Rydberg atoms - typically alkali atoms with a highly excited valence electron that is far from the nucleus - exhibit strong sensitivity to external electromagnetic fields. This sensitivity leads to coupling between different atomic energy levels, which can be observed by monitoring changes in a control laser beam before and after it passes through a vapor cell containing the Rydberg atoms. By analyzing the transmitted laser signal with a photodetector, variations in transmission can be attributed to the presence and characteristics of the external electromagnetic field. Because Rydberg atoms operate in a highly excited quantum state without relying on traditional electronic circuitry, they inherently avoid thermal noise, thereby enabling more sensitive detection. In this paper, we investigate the performance of a Rydberg atomic receiver based on Rb-85 and compare it with that of a conventional receiver in detecting an 8-level pulse amplitude modulation (8-PAM) signal in the presence of off-resonant interference. We demonstrate that the Rydberg receiver can suppress interference without the need for an additional filter. Effectively, our results show that the Rydberg receiver serves as an integrated filter and demodulator, outperforming conventional circuit-based receivers in terms of achievable symbol error rate
Authors:Subham Ghosh, Abhishek Tewari
Abstract:
The rapid discovery of materials is constrained by the lack of large, machine-readable datasets that couple performance metrics with structural context. Existing databases are either small, manually curated, or biased toward first principles results, leaving experimental literature underexploited. We present an agentic, large language model (LLM)-driven workflow that autonomously extracts thermoelectric and structural-properties from about 10,000 full-text scientific articles. The pipeline integrates dynamic token allocation, zeroshot multi-agent extraction, and conditional table parsing to balance accuracy against computational cost. Benchmarking on 50 curated papers shows that GPT-4.1 achieves the highest accuracy (F1 = 0.91 for thermoelectric properties and 0.82 for structural fields), while GPT-4.1 Mini delivers nearly comparable performance (F1 = 0.89 and 0.81) at a fraction of the cost, enabling practical large scale deployment. Applying this workflow, we curated 27,822 temperature resolved property records with normalized units, spanning figure of merit (ZT), Seebeck coefficient, conductivity, resistivity, power factor, and thermal conductivity, together with structural attributes such as crystal class, space group, and doping strategy. Dataset analysis reproduces known thermoelectric trends, such as the superior performance of alloys over oxides and the advantage of p-type doping, while also surfacing broader structure-property correlations. To facilitate community access, we release an interactive web explorer with semantic filters, numeric queries, and CSV export. This study delivers the largest LLM-curated thermoelectric dataset to date, provides a reproducible and cost-profiled extraction pipeline, and establishes a foundation for scalable, data-driven materials discovery beyond thermoelectrics.
Authors:Kaili Wang, Leonardo Ravaglia, Roberto Longo, Lore Goetschalckx, David Van Hamme, Julie Moeyersoms, Ben Stoffelen, Tom De Schepper
Abstract:
Thermal imaging in Advanced Driver Assistance Systems (ADAS) improves road safety with superior perception in low-light and harsh weather conditions compared to traditional RGB cameras. However, research in this area faces challenges due to limited dataset availability and poor representation in driving simulators. RGB-to-thermal image translation offers a potential solution, but existing methods focus on one-to-one mappings. We propose a one-to-many mapping using a multi-modal translation framework enhanced with our Component-aware Adaptive Instance Normalization (CoAdaIN). Unlike the original AdaIN, which applies styles globally, CoAdaIN adapts styles to different image components individually. The result, as we show, is more realistic and diverse thermal image translations. This is the accepted author manuscript of the paper published in IEEE Sensors Conference 2024. The final published version is available at 10.1109/SENSORS60989.2024.10785056.
Authors:Joseph Hunt, Koyo Fujii, Aly Magassouba, Praminda Caleb-Solly
Abstract:
Hospital patient falls remain a critical and costly challenge worldwide. While conventional fall prevention systems typically rely on post-fall detection or reactive alerts, they also often suffer from high false positive rates and fail to address the underlying patient needs that lead to bed-exit attempts. This paper presents a novel system architecture that leverages the Internet of Robotic Things (IoRT) to orchestrate human-robot-robot interaction for proactive and personalized patient assistance. The system integrates a privacy-preserving thermal sensing model capable of real-time bed-exit prediction, with two coordinated robotic agents that respond dynamically based on predicted intent and patient input. This orchestrated response could not only reduce fall risk but also attend to the patient's underlying motivations for movement, such as thirst, discomfort, or the need for assistance, before a hazardous situation arises. Our contributions with this pilot study are three-fold: (1) a modular IoRT-based framework enabling distributed sensing, prediction, and multi-robot coordination; (2) a demonstration of low-resolution thermal sensing for accurate, privacy-preserving preemptive bed-exit detection; and (3) results from a user study and systematic error analysis that inform the design of situationally aware, multi-agent interactions in hospital settings. The findings highlight how interactive and connected robotic systems can move beyond passive monitoring to deliver timely, meaningful assistance, empowering safer, more responsive care environments.
Authors:Jiali Zhang, Thomas S. White, Haoliang Zhang, Wenqing Hu, Donald C. Wunsch, Jian Liu
Abstract:
Infrared imaging has emerged as a robust solution for urban object detection under low-light and adverse weather conditions, offering significant advantages over traditional visible-light cameras. However, challenges such as class imbalance, thermal noise, and computational constraints can significantly hinder model performance in practical settings. To address these issues, we evaluate multiple YOLO variants on the FLIR ADAS V2 dataset, ultimately selecting YOLOv8 as our baseline due to its balanced accuracy and efficiency. Building on this foundation, we present \texttt{MS-YOLO} (\textbf{M}obileNetv4 and \textbf{S}lideLoss based on YOLO), which replaces YOLOv8's CSPDarknet backbone with the more efficient MobileNetV4, reducing computational overhead by \textbf{1.5%} while sustaining high accuracy. In addition, we introduce \emph{SlideLoss}, a novel loss function that dynamically emphasizes under-represented and occluded samples, boosting precision without sacrificing recall. Experiments on the FLIR ADAS V2 benchmark show that \texttt{MS-YOLO} attains competitive mAP and superior precision while operating at only \textbf{6.7 GFLOPs}. These results demonstrate that \texttt{MS-YOLO} effectively addresses the dual challenge of maintaining high detection quality while minimizing computational costs, making it well-suited for real-time edge deployment in urban environments.
Authors:Muhammad Kaif Laghari, Areeb Ahmed Shaikh, Faiz Khan, Aafia Gul Siddiqui
Abstract:
The adoption of current mixed reality (MR) content creation is primarily based on external PC-centric platforms and third-party cameras, limiting adoption for standalone virtual reality (VR) users. In this work, we investigate the feasibility of integrating an enhanced LIV SDK-like MR compositing pipeline into the Meta Quest 3 hardware, enabling native first-person physical perspective (FPP) MR content creation without external infrastructure. We conducted a simulation-based feasibility study using hardware specifications, developer documentation, and benchmarking with ARM-based SoCs, including Snapdragon 8 Gen 3 and MediaTek Dimensity 9300. The approach suggested Camera Passthrough Enhancement using Meta's experimental Passthrough Camera API with on-device machine learning segmentation through Unity Sentis and FastSAM, and an optimized real-time compositing engine for standalone VR. Benchmarking results show that Quest 3's Snapdragon XR2 Gen 2 can support lightweight native MR compositing at 720p30 resolution using 95\% resource utilization, leaving 5\% thermal headroom for sustained runtime. Comparison with next-generation SoCs such as Snapdragon 8 Gen 3 demonstrates 34\% headroom, enabling more robust MR experiences with 1.5--2x faster CPU/GPU performance and higher memory bandwidth. While current Quest 3 hardware supports basic native MR compositing, thermal limits restrict operation to 5--10 minutes before throttling. Experimental results confirm standalone MR content creation is possible on current hardware for short recordings, with new XR SoCs offering the headroom for extended sessions and improved quality. These findings lay groundwork for transitioning MR content creation from PC-based workflows to all-in-one VR devices, enhancing MR production for content creators and researchers.
Authors:Hemanth Puppala, Wayne Sarasua, Srinivas Biyaguda, Farhad Farzinpour, Mashrur Chowdhury
Abstract:
Deer-vehicle collisions represent a critical safety challenge in the United States, causing nearly 2.1 million incidents annually and resulting in approximately 440 fatalities, 59,000 injuries, and 10 billion USD in economic damages. These collisions also contribute significantly to declining deer populations. This paper presents a real-time detection and driver warning system that integrates thermal imaging, deep learning, and vehicle-to-everything communication to help mitigate deer-vehicle collisions. Our system was trained and validated on a custom dataset of over 12,000 thermal deer images collected in Mars Hill, North Carolina. Experimental evaluation demonstrates exceptional performance with 98.84 percent mean average precision, 95.44 percent precision, and 95.96 percent recall. The system was field tested during a follow-up visit to Mars Hill and readily sensed deer providing the driver with advanced warning. Field testing validates robust operation across diverse weather conditions, with thermal imaging maintaining between 88 and 92 percent detection accuracy in challenging scenarios where conventional visible light based cameras achieve less than 60 percent effectiveness. When a high probability threshold is reached sensor data sharing messages are broadcast to surrounding vehicles and roadside units via cellular vehicle to everything (CV2X) communication devices. Overall, our system achieves end to end latency consistently under 100 milliseconds from detection to driver alert. This research establishes a viable technological pathway for reducing deer-vehicle collisions through thermal imaging and connected vehicles.
Authors:Florian Wiesner, Matthias Wessling, Stephen Baek
Abstract:
Foundation models have revolutionized natural language processing through a ``train once, deploy anywhere'' paradigm, where a single pre-trained model adapts to countless downstream tasks without retraining. Access to a Physics Foundation Model (PFM) would be transformative -- democratizing access to high-fidelity simulations, accelerating scientific discovery, and eliminating the need for specialized solver development. Yet current physics-aware machine learning approaches remain fundamentally limited to single, narrow domains and require retraining for each new system. We present the General Physics Transformer (GPhyT), trained on 1.8 TB of diverse simulation data, that demonstrates foundation model capabilities are achievable for physics. Our key insight is that transformers can learn to infer governing dynamics from context, enabling a single model to simulate fluid-solid interactions, shock waves, thermal convection, and multi-phase dynamics without being told the underlying equations. GPhyT achieves three critical breakthroughs: (1) superior performance across multiple physics domains, outperforming specialized architectures by up to 29x, (2) zero-shot generalization to entirely unseen physical systems through in-context learning, and (3) stable long-term predictions through 50-timestep rollouts. By establishing that a single model can learn generalizable physical principles from data alone, this work opens the path toward a universal PFM that could transform computational science and engineering.
Authors:Xuyuan Kang, Xiao Wang, Jingjing An, Da Yan
Abstract:
Thermal energy storage (TES) is an effective method for load shifting and demand response in buildings. Optimal TES control and management are essential to improve the performance of the cooling system. Most existing TES systems operate on a fixed schedule, which cannot take full advantage of its load shifting capability, and requires extensive investigation and optimization. This study proposed a novel integrated load prediction and optimized control approach for ice-based TES in commercial buildings. A cooling load prediction model was developed and a mid-day modification mechanism was introduced into the prediction model to improve the accuracy. Based on the predictions, a rule-based control strategy was proposed according to the time-of-use tariff; the mid-day control adjustment mechanism was introduced in accordance with the mid-day prediction modifications. The proposed approach was applied in the ice-based TES system of a commercial complex in Beijing, and achieved a mean absolute error (MAE) of 389 kW and coefficient of variance of MAE of 12.5%. The integrated prediction-based control strategy achieved an energy cost saving rate of 9.9%. The proposed model was deployed in the realistic building automation system of the case building and significantly improved the efficiency and automation of the cooling system.
Authors:Trung Kien La, Eric Guiffo Kaigom
Abstract:
In this work, deep neural networks made up of multiple hidden Long Short-Term Memory (LSTM) and Feedforward layers are trained to predict the thermal behavior of the joint motors of robot manipulators. A model-free and scalable approach is adopted. It accommodates complexity and uncertainty challenges stemming from the derivation, identification, and validation of a large number of parameters of an approximation model that is hardly available. To this end, sensed joint torques are collected and processed to foresee the thermal behavior of joint motors. Promising prediction results of the machine learning based capture of the temperature dynamics of joint motors of a redundant robot with seven joints are presented.
Authors:Lucas Gallup, Kevin N. Long, Devin J. Roach, William D. Reinholtz, Adam Cook, Craig M. Hamel
Abstract:
Additive manufacturing (AM) allows for manufacturing of complex three-dimensional geometries not typically realizable with standard subtractive manufacturing practices. The internal microstructure of a 3D printed component can have a significant impact on its mechanical, vibrational, and shock properties and allows for a richer design space when this is controllable. Due to the complex interactions of the internal geometry of an extrusion-based AM component, it is common practice to assume a homogeneous behavior or to perform characterization testing on the specific toolpath configurations. To avoid unnecessary testing or material waste, it is necessary to develop an accurate and consistent numerical simulation framework with relevant boundary value problems that can handle the complicated geometry of internal material microstructure present in AM components. Herein, a framework is proposed to directly create computational meshes suitable for finite element analysis (FEA) of the fine-scale features generated from extrusion-based AM tool paths to maintain a strong process-structure-property-performance linkage. This mesh can be manually or automatically analyzed using standard FEA simulations such as quasi-static preloading, modal analysis, or thermal analysis. The framework allows an in-silico assessment of a target AM geometry where fine-scale features may greatly impact quantities of design interest such as in soft elastomeric lattices where toolpath infill can greatly influence the self contact of a structure in compression, which we will use as a motivating exemplar. This approach greatly reduces the waste of both time and resources consumed through traditional build and test design cycles for non-intuitive design spaces. It also further allows for the exploration of toolpath infill to optimize component properties beyond simple linear properties such as density and stiffness.
Authors:Xicheng Wang, Yun. Feng, Dmitry Grishchenko, Pavel Kudinov, Ruifeng Tian, Sichao Tan
Abstract:
Thermal-Hydraulic (TH) experiments provide valuable insight into the physics of heat and mass transfer and qualified data for code development, calibration and validation. However, measurements are typically collected from sparsely distributed sensors, offering limited coverage over the domain of interest and phenomena of interest. Determination of the spatial configuration of these sensors is crucial and challenging during the pre-test design stage. This paper develops a data-driven framework for optimizing sensor placement in TH experiments, including (i) a sensitivity analysis to construct datasets, (ii) Proper Orthogonal Decomposition (POD) for dimensionality reduction, and (iii) QR factorization with column pivoting to determine optimal sensor configuration under spatial constraints. The framework is demonstrated on a test conducted in the TALL-3D Lead-bismuth eutectic (LBE) loop. In this case, the utilization of optical techniques, such as Particle Image Velocimetry (PIV), are impractical. Thereby the quantification of momentum and energy transport relies heavily on readings from Thermocouples (TCs). The test section was previously instrumented with many TCs determined through a manual process combining simulation results with expert judgement. The proposed framework provides a systematic and automated approach for sensor placement. The resulting TCs exhibit high sensitivity to the variation of uncertain input parameters and enable accurate full field reconstruction while maintaining robustness against measurement noise.
Authors:Amira Abbas, Nunzia Cerrato, Francisco Escudero Gutiérrez, Dmitry Grinko, Francesco Anna Mele, Pulkit Sinha
Abstract:
We study the problem of learning Hamiltonians $H$ that are $s$-sparse in the Pauli basis, given access to their time evolution. Although Hamiltonian learning has been extensively investigated, two issues recur in much of the existing literature: the absence of matching lower bounds and the use of mathematically convenient but physically opaque error measures. We address both challenges by introducing two physically motivated distances between Hamiltonians and designing a nearly optimal algorithm with respect to one of these metrics. The first, time-constrained distance, quantifies distinguishability through dynamical evolution up to a bounded time. The second, temperature-constrained distance, captures distinguishability through thermal states at bounded inverse temperatures. We show that $s$-sparse Hamiltonians with bounded operator norm can be learned in both distances with $O(s \log(1/ε))$ experiments and $O(s^2/ε)$ evolution time. For the time-constrained distance, we further establish lower bounds of $Ω((s/n)\log(1/ε) + s)$ experiments and $Ω(\sqrt{s}/ε)$ evolution time, demonstrating near-optimality in the number of experiments. As an intermediate result, we obtain an algorithm that learns every Pauli coefficient of $s$-sparse Hamiltonians up to error $ε$ in $O(s\log(1/ε))$ experiments and $O(s/ε)$ evolution time, improving upon several recent results. The source of this improvement is a new isolation technique, inspired by the Valiant-Vazirani theorem (STOC'85), which shows that NP is as easy as detecting unique solutions. This isolation technique allows us to query the time evolution of a single Pauli coefficient of a sparse Hamiltonian--even when the Pauli support of the Hamiltonian is unknown--ultimately enabling us to recover the Pauli support itself.
Authors:Ueli Schilt, Somesh Vijayananda, Sarah Schneeberger, Manuel Meyer, Santhosh Iyyakkunnel, Pascal Marc Vecsei, Philipp Schuetz
Abstract:
Achieving net-zero targets requires the phase-out of fossil-based heating. A major challenge is the seasonal mismatch between renewable heat supply and demand. District heating networks often dispose of excess heat in summer and rely on fossil backups in winter. Large-scale thermal energy storage offers a solution by storing surplus summer heat for use during winter, thus reducing the need for fossil fuels. This study investigates the feasibility of a large-scale thermal storage system at a power production site that supplies a large district heating network in the city of Bern, Switzerland. Specifically, the study examines the potential of a geothermal storage system to offset fossil fuel heat generation in winter by utilising heat stored during the summer months. Using a Python-based multi-energy system model, we simulate the optimal operation of the geothermal storage system with respect to cost and emissions, considering both supply and demand on an hourly basis over one year. Multi-objective optimisation is applied to generate a Pareto-optimal front. The results show that the geothermal storage system eliminates the requirement of 8 GWh of gas-powered heat supply and increases the waste heat utilisation by 20%, therefore lowering emissions. This effect is further increased when combined with an expansion of the district heating network, as individual, emission-heavy heaters are replaced by low-emission heat from the district heating network. The findings presented in this study can prove useful when evaluating similar systems across Switzerland.
Authors:Mingyuan Yang, Qian Yu, Chao Yang
Abstract:
We present a Pseudo-Transient Topology Optimization (PeTTO) approach that can leverage graphics processing units (GPUs) to efficiently solve single-material and multi-material topology optimization problems. By integrating PeTTO with phase field methods, the partial differential equations (PDEs) constrained optimization problem in topology optimization is transformed into a set of time dependent PDEs, which can be analyzed using the knowledge of transient physics. The sensitivities with respect to the design variable are calculated with the automatic differentiation which help avoid tedious and error-prone manual derivations. The overall system of equations is efficiently solved using a hybrid of the pseudo-transient method and the accelerated pseudo-transient method, balancing the convergence rate and numerical stability. A variety of numerical examples are presented to demonstrate the effectiveness and efficiency of the proposed PeTTO approach. These examples cover different physics scenarios including mechanical and thermal problems, as well as single-material and multi-materials cases in both 2D and 3D. The numerical results show a 40- to 50-fold speedup when running the same PeTTO code on a single GPU compared to desktop CPUs. This work helps bridge the gap between high-performance computing and topology optimization, potentially enabling faster and better designs for real-world problems.
Authors:Henri ter Hofte, Nick van Ravenzwaaij
Abstract:
We introduce NeedForHeat DataGear: an open hardware and open software data collection system designed to accelerate the residential heating transition. NeedForHeat DataGear collects time series monitoring data in homes that have not yet undergone a heating transition, enabling assessment of real-life thermal characteristics, heating system efficiency, and residents' comfort needs. This paper outlines its architecture and functionalities, emphasizing its modularity, adaptability, and cost-effectiveness for field data acquisition. Unlike conventional domestic monitoring solutions focused on home automation, direct feedback, or post-installation heat pump monitoring, it prioritizes time series data we deemed essential to evaluate the current situation in existing homes before the heating transition. Designed for seamless deployment across diverse households, NeedForHeat DataGear combines openness, security, and privacy with a low-cost, user-friendly approach, making it a valuable tool for researchers, energy professionals, and energy coaches.
Authors:Mohammad Ahangarkiasari, Hassan Pouraria
Abstract:
Buoyancy-driven heat transfer in closed cavities serves as a canonical testbed for thermal design High-fidelity CFD modelling yields accurate thermal field solutions, yet its reliance on expert-crafted physics models, fine meshes, and intensive computation limits rapid iteration. Recent developments in data-driven modeling, especially Graph Neural Networks (GNNs), offer new alternatives for learning thermal-fluid behavior directly from simulation data, particularly on irregular mesh structures. However, conventional GNNs often struggle to capture long-range dependencies in high-resolution graph structures. To overcome this limitation, we propose a novel multi-stage GNN architecture that leverages hierarchical pooling and unpooling operations to progressively model global-to-local interactions across multiple spatial scales. We evaluate the proposed model on our newly developed CFD dataset simulating natural convection within a rectangular cavities with varying aspect ratios where the bottom wall is isothermal hot, the top wall is isothermal cold, and the two vertical walls are adiabatic. Experimental results demonstrate that the proposed model achieves higher predictive accuracy, improved training efficiency, and reduced long-term error accumulation compared to state-of-the-art (SOTA) GNN baselines. These findings underscore the potential of the proposed multi-stage GNN approach for modeling complex heat transfer in mesh-based fluid dynamics simulations.
Authors:Caleb Gates, Patrick Moorhead, Jayden Ferguson, Omar Darwish, Conner Stallman, Pablo Rivas, Paapa Quansah
Abstract:
Dust storms harm health and reduce visibility; quick detection from satellites is needed. We present a near real-time system that flags dust at the pixel level using multi-band images from NASA's Terra and Aqua (MODIS). A 3D convolutional network learns patterns across all 36 bands, plus split thermal bands, to separate dust from clouds and surface features. Simple normalization and local filling handle missing data. An improved version raises training speed by 21x and supports fast processing of full scenes. On 17 independent MODIS scenes, the model reaches about 0.92 accuracy with a mean squared error of 0.014. Maps show strong agreement in plume cores, with most misses along edges. These results show that joint band-and-space learning can provide timely dust alerts at global scale; using wider input windows or attention-based models may further sharpen edges.
Authors:Hugo Parada, Claudia Negulescu
Abstract:
The concern of the present paper is the design of efficient numerical schemes for a specific Fokker-Planck equation describing the dynamics of energetic particles occurring in thermonuclear fusion plasmas (runaway electrons for example). In the long-time limit, the velocity distribution function of these particles tends towards a thermal non-equilibrium $κ$-distribution function which is a steady-state of the considered Fokker-Planck equation. These $κ$-distribution functions have the particularity of being only algebraically decaying for large velocities, thus describing very well suprathermal particle populations. Our aim is to present two efficient spectral methods for the simulation of such energetic particle dynamics. The first method will be based on rational Chebyshev basis functions, rather than on Hermite basis sets, which are the basis of choice for Maxwellian steady states. The second method is based on a different polynomial basis set, constructed via the Gram-Schmidt orthogonalisation process. These two new spectral schemes, specifically adapted to the here considered physical context, shall permit to cope with the long-time asymptotics without significant numerical costs.
Authors:Yannick Weiss, Marlene Eder, Oguzhan Cesur, Steeven Villa
Abstract:
Thermal sensations are central to how we experience the world, yet most virtual and extended reality systems fail to simulate them effectively. While hardware-based thermal displays can provide accurate temperature changes, they are often bulky, power-intensive, and restrict user mobility. Consequently, recent works have explored thermal illusions, perceptual effects that rely on cross-modal interactions, to achieve thermal experiences without physical heating or cooling. While thermal illusions have been shown to consistently alter subjective ratings, the actual extent of their effect on the perceived temperature of interacted objects remains unexplored. To address this, we contribute the findings of two user studies following psychophysical procedures. We first ordered and scaled the effects of a variety of visual and auditory cues (N=20) and subsequently quantified their isolated and combined efficacy in offsetting physical temperature changes (N=24). We found that thermal illusions elicited robust changes in subjective judgments, and auditory cues showed potential as an alternative or complementary approach to established visual techniques. However, the actual effects induced by thermal illusions were relatively small (+-0.5°C) and did not consistently align with abstract ratings, suggesting a need to reconsider how future thermal illusions or experiences are designed and evaluated.
Authors:Sitong Tao, Fei Han
Abstract:
A thermo-mechanical fracture modeling is proposed to address thermal failure issues, where the temperature field is calculated by a heat conduction model based on classical continuum mechanics (CCM), while the deformation field with discontinuities is calculated by the peridynamic (PD) model. The model is calculated by a CCM/PD alternating solution based on the finite element discretization, which ensures the calculation accuracy and facilitates engineering applications. The original PD model defines damage solely based on the number of broken bonds in the vicinity of the material point, neglecting the distribution of these bonds. To address this limitation, a new definition of the PD damage accounting for both the number of broken bonds and their specific distribution is proposed. As a result, damage in various directions can be captured, enabling more realistic thermal fracture simulations based on a unified mesh discretization. The effectiveness of the proposed model is validated by comparing numerical examples with analytical solutions. Moreover, simulation results of quasi-static and dynamic crack propagation demonstrate the model's ability to aid in understanding the initiation and propagation mechanisms of complex thermal fractures.
Authors:Mirkan Emir Sancak, Unal Sen, Ulker Diler Keris-Sen
Abstract:
Accurate determination of total oxidant concentration ([Ox]_{tot}) in non-thermal plasma (NTP)-treated aqueous systems remains a critical challenge due to the transient nature of reactive oxygen and nitrogen species and the subjectivity of conventional titration methods used for [Ox]_{tot} determination. This study introduces a novel, color-based computer analysis (CBCA) method that integrates advanced image processing with machine learning (ML) to quantify colorimetric shifts in potassium iodide (KI) solutions during oxidation. First, a custom-built visual data acquisition system captured high-resolution video of the color transitions in a KI solution during oxidation with an NTP system. The change in [Ox]_{tot} during the experiments was monitored with a standard titrimetric method. Second, the captured frames were processed using a robust image processing pipeline to extract RGB, HSV, and Lab color features. The extracted features were statistically evaluated, and the results revealed strong linear correlations with the measured [Ox]_{tot} values, particularly in the saturation (HSV), a and b (Lab), and blue (RGB) channels. Subsequently, the [Ox]_{tot} measurements and the extracted color features were used to train and validate five ML models. Among them, linear regression and gradient boosting models achieved the highest predictive accuracy (R^2 > 0.990). It was also found that reducing the feature set from nine to four resulted in comparable performance with improved prediction efficiency, especially for gradient boosting. Finally, comparison of the model predictions with real titration measurements revealed that the CBCA system successfully predicts the [Ox]_{tot} in KI solution with high accuracy (R^2 > 0.998) even with a reduced number of features.
Authors:Nicholas J. Sullivan, Julio J. Valdés, Kirk H. Bevan, Peter Grutter
Abstract:
Scanning probe microscopy (SPM) is a valuable technique by which one can investigate the physical characteristics of the surfaces of materials. However, its widespread use is hampered by the time-consuming nature of running an experiment and the significant domain knowledge required. Recent studies have shown the value of multiple forms of automation in improving this, but their use is limited due to the difficulty of integrating them with SPMs other than the one it was developed for. With this in mind, we propose an automation framework for SPMs aimed toward facilitating code sharing and reusability of developed components. Our framework defines generic control and data structure schemas which are passed among independent software processes (components), with the final SPM commands sent after passing through an SPM-specific translator. This approach permits multi-language support and allows for experimental components to be decoupled among multiple computers. Our mediation logic limits access to the SPM to a single component at a time, with a simple override mechanism in order to correct detected experiment problems. To validate our proposal, we integrated and tested it with two SPMs from separate manufacturers, and ran an experiment involving a thermal drift correction component.
Authors:He Li, Xinyu Liu, Weihang Kong, Xingchen Zhang
Abstract:
Visible and infrared image fusion (VIF) is an important multimedia task in computer vision. Most VIF methods focus primarily on optimizing fused image quality. Recent studies have begun incorporating downstream tasks, such as semantic segmentation and object detection, to provide semantic guidance for VIF. However, semantic segmentation requires extensive annotations, while object detection, despite reducing annotation efforts compared with segmentation, faces challenges in highly crowded scenes due to overlapping bounding boxes and occlusion. Moreover, although RGB-T crowd counting has gained increasing attention in recent years, no studies have integrated VIF and crowd counting into a unified framework. To address these challenges, we propose FusionCounting, a novel multi-task learning framework that integrates crowd counting into the VIF process. Crowd counting provides a direct quantitative measure of population density with minimal annotation, making it particularly suitable for dense scenes. Our framework leverages both input images and population density information in a mutually beneficial multi-task design. To accelerate convergence and balance tasks contributions, we introduce a dynamic loss function weighting strategy. Furthermore, we incorporate adversarial training to enhance the robustness of both VIF and crowd counting, improving the model's stability and resilience to adversarial attacks. Experimental results on public datasets demonstrate that FusionCounting not only enhances image fusion quality but also achieves superior crowd counting performance.
Authors:Joe Alexandersen, Magnus Appel
Abstract:
This paper presents a novel space-time topology optimisation framework for time-dependent thermal conduction problems, aiming to significantly reduce the time-to-solution. By treating time as an additional spatial dimension, we discretise the governing equations using a stabilised continuous Galerkin space-time finite element method. The resulting large all-at-once system is solved using an iterative Krylov solver preconditioned with a parallel space-time multigrid method employing a semi-coarsening strategy. Implemented in a fully parallel computing framework, the method yields a parallel-in-time method that demonstrates excellent scalability on a distributed-memory supercomputer, solving problems up to 4.2 billion degrees of freedom. Comparative studies show up to 52x speed-up over traditional time-stepping approaches, with only moderate increases in total computational cost in terms of core-hours. The framework is validated on benchmark problems with both time-constant and time-varying designs, and its flexibility is demonstrated through variations in material properties. These results establish the proposed space-time method as a promising approach for large-scale time-dependent topology optimisation in thermal applications.
Authors:Neil F. Johnson, Frank Yingjie Huo
Abstract:
Output from generative AI such as ChatGPT, can be repetitive and biased. But more worrying is that this output can mysteriously tip mid-response from good (correct) to bad (misleading or wrong) without the user noticing. In 2024 alone, this reportedly caused $67 billion in losses and several deaths. Establishing a mathematical mapping to a multispin thermal system, we reveal a hidden tipping instability at the scale of the AI's 'atom' (basic Attention head). We derive a simple but essentially exact formula for this tipping point which shows directly the impact of a user's prompt choice and the AI's training bias. We then show how the output tipping can get amplified by the AI's multilayer architecture. As well as helping improve AI transparency, explainability and performance, our results open a path to quantifying users' AI risk and legal liabilities.
Authors:Shengao Yi, Xiaojiang Li, Wei Tu, Tianhong Zhao
Abstract:
As extreme heat events intensify due to climate change and urbanization, cities face increasing challenges in mitigating outdoor heat stress. While traditional physical models such as SOLWEIG and ENVI-met provide detailed assessments of human-perceived heat exposure, their computational demands limit scalability for city-wide planning. In this study, we propose GSM-UTCI, a multimodal deep learning framework designed to predict daytime average Universal Thermal Climate Index (UTCI) at 1-meter hyperlocal resolution. The model fuses surface morphology (nDSM), high-resolution land cover data, and hourly meteorological conditions using a feature-wise linear modulation (FiLM) architecture that dynamically conditions spatial features on atmospheric context. Trained on SOLWEIG-derived UTCI maps, GSM-UTCI achieves near-physical accuracy, with an R2 of 0.9151 and a mean absolute error (MAE) of 0.41°C, while reducing inference time from hours to under five minutes for an entire city. To demonstrate its planning relevance, we apply GSM-UTCI to simulate systematic landscape transformation scenarios in Philadelphia, replacing bare earth, grass, and impervious surfaces with tree canopy. Results show spatially heterogeneous but consistently strong cooling effects, with impervious-to-tree conversion producing the highest aggregated benefit (-4.18°C average change in UTCI across 270.7 km2). Tract-level bivariate analysis further reveals strong alignment between thermal reduction potential and land cover proportions. These findings underscore the utility of GSM-UTCI as a scalable, fine-grained decision support tool for urban climate adaptation, enabling scenario-based evaluation of greening strategies across diverse urban environments.
Authors:Daniele Lanzoni, Olivier Pierre-Louis, Roberto Bergamaschini, Francesco Montalenti
Abstract:
We show that Generative Adversarial Networks (GANs) may be fruitfully exploited to learn stochastic dynamics, surrogating traditional models while capturing thermal fluctuations. Specifically, we showcase the application to a two-dimensional, many-particle system, focusing on surface-step fluctuations and on the related time-dependent roughness. After the construction of a dataset based on Kinetic Monte Carlo simulations, a conditional GAN is trained to propagate stochastically the state of the system in time, allowing the generation of new sequences with a reduced computational cost. Modifications with respect to standard GANs, which facilitate convergence and increase accuracy, are discussed. The trained network is demonstrated to quantitatively reproduce equilibrium and kinetic properties, including scaling laws, with deviations of a few percent from the exact value. Extrapolation limits and future perspectives are critically discussed.
Authors:Walter Boscheri, Firas Dhaouadi
Abstract:
We propose a new curl-free and thermodynamically compatible finite volume scheme on Voronoi grids to solve compressible heat conducting flows written in first-order hyperbolic form. The approach is based on the definition of compatible discrete curl-grad operators, exploiting the triangular nature of the dual mesh. We design a cell solver reminiscent of the nodal solvers used in Lagrangian schemes to discretize the evolution equation for the thermal impulse vector, and we demonstrate that the resulting numerical scheme ensures energy conservation, local non-negative entropy production, as well as asymptotic consistency with the classical Fourier law in the stiff relaxation limit. A novel technique is proposed to transfer residuals from the dual to the primal mesh as subfluxes, which eventually yields the construction of entropy compatible semi-discrete methods. The scheme and its properties are validated on a set of numerical test cases.
Authors:DarÃo Slaifstein, Gautham Ram Chandra Mouli, Laura Ramirez-Elizondo, Pavol Bauer
Abstract:
The operation of residential energy hubs with multiple energy carriers (electricity, heat, mobility) poses a significant challenge due to different carrier dynamics, hybrid storage coordination and high-dimensional action-spaces. Energy management systems oversee their operation, deciding the set points of the primary control layer. This paper presents a novel 2-stage economic model predictive controller for electrified buildings including physics-based models of the battery degradation and thermal systems. The hierarchical control operates in the Dutch sequential energy markets. In particular common assumptions regarding intra-day markets (auction and continuous-time) are discussed as well as the coupling of the different storage systems. The best control policy is to co-optimize day-ahead and intra-day auctions in the first stage, to later follow intra-day auctions. If no intra-day prices are known at the time of the day-ahead auction, its best to follow continuous time intra-day in the summer and the intra-day auction in the winter. Additionally, this sequential operation increases battery degradation. Finally, under our controller the realized short-term flexibility of the thermal energy storage is marginal compared to the flexibility delivered by static battery pack and electric vehicles with bidirectional charging.
Authors:Hassan Zahid Butt, Xingpeng Li
Abstract:
Traditional long-term microgrid planning models assume constant power charging for battery energy storage systems (BESS), overlooking efficiency losses that occur toward the end of charge due to rising internal resistance. While this issue can be mitigated at the cell level using constant current-constant voltage (CCCV) charging, it is impractical at the pack level in large-scale systems. However, battery management systems and inverter controls can emulate this effect by tapering charging power at high state-of-charge (SOC) levels, trading off charging speed for improved efficiency and reduced thermal stress. Ignoring this behavior in planning models can lead to undersized batteries and potential reliability issues. This paper proposes a tractable and scalable approach to approximate CCCV behavior using SOC-dependent tapered charging power (TCP) constraints. A MATLAB-based proof of concept demonstrates the energy delivery and efficiency benefits of tapering. The method is integrated into a long-term planning framework and evaluated under a synthetic load and solar profile. Results show tapering significantly affects BESS sizing, cost, and reliability under dynamic operating conditions that demand fast charging. These findings highlight tapering as a critical modeling factor for accurately capturing BESS performance in long-term microgrid planning.
Authors:Waqar Muhammad Ashraf, Amir H. Keshavarzzadeh, Abdulelah S. Alshehri, Abdulrahman bin Jumah, Ramit Debnath, Vivek Dua
Abstract:
The domain-consistent adoption of artificial intelligence (AI) remains low in thermal power plants due to the black-box nature of AI algorithms and low representation of domain knowledge in conventional data-centric analytics. In this paper, we develop a MAhalanobis Distance-based OPTimization (MAD-OPT) framework that incorporates the Mahalanobis distance-based constraint to introduce domain knowledge into data-centric analytics. The developed MAD-OPT framework is applied to maximize thermal efficiency and minimize turbine heat rate for a 395 MW capacity gas turbine system. We demonstrate that the MAD-OPT framework can estimate domain-informed optimal process conditions under different ambient conditions, and the optimal solutions are found to be robust as evaluated by Monte Carlo simulations. We also apply the MAD-OPT framework to estimate optimal process conditions beyond the design power generation limit of the gas turbine system, and have found comparable results with the actual data of the power plant. We demonstrate that implementing data-centric optimization analytics without incorporating domain-informed constraints may provide ineffective solutions that may not be implementable in the real operation of the gas turbine system. This research advances the integration of the data-driven domain knowledge into machine learning-powered analytics that enhances the domain-informed operation excellence and paves the way for safe AI adoption in thermal power systems.
Authors:Reece Bourisaw, Reid McCants, Jean-Marie Le Corre, Anna Iskhakova, Arsen S. Iskhakov
Abstract:
Critical heat flux (CHF) marks the onset of boiling crisis in light-water reactors, defining safe thermal-hydraulic operating limits. To support Phase II of the OECD/NEA AI/ML CHF benchmark, which introduces spatially varying power profiles, this work compiles and digitizes a broad CHF dataset covering both uniform and non-uniform axial heating conditions. Heating profiles were extracted from technical reports, interpolated onto a consistent axial mesh, validated via energy-balance checks, and encoded in machine-readable formats for benchmark compatibility.
Classical CHF correlations exhibit substantial errors under uniform heating and degrade markedly when applied to non-uniform profiles, while modern tabular methods offer improved but still imperfect predictions. A neural network trained solely on uniform data performs well in that regime but fails to generalize to spatially varying scenarios, underscoring the need for models that explicitly incorporate axial power distributions. By providing these curated datasets and baseline modeling results, this study lays the groundwork for advanced transfer-learning strategies, rigorous uncertainty quantification, and design-optimization efforts in the next phase of the CHF benchmark.
Authors:Pietro Favaro, Jean-François Toubeau, François Vallée, Yury Dvorkin
Abstract:
Heating, Ventilation, and Air Conditioning (HVAC) is a major electricity end-use with a substantial potential for grid services such as demand response. Harnessing this flexibility requires accurate modeling of the thermal dynamics of buildings, which is challenging due to their nonlinear and repetitive behavior (e.g., daily pattern), which reduce the value of historical data. To address this issue, this paper presents an HVAC management system formulated as a Mixed Integer Quadratic Program (MIQP), where Neural Network (NN) models of thermal dynamics are embedded as exact mixed-integer linear constraints. We employ Decision-Focused Learning (DFL) which tunes the NN parameters to improve the HVAC performance rather than prediction metrics. However, the discrete nature of the MIQP poses challenges for this approach, as it leads to gradients that are undefined or discontinuous, thus impeding standard gradient-based training. Here, we employ Stochastic Smoothing (SS), which enables efficient gradient computation without the need to differentiate through the MIQP. Experiments on a realistic five-zone building using a high-fidelity building simulator demonstrate that the proposed SS-DFL approach outperforms conventional two-stage and relaxed DFL methods in both cost savings and grid service performance, highlighting its potential for scalable, grid-interactive building control.
Authors:Haitao Huang, Chuangtao Chen, Qinglin Zhao
Abstract:
The generation and preservation of complex quantum states against environmental noise are paramount challenges in advancing continuous-variable (CV) quantum information processing. This paper introduces a novel framework based on continuous-variable quantum diffusion principles, synergizing them with CV quantum neural networks (CVQNNs) to address these dual challenges. For the task of state generation, our Continuous-Variable Quantum Diffusion Generative model (CVQD-G) employs a physically driven forward diffusion process using a thermal loss channel, which is then inverted by a learnable, parameter-efficient backward denoising process based on a CVQNN with time-embedding. This framework's capability is further extended for state recovery by the Continuous-Variable Quantum Diffusion Restoration model (CVQD-R), a specialized variant designed to restore quantum states, particularly coherent states with unknown parameters, from thermal degradation. Extensive numerical simulations validate these dual capabilities, demonstrating the high-fidelity generation of diverse Gaussian (coherent, squeezed) and non-Gaussian (Fock, cat) states, typically with fidelities exceeding 99%, and confirming the model's ability to robustly restore corrupted states. Furthermore, a comprehensive complexity analysis reveals favorable training and inference costs, highlighting the framework's efficiency, scalability, and its potential as a robust tool for quantum state engineering and noise mitigation in realistic CV quantum systems.
Authors:Ning Chu, Siya Zheng, Shanqing Zhang, Li Li, Caifang Cai, Ali Mohammad-Djafari, Feng Zhao, Yuanbo Song
Abstract:
Infrared thermography faces persistent challenges in temperature accuracy due to material emissivity variations, where existing methods often neglect the joint optimization of radiometric calibration and image degradation. This study introduces a physically guided neural framework that unifies temperature correction and image enhancement through a symmetric skip-CNN architecture and an emissivity-aware attention module. The pre-processing stage segments the ROIs of the image and and initially corrected the firing rate. A novel dual-constrained loss function strengthens the statistical consistency between the target and reference regions through mean-variance alignment and histogram matching based on Kullback-Leibler dispersion. The method works by dynamically fusing thermal radiation features and spatial context, and the model suppresses emissivity artifacts while recovering structural details. After validating the industrial blower system under different conditions, the improved network realizes the dynamic fusion of thermal radiation characteristics and spatial background, with accurate calibration results in various industrial conditions.
Authors:Shuchen Sun, Ligen Shi, Chang Liu, Lina Wu, Jun Qiu
Abstract:
Infrared and visible light image fusion aims to combine the strengths of both modalities to generate images that are rich in information and fulfill visual or computational requirements. This paper proposes an image fusion method based on Implicit Neural Representations (INR), referred to as INRFuse. This method parameterizes a continuous function through a neural network to implicitly represent the multimodal information of the image, breaking through the traditional reliance on discrete pixels or explicit features. The normalized spatial coordinates of the infrared and visible light images serve as inputs, and multi-layer perceptrons is utilized to adaptively fuse the features of both modalities, resulting in the output of the fused image. By designing multiple loss functions, the method jointly optimizes the similarity between the fused image and the original images, effectively preserving the thermal radiation information of the infrared image while maintaining the texture details of the visible light image. Furthermore, the resolution-independent characteristic of INR allows for the direct fusion of images with varying resolutions and achieves super-resolution reconstruction through high-density coordinate queries. Experimental results indicate that INRFuse outperforms existing methods in both subjective visual quality and objective evaluation metrics, producing fused images with clear structures, natural details, and rich information without the necessity for a training dataset.
Authors:Conor Rowan, John Evans, Kurt Maute, Alireza Doostan
Abstract:
From characterizing the speed of a thermal system's response to computing natural modes of vibration, eigenvalue analysis is ubiquitous in engineering. In spite of this, eigenvalue problems have received relatively little treatment compared to standard forward and inverse problems in the physics-informed machine learning literature. In particular, neural network discretizations of solutions to eigenvalue problems have seen only a handful of studies. Owing to their nonlinearity, neural network discretizations prevent the conversion of the continuous eigenvalue differential equation into a standard discrete eigenvalue problem. In this setting, eigenvalue analysis requires more specialized techniques. Using a neural network discretization of the eigenfunction, we show that a variational form of the eigenvalue problem called the "Rayleigh quotient" in tandem with a Gram-Schmidt orthogonalization procedure is a particularly simple and robust approach to find the eigenvalues and their corresponding eigenfunctions. This method is shown to be useful for finding sets of harmonic functions on irregular domains, parametric and nonlinear eigenproblems, and high-dimensional eigenanalysis. We also discuss the utility of harmonic functions as a spectral basis for approximating solutions to partial differential equations. Through various examples from engineering mechanics, the combination of the Rayleigh quotient objective, Gram-Schmidt procedure, and the neural network discretization of the eigenfunction is shown to offer unique advantages for handling continuous eigenvalue problems.
Authors:Walter Boscheri, Michael Dumbser, Raphael Loubère, Pierre-Henri Maire
Abstract:
In this work we present a novel structure-preserving scheme for the discretization of the Godunov-Peshkov-Romenski (GPR) model of continuum mechanics written in Lagrangian form. This model admits an extra conservation law for the total energy (first principle of thermodynamics) and satisfies the entropy inequality (second principle of thermodynamics). Furthermore, in the absence of algebraic source terms, the distortion field of the continuum and the specific thermal impulse satisfy a curl-free condition, provided the initial data are curl-free. Last but not least, the determinant of the distortion field is related to the density of the medium, i.e. the system is also endowed with a nonlinear algebraic constraint.
The objective of this work is to construct and analyze a new semi-discrete thermodynamically compatible cell-centered Lagrangian finite volume scheme on moving unstructured meshes that satisfies the following structural properties of the governing PDE exactly at the discrete level: i) compatibility with the first law of thermodynamics, i.e. discrete total energy conservation; ii) compatibility with the second law of thermodynamics, i.e. discrete entropy inequality; iii) exact discrete compatibility between the density and the determinant of the distortion field; iv) exact preservation of the curl-free property of the distortion field and of the specific thermal impulse in the absence of algebraic source terms. We show that it is possible to achieve all above properties simultaneously. Unlike in existing schemes, we choose to directly discretize the entropy inequality, hence obtaining total energy conservation as a consequence of an appropriate and thermodynamically compatible discretization of all the other equations.
Authors:OndÅej Benedikt, Michal Sojka, PÅemysl Šůcha, Pavel Zaykov, ZdenÄk Hanzálek
Abstract:
Multi-Processor Systems-on-Chip (MPSoC) can deliver high performance needed in many industrial domains, including aerospace. However, their high power consumption, combined with avionics safety standards, brings new thermal management challenges. This paper investigates techniques for offline thermal-aware allocation of periodic tasks on heterogeneous MPSoCs running at a fixed clock frequency, as required in avionics. The goal is to find the assignment of tasks to (i) cores and (ii) temporal isolation windows while minimizing the MPSoC temperature. To achieve that, we propose and analyze three power models, and integrate them within several novel optimization approaches based on heuristics, a black-box optimizer, and Integer Linear Programming (ILP). We perform the experimental evaluation on three popular MPSoC platforms (NXP i.MX8QM MEK, NXP i.MX8QM Ixora, NVIDIA TX2) and observe a difference of up to 5.5°C among the tested methods (corresponding to a 22% reduction w.r.t. the ambient temperature). We also show that our method, integrating the empirical power model with the ILP, outperforms the other methods on all tested platforms.
Authors:Alex Brown, Joscha Fregin, Thomas Bendall, Thomas Melvin, Daniel Ruprecht, Jemma Shipton
Abstract:
This paper investigates the application of a fast-wave slow-wave spectral deferred correction time-stepping method (FWSW-SDC) to the compressible Euler equations. The resulting model achieves arbitrary order accuracy in time, demonstrating robust performance in standard benchmark idealised test cases for dynamical cores used for numerical weather prediction. The model uses a compatible finite element spatial discretisation, achieving good linear wave dispersion properties without spurious computational modes. A convergence test confirms the model's high temporal accuracy. Arbitrarily high spatial-temporal convergence is demonstrated using a gravity wave test case. The model is further extended to include the parametrisation of a simple physics process by adding two phases of moisture and its validity is demonstrated for a rising thermal problem. Finally, a baroclinic wave in simulated in a Cartesian domain.
Authors:Magnus Appel, Joe Alexandersen
Abstract:
This paper presents Space-Time MultiGrid (STMG) methods which are suitable for performing topology optimisation of transient heat conduction problems. The proposed methods use a pointwise smoother and uniform Cartesian space-time meshes. For problems with high contrast in the diffusivity, it was found that it is beneficial to define a coarsening strategy based on the geometric mean of the minimum and maximum diffusivity. However, other coarsening strategies may be better for other smoothers. Several methods of discretising the coarse levels were tested. Of these, it was best to use a method which averages the thermal resistivities on the finer levels. However, this was likely a consequence of the fact that only one spatial dimension was considered for the test problems. A second coarsening strategy was proposed which ensures spatial resolution on the coarse grids. Mixed results were found for this strategy. The proposed STMG methods were used as a solver for a one-dimensional topology optimisation problem. In this context, the adjoint problem was also solved using the STMG methods. The STMG methods were sufficiently robust for this application, since they converged during every optimisation cycle. It was found that the STMG methods also work for the adjoint problem when the prolongation operator only sends information forwards in time, even although the direction of time for the adjoint problem is backwards.
Authors:Takahiro Ito, Kiwamu Izumi, Isao Kawano, Ikkoh Funaki, Shuichi Sato, Tomotada Akutsu, Kentaro Komori, Mitsuru Musha, Yuta Michimura, Satoshi Satoh, Takuya Iwaki, Kentaro Yokota, Kenta Goto, Katsumi Furukawa, Taro Matsuo, Toshihiro Tsuzuki, Katsuhiko Yamada, Takahiro Sasaki, Taisei Nishishita, Yuki Matsumoto, Chikako Hirose, Wataru Torii, Satoshi Ikari, Koji Nagano, Masaki Ando, Seiji Kawamura, Hidehiro Kaneda, Shinsuke Takeuchi, Shinichiro Sakai
Abstract:
We propose SILVIA (Space Interferometer Laboratory Voyaging towards Innovative Applications), a mission concept designed to demonstrate ultra-precision formation flying between three spacecraft separated by 100 m. SILVIA aims to achieve sub-micrometer precision in relative distance control by integrating spacecraft sensors, laser interferometry, low-thrust and low-noise micro-propulsion for real-time measurement and control of distances and relative orientations between spacecraft. A 100-meter-scale mission in a near-circular low Earth orbit has been identified as an ideal, cost-effective setting for demonstrating SILVIA, as this configuration maintains a good balance between small relative perturbations and low risk for collision. This mission will fill the current technology gap towards future missions, including gravitational wave observatories such as DECIGO (DECihertz Interferometer Gravitational wave Observatory), designed to detect the primordial gravitational wave background, and high-contrast nulling infrared interferometers like LIFE (Large Interferometer for Exoplanets), designed for direct imaging of thermal emissions from nearby terrestrial planet candidates. The mission concept and its key technologies are outlined, paving the way for the next generation of high-precision space-based observatories.
Authors:Chiaki Kojima, Yuya Muto, Hikaru Akutsu, Rinnosuke Shima, Yoshihiko Susuki
Abstract:
In regions with heavy snowfall, the living environment is becoming a serious problem due to heavy snow accumulation. A road heating is an electrical device which promotes snow melting by burying a heating cable as a thermal source underground in such regions. When integrating the road heating into power distribution systems, we need to optimize the flow of electric power by appropriately integrating distributed power sources and conventional power distribution equipment. In this paper, we introduce a battery storage to the power distribution system including road heating, and extend the predictive switching control of the systems due to the authors' previous study to the case where battery storage is installed. As a main result, we propose a predictive switching control that utilizes photovoltaic (PV) power generation and surplus power stored in the battery storage effectively, and achieves the reduction of distribution loss, attenuation of voltage fluctuation, and efficient snow melting, simultaneously. We verify the effectiveness of the application of battery storage through numerical simulation using actual time series data of weather conditions and active power of the PV power generation and load.
Authors:Chengyi Wang, Ji Wang
Abstract:
This work is motivated by the engineering challenge of suppressing vibrations in turbine blades of aero engines, which often operate under extreme thermal conditions and high-Mach aerodynamic environments that give rise to complex vibration phenomena, commonly referred to as thermally-induced and flow-induced vibrations. Using Hamilton's variational principle, the system is modeled as a rotating slender Timoshenko beam under thermal and aerodynamic loads, described by a coupled system of 2*2 hyperbolic PIDEs, parabolic PDE, and ODEs, where the nonlocal terms exist in the hyperbolic PDE domain, and where the external disturbance (heat flux) flows into one boundary of the heat PDE. For the general form of such mixed systems, we present the state-feedback control design based on the PDE backstepping method, and then design an extended state observer for the unmeasurable distributed states and external disturbances using only available boundary measurements. In the resulting output-feedback closed-loop system, the state of the uncontrolled boundary, i.e., the furthest state from the control input, is proved to be exponentially convergent to zero, and all signals are proved to be uniformly ultimately bounded. Moreover, if the external disturbance vanishes, the exponential stability of the overall system is obtained. The proposed control design is validated on an aero-engine flexible blade under extreme thermal and aerodynamic conditions.
Authors:Pratyush Kumar Singh, Danial Faghihi
Abstract:
We present a thermodynamically consistent three-phase model for the coupled thermal transport and mechanical deformation of ceramic aerogel porous composite materials, which is formulated via continuum mixture theory. The composite comprises a solid silica skeleton, a gaseous fluid phase, and dispersed solid fibers. The thermal transport model incorporates the effects of meso- and macro-pore size variations due to the Knudsen effect, achieved by upscaling phonon transport relations to derive constitutive equations for the fluid thermal conductivity. The mechanical model captures solid-solid and solid-fluid interactions through momentum exchange between phases. A mixed finite element formulation is employed to solve the multiphase model, and numerical studies are conducted to analyze key features of the computational model.
Authors:Janis Nötzel, Pere Munar-Vallespir
Abstract:
We study the hypothesis testing problem of distinguishing between correlated thermal noise and uncorrelated thermal noise of the same average energy on $K$ detectors in asymptotic asymmetric hypothesis testing. We compare the performance of heterodyne or homodyne detection with classical post-processing, the most general quantum strategy (involving any arbitrary measurement), and a simple strategy involving a photonic chip and On-Off detection. When the average received energy per detector goes to zero, the photonic chip strategy asymptotically achieves the optimal decrease in the error, while heterodyne/homodyne measurements do not. Thus, we show that linear optics and On-Off measurement are enough to achieve better detection than classical methods when detecting correlations in thermal optical signals.
Authors:Leonardo D. González, Joshua L. Pulsipher, Shengli Jiang, Tyler Soderstrom, Victor M. Zavala
Abstract:
We present a digital-twin simulator for a pastillation process. The simulation framework produces realistic thermal image data of the process that is used to train computer vision-based soft sensors based on convolutional neural networks (CNNs); the soft sensors produce output signals for temperature and product flow rate that enable real-time monitoring and feedback control. Pastillation technologies are high-throughput devices that are used in a broad range of industries; these processes face operational challenges such as real-time identification of clog locations (faults) in the rotating shell and the automatic, real-time adjustment of conveyor belt speed and operating conditions to stabilize output. The proposed simulator is able to capture this behavior and generates realistic data that can be used to benchmark different algorithms for image processing and different control architectures. We present a case study to illustrate the capabilities; the study explores behavior over a range of equipment sizes, clog locations, and clog duration. A feedback controller (tuned using Bayesian optimization) is used to adjust the conveyor belt speed based on the CNN output signal to achieve the desired process outputs.
Authors:Leyang Wang, Joice Lin
Abstract:
The success of modern machine learning, particularly in facial translation networks, is highly dependent on the availability of high-quality, paired, large-scale datasets. However, acquiring sufficient data is often challenging and costly. Inspired by the recent success of diffusion models in high-quality image synthesis and advancements in Large Language Models (LLMs), we propose a novel framework called LLM-assisted Paired Image Generation (LaPIG). This framework enables the construction of comprehensive, high-quality paired visible and thermal images using captions generated by LLMs. Our method encompasses three parts: visible image synthesis with ArcFace embedding, thermal image translation using Latent Diffusion Models (LDMs), and caption generation with LLMs. Our approach not only generates multi-view paired visible and thermal images to increase data diversity but also produces high-quality paired data while maintaining their identity information. We evaluate our method on public datasets by comparing it with existing methods, demonstrating the superiority of LaPIG.
Authors:DarÃo Slaifstein, Gautham Ram Chandra Mouli, Laura Ramirez-Elizondo, Pavol Bauer
Abstract:
In the context of building electrification, the operation of distributed energy resources integrating multiple energy carriers (electricity, heat, mobility) poses a significant challenge due to the nonlinear device dynamics, uncertainty, and computational issues. As such, energy management systems seek to decide the power dispatch in the best way possible. The objective is to minimize and balance operative costs (energy bills or asset degradation) with user requirements (mobility, heating, etc.). Current energy management uses empirical battery ageing models outside of their specific fitting conditions, resulting in inaccuracies and poor performance. Moreover, the link to thermal systems is also overlooked. This paper presents an ageing-aware day-ahead algorithm for electrified buildings that incorporates physics-based battery ageing models. The models distinguish between energy storage systems and make explicit the trade-off between grid cost and battery degradation. The proposed day-ahead algorithm can either cut down on grid costs or extend battery lifetime (electric vehicle or stationary battery packs). Moreover, it exploits the differences between cathode chemistries improving grid costs by 25% when using LFP cells, with respect to NMC cells. Finally, the performance using aged batteries is also enhanced with 35% grid cost observed savings, when passing from new to aged batteries in the summer.
Authors:Donát M. Takács, Tamás Fülöp, Róbert Kovács, Mátyás Szücs
Abstract:
In the vicinity of the liquid--vapor critical point, supercritical fluids behave strongly compressibly and, in parallel, thermophysical properties have strong state dependence. These lead to various peculiar phenomena, one of which being the piston effect where a sudden heating induces a mechanical pulse. The coupling between thermal and mechanical processes, in the linear approximation, yields a non-trivially rich thermoacoustics. The numerous applications of supercritical fluids raise the need for reliable yet fast and efficient numerical solution for thermoacoustic time and space dependence in this sensitive domain. Here, we present a second-order accurate, fully explicit staggered space-time grid finite difference method for such coupled linear thermoacoustic problems. Time integration is based on the splitting of the state space vector field representing the interactions that affect the dynamics into reversible and irreversible parts, which splitting procedure leads to decoupled wave and heat equations. The former is a hyperbolic partial differential equation, while the latter is a parabolic one, therefore, different time integration algorithms must be amalgamated to obtain a reliable, dispersion error-free, and dissipation error-free numerical solution. Finally, the thermoacoustic approximation of the supercritical piston effect is investigated via the developed method.
Authors:Waqar Muhammad Ashraf, Vivek Dua, Ramit Debnath
Abstract:
Machine learning and optimisation techniques (MLOPT) hold significant potential to accelerate the decarbonisation of industrial systems by enabling data-driven operational improvements. However, the practical application of MLOPT in industrial settings is often hindered by a lack of domain compliance and system-specific consistency, resulting in suboptimal solutions with limited real-world applicability. To address this challenge, we propose a novel human-in-the-loop (HITL) constraint-based optimisation framework that integrates domain expertise with data-driven methods, ensuring solutions are both technically sound and operationally feasible. We demonstrate the efficacy of this framework through a case study focused on enhancing the thermal efficiency and reducing the turbine heat rate of a 660 MW supercritical coal-fired power plant. By embedding domain knowledge as constraints within the optimisation process, our approach yields solutions that align with the plant's operational patterns and are seamlessly integrated into its control systems. Empirical validation confirms a mean improvement in thermal efficiency of 0.64\% and a mean reduction in turbine heat rate of 93 kJ/kWh. Scaling our analysis to 59 global coal power plants with comparable capacity and fuel type, we estimate a cumulative lifetime reduction of 156.4 million tons of carbon emissions. These results underscore the transformative potential of our HITL-MLOPT framework in delivering domain-compliant, implementable solutions for industrial decarbonisation, offering a scalable pathway to mitigate the environmental impact of coal-based power generation worldwide.
Authors:Peretz Yafin, Nir Sochen, Iftach Klapp
Abstract:
Due to their affordable, low mass, and small dimensions, uncooled microbolometer-based thermal focal plane arrays (UC-FPAs) are useful for long-wave infrared (LWIR)imaging applications. However, in outdoor conditions typical in agricultural remote sensing, cameras based on UC-FPAs may suffer from drift in offset and gain. To tackle the persistent drift, the system requires continuous calibration. Our goal in this study was to eliminate this requirement via a computational schema. In a former study, we estimated unknown gain and offset values and thermographic images of an object from a sequence of pairs of successive images taken at two different blur levels.In the current work, we took on a similar problem using a sequence of shifted images, with relative shifts caused by realistic drone hovering modeled by homography transformation. This places our work in the realm of scene-based nonuniformity correction problems. We show that an object's thermographic values, as well as gain and offset, can be jointly estimated by relying on a few sets of shifted images. We use a minimum likelihood estimator, which is found using alternating minimization. Registration is done using a generalized Lucas-Kanade method. Simulations show promising accuracy with mean Pearson correlation of more than 0.9999998 between ground truth and restoration. Under ideal assumptions, this is equivalent to a mean restoration error of less than 0.01 Celsius degree.
Authors:Lei, Chen, Juheon Lee, Juan Carlos Catana, Tsegai Yhdego, Nathan Moroney, Mohammad Amin Nabian, Hui Wang, Jun Zeng
Abstract:
This paper introduces a data-driven algorithm for modeling and compensating shape deviations in additive manufacturing (AM), addressing challenges in geometric accuracy and batch production. While traditional methods, such as analytical models and metrology, laid the groundwork for geometric precision, they are often impractical for large-scale production. Recent advancements in machine learning (ML) have improved compensation precision, but issues remain in generalizing across complex geometries and adapting to position-dependent variations. We present a novel approach for powder bed fusion (PBF) processes, using GraphCompNet, which is a computational framework combining graph-based neural networks with a generative adversarial network (GAN)-inspired training process. By leveraging point cloud data and dynamic graph convolutional neural networks (DGCNNs), GraphCompNet models complex shapes and incorporates position-specific thermal and mechanical factors. A two-stage adversarial training procedure iteratively refines compensated designs via a compensator-predictor architecture, offering real-time feedback and optimization. Experimental validation across diverse shapes and positions shows the framework significantly improves compensation accuracy (35 to 65 percent) across the entire print space, adapting to position-dependent variations. This work advances the development of Digital Twin technology for AM, enabling scalable, real-time monitoring and compensation, and addressing critical gaps in AM process control. The proposed method supports high-precision, automated industrial-scale design and manufacturing systems.
Authors:Guanzhou Ji, Sriram Narayanan, Azadeh Sawyer, Srinivasa Narasimhan
Abstract:
This paper presents a novel application for directly estimating indoor light and heat maps from captured indoor-outdoor High Dynamic Range (HDR) panoramas. In our image-based rendering method, the indoor panorama is used to estimate the 3D room layout, while the corresponding outdoor panorama serves as an environment map to infer spatially-varying light and material properties. We establish a connection between indoor light transport and heat transport and implement transient heat simulation to generate indoor heat panoramas. The sensitivity analysis of various thermal parameters is conducted, and the resulting heat maps are compared with the images captured by the thermal camera in real-world scenarios. This digital application enables automatic indoor light and heat estimation without manual inputs and cumbersome field measurements.
Authors:Barbara Wirthl, Paolo Decuzzi, Bernhard A. Schrefler, Wolfgang A. Wall
Abstract:
Heat-based cancer treatment, so-called hyperthermia, can be used to destroy tumour cells directly or to make them more susceptible to chemotherapy or radiation therapy. To apply heat locally, iron oxide nanoparticles are injected into the bloodstream and accumulate at the tumour site, where they generate heat when exposed to an alternating magnetic field. However, the temperature must be precisely controlled to achieve therapeutic benefits while avoiding damage to healthy tissue. We therefore present a computational model for nanoparticle-mediated hyperthermia treatment fully integrated into a multiphase porous-media model of the tumour and its microenvironment. We study how the temperature depends on the amount of nanoparticles accumulated in the tumour area and the specific absorption rate of the nanoparticles. Our results show that host tissue surrounding the tumour is also exposed to considerable doses of heat due to the high thermal conductivity of the tissue, which may cause pain or even unnecessary irreversible damage. Further, we include a lumped and a discrete model for the cooling effect of blood perfusion. Using a discrete model of a realistic microvasculature reveals that the small capillaries do not have a significant cooling effect during hyperthermia treatment and that the commonly used lumped model based on Pennes' bioheat equation overestimates the effect: within the specific conditions analysed, the difference between lumped and discrete approaches is approximatively 0.75°C, which could influence the therapeutic intervention outcome. Such a comprehensive computational model, as presented here, can provide insights into the optimal treatment parameters for nanoparticle-mediated hyperthermia and can be used to design more efficient treatment strategies.
Authors:Ben S. Southworth, Steven Walton, Steven B. Roberts, HyeongKae Park
Abstract:
In this paper we develop a framework for moment-based adaptive time integration of deterministic multifrequency thermal radiation transpot (TRT). We generalize our recent semi-implicit-explicit (IMEX) integration framework for gray TRT to multifrequency TRT, and also introduce a semi-implicit variation that facilitates higher-order integration of TRT, where each stage is implicit in all components except opacities. To appeal to the broad literature on adaptivity with Runge--Kutta methods, we derive new embedded methods for four asymptotic preserving IMEX Runge--Kutta schemes we have found to be robust in our previous work on TRT and radiation hydrodynamics. We then use a moment-based high-order-low-order representation of the transport equations. Due to the high dimensionality, memory is always a concern in simulating TRT. We form error estimates and adaptivity in time purely based on temperature and radiation energy, for a trivial overhead in computational cost and memory usage compared with the base second order integrators. We then test the adaptivity in time on the tophat and Larsen problem, demonstrating the ability of the adaptive algorithm to naturally vary the timestep across 4--5 orders of magnitude, ranging from the dynamical timescales of the streaming regime to the thick diffusion limit.
Authors:Ran Zhang, Caihua Wan, Yingqian Xu, Xiaohan Li, Raik Hoffmann, Meike Hindenberg, Shiqiang Liu, Dehao Kong, Shilong Xiong, Shikun He, Alptekin Vardar, Qiang Dai, Junlu Gong, Yihui Sun, Zejie Zheng, Thomas Kämpfe, Guoqiang Yu, Xiufeng Han
Abstract:
Magnetic Tunnel Junctions (MTJs) have shown great promise as hardware sources for true random number generation (TRNG) due to their intrinsic stochastic switching behavior. However, practical deployment remains challenged by drift in switching probability caused by thermal fluctuations, device aging, and environmental instability. This work presents an engineering-oriented, drift-resilient MTJ-based TRNG architecture, enabled by a hybrid control strategy that combines self-stabilizing feedback with pulse width modulation. A key component is the Downcalibration-2 scheme, which updates the control parameter every two steps using only integer-resolution timing, ensuring excellent statistical quality without requiring bit discarding, pre-characterization, or external calibration. Extensive experimental measurements and numerical simulations demonstrate that this approach maintains stable randomness under dynamic temperature drift, using only simple digital logic. The proposed architecture offers high throughput, robustness, and scalability, making it well-suited for secure hardware applications, embedded systems, and edge computing environments.
Authors:Jose Guajardo, Ali Niknejad
Abstract:
Digital beamforming forms the foundation for massive MIMO in 6G wireless communications. At their core, digital beamforming architectures provide key benefits such as faster beam search, interference nulling via zero-force beamforming, higher spectral capacity, and more increased flexibility. However, they generally tradeoff power consumption due to the large number of ADCs in such systems. This paper introduces an open-source MATLAB-based behavioral hardware model of a general digital beamforming system. More specifically, it models an end-to-end uplink between an arbitrary number of user elements (UEs) and an arbitrarily large base station (BS) with and without a strong interferer. This paper also presents and validates an equation-based model for the effects of interference on thermal and quantization noise. The behavioral model presented in this paper aims to deepen understanding of such digital beamforming systems to enable system designers to make optimizations. The results presented in this paper primarily center on implementations with low-resolution ADCs and, thus, focus on the effects of system parameters, including interferer strength, on quantization noise.
Authors:Pratyush Kumar Singh, Danial Faghihi
Abstract:
This paper presents a computationally efficient method for the optimal design of silica aerogel porous material systems, balancing thermal insulation performance with mechanical stability under stress concentrations. The proposed approach explicitly accounts for additive manufacturing uncertainties by modeling material porosity as a spatially correlated stochastic field within a multiphase finite element formulation. A risk-averse objective function, incorporating statistical moments of the design objective, is employed in conjunction with chance constraints that enforce mechanical stability by restricting the probability of exceeding critical stress thresholds. To mitigate the prohibitively high computational cost associated with the large-dimensional uncertainty space and Monte Carlo estimations of the objective function's statistical moments, a second-order Taylor expansion is utilized as a control variate. Furthermore, a continuation-based smoothing strategy is introduced to address the non-differentiability of the chance constraints, ensuring compatibility with gradient-based optimization. The resulting framework achieves computational scalability, remaining agnostic to the dimensionality of the stochastic design space. The effectiveness of the method is demonstrated through numerical experiments on two- and three-dimensional thermal break systems for building insulation. The results highlight the framework's capability to solve large-scale, chance-constrained optimal design problems governed by finite element models with uncertain design parameter spaces reaching dimensions in the hundreds of thousands.
Authors:Michael Bezick, Blake A. Wilson, Vaishnavi Iyer, Yuheng Chen, Vladimir M. Shalaev, Sabre Kais, Alexander V. Kildishev, Alexandra Boltasseva, Brad Lackey
Abstract:
PearSAN is a machine learning-assisted optimization algorithm applicable to inverse design problems with large design spaces, where traditional optimizers struggle. The algorithm leverages the latent space of a generative model for rapid sampling and employs a Pearson correlated surrogate model to predict the figure of merit of the true design metric. As a showcase example, PearSAN is applied to thermophotovoltaic (TPV) metasurface design by matching the working bands between a thermal radiator and a photovoltaic cell. PearSAN can work with any pretrained generative model with a discretized latent space, making it easy to integrate with VQ-VAEs and binary autoencoders. Its novel Pearson correlational loss can be used as both a latent regularization method, similar to batch and layer normalization, and as a surrogate training loss. We compare both to previous energy matching losses, which are shown to enforce poor regularization and performance, even with upgraded affine parameters. PearSAN achieves a state-of-the-art maximum design efficiency of 97%, and is at least an order of magnitude faster than previous methods, with an improved maximum figure-of-merit gain.
Authors:Zhen Hao, Ning Jiang, Liu Liu
Abstract:
In this paper, we develop and implement an efficient asymptotic-preserving (AP) scheme to solve the gas mixture of Boltzmann equations under the disparate mass scaling relevant to the so-called "epochal relaxation" phenomenon. The disparity in molecular masses, ranging across several orders of magnitude, leads to significant challenges in both the evaluation of collision operators and the designing of time-stepping schemes to capture the multi-scale nature of the dynamics. A direct implementation of the spectral method faces prohibitive computational costs as the mass ratio increases due to the need to resolve vastly different thermal velocities. Unlike [I. M. Gamba, S. Jin, and L. Liu, Commun. Math. Sci., 17 (2019), pp. 1257-1289], we propose an alternative approach based on proper truncation of asymptotic expansions of the collision operators, which significantly reduces the computational complexity and works well for small $\varepsilon$. By incorporating the separation of three time scales in the model's relaxation process [P. Degond and B. Lucquin-Desreux, Math. Models Methods Appl. Sci., 6 (1996), pp. 405-436], we design an AP scheme that captures the specific dynamics of the disparate mass model while maintaining computational efficiency. Numerical experiments demonstrate the effectiveness of the proposed scheme in handling large mass ratios of heavy and light species, as well as capturing the epochal relaxation phenomenon.
Authors:David López-GarcÃa, FermÃn Segovia, Jacob RodrÃguez-Rivero, Javier RamÃrez, David Pérez, Raúl Serrano, Juan Manuel Górriz
Abstract:
The RESISTO project represents a pioneering initiative in Europe aimed at enhancing the resilience of the power grid through the integration of advanced technologies. This includes artificial intelligence and thermal surveillance systems to mitigate the impact of extreme meteorological phenomena. RESISTO endeavors to predict, prevent, detect, and recover from weather-related incidents, ultimately enhancing the quality of service provided and ensuring grid stability and efficiency in the face of evolving climate challenges. In this study, we introduce one of the fundamental pillars of the project: a monitoring system for the operating temperature of different regions within power transformers, aiming to detect and alert early on potential thermal anomalies. To achieve this, a distributed system of thermal cameras for real-time temperature monitoring has been deployed in The Doñana National Park, alongside servers responsible for the storing, analyzing, and alerting of any potential thermal anomalies. An adaptive prediction model was developed for temperature forecasting, which learns online from the newly available data. In order to test the long-term performance of the proposed solution, we generated a synthetic temperature database for the whole of the year 2022. Overall, the proposed system exhibits promising capabilities in predicting and detecting thermal anomalies in power electric transformers, showcasing potential applications in enhancing grid reliability and preventing equipment failures.
Authors:Zhanwei Yu, Yi Zhao, Xiaoli Chu, Di Yuan
Abstract:
Passively cooled base stations (PCBSs) have emerged to deliver better cost and energy efficiency. However, passive cooling necessitates intelligent thermal control via traffic management, i.e., the instantaneous data traffic or throughput of a PCBS directly impacts its thermal performance. This is particularly challenging for outdoor deployment of PCBSs because the heat dissipation efficiency is uncertain and fluctuates over time. What is more, the PCBSs are interference-coupled in multi-cell scenarios. Thus, a higher-throughput PCBS leads to higher interference to the other PCBSs, which, in turn, would require more resource consumption to meet their respective throughput targets. In this paper, we address online decision-making for maximizing the total downlink throughput for a multi-PCBS system subject to constraints related on operating temperature. We demonstrate that a reinforcement learning (RL) approach, specifically soft actor-critic (SAC), can successfully perform throughput maximization while keeping the PCBSs cool, by adapting the throughput to time-varying heat dissipation conditions. Furthermore, we design a denial and reward mechanism that effectively mitigates the risk of overheating during the exploration phase of RL. Simulation results show that our approach achieves up to 88.6% of the global optimum. This is very promising, as our approach operates without prior knowledge of future heat dissipation efficiency, which is required by the global optimum.
Authors:Lucia Gordon, Nikhil Behari, Samuel Collier, Elizabeth Bondi-Kelly, Jackson A. Killian, Catherine Ressijac, Peter Boucher, Andrew Davies, Milind Tambe
Abstract:
Much of Earth's charismatic megafauna is endangered by human activities, particularly the rhino, which is at risk of extinction due to the poaching crisis in Africa. Monitoring rhinos' movement is crucial to their protection but has unfortunately proven difficult because rhinos are elusive. Therefore, instead of tracking rhinos, we propose the novel approach of mapping communal defecation sites, called middens, which give information about rhinos' spatial behavior valuable to anti-poaching, management, and reintroduction efforts. This paper provides the first-ever mapping of rhino midden locations by building classifiers to detect them using remotely sensed thermal, RGB, and LiDAR imagery in passive and active learning settings. As existing active learning methods perform poorly due to the extreme class imbalance in our dataset, we design MultimodAL, an active learning system employing a ranking technique and multimodality to achieve competitive performance with passive learning models with 94% fewer labels. Our methods could therefore save over 76 hours in labeling time when used on a similarly-sized dataset. Unexpectedly, our midden map reveals that rhino middens are not randomly distributed throughout the landscape; rather, they are clustered. Consequently, rangers should be targeted at areas with high midden densities to strengthen anti-poaching efforts, in line with UN Target 15.7.
Authors:Sarah A. Flanery, Christiana Chamon
Abstract:
This paper introduces a three-point biometric authentication system for a blockchain-based decentralized identity network. We use existing biometric authentication systems to demonstrate the unique noise fingerprints that belong to each individual human and the respective information leak from the biological characteristics. We then propose the concept of using unique thermal noise amplitudes generated by each user and explore the open questions regarding the robustness of unconditionally secure authentication.
Authors:Steffen Knoblauch, Ram Kumar Muthusamy, Hao Li, Iddy Chazua, Benedcto Adamu, Innocent Maholi, Alexander Zipf
Abstract:
Climate change is intensifying human heat exposure, particularly in densely built urban centers of the Global South. Low-cost construction materials and high thermal-mass surfaces further exacerbate this risk. Yet scalable methods for assessing such heat-relevant building attributes remain scarce. We propose a machine learning framework that fuses openly available unmanned aerial vehicle (UAV) and street-view (SV) imagery via a coupled global context vision transformer (CGCViT) to learn heat-relevant representations of urban structures. Thermal infrared (TIR) measurements from HotSat-1 are used to quantify the relationship between building attributes and heat-associated health risks. Our dual-modality cross-view learning approach outperforms the best single-modality models by up to $9.3\%$, demonstrating that UAV and SV imagery provide valuable complementary perspectives on urban structures. The presence of vegetation surrounding buildings (versus no vegetation), brighter roofing (versus darker roofing), and roofing made of concrete, clay, or wood (versus metal or tarpaulin) are all significantly associated with lower HotSat-1 TIR values. Deployed across the city of Dar es Salaam, Tanzania, the proposed framework illustrates how household-level inequalities in heat exposure - often linked to socio-economic disadvantage and reflected in building materials - can be identified and addressed using machine learning. Our results point to the critical role of localized, data-driven risk assessment in shaping climate adaptation strategies that deliver equitable outcomes.
Authors:Jiajun Sun, Yangyi Ou, Haoyuan Zheng, Chao yang, Yue Ma
Abstract:
In complex environments, autonomous robot navigation and environmental perception pose higher requirements for SLAM technology. This paper presents a novel method for semantically enhancing 3D point cloud maps with thermal information. By first performing pixel-level fusion of visible and infrared images, the system projects real-time LiDAR point clouds onto this fused image stream. It then segments heat source features in the thermal channel to instantly identify high temperature targets and applies this temperature information as a semantic layer on the final 3D map. This approach generates maps that not only have accurate geometry but also possess a critical semantic understanding of the environment, making it highly valuable for specific applications like rapid disaster assessment and industrial preventive maintenance.
Authors:Chao Yang, Haoyuan Zheng, Yue Ma
Abstract:
Traditional two-dimensional thermography, despite being non-invasive and useful for defect detection in the construction field, is limited in effectively assessing complex geometries, inaccessible areas, and subsurface defects. This paper introduces Thermo-LIO, a novel multi-sensor system that can enhance Structural Health Monitoring (SHM) by fusing thermal imaging with high-resolution LiDAR. To achieve this, the study first develops a multimodal fusion method combining thermal imaging and LiDAR, enabling precise calibration and synchronization of multimodal data streams to create accurate representations of temperature distributions in buildings. Second, it integrates this fusion approach with LiDAR-Inertial Odometry (LIO), enabling full coverage of large-scale structures and allowing for detailed monitoring of temperature variations and defect detection across inspection cycles. Experimental validations, including case studies on a bridge and a hall building, demonstrate that Thermo-LIO can detect detailed thermal anomalies and structural defects more accurately than traditional methods. The system enhances diagnostic precision, enables real-time processing, and expands inspection coverage, highlighting the crucial role of multimodal sensor integration in advancing SHM methodologies for large-scale civil infrastructure.
Authors:Chao Yang, Haoyuan Zheng, Yue Ma
Abstract:
This paper addresses the critical bottleneck of infrared (IR) data scarcity in Printed Circuit Board (PCB) defect detection by proposing a cross-modal data augmentation framework integrating CycleGAN and YOLOv8. Unlike conventional methods relying on paired supervision, we leverage CycleGAN to perform unpaired image-to-image translation, mapping abundant visible-light PCB images into the infrared domain. This generative process synthesizes high-fidelity pseudo-IR samples that preserve the structural semantics of defects while accurately simulating thermal distribution patterns. Subsequently, we construct a heterogeneous training strategy that fuses generated pseudo-IR data with limited real IR samples to train a lightweight YOLOv8 detector. Experimental results demonstrate that this method effectively enhances feature learning under low-data conditions. The augmented detector significantly outperforms models trained on limited real data alone and approaches the performance benchmarks of fully supervised training, proving the efficacy of pseudo-IR synthesis as a robust augmentation strategy for industrial inspection.
Authors:Mohammad Walid Charrwi, Zaid Hussain
Abstract:
As Network-on-Chip (NoC) and Wireless Sensor Network architectures continue to scale, the topology of the underlying network becomes a critical factor in performance. Gaussian Interconnected Networks based on the arithmetic of Gaussian integers, offer attractive properties regarding diameter and symmetry. Despite their attractive theoretical properties, adaptive routing techniques in these networks are vulnerable to node and link faults, leading to rapid degradation in communication reliability. Node failures (particularly those following Gaussian distributions, such as thermal hotspots or physical damage clusters) pose severe challenges to traditional deterministic routing. This paper proposes a fault-aware Reinforcement Learning (RL) routing scheme tailored for Gaussian Interconnected Networks. By utilizing a PPO (Proximal Policy Optimization) agent with a specific reward structure designed to penalize fault proximity, the system dynamically learns to bypass faulty regions. We compare our proposed RL-based routing protocol against a greedy adaptive shortest-path routing algorithm. Experimental results demonstrate that the RL agent significantly outperforms the adaptive routing sustaining a Packet Delivery Ratio (PDR) of 0.95 at 40% fault density compared to 0.66 for the greedy. Furthermore, the RL approach exhibits effective delivery rates compared to the greedy adaptive routing, particularly under low network load of 20% at 0.57 vs. 0.43, showing greater proficiency in managing congestion, validating its efficacy in stochastic, fault-prone topologies
Authors:Jannes Nys, Juan Carrasquilla
Abstract:
We introduce fermionic neural Gibbs states (fNGS), a variational framework for modeling finite-temperature properties of strongly interacting fermions. fNGS starts from a reference mean-field thermofield-double state and uses neural-network transformations together with imaginary-time evolution to systematically build strong correlations. Applied to the doped Fermi-Hubbard model, a minimal lattice model capturing essential features of strong electronic correlations, fNGS accurately reproduces thermal energies over a broad range of temperatures, interaction strengths, even at large dopings, for system sizes beyond the reach of exact methods. These results demonstrate a scalable route to studying finite-temperature properties of strongly correlated fermionic systems beyond one dimension with neural-network representations of quantum states.
Authors:Mojtaba Fanoodi, Farzaneh Abdollahi, Mahdi Aliyari Shoorehdeli
Abstract:
This paper presents a novel fault-tolerant control framework for steam temperature regulation in Heat Recovery Steam Generators (HRSGs) subject to actuator faults. Addressing the critical challenge of valve degradation in superheater spray attemperators, we propose a synergistic architecture comprising three components: (1) a Sliding Mode Observer (SMO) for estimation of unmeasured thermal states, (2) a Physics-Informed Neural Network (PINN) for estimating multiplicative actuator faults using physical laws as constraints, and (3) a one-sided Sliding Mode Controller (SMC) that adapts to the estimated faults while minimizing excessive actuation. The key innovation lies in the framework of closed-loop physics-awareness, where the PINN continuously informs both the observer and controller about fault severity while preserving thermodynamic consistency. Rigorous uniform ultimate boundedness (UUB) is established via Lyapunov analysis under practical assumptions. Validated on real HRSG operational data, the framework demonstrates effective fault adaptation, reduced temperature overshoot, and maintains steam temperature within 1°C of the setpoint under valve effectiveness loss. This work bridges control theory and physics-guided machine learning to deliver a practically deployable solution for power plant resilience, with extensions applicable to thermal systems subject to multiplicative faults.
Authors:Minseong Kweon, Janghyun Kim, Ukcheol Shin, Jinsun Park
Abstract:
Recent advances in Neural Radiance Fields (NeRFs) and 3D Gaussian Splatting (3DGS) have achieved considerable performance in RGB scene reconstruction. However, multi-modal rendering that incorporates thermal infrared imagery remains largely underexplored. Existing approaches tend to neglect distinctive thermal characteristics, such as heat conduction and the Lambertian property. In this study, we introduce MrGS, a multi-modal radiance field based on 3DGS that simultaneously reconstructs both RGB and thermal 3D scenes. Specifically, MrGS derives RGB- and thermal-related information from a single appearance feature through orthogonal feature extraction and employs view-dependent or view-independent embedding strategies depending on the degree of Lambertian reflectance exhibited by each modality. Furthermore, we leverage two physics-based principles to effectively model thermal-domain phenomena. First, we integrate Fourier's law of heat conduction prior to alpha blending to model intensity interpolation caused by thermal conduction between neighboring Gaussians. Second, we apply the Stefan-Boltzmann law and the inverse-square law to formulate a depth-aware thermal radiation map that imposes additional geometric constraints on thermal rendering. Experimental results demonstrate that the proposed MrGS achieves high-fidelity RGB-T scene reconstruction while reducing the number of Gaussians.
Authors:Camille Dionne-Pierre, Samuel Foucher, Jérôme Théau, Jérôme Lemaître, Patrick Charbonneau, Maxime Brousseau, Mathieu Varin
Abstract:
Efficient wildlife monitoring methods are necessary for biodiversity conservation and management. The combination of remote sensing, aerial imagery and deep learning offer promising opportunities to renew or improve existing survey methods. The complementary use of visible (VIS) and thermal infrared (TIR) imagery can add information compared to a single-source image and improve results in an automated detection context. However, the alignment and fusion process can be challenging, especially since visible and thermal images usually have different fields of view (FOV) and spatial resolutions. This research presents a case study on the great blue heron (Ardea herodias) to evaluate the performances of synchronous aerial VIS and TIR imagery to automatically detect individuals and nests using a YOLO11n model. Two VIS-TIR fusion methods were tested and compared: an early fusion approach and a late fusion approach, to determine if the addition of the TIR image gives any added value compared to a VIS-only model. VIS and TIR images were automatically aligned using a deep learning model. A principal component analysis fusion method was applied to VIS-TIR image pairs to form the early fusion dataset. A classification and regression tree was used to process the late fusion dataset, based on the detection from the VIS-only and TIR-only trained models. Across all classes, both late and early fusion improved the F1 score compared to the VIS-only model. For the main class, occupied nest, the late fusion improved the F1 score from 90.2 (VIS-only) to 93.0%. This model was also able to identify false positives from both sources with 90% recall. Although fusion methods seem to give better results, this approach comes with a limiting TIR FOV and alignment constraints that eliminate data. Using an aircraft-mounted very high-resolution visible sensor could be an interesting option for operationalizing surveys.
Authors:Shailendra K. Rathor, Lina Jaurigue, Martin Ziegler, Jörg Schumacher
Abstract:
Reservoir computing (RC) is a powerful framework for predicting nonlinear dynamical systems, yet the role of reservoir topology$-$particularly symmetry in connectivity and weights$-$remains not adequately understood. This work investigates how the structure of the network influences the performance of RC in four systems of increasing complexity: the Mackey-Glass system with delayed-feedback, two low-dimensional thermal convection models, and a three-dimensional shear flow model exhibiting transition to turbulence. Using five reservoir topologies in which connectivity patterns and edge weights are controlled independently, we evaluate both direct- and cross-prediction tasks. The results show that symmetric reservoir networks substantially improve prediction accuracy for the convection-based systems, especially when the input dimension is smaller than the number of degrees of freedom. In contrast, the shear-flow model displays almost no sensitivity to topological symmetry due to its strongly chaotic high-dimensional dynamics. These findings reveal how structural properties of reservoir networks affect their ability to learn complex dynamics and provide guidance for designing more effective RC architectures.
Authors:Xinyuan Liao, Shaowei Chen, Shuai Zhao
Abstract:
Accurate and efficient thermal dynamics models of permanent magnet synchronous motors are vital to efficient thermal management strategies. Physics-informed methods combine model-based and data-driven methods, offering greater flexibility than model-based methods and superior explainability compared to data-driven methods. Nonetheless, there are still challenges in balancing real-time performance, estimation accuracy, and explainability. This paper presents a hardware-efficient complex neural dynamics model achieved through the linear decoupling, diagonalization, and reparameterization of the state-space model, introducing a novel paradigm for the physics-informed method that offers high explainability and accuracy in electric motor temperature estimation tasks. We validate this physics-informed method on an NVIDIA A800 GPU using the JAX machine learning framework, parallel prefix sum algorithm, and Compute Unified Device Architecture (CUDA) platform. We demonstrate its superior estimation accuracy and parallelizable hardware acceleration capabilities through experimental evaluation on a real electric motor.
Authors:Xuxin Yang, Xue Yuan, Donghan Feng, Siru Chen, Yuanhao Feng
Abstract:
Constructing clean and low-carbon rural integrated energy system (RIES) is a fundamental requirement for supporting China's rural modernization and new-type urbanization. Existing research on RIES decarbonization primarily focuses on the optimal low-carbon operation of system-level energy devices at the macro level, while the synergistic carbon-reduction effects of demand-side flexible loads and external carbon trading mechanisms have not been fully explored. Meanwhile, at the micro level, the carbon sensitivity of device parameters and their potential contribution to emission reduction remain insufficiently investigated. To address these gaps, this study integrates macro- and micro-level analyses. At the macro level, a multi-energy-coupled low-carbon optimal operation framework is developed, incorporating coordinated electric-thermal demand response (DR) and carbon trading. At the micro level, a carbon emission model for RIES components is established, and sensitivity analysis is conducted on 28 carbon-related parameters to identify highly sensitive determinants of emission reduction. Case studies based on typical operation data from a rural region in northern China demonstrate that coordinated electric-thermal DR and carbon trading can achieve maximum carbon-reduction potential. Furthermore, the identified high-sensitivity parameters provide essential theoretical guidance for enhancing the decarbonization potential of RIES.
Authors:Marco Kurzynski, Shaizeen Aga, Di Wu
Abstract:
GPU systems are increasingly powering modern datacenters at scale. Despite being highly performant, GPU systems suffer from performance variation at the node and cluster levels. Such performance variation significantly impacts both high-performance computing and artificial intelligence workloads, such as cutting-edge large language models (LLMs). We analyze the performance of a single-node multi-GPU system running LLM training, and observe that the kernel-level performance variation is highly correlated with concurrent computation communication (C3), a technique to overlap computation and communication across GPUs for performance gains. We then take a further step to reason that thermally induced straggling coupling with C3 impacts performance variation, coined as the Lit Silicon effect. Lit Silicon describes that in a multi-GPU node, thermal imbalance across GPUs introduces node-level straggler GPUs, which in turn slow down the leader GPUs. Lit Silicon leads to node-level performance variation and inefficiency, impacting the entire datacenter from the bottom up. We propose analytical performance and power models for Lit Silicon, to understand the potential system-level gains. We further design simple detection and mitigation techniques to effectively address the Lit Silicon problem, and evaluate three different power management solutions, including power optimization under GPU thermal design power, performance optimization under node-level GPU power capping, and performance optimization under node-level CPU power sloshing. We conduct experiments on two workloads on two AMD InstinctTM MI300X GPU systems under two LLM training frameworks, and observe up to 6% performance and 4% power improvements, potentially saving hundreds of millions of dollars in datacenters. Our solution is almost free lunch and can be effortlessly adopted in datacenters as a new node-level power management layer.
Authors:Theron Guo, Kento Kaneko, Claude Le Bris, Anthony T. Patera
Abstract:
We consider the thermal dunking problem, in which a solid body is suddenly immersed in a fluid of different temperature, and study both the temporal evolution of the solid and the associated Biot number -- a non-dimensional heat transfer coefficient characterizing heat exchange across the solid-fluid interface. We focus on the small-Biot-number regime. The problem is accurately described by the conjugate heat transfer (CHT) formulation, which couples the Navier-Stokes and energy equations in the fluid with the heat equation in the solid through interfacial continuity conditions. Because full CHT simulations are computationally expensive, simplified models are often used in practice. Starting from the coupled equations, we systematically reduce the formulation to the lumped-capacitance model, a single ordinary differential equation with a closed-form solution, based on two assumptions: time scale separation and a spatially uniform solid temperature. The total modeling error is decomposed into time homogenization and lumping contributions. We derive an asymptotic error bound for the lumping error, valid for general heterogeneous solids and spatially varying heat transfer coefficients. Building on this theoretical result, we introduce a computable upper bound expressed in measurable quantities for practical evaluation. Time scale separation is analyzed theoretically and supported by physical arguments and simulations, showing that large separation yields small time homogenization errors. In practice, the Biot number must be estimated from so-called empirical correlations, which are typically limited to specific canonical geometries. We propose a data-driven framework that extends empirical correlations to a broader range of geometries through learned characteristic length scales. All results are validated by direct numerical simulations up to Reynolds numbers of 10,000.
Authors:Sanghyeon Chang, Srikar Arani, Nishant Sai Nuthalapati, Youngjoon Suh, Nicholas Choi, Siavash Khodakarami, Md Rakibul Hasan Roni, Nenad Miljkovic, Aparna Chandramowlishwaran, Yoonjin Won
Abstract:
Flow boiling is an efficient heat transfer mechanism capable of dissipating high heat loads with minimal temperature variation, making it an ideal thermal management method. However, sudden shifts between flow regimes can disrupt thermal performance and system reliability, highlighting the need for accurate and low-latency real-time monitoring. Conventional optical imaging methods are limited by high computational demands and insufficient temporal resolution, making them inadequate for capturing transient flow behavior. To address this, we propose a real-time framework based on signals from neuromorphic sensors for flow regime classification. Neuromorphic sensors detect changes in brightness at individual pixels, which typically correspond to motion at edges, enabling fast and efficient detection without full-frame reconstruction, providing event-based information. We develop five classification models using both traditional image data and event-based data, demonstrating that models leveraging event data outperform frame-based approaches due to their sensitivity to dynamic flow features. Among these models, the event-based long short-term memory model provides the best balance between accuracy and speed, achieving 97.6% classification accuracy with a processing time of 0.28 ms. Our asynchronous processing pipeline supports continuous, low-latency predictions and delivers stable output through a majority voting mechanisms, enabling reliable real-time feedback for experimental control and intelligent thermal management.
Authors:Rahul Mishra, Sudhanshu Kumar Jha, Omar Faruq Osama, Bishnu Bhusal, Sneha Sudhakaran, Naresh Kshetri
Abstract:
Wireless Sensor Networks forms the backbone of modern cyber physical systems used in various applications such as environmental monitoring, healthcare monitoring, industrial automation, and smart infrastructure. Ensuring the reliability of data collected through these networks is essential as these data may contain anomalies due to many reasons such as sensor faults, environmental disturbances, or malicious intrusions. In this paper a lightweight and interpretable anomaly detection framework based on a first order Markov chain model has been proposed. The method discretizes continuous sensor readings into finite states and models the temporal dynamics of sensor transitions through a transition probability matrix. Anomalies are detected when observed transitions occur with probabilities below a computed threshold, allowing for real time detection without labeled data or intensive computation. The proposed framework was validated using the Intel Berkeley Research Lab dataset, as a case study on indoor environmental monitoring demonstrates its capability to identify thermal spikes, voltage related faults, and irregular temperature fluctuations with high precision. Comparative analysis with Z score, Hidden Markov Model, and Auto encoder based methods shows that the proposed Markov based framework achieves a balanced trade-off between accuracy, F1 score is 0.86, interoperability, and computational efficiency. The systems scalability and low resource footprint highlight its suitability for large-scale and real time anomaly detection in WSN deployments.
Authors:Matsive Ali, Blake Gassen, Sen Liu
Abstract:
This paper presents an integrated robotic fused deposition modeling additive manufacturing system featuring closed-loop thermal control and intelligent in-situ defect correction using a 6-degree of freedom robotic arm and an Oak-D camera. The robot arm end effector was modified to mount an E3D hotend thermally regulated by an IoT microcontroller, enabling precise temperature control through real-time feedback. Filament extrusion system was synchronized with robotic motion, coordinated via ROS2, ensuring consistent deposition along complex trajectories. A vision system based on OpenCV detects layer-wise defects position, commanding autonomous re-extrusion at identified sites. Experimental validation demonstrated successful defect mitigation in printing operations. The integrated system effectively addresses challenges real-time quality assurance. Inverse kinematics were used for motion planning, while homography transformations corrected camera perspectives for accurate defect localization. The intelligent system successfully mitigated surface anomalies without interrupting the print process. By combining real-time thermal regulation, motion control, and intelligent defect detection & correction, this architecture establishes a scalable and adaptive robotic additive manufacturing framework suitable for aerospace, biomedical, and industrial applications.
Authors:Rezvan Alamian, Sören Müller, Uwe Steinmetz, Christian Henrich, Stefan Goetz
Abstract:
This paper suggests a novel rotor-cooling shaft concept for high-performance electric motors that increases the effectiveness of cooling and is yet simple and cost-effective to manufacture. We investigate the thermal performance of four shaft geometries for rotor cooling in automotive applications. The proposed tooth-guided liquid-cooling shaft design aims to solve the high churning loss of conventional cooled rotor shafts due to internal vortex formation and their still limited heat transfer. Therefore, we optimize heat transfer efficiency and pressure management by incorporating cold-formed internal channels that restrict vortex formation beyond a degree that improves heat transfer. We evaluated key performance metrics, including heat transfer rate, outlet temperature, pressure drop, and velocity profiles, under varying rotational speeds, inlet flow rates, and coolant temperatures. Computational fluid analysis demonstrates that the tooth-guided design outperforms conventional hollow shafts and achieves up to 110% higher cooling efficiency at low rotational speeds, while it maintains comparable pressure levels. These findings provide practical insight into geometry-driven thermal optimization and offer a path toward improving the performance and durability of electric motors.
Authors:Victor Oliveira Ferreira, Wiebke Mainville, Vincent Raymond, Jean-Michel Lamarre, Antoine Hamel, Mikael Vaillant, Moncef Chioua, Bruno Blais
Abstract:
We present an experimental unit that realizes the ``multi-input, multi-output manifold'' thermal management technology proposed by Lamarre & Raymond (2023). The proposed setup can be used for experiments aimed at controlling spatiotemporal temperature distribution. Temperature control is achieved by impinging coolant fluid jets, leveraging a manifold of channels targeted to the surface. The direction of the fluid is controlled by shifting the role of channels between inputs, outputs, or closing them. Files associated with this work include Computer-Aided Design (CAD) STEP files, Gerber files to manufacture a Printed Circuit Board (PCB), and a Graphical User Interface (GUI) written in Python. We provide a step-by-step guide to assemble the experimental setup. We also provide instructions to interact with the setup through the GUI, which allows for real-time tracking of sample temperature and flow rates per flow control device. Additionally, we provide examples of usage of the setup, including system characterization with step response, Proportional-Integral-Derivative performance tracking, and disturbance rejection in a coupled system. Extending the application is accessible through the files provided in the open repository associated with this work. The active cooling device presents a safe, flexible, and complete design, allowing for lab-scale assessment of the performance of custom temperature control strategies using enclosed impinging jets.
Authors:Akhila Kambhatla, Ahmed R Khaled
Abstract:
Thermal weapon segmentation is crucial for surveillance and security applications, enabling robust detection under lowlight and visually obscured conditions where RGB-based systems fail. While convolutional neural networks (CNNs) dominate thermal segmentation literature, their ability to capture long-range dependencies and fine structural details is limited. Vision Transformers (ViTs), with their global context modeling capabilities, have achieved state-of-the-art results in RGB segmentation tasks, yet their potential in thermal weapon segmentation remains underexplored. This work adapts and evaluates four transformer-based architectures SegFormer, DeepLabV3\+, SegNeXt, and Swin Transformer for binary weapon segmentation on a custom thermal dataset comprising 9,711 images collected from real world surveillance videos and automatically annotated using SAM2. We employ standard augmentation strategies within the MMSegmentation framework to ensure robust model training and fair architectural comparison. Experimental results demonstrate significant improvements in segmentation performance: SegFormer-b5 achieves the highest mIoU (94.15\%) and Pixel Accuracy (97.04\%), while SegFormer-b0 provides the fastest inference speed (98.32 FPS) with competitive mIoU (90.84\%). SegNeXt-mscans offers balanced performance with 85.12 FPS and 92.24\% mIoU, and DeepLabV3\+ R101-D8 reaches 92.76\% mIoU at 29.86 FPS. The transformer architectures demonstrate robust generalization capabilities for weapon detection in low-light and occluded thermal environments, with flexible accuracy-speed trade-offs suitable for diverse real-time security applications.
Authors:Divya Bhardwaj, Arnav Ramamoorthy, Poonam Goyal
Abstract:
Concealed weapon detection aims at detecting weapons hidden beneath a person's clothing or luggage. Various imaging modalities like Millimeter Wave, Microwave, Terahertz, Infrared, etc., are exploited for the concealed weapon detection task. These imaging modalities have their own limitations, such as poor resolution in microwave imaging, privacy concerns in millimeter wave imaging, etc. To provide a real-time, 24 x 7 surveillance, low-cost, and privacy-preserved solution, we opted for thermal imaging in spite of the lack of availability of a benchmark dataset. We propose a novel approach and a dataset for concealed weapon detection in thermal imagery. Our YOLO-based architecture, DEF-YOLO, is built with key enhancements in YOLOv8 tailored to the unique challenges of concealed weapon detection in thermal vision. We adopt deformable convolutions at the SPPF layer to exploit multi-scale features; backbone and neck layers to extract low, mid, and high-level features, enabling DEF-YOLO to adaptively focus on localization around the objects in thermal homogeneous regions, without sacrificing much of the speed and throughput. In addition to these simple yet effective key architectural changes, we introduce a new, large-scale Thermal Imaging Concealed Weapon dataset, TICW, featuring a diverse set of concealed weapons and capturing a wide range of scenarios. To the best of our knowledge, this is the first large-scale contributed dataset for this task. We also incorporate focal loss to address the significant class imbalance inherent in the concealed weapon detection task. The efficacy of the proposed work establishes a new benchmark through extensive experimentation for concealed weapon detection in thermal imagery.
Authors:Connor G. McMahan, Gavin Chang, Raymond Nguyen, Souren Soukiazian, David A. Smith, Tobias Schaedler, David Shahan
Abstract:
In this study, we demonstrate the first realization of wireless strain and temperature sensing within 3D-printed metallic structures using standard electromagnetic inspection hardware. This establishes a path toward need-based parts maintenance driven by accurate damage assessments instead of relying on regularly scheduled maintenance teardowns, extending the service intervals of structures operating in harsh environments. To this end, we encapsulate magnetoelastic and thermomagnetic materials inside microtubes and embed the sensing elements during additive manufacturing. Mechanical and thermal stimuli affect the magnetic permeability of the embedded materials, which modulates the impedance of a coil placed on or near the surface of the printed part. We demonstrate strain sensing accurate to +/-27x10-6 over at least a 6x10-4 strain range, and temperature sensing accurate to +/-0.75oC over a 70oC range, both to a 95% confidence interval. We highlight these sensors' capabilities by detecting the onset of plasticity and fatigue-driven crack growth thousands of cycles before critical failure. This extends non-destructive eddy-current damage detection to accurate, real-time strain and temperature monitoring within metallic structures.
Authors:Yibo Chen, Eren Kurshan, Dave Motschman, Charles Johnson, Yuan Xie
Abstract:
3-D integrated circuits (3-D ICs) offer performance advantages due to their increased bandwidth and reduced wire-length enabled by through-silicon-via structures (TSVs). Traditionally TSVs have been considered to improve the thermal conductivity in the vertical direction. However, the lateral thermal blockage effect becomes increasingly important for TSV via farms (a cluster of TSV vias used for signal bus connections between layers) because the TSV size and pitch continue to scale in μm range and the metal to insulator ratio becomes smaller. Consequently, dense TSV farms can create lateral thermal blockages in thinned silicon substrate and exacerbate the local hotspots. In this paper, we propose a thermal-aware via farm placement technique for 3-D ICs to minimize lateral heat blockages caused by dense signal bus TSV structures.
Authors:Maksym Szemer, Szymon Buchaniec, Grzegorz Brus
Abstract:
The microstructure critically governs the properties of materials used in energy and chemical engineering technologies, from catalysts and filters to thermal insulators and sensors. Therefore, accurate design is based on quantitative descriptors of microstructural features. Here we show that eight key descriptors can be extracted by a single workflow that fuses computational topology with assembly-learning-based regression. First, 1312 synthetic three-dimensional microstructures were generated and evaluated using established algorithms, and a labeled data set of ground-truth parameters was built. Converting every structure into a persistence image allowed us to train a deep neural network that predicts the eight descriptors. In an independent test set, the model achieved on average R^2 ~ 0.84 and Pearson r ~ 0.92, demonstrating both precision and generality. The approach provides a unified and scalable tool for rapid characterization of functional porous materials.
Authors:Sabin Roman, Gregor Skok, Ljupco Todorovski, Saso Dzeroski
Abstract:
This article explores novel data-driven modeling approaches for analyzing and approximating the Universal Thermal Climate Index (UTCI), a physiologically-based metric integrating multiple atmospheric variables to assess thermal comfort. Given the nonlinear, multivariate structure of UTCI, we investigate symbolic and sparse regression techniques as tools for interpretable and efficient function approximation. In particular, we highlight the benefits of using orthogonal polynomial bases-such as Legendre polynomials-in sparse regression frameworks, demonstrating their advantages in stability, convergence, and hierarchical interpretability compared to standard polynomial expansions. We demonstrate that our models achieve significantly lower root-mean squared losses than the widely used sixth-degree polynomial benchmark-while using the same or fewer parameters. By leveraging Legendre polynomial bases, we construct models that efficiently populate a Pareto front of accuracy versus complexity and exhibit stable, hierarchical coefficient structures across varying model capacities. Training on just 20% of the data, our models generalize robustly to the remaining 80%, with consistent performance under bootstrapping. The decomposition effectively approximates the UTCI as a Fourier-like expansion in an orthogonal basis, yielding results near the theoretical optimum in the L2 (least squares) sense. We also connect these findings to the broader context of equation discovery in environmental modeling, referencing probabilistic grammar-based methods that enforce domain consistency and compactness in symbolic expressions. Taken together, these results illustrate how combining sparsity, orthogonality, and symbolic structure enables robust, interpretable modeling of complex environmental indices like UTCI - and significantly outperforms the state-of-the-art approximation in both accuracy and efficiency.
Authors:Bernardino D'Amico, Francesco Pomponi, Jay H. Arehart, Lina Khaddour
Abstract:
Reducing domestic energy demand is central to climate mitigation and fuel poverty strategies, yet the impact of energy efficiency interventions is highly heterogeneous. Using a causal machine learning model trained on nationally representative data of the English housing stock, we estimate average and conditional treatment effects of wall insulation on gas consumption, focusing on distributional effects across energy burden subgroups. While interventions reduce gas demand on average (by as much as 19 percent), low energy burden groups achieve substantial savings, whereas those experiencing high energy burdens see little to no reduction. This pattern reflects a behaviourally-driven mechanism: households constrained by high costs-to-income ratios (e.g. more than 0.1) reallocate savings toward improved thermal comfort rather than lowering consumption. Far from wasteful, such responses represent rational adjustments in contexts of prior deprivation, with potential co-benefits for health and well-being. These findings call for a broader evaluation framework that accounts for both climate impacts and the equity implications of domestic energy policy.
Authors:Dylan Stow, Russell Barnes, Eren Kurshan, Yuan Xie
Abstract:
Side-channel attacks are important security challenges as they reveal sensitive information about on-chip activities. Among such attacks, the thermal side-channel has been shown to disclose the activities of key functional blocks and even encryption keys. This paper proposes a novel approach to proactively conceal critical activities in the functional layers while minimizing the power dissipation by (i) leveraging inherent characteristics of 3D integration to protect from side-channel attacks and (ii) dynamically generating custom activity patterns to match the activity to be concealed in the functional layers. Experimental analysis shows that 3D technology combined with the proposed run-time algorithm effectively reduces the Side channel vulnerability Factor (SVF) below 0.05 and the Spatial Thermal Side-channel Factor (STSF) below 0.59.
Authors:Yasser Ashraf, Ahmed Sharshar, Velibor Bojkovic, Bin Gu
Abstract:
Spike cameras, bio-inspired vision sensors, asynchronously fire spikes by accumulating light intensities at each pixel, offering ultra-high energy efficiency and exceptional temporal resolution. Unlike event cameras, which record changes in light intensity to capture motion, spike cameras provide even finer spatiotemporal resolution and a more precise representation of continuous changes. In this paper, we introduce the first video action recognition (VAR) dataset using spike camera, alongside synchronized RGB and thermal modalities, to enable comprehensive benchmarking for Spiking Neural Networks (SNNs). By preserving the inherent sparsity and temporal precision of spiking data, our three datasets offer a unique platform for exploring multimodal video understanding and serve as a valuable resource for directly comparing spiking, thermal, and RGB modalities. This work contributes a novel dataset that will drive research in energy-efficient, ultra-low-power video understanding, specifically for action recognition tasks using spike-based data.
Authors:Ado Farsi, Nacime Bouziani, David A Ham
Abstract:
Although many problems in science and engineering are modelled by well-established PDEs, they often involve unknown or incomplete relationships, such as material constitutive laws or thermal response, that limit accuracy and generality. Existing surrogate-modelling approaches directly approximate PDE solutions but remain tied to a specific geometry, boundary conditions, and set of physical constraints. To address these limitations, we introduce a fully differentiable finite element-based machine learning (FEBML) framework that embeds trainable operators for unknown physics within a state-of-the-art, general FEM solver, enabling true end-to-end differentiation. At its core, FEBML represents each unknown operator as an encode-process-decode pipeline over finite-element degrees of freedom: field values are projected to nodal coefficients, transformed by a neural network, and then lifted back to a continuous FE function, ensuring the learned physics respects the variational structure. We demonstrate its versatility by recovering nonlinear stress-strain laws from laboratory tests, applying the learned model to a new mechanical scenario without retraining, and identifying temperature-dependent conductivity in transient heat flow.
Authors:Mikael Vaillant, Victor Oliveira Ferreira, Wiebke Mainville, Jean-Michel Lamarre, Vincent Raymond, Moncef Chioua, Bruno Blais
Abstract:
This study presents a surrogate model designed to predict the Nusselt number distribution in an enclosed impinging jet arrays, where each jet function independently and where jets can be transformed from inlets to outlets, leading to a vast number of possible flow arrangements. While computational fluid dynamics (CFD) simulations can model heat transfer with high fidelity, their cost prohibits real-time application such as model-based temperature control. To address this, we generate a CNN-based surrogate model that can predict the Nusselt distribution in real time. We train it with data from implicit large eddy computational fluid dynamics simulations (Re < 2,000). We train two distinct models, one for a five by one array of jets (83 simulations) and one for a three by three array of jets (100 simulations). We introduce a method to extrapolate predictions to higher Reynolds numbers (Re < 10,000) using a correlation-based scaling. The surrogate models achieve high accuracy, with a normalized mean average error below 2% on validation data for the five by one surrogate model and 0.6% for the three by three surrogate model. Experimental validation confirms the model's predictive capabilities. This work provides a foundation for model-based control strategies in advanced thermal management applications.
Authors:Manuel Kollmar, Adrian Bürger, Markus Bohlayer, Angelika Altmann-Dieses, Marco Braun, Moritz Diehl
Abstract:
Fifth generation district heating and cooling (5GDHC) networks accelerate the use of renewable energies in the heating sector and enable flexible, efficient and future-proof heating and cooling supply via a single network. Due to their low temperature level and high integration of renewables, 5GDHC systems pose new challenges for the modeling of these networks in order to simulate and test operational strategies. A particular feature is the use of uninsulated pipes, which allow energy exchange with the surrounding ground. Accurate modeling of this interaction is essential for reliable simulation and optimization. This paper presents a thermp-physical model of the pip connections, the surrounding soil, a latent heat storage in the form of an ice storage as a seasonal heat storage and the house transfer stations. The model is derived from mass and energy balances leading to ordinary differential equations (ODEs). Validation is performed using field date from the 5GDHC network in Gutach-Bleibach, Germany, which supplies heating and cooling to 30 modern buildings. With an average model deviation of 4.5 % in the normalized mean bias error (NMBE) and 15.9 % in the coefficient of the variation of the root mean square error (CVRMSE), the model's accuracy is validated against the available temperature measurements. The realistic representation of the thermal-hydraulic interactions between soil and pipes, as well as the heat flow within the network, confirms the accuracy of the model and its applicability for the simulation of 5GDHC systems. The model is made openly accessible under an open-source license.
Authors:M. Bernaschi, L. A. Fernandez, I. González-Adalid PemartÃn, E. Marinari, V. Martin-Mayor, G. Parisi, F. Ricci-Tersenghi, J. J. Ruiz-Lorenzo, D. Yllanes
Abstract:
Numerical simulations of models and theories that describe complex experimental systems $\unicode{x2014}$in fields like high-energy and condensed-matter physics$\unicode{x2014}$ are becoming increasingly important. Examples include lattice gauge theories, which can describe, among others, quantum chromodynamics (the Standard Model description of strong interactions between elementary particles), and spin-glass systems. Beyond fundamental research, these computational methods also find practical applications, among many others, in optimization, finance, and complex biological problems. However, Monte Carlo simulations, an important subcategory of these methods, are plagued by a major drawback: they are extremely greedy for (pseudo) random numbers. The total fraction of computer time dedicated to random-number generation increases as the hardware grows more sophisticated, and can get prohibitive for special-purpose computing platforms. We propose here a general-purpose microcanonical simulated annealing (MicSA) formalism that dramatically reduces such a burden. The algorithm is fully adapted to a massively parallel computation, as we show in the particularly demanding benchmark of the three-dimensional Ising spin glass. We carry out very stringent numerical tests of the new algorithm by comparing our results, obtained on GPUs, with high-precision standard (i.e., random-number-greedy) simulations performed on the Janus II custom-built supercomputer. In those cases where thermal equilibrium is reachable (i.e., in the paramagnetic phase), both simulations reach compatible values. More significantly, barring short-time corrections, a simple time rescaling suffices to map the MicSA off-equilibrium dynamics onto the results obtained with standard simulations.
Authors:Mehmet Ozgur Turkoglu, Selene Ledain, Helge Aasen
Abstract:
Crop type classification using optical satellite time series remains limited in its ability to generalize across seasons, particularly when crop phenology shifts due to inter-annual weather variability. This hampers real-world applicability in scenarios where current-year labels are unavailable. In addition, uncertainty quantification is often overlooked, which reduces the reliability of such approaches for operational crop monitoring. Inspired by ecophysiological principles of plant growth, we propose a simple, model-agnostic Thermal-Time-based Temporal Sampling (T3S) method that replaces calendar time with thermal time. By subsampling time series in this biologically meaningful way, our method highlights key periods within the growing season while reducing temporal redundancy and noise. We evaluate the T3S on a multi-year Sentinel-2 dataset covering the entirety of Switzerland, which allows us to assess all applied methods on unseen years. Compared to state-of-the-art baselines, our approach yields substantial improvements in classification accuracy and, critically, provides well-calibrated uncertainty estimates. Moreover, the T3S method excels in low-data regimes and enables significantly more accurate early-season classification. With just 10% of the training labels, it outperforms the current baseline in both accuracy and uncertainty calibration, and by the end of June, it achieves a performance similar to the full-season baseline model.
Authors:Sindhu Boddu, Arindam Mukherjee
Abstract:
This paper presents the deployment and performance evaluation of a quantized YOLOv4-Tiny model for real-time object detection in aerial emergency imagery on a resource-constrained edge device the Raspberry Pi 5. The YOLOv4-Tiny model was quantized to INT8 precision using TensorFlow Lite post-training quantization techniques and evaluated for detection speed, power consumption, and thermal feasibility under embedded deployment conditions. The quantized model achieved an inference time of 28.2 ms per image with an average power consumption of 13.85 W, demonstrating a significant reduction in power usage compared to its FP32 counterpart. Detection accuracy remained robust across key emergency classes such as Ambulance, Police, Fire Engine, and Car Crash. These results highlight the potential of low-power embedded AI systems for real-time deployment in safety-critical emergency response applications.
Authors:Qianchao Wang, Peng Sha, Leena Heistrene, Yuxuan Ding, Yaping Du
Abstract:
Data-driven soft sensors have been widely applied in complex industrial processes. However, the interpretable spatio-temporal features extraction by soft sensors remains a challenge. In this light, this work introduces a novel method termed spatio-temporal consistent and interpretable model (STCIM). First, temporal and spatial features are captured and aligned by a far topological spatio-temporal consistency extraction block. Then, the features are mapped into an interpretable latent space for further prediction by explicitly giving physical meanings to latent variables. The efficacy of the proposed STCIM is demonstrated through the modeling of two generated datasets and a real-life dataset of coal-fired power plants. The corresponding experiments show: 1) The generalization of STCIM outperforms other methods, especially in different operation situations. 2) The far topological spatio-temporal consistency is vital for feature alignment. 3) The hyper-parameters of physics-informed interpretable latent space loss decide the performance of STCIM.
Authors:An Zou, Yuankai Xu, Yinchen Ni, Jintao Chen, Yehan Ma, Jing Li, Christopher Gill, Xuan Zhang, Yier Jin
Abstract:
Accelerator-based heterogeneous architectures, such as CPU-GPU, CPU-TPU, and CPU-FPGA systems, are widely adopted to support the popular artificial intelligence (AI) algorithms that demand intensive computation. When deployed in real-time applications, such as robotics and autonomous vehicles, these architectures must meet stringent timing constraints. To summarize these achievements, this article presents a comprehensive survey of real-time scheduling techniques for accelerator-based heterogeneous platforms. It highlights key advancements from the past ten years, showcasing how proposed solutions have evolved to address the distinct challenges and requirements of these systems.
This survey begins with an overview of the hardware characteristics and common task execution models used in accelerator-based heterogeneous systems. It then categorizes the reviewed works based on soft and hard deadline constraints. For soft real-time approaches, we cover real-time scheduling methods supported by hardware vendors and strategies focusing on timing-critical scheduling, energy efficiency, and thermal-aware scheduling. For hard real-time approaches, we first examine support from processor vendors. We then discuss scheduling techniques that guarantee hard deadlines (with strict response time analysis). After reviewing general soft and hard real-time scheduling methods, we explore application- or scenario-driven real-time scheduling techniques for accelerator-enabled heterogeneous computing platforms. Finally, the article concludes with a discussion of open issues and challenges within this research area.
Authors:Shun Wang, Shun-Li Shang, Zi-Kui Liu, Wenrui Hao
Abstract:
Traditional entropy-based methods - such as cross-entropy loss in classification problems - have long been essential tools for quantifying uncertainty and disorder in data and developing artificial intelligence algorithms. However, the rapid growth of data across various domains has introduced new challenges, particularly the integration of heterogeneous datasets with intrinsic disparities. In this paper, we extend zentropy theory into the data science domain by introducing intrinsic entropy, enabling more effective learning from heterogeneous data sources. We propose a zentropy-enhanced neural network (ZENN) that simultaneously learns both energy and intrinsic entropy components, capturing the underlying structure of multi-source data. To support this, we redesign the neural network architecture to better reflect the intrinsic properties and variability inherent in diverse datasets. We demonstrate the effectiveness of ZENN on classification tasks and energy landscape reconstructions, showing its superior generalization capabilities and robustness-particularly in predicting high-order derivatives. As a practical application, we employ ZENN to reconstruct the Helmholtz energy landscape of Fe3Pt using data generated from DFT and capture key material behaviors, including negative thermal expansion and the critical point in the temperature-pressure space. Overall, our study introduces a novel approach for data-driven machine learning grounded in zentropy theory, highlighting ZENN as a versatile and robust deep learning framework for scientific problems involving complex, heterogeneous datasets.
Authors:Qianxi Fu, Youngjoon Suh, Xiaojing Zhang, Yoonjin Won
Abstract:
Phase change plays a critical role in thermal management systems, yet quantitative characterization of multiphase heat transfer remains limited by the challenges of measuring temperature fields in chaotic, rapidly evolving flow regimes. While computational methods offer spatiotemporal resolution in idealized cases, replicating complex experimental conditions remains prohibitively difficult. Here, we present a data-driven framework that leverages a conditional generative adversarial network (CGAN) to infer temperature fields from geometric phase contours in a canonical pool boiling configuration where advanced data collection techniques are restricted. Using high-speed imaging data and simulation-informed training, our model demonstrates the ability to reconstruct temperature fields with errors below 6%. We further show that standard data augmentation strategies are effective in enhancing both accuracy and physical plausibility of the predicted maps across both simulation and experimental datasets when precise physical constraints are not applicable. Our results highlight the potential of deep generative models to bridge the gap between observable multiphase phenomena and underlying thermal transport, offering a powerful approach to augment and interpret experimental measurements in complex two-phase systems.
Authors:Jeesuk Shin, Cheolwoong Kim, Sunwoong Yang, Minseo Lee, Sung Joong Kim, Joongoo Jeon
Abstract:
Severe accidents (SAs) in nuclear power plants have been analyzed using thermal-hydraulic (TH) system codes such as MELCOR and MAAP. These codes efficiently simulate the progression of SAs, while they still have inherent limitations due to their inconsistent finite difference schemes. The use of empirical schemes incorporating both implicit and explicit formulations inherently induces unidirectional coupling in multi-physics analyses. The objective of this study is to develop a novel numerical method for TH system codes using physics-informed neural network (PINN). They have shown strength in solving multi-physics due to the innate feature of neural networks-automatic differentiation. We propose a node-assigned PINN (NA-PINN) that is suitable for the control volume approach-based system codes. NA-PINN addresses the issue of spatial governing equation variation by assigning an individual network to each nodalization of the system code, such that spatial information is excluded from both the input and output domains, and each subnetwork learns to approximate a purely temporal solution. In this phase, we evaluated the accuracy of the PINN methods for the hydrodynamic module. In the 6 water tank simulation, PINN and NA-PINN showed maximum absolute errors of 1.678 and 0.007, respectively. It should be noted that only NA-PINN demonstrated acceptable accuracy. To the best of the authors' knowledge, this is the first study to successfully implement a system code using PINN. Our future work involves extending NA-PINN to a multi-physics solver and developing it in a surrogate manner.
Authors:Martin Cooney, Fernando Alonso-Fernandez
Abstract:
Crime is a critical problem -- which often takes place behind closed doors, posing additional difficulties for investigators. To bring hidden truths to light, evidence at indoor crime scenes must be documented before any contamination or degradation occurs. Here, we address this challenge from the perspective of artificial intelligence (AI), computer vision, and robotics: Specifically, we explore the use of a blimp as a "floating camera" to drift over and record evidence with minimal disturbance. Adopting a rapid prototyping approach, we develop a proof-of-concept to investigate capabilities required for manual or semi-autonomous operation. Consequently, our results demonstrate the feasibility of equipping indoor blimps with various components (such as RGB and thermal cameras, LiDARs, and WiFi, with 20 minutes of battery life). Moreover, we confirm the core premise: that such blimps can be used to observe crime scene evidence while generating little airflow. We conclude by proposing some ideas related to detection (e.g., of bloodstains), mapping, and path planning, with the aim of stimulating further discussion and exploration.
Authors:Daiyaan Arfeen, Dheevatsa Mudigere, Ankit More, Bhargava Gopireddy, Ahmet Inci, Gregory R. Ganger
Abstract:
LLM training is scaled up to 10Ks of GPUs by a mix of data-(DP) and model-parallel (MP) execution. Critical to achieving efficiency is tensor-parallel (TP; a form of MP) execution within tightly-coupled subsets of GPUs, referred to as a scale-up domain, and the larger the scale-up domain the better the performance. New datacenter architectures are emerging with more GPUs able to be tightly-coupled in a scale-up domain, such as moving from 8 GPUs to 72 GPUs connected via NVLink. Unfortunately, larger scale-up domains increase the blast-radius of failures, with a failure of single GPU potentially impacting TP execution on the full scale-up domain, which can degrade overall LLM training throughput dramatically. With as few as 0.1% of GPUs being in a failed state, a high TP-degree job can experience nearly 10% reduction in LLM training throughput. We propose nonuniform-tensor-parallelism (NTP) to mitigate this amplified impact of GPU failures. In NTP, a DP replica that experiences GPU failures operates at a reduced TP degree, contributing throughput equal to the percentage of still-functional GPUs. We also propose a rack-design with improved electrical and thermal capabilities in order to sustain power-boosting of scale-up domains that have experienced failures; combined with NTP, this can allow the DP replica with the reduced TP degree (i.e., with failed GPUs) to keep up with the others, thereby achieving near-zero throughput loss for large-scale LLM training.
Authors:Emma Hannula, Arttu Häkkinen, Antti Solonen, Felipe Uribe, Jana de Wiljes, Lassi Roininen
Abstract:
Improving the energy efficiency of building heating systems is crucial for reducing global energy consumption and greenhouse gas emissions. Traditional control methods rely on static heating curves that are based solely on outdoor temperature, neglecting system state measurements, such as indoor temperature, and free heat sources, such as solar gain. A more effective strategy is model predictive control (MPC), which optimizes heating control by incorporating system state predictions based on weather forecasts, among other factors. However, current industrial MPC solutions often employ simplified physics-inspired indoor temperature models, sacrificing accuracy for robustness and interpretability. To bridge this gap, we propose a partially stochastic deep learning (DL) architecture for building-specific indoor temperature modeling. Unlike most studies that evaluate model performance through simulations or limited test buildings, our experiments across a large dataset of 100 real-world buildings, covering various heating season conditions, demonstrate that the proposed model outperforms a widely used industrial physics-based model in predictive accuracy. The proposed DL architecture shows significant potential to improve thermal comfort and energy efficiency in heating MPC solutions. Although its computational cost is higher than that of the reference model, we discuss why this trade-off is manageable, even in large-scale applications. Unlike deterministic black-box approaches, the partially stochastic DL model offers a critical advantage by enabling pre-assessment of model feasibility through predictive uncertainty quantification. This work advances heating MPC, particularly for buildings with comprehensive datasets on their thermal behavior under various weather conditions.
Authors:Janghyun Kim, Minseong Kweon, Jinsun Park, Ukcheol Shin
Abstract:
Depth completion, which estimates dense depth from sparse LiDAR and RGB images, has demonstrated outstanding performance in well-lit conditions. However, due to the limitations of RGB sensors, existing methods often struggle to achieve reliable performance in harsh environments, such as heavy rain and low-light conditions. Furthermore, we observe that ground truth depth maps often suffer from large missing measurements in adverse weather conditions such as heavy rain, leading to insufficient supervision. In contrast, thermal cameras are known for providing clear and reliable visibility in such conditions, yet research on thermal-LiDAR depth completion remains underexplored. Moreover, the characteristics of thermal images, such as blurriness, low contrast, and noise, bring unclear depth boundary problems. To address these challenges, we first evaluate the feasibility and robustness of thermal-LiDAR depth completion across diverse lighting (eg., well-lit, low-light), weather (eg., clear-sky, rainy), and environment (eg., indoor, outdoor) conditions, by conducting extensive benchmarks on the MS$^2$ and ViViD datasets. In addition, we propose a framework that utilizes COntrastive learning and Pseudo-Supervision (COPS) to enhance depth boundary clarity and improve completion accuracy by leveraging a depth foundation model in two key ways. First, COPS enforces a depth-aware contrastive loss between different depth points by mining positive and negative samples using a monocular depth foundation model to sharpen depth boundaries. Second, it mitigates the issue of incomplete supervision from ground truth depth maps by leveraging foundation model predictions as dense depth priors. We also provide in-depth analyses of the key challenges in thermal-LiDAR depth completion to aid in understanding the task and encourage future research.
Authors:Simone Fasolato, Anirudh Allam, Simona Onori, Davide M. Raimondo
Abstract:
In parallel-connected cells, cell-to-cell (CtC) heterogeneities can lead to current and thermal gradients that may adversely impact the battery performance and aging. Sources of CtC heterogeneity include manufacturing process tolerances, poor module configurations, and inadequate thermal management. Understanding which CtC heterogeneity sources most significantly impact battery performance is crucial, as it can provide valuable insights. In this study, we use an experimentally validated electrochemical battery model to simulate hundreds of battery configurations, each consisting of four cells in parallel. We conduct a statistical analysis to evaluate the relative importance of key cell-level parameters, interconnection resistance, cell spacing, and location on performance and aging. The analysis reveals that heterogeneities in electrode active material volume fractions primarily impact module capacity, energy, and cell current, leading to substantial thermal gradients. However, to fully capture the output behavior, interconnection resistance, state of charge gradients and the effect of the temperature on parameter values must also be considered. Additionally, module design configurations, particularly cell location, exacerbate thermal gradients, accelerating long-term module degradation. This study also offers insights into optimizing cell arrangement during module design to reduce thermal gradients and enhance overall battery performance and longevity. Simulation results with four cells indicate a reduction of 51.8% in thermal gradients, leading to a 5.2% decrease in long-term energy loss.
Authors:Alexandra Watkins, Ritam Ghosh, Evan Chow, Nilanjan Sarkar
Abstract:
In augmented reality (AR), where digital content is overlaid onto the real world, realistic thermal feedback has been shown to enhance immersion. Yet current thermal feedback devices, heavily influenced by the needs of virtual reality, often hinder physical interactions and are ineffective for immersion in AR. To bridge this gap, we have identified three design considerations relevant for AR thermal feedback: indirect feedback to maintain dexterity, thermal passthrough to preserve real-world temperature perception, and spatiotemporal rendering for dynamic sensations. We then created a unique and innovative thermal feedback device that satisfies these criteria. Human subject experiments assessing perceptual sensitivity, object temperature matching, spatial pattern recognition, and moving thermal stimuli demonstrated the impact of our design, enabling realistic temperature discrimination, virtual object perception, and enhanced immersion. These findings demonstrate that carefully designed thermal feedback systems can bridge the sensory gap between physical and virtual interactions, enhancing AR realism and usability.
Authors:Roozbeh Siyadatzadeh, Mohsen Ansari, Muhammad Shafique, Alireza Ejlali
Abstract:
Embedded systems power many modern applications and must often meet strict reliability, real-time, thermal, and power requirements. Task replication can improve reliability by duplicating a task's execution to handle transient and permanent faults, but blindly applying replication often leads to excessive overhead and higher temperatures. Existing design-time methods typically choose the number of replicas based on worst-case conditions, which can waste resources under normal operation. In this paper, we present RL-TIME, a reinforcement learning-based approach that dynamically decides the number of replicas according to actual system conditions. By considering both the reliability target and a core-level Thermal Safe Power (TSP) constraint at run-time, RL-TIME adapts the replication strategy to avoid unnecessary overhead and overheating. Experimental results show that, compared to state-of-the-art methods, RL-TIME reduces power consumption by 63%, increases schedulability by 53%, and respects TSP 72% more often.
Authors:Myisha A. Chowdhury, Qiugang Lu
Abstract:
Accurate state of temperature (SOT) estimation for batteries is crucial for regulating their temperature within a desired range to ensure safe operation and optimal performance. The existing measurement-based methods often generate noisy signals and cannot scale up for large-scale battery packs. The electrochemical model-based methods, on the contrary, offer high accuracy but are computationally expensive. To tackle these issues, inspired by the equivalentcircuit voltage model for batteries, this paper presents a novel equivalent-circuit electro-thermal model (ECTM) for modeling battery surface temperature. By approximating the complex heat generation inside batteries with data-driven nonlinear (polynomial) functions of key measurable parameters such as state-of-charge (SOC), current, and terminal voltage, our ECTM is simplified into a linear form that admits rapid solutions. Such simplified ECTM can be readily identified with one single (one-shot) cycle data. The proposed model is extensively validated with benchmark NASA, MIT, and Oxford battery datasets. Simulation results verify the accuracy of the model, despite being identified with one-shot cycle data, in predicting battery temperatures robustly under different battery degradation status and ambient conditions.
Authors:Chinmay Patwardhan, Jonas Kusch
Abstract:
Thermal radiative transfer models physical phenomena ranging from supernovas in astrophysics to radiation from a hohlraum striking a fusion target in plasma physics. Transport and absorption of particles in radiative transfer at different rates lead to a complex interaction between the material and particles that involves highly varying time scales. Resolving these effects can require prohibitively small step sizes, which, combined with nonlinear effects and the particle density's high-dimensional phase space, render conventional numerical methods computationally expensive. This work presents an asymptotic--preserving, mass conservative, rank-adaptive, and parallel integrator for a macro--micro decomposition-based dynamical low-rank approximation of the thermal radiative transfer equations. The proposed integrator efficiently incorporates reflection-transmission type boundary conditions in the low-rank factors. It captures the nonlinear effects of thermal radiation and is energy stable with the step size restriction capturing both hyperbolic and parabolic CFL conditions. The efficacy of the proposed integrator is demonstrated with numerical experiments.
Authors:David Elkouss, Ananda G. Maity, Aditya Nema, Sergii Strelchuk
Abstract:
The majorization relation has found numerous applications in mathematics, quantum information and resource theory, and quantum thermodynamics, where it describes the allowable transitions between two physical states. In many cases, when state vector $x$ does not majorize state vector $y$, it is nevertheless possible to find a catalyst - another vector $z$ such that $x \otimes z$ majorizes $y \otimes z$. Determining the feasibility of such catalytic transformation typically involves checking an infinite set of inequalities. Here, we derive a finite sufficient set of inequalities that imply catalysis. Extending this framework to thermodynamics, we also establish a finite set of sufficient conditions for catalytic state transformations under thermal operations. For novel examples, we provide a software toolbox implementing these conditions.
Authors:Ruizhe Yang, Zhongkai Yi, Ying Xu, Guiyu Chen, Haojie Yang, Rong Yi, Tongqing Li, Miaozhe ShenJin Li, Haoxiang Gao, Hongyu Duan
Abstract:
The traditional heat-load generation pattern of combined heat and power generators has become a problem leading to renewable energy source (RES) power curtailment in cold regions, motivating the proposal of a planning model for alternative heat sources. The model aims to identify non-dominant capacity allocation schemes for heat pumps, thermal energy storage, electric boilers, and combined storage heaters to construct a Pareto front, considering both economic and sustainable objectives. The integration of various heat sources from both generation and consumption sides enhances flexibility in utilization. The study introduces a novel optimization algorithm, the adaptive multi-objective Bayesian optimization (AMBO). Compared to other widely used multi-objective optimization algorithms, AMBO eliminates predefined parameters that may introduce subjectivity from planners. Beyond the algorithm, the proposed model incorporates a noise term to account for inevitable simulation deviations, enabling the identification of better-performing planning results that meet the unique requirements of cold regions. What's more, the characteristics of electric-thermal coupling scenarios are captured and reflected in the operation simulation model to make sure the simulation is close to reality. Numerical simulation verifies the superiority of the proposed approach in generating a more diverse and evenly distributed Pareto front in a sample-efficient manner, providing comprehensive and objective planning choices.
Authors:Ãlex Solé, Albert Mosella-Montoro, Joan Cardona, Silvia Gómez-Coca, Daniel Aravena, Eliseo Ruiz, Javier Ruiz-Hidalgo
Abstract:
In diffraction-based crystal structure analysis, thermal ellipsoids, quantified via Anisotropic Displacement Parameters (ADPs), are critical yet challenging to determine. ADPs capture atomic vibrations, reflecting thermal and structural properties, but traditional computation is often expensive. This paper introduces CartNet, a novel graph neural network (GNN) for efficiently predicting crystal properties by encoding atomic geometry into Cartesian coordinates alongside the crystal temperature. CartNet integrates a neighbour equalization technique to emphasize covalent and contact interactions, and a Cholesky-based head to ensure valid ADP predictions. We also propose a rotational SO(3) data augmentation strategy during training to handle unseen orientations. An ADP dataset with over 200,000 experimental crystal structures from the Cambridge Structural Database (CSD) was curated to validate the approach. CartNet significantly reduces computational costs and outperforms existing methods in ADP prediction by 10.87%, while delivering a 34.77% improvement over theoretical approaches. We further evaluated CartNet on other datasets covering formation energy, band gap, total energy, energy above the convex hull, bulk moduli, and shear moduli, achieving 7.71% better results on the Jarvis Dataset and 13.16% on the Materials Project Dataset. These gains establish CartNet as a state-of-the-art solution for diverse crystal property predictions. Project website and online demo: https://www.ee.ub.edu/cartnet
Authors:Zhengwen Shen, Yulian Li, Han Zhang, Yuchen Weng, Jun Wang
Abstract:
RGB and thermal image fusion have great potential to exhibit improved semantic segmentation in low-illumination conditions. Existing methods typically employ a two-branch encoder framework for multimodal feature extraction and design complicated feature fusion strategies to achieve feature extraction and fusion for multimodal semantic segmentation. However, these methods require massive parameter updates and computational effort during the feature extraction and fusion. To address this issue, we propose a novel multimodal fusion network (EFNet) based on an early fusion strategy and a simple but effective feature clustering for training efficient RGB-T semantic segmentation. In addition, we also propose a lightweight and efficient multi-scale feature aggregation decoder based on Euclidean distance. We validate the effectiveness of our method on different datasets and outperform previous state-of-the-art methods with lower parameters and computation.
Authors:Edward J. Oughton, Dennies K. Bor, Michael Wiltberger, Robert Weigel, C. Trevor Gaunt, Ridvan Dogan, Liling Huang
Abstract:
There is growing concern about our vulnerability to space weather hazards and the disruption critical infrastructure failures could cause to society and the economy. However, the socio-economic impacts of space weather hazards, such as from geomagnetic storms, remain under-researched. This study introduces a novel framework to estimate the economic impacts of electricity transmission infrastructure failure due to space weather. By integrating existing geophysical and geomagnetically induced current (GIC) estimation models with a newly developed geospatial model of the Continental United States power grid, GIC vulnerabilities are assessed for a range of space weather scenarios. The approach evaluates multiple power network architectures, incorporating input-output economic modeling to translate business and population disruptions into macroeconomic impacts from GIC-related thermal heating failures. The results indicate a daily GDP loss from 6 billion USD to over 10 billion USD. Even under conservative GIC thresholds (75 A/ph) aligned with thermal withstand limits from the North American Electric Reliability Corporation (NERC), significant economic disruptions are evident. This study is limited by its restriction to thermal heating analysis, though GICs can also affect the grid through other pathways, such as voltage instability and harmonic distortions. Addressing these other failure mechanisms need to be the focus of future research.
Authors:Stephen Whitelam, Corneel Casert
Abstract:
We present the design for a thermodynamic computer that can perform arbitrary nonlinear calculations in or out of equilibrium. Simple thermodynamic circuits, fluctuating degrees of freedom in contact with a thermal bath and confined by a quartic potential, display an activity that is a nonlinear function of their input. Such circuits can therefore be regarded as thermodynamic neurons, and can serve as the building blocks of networked structures that act as thermodynamic neural networks, universal function approximators whose operation is powered by thermal fluctuations. We simulate a digital model of a thermodynamic neural network, and show that its parameters can be adjusted by genetic algorithm to perform nonlinear calculations at specified observation times, regardless of whether the system has attained thermal equilibrium. This work expands the field of thermodynamic computing beyond the regime of thermal equilibrium, enabling fully nonlinear computations, analogous to those performed by classical neural networks, at specified observation times.
Authors:Saakaar Bhatnagar, Andrew Comerford, Zelu Xu, Simone Reitano, Luigi Scrimieri, Luca Giuliano, Araz Banaeizadeh
Abstract:
Thermal runaway in lithium-ion batteries is a critical safety concern for the battery industry due to its potential to cause uncontrolled temperature rises and subsequent fires that can engulf the battery pack and its surroundings. Modeling and simulation offer cost-effective tools for designing strategies to mitigate thermal runaway. Accurately simulating the chemical kinetics of thermal runaway, commonly represented by systems of Arrhenius-based Ordinary Differential Equations (ODEs), requires fitting kinetic parameters to experimental calorimetry data, such as Accelerating Rate Calorimetry (ARC) measurements. However, existing fitting methods often rely on empirical assumptions and simplifications that compromise generality or require manual tuning during the fitting process. Particle Swarm Optimization (PSO) offers a promising approach for directly fitting kinetic parameters to experimental data. Yet, for systems created by multiple Arrhenius ODEs, the computational cost of fitting using a brute-force approach that searches the entire parameter space simultaneously can become prohibitive. This work introduces a divide-and-conquer approach based on PSO to fit N-equation Arrhenius ODE models to ARC data. The proposed method achieves more accurate parameter fitting compared to the brute-force method while maintaining low computational costs. The method is analyzed using two distinct ARC datasets, and the resulting models are further validated through simulations of 3D ARC and oven tests, showing excellent agreement with experimental data and alignment with expected trends.
Authors:Bryce Hopkins, Leo ONeill, Michael Marinaccio, Eric Rowell, Russell Parsons, Sarah Flanary, Irtija Nazim, Carl Seielstad, Fatemeh Afghah
Abstract:
The increasing accessibility of radiometric thermal imaging sensors for unmanned aerial vehicles (UAVs) offers significant potential for advancing AI-driven aerial wildfire management. Radiometric imaging provides per-pixel temperature estimates, a valuable improvement over non-radiometric data that requires irradiance measurements to be converted into visible images using RGB color palettes. Despite its benefits, this technology has been underutilized largely due to a lack of available data for researchers. This study addresses this gap by introducing methods for collecting and processing synchronized visual spectrum and radiometric thermal imagery using UAVs at prescribed fires. The included imagery processing pipeline drastically simplifies and partially automates each step from data collection to neural network input. Further, we present the FLAME 3 dataset, the first comprehensive collection of side-by-side visual spectrum and radiometric thermal imagery of wildland fires. Building on our previous FLAME 1 and FLAME 2 datasets, FLAME 3 includes radiometric thermal Tag Image File Format (TIFFs) and nadir thermal plots, providing a new data type and collection method. This dataset aims to spur a new generation of machine learning models utilizing radiometric thermal imagery, potentially trivializing tasks such as aerial wildfire detection, segmentation, and assessment. A single-burn subset of FLAME 3 for computer vision applications is available on Kaggle with the full 6 burn set available to readers upon request.
Authors:Zhipeng Lyu, Jinrong Su, Zhe Li, Xiang Li, Hanghang Yan, Lei Chen
Abstract:
Hybrid battery thermal management systems (HBTMS) combining active liquid cooling and passive phase change materials (PCM) cooling have shown a potential for the thermal management of lithium-ion batteries. However, the fill volume of coolant and PCM in hybrid cooling systems is limited by the size and weight of the HBTMS at high charge/discharge rates. These limitations result in reduced convective heat transfer from the coolant during discharge. The liquefaction rate of PCM is accelerated and the passive cooling effect is reduced. In this paper, we propose a compact hybrid cooling system with multi-inlet U-shaped microchannels for which the gap between channels is embedded by PCM/aluminum foam for compactness. Nanofluid cooling (NC) technology with better thermal conductivity is used. A pulsed flow function is further developed for enhanced cooling (EC) with reduced power consumption. An experimentally validated thermal-fluid dynamics model is developed to optimize operating conditions including coolant type, cooling direction, channel height, inlet flow rate, and cooling scheme. The results show that the hybrid cooling solution of NC+PCM+EC adopted by HBTMS further reduces the maximum temperature of the Li-ion battery by 3.44°C under a discharge rate of 1C at room temperature of 25°C with only a 5% increase in power consumption, compared to the conventional liquid cooling method for electric vehicles (EV). The average number of battery charges has increased by about 6 to 15 percent. The results of this study can help improve the range as well as driving safety of new energy EV.
Authors:Tengji Xu, Zeyu Luo, Shaojie Liu, Li Fan, Qiarong Xiao, Benshan Wang, Dongliang Wang, Chaoran Huang
Abstract:
AI models are essential in science and engineering, but recent advances are pushing the limits of traditional digital hardware. To address these limitations, physical neural networks (PNNs), which use physical substrates for computation, have gained increasing attention. However, developing effective training methods for PNNs remains a significant challenge. Current approaches, regardless of offline and online training, suffer from significant accuracy loss. Offline training is hindered by imprecise modeling, while online training yields device-specific models that can't be transferred to other devices due to manufacturing variances. Both methods face challenges from perturbations after deployment, such as thermal drift or alignment errors, which make trained models invalid and require retraining. Here, we address the challenges with both offline and online training through a novel technique called Sharpness-Aware Training (SAT), where we innovatively leverage the geometry of the loss landscape to tackle the problems in training physical systems. SAT enables accurate training using efficient backpropagation algorithms, even with imprecise models. PNNs trained by SAT offline even outperform those trained online, despite modeling and fabrication errors. SAT also overcomes online training limitations by enabling reliable transfer of models between devices. Finally, SAT is highly resilient to perturbations after deployment, allowing PNNs to continuously operate accurately under perturbations without retraining. We demonstrate SAT across three types of PNNs, showing it is universally applicable, regardless of whether the models are explicitly known. This work offers a transformative, efficient approach to training PNNs, addressing critical challenges in analog computing and enabling real-world deployment.
Authors:Jerome Gilles, Stephane Landeau, Tristan Dagobert, Philippe Chevalier, Christian Bolut
Abstract:
This paper deals with the problem of infrared image database generation for ATR assessment purposes. Huge databases are required to have quantitative and objective performance evaluations. We propose a method which superimpose targets and occultants on background under image quality metrics constraints to generate realistic images. We also propose a method to generate target signatures with intrinsic thermal variability based on 3D models plated with real infrared textures.
Authors:Jerome Gilles, Stephane Landeau, Tristan Dagobert, Philippe Chevalier, Christian Bolut
Abstract:
In this communication, we propose a method which permits to simulate images of targets in infrared imagery by superimposition of vehicle signatures in background, eventually with occultants. We develop a principle which authorizes us to generate different thermal configurations of target signatures. This method enables us to easily generate huge datasets for ATR algorithms performance evaluation.
Authors:Nikos Sakellariou, Antonios Lalas, Konstantinos Votis, Dimitrios Tzovaras
Abstract:
The unique cost, flexibility, speed, and efficiency of modern UAVs make them an attractive choice in many applications in contemporary society. This, however, causes an ever-increasing number of reported malicious or accidental incidents, rendering the need for the development of UAV detection and classification mechanisms essential. We propose a methodology for developing a system that fuses already processed multi-sensor data into a new Deep Neural Network to increase its classification accuracy towards UAV detection. The DNN model fuses high-level features extracted from individual object detection and classification models associated with thermal, optronic, and radar data. Additionally, emphasis is given to the model's Convolutional Neural Network (CNN) based architecture that combines the features of the three sensor modalities by stacking the extracted image features of the thermal and optronic sensor achieving higher classification accuracy than each sensor alone.
Authors:Daniel Andrés Arcones, Martin Weiser, Phaedon-Stelios Koutsourelakis, Jörg F. Unger
Abstract:
A key factor in ensuring the accuracy of computer simulations that model physical systems is the proper calibration of their parameters based on real-world observations or experimental data. Inevitably, uncertainties arise, and Bayesian methods provide a robust framework for quantifying and propagating these uncertainties to model predictions. Nevertheless, Bayesian methods paired with inexact models usually produce predictions unable to represent the observed datapoints. Additionally, the quantified uncertainties of these overconfident models cannot be propagated to other Quantities of Interest (QoIs) reliably. A promising solution involves embedding a model inadequacy term in the inference parameters, allowing the quantified model form uncertainty to influence non-observed QoIs. This paper introduces a more interpretable framework for embedding the model inadequacy compared to existing methods. To overcome the limitations of current approaches, we adapt the existing likelihood models to properly account for noise in the measurements and propose two new formulations designed to address their shortcomings. Moreover, we evaluate the performance of this inadequacy-embedding approach in the presence of discrepancies between measurements and model predictions, including noise and outliers. Particular attention is given to how the uncertainty associated with the model inadequacy term propagates to the QoIs, enabling a more comprehensive statistical analysis of prediction's reliability. Finally, the proposed approach is applied to estimate the uncertainty in the predicted heat flux from a transient thermal simulation using temperature observations.
Authors:Sijie Yang, Adrian Chong, Pengyuan Liu, Filip Biljecki
Abstract:
In response to climate change and urban heat island effects, enhancing human thermal comfort in cities is crucial for sustainable urban development. Traditional methods for investigating the urban thermal environment and corresponding human thermal comfort level are often resource intensive, inefficient, and limited in scope. To address these challenges, we (1) introduce a new concept named thermal affordance, which formalizes the integrated inherent capacity of a streetscape to influence human thermal comfort based on its visual and physical features; and (2) an efficient method to evaluate it (visual assessment of thermal affordance -- VATA), which combines street view imagery (SVI), online and in-field surveys, and statistical learning algorithms. VATA extracts five categories of image features from SVI data and establishes 19 visual-perceptual indicators for streetscape visual assessment. Using a multi-task neural network and elastic net regression, we model their chained relationship to predict and comprehend thermal affordance for Singapore. VATA predictions are validated with field-investigated OTC data, providing a cost-effective, scalable, and transferable method to assess the thermal comfort potential of urban streetscape. Moreover, we demonstrate its utility by generating a geospatially explicit mapping of thermal affordance, outlining a model update workflow for long-term urban-scale analysis, and implementing a two-stage prediction and inference approach (IF-VPI-VATA) to guide future streetscape improvements. This framework can inform streetscape design to support sustainable, liveable, and resilient urban environments.
Authors:Lucia Gordon, Nico Lang, Catherine Ressijac, Andrew Davies
Abstract:
Multimodal aerial data are used to monitor natural systems, and machine learning can significantly accelerate the classification of landscape features within such imagery to benefit ecology and conservation. It remains under-explored, however, how these multiple modalities ought to be fused in a deep learning model. As a step towards filling this gap, we study three strategies (Early fusion, Late fusion, and Mixture of Experts) for fusing thermal, RGB, and LiDAR imagery using a dataset of spatially-aligned orthomosaics in these three modalities. In particular, we aim to map three ecologically-relevant biophysical landscape features in African savanna ecosystems: rhino middens, termite mounds, and water. The three fusion strategies differ in whether the modalities are fused early or late, and if late, whether the model learns fixed weights per modality for each class or generates weights for each class adaptively, based on the input. Overall, the three methods have similar macro-averaged performance with Late fusion achieving an AUC of 0.698, but their per-class performance varies strongly, with Early fusion achieving the best recall for middens and water and Mixture of Experts achieving the best recall for mounds.
Authors:Shuwei Xing, Derek W. Cool, David Tessier, Elvis C. S. Chen, Terry M. Peters, Aaron Fenster
Abstract:
Liver tumor ablation procedures require accurate placement of the needle applicator at the tumor centroid. The lower-cost and real-time nature of ultrasound (US) has advantages over computed tomography (CT) for applicator guidance, however, in some patients, liver tumors may be occult on US and tumor mimics can make lesion identification challenging. Image registration techniques can aid in interpreting anatomical details and identifying tumors, but their clinical application has been hindered by the tradeoff between alignment accuracy and runtime performance, particularly when compensating for liver motion due to patient breathing or movement. Therefore, we propose a 2D-3D US registration approach to enable intra-procedural alignment that mitigates errors caused by liver motion. Specifically, our approach can correlate imbalanced 2D and 3D US image features and use continuous 6D rotation representations to enhance the model's training stability. The dataset was divided into 2388, 196 and 193 image pairs for training, validation and testing, respectively. Our approach achieved a mean Euclidean distance error of 2.28 mm $\pm$ 1.81 mm and a mean geodesic angular error of 2.99$^{\circ}$ $\pm$ 1.95$^{\circ}$, with a runtime of 0.22 seconds per 2D-3D US image pair. These results demonstrate that our approach can achieve accurate alignment and clinically acceptable runtime, indicating potential for clinical translation.
Authors:Baohe Zhang, Lilli Frison, Thomas Brox, Joschka Bödecker
Abstract:
Constrained Reinforcement Learning (RL) has emerged as a significant research area within RL, where integrating constraints with rewards is crucial for enhancing safety and performance across diverse control tasks. In the context of heating systems in the buildings, optimizing the energy efficiency while maintaining the residents' thermal comfort can be intuitively formulated as a constrained optimization problem. However, to solve it with RL may require large amount of data. Therefore, an accurate and versatile simulator is favored. In this paper, we propose a novel building simulator I4B which provides interfaces for different usages and apply a model-free constrained RL algorithm named constrained Soft Actor-Critic with Linear Smoothed Log Barrier function (CSAC-LB) to the heating optimization problem. Benchmarking against baseline algorithms demonstrates CSAC-LB's efficiency in data exploration, constraint satisfaction and performance.
Authors:Zhaojun Ruan, Libao Shi
Abstract:
Semidefinite programming (SDP) is widely acknowledged as one of the most effective methods for deriving the tightest lower bounds of the optimal power flow (OPF) problems. In this paper, an enhanced semidefinite relaxation model that integrates tighter λ-based quadratic convex relaxation, valid inequalities, and optimality-based bound tightening algorithms derived in accordance with the branch thermal limit boundary surface into the SDP framework is presented to further tighten the lower bounds of the feasible region of OPF problems, effectively combining the advantages of these recent advancements. Additionally, the utilization of chordal decomposition in the complex matrix formulation of SDP can significantly accelerate the solution time. Notably, for the same SDP problem, different chordal decompositions can result in varying solution time. To address this problem, this paper proposes a clique graph merging strategy within the complex matrix SDP framework, which assesses clique sizes and the computational burden on interior-point solvers, as well as reducing the need for hyperparameter tuning and further enhancing the solution efficiency. Finally, the proposed hybrid relaxation model is evaluated using MATPOWER and PGLib-OPF test cases, demonstrating its effectiveness in reducing the optimality gap and validating its computational performance on test cases with up to 13659-node.
Authors:Anirudh Tunga, Jordan Heim, Michael Mueterthies, Thomas Gruenwald, Jonathan Nistor
Abstract:
Accurately capturing the three dimensional power distribution within a reactor core is vital for ensuring the safe and economical operation of the reactor, compliance with Technical Specifications, and fuel cycle planning (safety, control, and performance evaluation). Offline (that is, during cycle planning and core design), a three dimensional neutronics simulator is used to estimate the reactor's power, moderator, void, and flow distributions, from which margin to thermal limits and fuel exposures can be approximated. Online, this is accomplished with a system of local power range monitors (LPRMs) designed to capture enough neutron flux information to infer the full nodal power distribution. Certain problems with this process, ranging from measurement and calibration to the power adaption process, pose challenges to operators and limit the ability to design reload cores economically (e.g., engineering in insufficient margin or more margin than required). Artificial intelligence (AI) and machine learning (ML) are being used to solve the problems to reduce maintenance costs, improve the accuracy of online local power measurements, and decrease the bias between offline and online power distributions, thereby leading to a greater ability to design safe and economical reload cores. We present ML models trained from two deep neural network (DNN) architectures, SurrogateNet and LPRMNet, that demonstrate a testing error of 1 percent and 3 percent, respectively. Applications of these models can include virtual sensing capability for bypassed or malfunctioning LPRMs, on demand virtual calibration of detectors between successive calibrations, highly accurate nuclear end of life determinations for LPRMs, and reduced bias between measured and predicted power distributions within the core.
Authors:Ryan L. Mann, Gabriel Waite
Abstract:
We establish efficient algorithms for weakly-interacting quantum spin systems at arbitrary temperature. In particular, we obtain a fully polynomial-time approximation scheme for the partition function and an efficient approximate sampling scheme for the thermal distribution over a classical spin space. Our approach is based on the cluster expansion method and a standard reduction from approximate sampling to approximate counting.
Authors:Xuguang Zhang, Hexiang Zhang, Hanqing Liu, Xiaoli Li, Mu Ying, Yutian Yang, Marilyn L. Minus, Ming Su, Yi Zheng
Abstract:
Passive daytime radiative cooling (PDRC) provides an energy-free approach to suppress surface temperatures by reflecting solar irradiation while emitting thermal radiation through the mid-infrared atmospheric window. Despite rapid progress in optical performance, most PDRC systems remain limited by rigid, fragile, or planar substrates, restricting their use on flexible, curved, or wearable surfaces. Here, we report a biocompatible and structurally robust PDRC system integrated onto a commercial rapid-curing fiberglass cast, a conformal substrate widely used in orthopedic and industrial applications. The cooling architecture adopts a bilayer polymer design consisting of a polyvinyl alcohol (PVA) adhesion layer and a polymethyl methacrylate (PMMA) protective layer, both embedded with calcium pyrophosphate (CPP) ceramic particles derived from processed animal bone waste. The bio-derived CPP simultaneously enables broadband solar scattering and high mid-infrared emittance, while offering sustainability and biocompatibility advantages. The resulting composite exhibits over 90% solar reflectance and achieves up to 15 C sub-ambient cooling under direct outdoor sunlight.
Authors:Xuguang Zhanga, Michael C. Halbig, Amjad Almansour, Mrityunjay Singh, Meelad Ranaiefar, Yi Zheng
Abstract:
Efficient thermal management is critical for ensuring the safety, performance, and durability of lithium ion pouch cells (LIPCs), particularly under high power operating conditions where conventional battery thermal management systems (BTMS) struggle to balance cooling effectiveness, structural simplicity, and weight. Here, we report a lightweight hybrid BTMS that synergistically integrates active liquid cooling with composite phase change material (CPCM) based thermal buffering through a 3D printed hexagonal architecture. The system is fabricated via a two step additive manufacturing process that enables sealed CPCM encapsulation and isolated liquid cooling pathways within a single carbon fiber reinforced nylon module, effectively eliminating leakage risks while allowing precise geometric control. Hexagonally partitioned CPCM cavities maximize the CPCM wall interfacial area and shorten internal conduction paths, accelerating latent heat absorption, while embedded serpentine liquid channels provide continuous convective heat removal and prevent CPCM saturation. A nanocarbon enhanced CPCM is employed to overcome the intrinsic low thermal conductivity of conventional paraffin based materials.
Authors:Muhammad Ibrahim Khan, Bivin Pradeep, James Brusey
Abstract:
Typical domestic immersion water heater systems are often operated continuously during winter, heating quickly rather than efficiently and ignoring predictable demand windows and ambient losses. We study deadline-aware control, where the aim is to reach a target temperature at a specified time while minimising energy consumption. We introduce an efficient Gymnasium environment that models an immersion hot water heater with first-order thermal losses and discrete on and off actions of 0 W and 6000 W applied every 120 seconds. Methods include a time-optimal bang-bang baseline, a zero-shot Monte Carlo Tree Search planner, and a Proximal Policy Optimisation policy. We report total energy consumption in watt-hours under identical physical dynamics. Across sweeps of initial temperature from 10 to 30 degrees Celsius, deadline from 30 to 90 steps, and target temperature from 40 to 80 degrees Celsius, PPO achieves the most energy-efficient performance at a 60-step horizon of 2 hours, using 3.23 kilowatt-hours, compared to 4.37 to 10.45 kilowatt-hours for bang-bang control and 4.18 to 6.46 kilowatt-hours for MCTS. This corresponds to energy savings of 26 percent at 30 steps and 69 percent at 90 steps. In a representative trajectory with a 50 kg water mass, 20 degrees Celsius ambient temperature, and a 60 degrees Celsius target, PPO consumes 54 percent less energy than bang-bang control and 33 percent less than MCTS. These results show that learned deadline-aware control reduces energy consumption under identical physical assumptions, while planners provide partial savings without training and learned policies offer near-zero inference cost once trained.
Authors:Jacob Linden, Travis Askham, Jeremy Hoskins
Abstract:
We present a boundary integral formulation of the Helmholtz equation with visco-thermal boundary conditions, in two dimensions. Such boundary conditions allow for the accurate simulation of viscous and thermal losses in the vicinity of the boundary, which are particularly relevant in acoustic devices with narrow features. Using cancellations between hyper-singular operators, a variant of the method of images technique, and analytic pre-conditioners, we derive integral equations that are Fredholm second-kind, up to the application of a boundedly invertible operator. This approach allows for the fast and accurate solution of acoustics problems with boundary layers.
Authors:Yuji Sakamoto, Masaki Aoi, Sho Suzuki, Takumi Haga, Shumpei Hosokawa, Yuma Abe, Yuya Tasaki, Tsuyoshi Totani, Sou Nakamura, Masaharu Uchiumi, Shinya Fujita
Abstract:
This paper describes the system design methodology derived from the development and evaluation tests of deployable solar panels to be mounted on a 3U CubeSat. The study mainly includes structural analysis, thermal analysis, and a review of vibration test results. Hokkaido University is developing the 3U CubeSat HOKUSHIN-1 in collaboration with Tohoku University and Muroran Institute of Technology. Deployable solar panels are a key technology for future planned lunar exploration missions, as they enable power-intensive communication and propulsion required for orbit control. The satellite also demonstrates a newly developed compact and efficient propulsion system. The satellite has dimensions of approximately 10x10x34 cm, a mass of 3.99 kg, and will be deployed into a circular orbit at an altitude of about 400 km with an orbital inclination of 51.6 degrees from the International Space Station.
Authors:Qi He, Chunyu Qu
Abstract:
AI data-center expansion is increasingly constrained by the coupled availability of deliverable electricity and heat-rejection (cooling) capacity. We propose and evaluate an integrated Waste-to-Energy-AI Data Center configuration that treats cooling as a first-class energy service rather than an unavoidable electricity burden. The coupled system is modeled as an input-output 'black box' with transparent boundaries and a standalone benchmark in which mechanical chilling is powered by grid electricity. The central mechanism is energy-grade matching: low-grade WtE thermal output drives absorption cooling to deliver chilled service, thereby displacing baseline cooling electricity. We show that thermoeconomic superiority is governed by three first-order determinants, (i) cooling coverage of IT heat load, (ii) parasitic electricity for transport and auxiliaries, and (iii) distance-driven delivery decay, yielding a break-even corridor beyond which net benefits vanish. Comparative statics characterize sensitivity to IT utilization, feedstock quality (waste LHV and throughput), climate parameterization, and corridor distance. We translate these accounting gains into decision language through a computable prototype for Levelized Cost of Computing (LCOC) and an ESG valuation channel grounded in measurable mechanisms, without re-deriving full lifecycle inventories. The framework provides siting-ready feasibility conditions for WtE-AIDC coupling in urban AI corridors under grid stress.
Authors:Lupiao Hu, Fasheng Wang, Fangmei Chen, Fuming Sun, Haojie Li
Abstract:
Existing RGB-T salient object detection methods predominantly rely on manually aligned and annotated datasets, struggling to handle real-world scenarios with raw, unaligned RGB-T image pairs. In practical applications, due to significant cross-modal disparities such as spatial misalignment, scale variations, and viewpoint shifts, the performance of current methods drastically deteriorates on unaligned datasets. To address this issue, we propose an efficient RGB-T SOD method for real-world unaligned image pairs, termed Thin-Plate Spline-driven Semantic Correlation Learning Network (TPS-SCL). We employ a dual-stream MobileViT as the encoder, combined with efficient Mamba scanning mechanisms, to effectively model correlations between the two modalities while maintaining low parameter counts and computational overhead. To suppress interference from redundant background information during alignment, we design a Semantic Correlation Constraint Module (SCCM) to hierarchically constrain salient features. Furthermore, we introduce a Thin-Plate Spline Alignment Module (TPSAM) to mitigate spatial discrepancies between modalities. Additionally, a Cross-Modal Correlation Module (CMCM) is incorporated to fully explore and integrate inter-modal dependencies, enhancing detection performance. Extensive experiments on various datasets demonstrate that TPS-SCL attains state-of-the-art (SOTA) performance among existing lightweight SOD methods and outperforms mainstream RGB-T SOD approaches.
Authors:Krishna Chaitanya Sunkara, Rambabu Konakanchi
Abstract:
AI data centers which are GPU centric, have adopted liquid cooling to handle extreme heat loads, but coolant leaks result in substantial energy loss through unplanned shutdowns and extended repair periods. We present a proof-of-concept smart IoT monitoring system combining LSTM neural networks for probabilistic leak forecasting with Random Forest classifiers for instant detection. Testing on synthetic data aligned with ASHRAE 2021 standards, our approach achieves 96.5% detection accuracy and 87% forecasting accuracy at 90% probability within plus or minus 30-minute windows. Analysis demonstrates that humidity, pressure, and flow rate deliver strong predictive signals, while temperature exhibits minimal immediate response due to thermal inertia in server hardware. The system employs MQTT streaming, InfluxDB storage, and Streamlit dashboards, forecasting leaks 2-4 hours ahead while identifying sudden events within 1 minute. For a typical 47-rack facility, this approach could prevent roughly 1,500 kWh annual energy waste through proactive maintenance rather than reactive emergency procedures. While validation remains synthetic-only, results establish feasibility for future operational deployment in sustainable data center operations.
Authors:Sudheer Mishra, Sundararajan Natarajan, Natarajan E
Abstract:
This work presents a new conforming stabilized virtual element method for the generalized Boussinesq equation with temperature-dependent viscosity and thermal conductivity. A gradient-based local projection stabilization method is introduced in the discrete formulation to circumvent the violation of the discrete inf-sup condition. The well-posedness of the continuous problem is established under sufficiently small datum. We derive a stabilized virtual element problem for the Boussinesq equation using equal-order virtual element approximations. The proposed method has several advantages, such as being more straightforward to implement, free from higher-order derivative terms, providing separate stabilization terms without introducing coupling between solution components, and minimizing the number of globally coupled degrees of freedom. The existence of a discrete solution to the stabilized virtual element problem is demonstrated using the Brouwer fixed-point theorem. The error estimates are derived in the energy norm. Additionally, several numerical examples are presented to show the efficiency and robustness of the proposed method, confirming the theoretical results.
Authors:Nidhi Malhotra, Amber K. Rothe, Revanth Konda, Jaydev P. Desai
Abstract:
Robotically steerable compliant surgical tools offer several advantages over rigid tools, including enhanced dexterity, reduced tissue damage, and the ability to generate non-linear trajectories in minimally invasive neurosurgical procedures. Many existing robotic neurosurgical tools are designed using stainless steel or nitinol materials. Using polymer-based materials instead can offer advantages such as reduced interference in magnetic resonance imaging, enhanced safety for guiding electrically powered instruments, and reduced tissue damage due to inherent compliance. Several polymer materials have been used in robotic surgical applications, such as polyimide, polycarbonate, and elastic resin. Various fabrication strategies have also been proposed, including standard microfabrication techniques, thermal drawing, and 3-D printing. In our previous work, a tendon-driven, notched-tube was designed for several neurosurgical robotic tools, utilizing laser micromachining to reduce the stiffness of the tube in certain directions. This fabrication method is desirable because it has a single-step process, has high precision, and does not require a cleanroom or harsh chemicals. Past studies have explored laser-micromachining of polymer material for surgical applications such as stent fabrication. In this work, we explore extending the use of the laser micromachining approach to the fabrication of polyimide (PI) robotically steerable cannulas for neurosurgical applications. Utilizing the method presented in this work, we fabricated joints as small as 1.5 mm outer diameter (OD). Multiple joints were fabricated using PI tubes of different ODs, and the loading behavior of the fabricated joints was experimentally characterized.
Authors:Jacqueline Borgstedt, Jake Bhattacharyya, Matteo Iovino, Frank E. Pollick, Stephen Brewster
Abstract:
Zoomorphic Socially Assistive Robots (SARs) offer an alternative source of social touch for individuals who cannot access animal companionship. However, current SARs provide only limited, passive touch-based interactions and lack the rich haptic cues, such as warmth, heartbeat or purring, that are characteristic of human-animal touch. This limits their ability to evoke emotionally engaging, life-like physical interactions. We present a multimodal tactile prototype, which was used to augment the established PARO robot, integrating thermal and vibrotactile feedback to simulate feeling biophysiological signals. A flexible heating interface delivers body-like warmth, while embedded actuators generate heartbeat-like rhythms and continuous purring sensations. These cues were iteratively designed and calibrated with input from users and haptics experts. We outline the design process and offer reproducible guidelines to support the development of emotionally resonant and biologically plausible touch interactions with SARs.
Authors:Jon Muhovič, Janez Perš
Abstract:
Unmanned surface vehicles can encounter a number of varied visual circumstances during operation, some of which can be very difficult to interpret. While most cases can be solved only using color camera images, some weather and lighting conditions require additional information. To expand the available maritime data, we present a novel multimodal maritime dataset MULTIAQUA (Multimodal Aquatic Dataset). Our dataset contains synchronized, calibrated and annotated data captured by sensors of different modalities, such as RGB, thermal, IR, LIDAR, etc. The dataset is aimed at developing supervised methods that can extract useful information from these modalities in order to provide a high quality of scene interpretation regardless of potentially poor visibility conditions. To illustrate the benefits of the proposed dataset, we evaluate several multimodal methods on our difficult nighttime test set. We present training approaches that enable multimodal methods to be trained in a more robust way, thus enabling them to retain reliable performance even in near-complete darkness. Our approach allows for training a robust deep neural network only using daytime images, thus significantly simplifying data acquisition, annotation, and the training process.
Authors:Enrique Rodríguez-Miranda, Pablo Otálora, José González-Hernández, José Luis Guzmán, Manuel Berenguel
Abstract:
This paper presents a benchmarking framework to evaluate process control strategies in outdoor microalgae raceway reactors, integrating four key control regulation tasks: pH, dissolved oxygen (DO), culture volume through coordinated harvest-dilution actions, and temperature via a sump-mounted spiral heat exchanger. The benchmark is built upon a high-fidelity, experimentally calibrated dynamic model that captures the strongly coupled thermal, physicochemical, and biological processes governing industrial-scale open raceway ponds. A closed-loop simulation environment is provided, featuring realistic actuator constraints, gas transport delays, stiff integration, and a fully specified scenario based on multi-day outdoor disturbances (irradiance, temperature, wind, and humidity). Four user-replaceable controllers define the manipulation of CO2 injection, air bubbling, harvest/dilution sequencing, and heat-exchanger operation. The platform computes a unified global performance index, in addition to individual metrics for each control problem, combining tracking error, gas and energy usage, and biomass productivity, enabling consistent and quantitative comparison of alternative control strategies. Baseline regulatory architectures (On/Off, PI/PID, and Economic Model Predictive Control (EMPC)) are included to illustrate the benchmark use for classical and advanced control methods. By providing an openly specified, reproducible, and computationally tractable benchmark with well-defined function interfaces, this work aims to bridge control methodology and outdoor algal bioprocess engineering, and to support the development of multivariable control strategies for disturbance-rich environmental systems.
Authors:Elizaveta Prozorova, Anton Konev, Vladimir Faerman
Abstract:
The article analyzes the use of thermal imaging technologies for biometric identification based on facial thermograms. It presents a comparative analysis of infrared spectral ranges (NIR, SWIR, MWIR, and LWIR). The paper also defines key requirements for thermal cameras used in biometric systems, including sensor resolution, thermal sensitivity, and a frame rate of at least 30 Hz. Siamese neural networks are proposed as an effective approach for automating the identification process. In experiments conducted on a proprietary dataset, the proposed method achieved an accuracy of approximately 80%. The study also examines the potential of hybrid systems that combine visible and infrared spectra to overcome the limitations of individual modalities. The results indicate that thermal imaging is a promising technology for developing reliable security systems.
Authors:Eric J. Elias, Michael Esswein, Jonathan P. How, David W. Miller
Abstract:
As the popularity of on-orbit operations grows, so does the need for precise navigation around unknown resident space objects (RSOs) such as other spacecraft, orbital debris, and asteroids. The use of Simultaneous Localization and Mapping (SLAM) algorithms is often studied as a method to map out the surface of an RSO and find the inspector's relative pose using a lidar or conventional camera. However, conventional cameras struggle during eclipse or shadowed periods, and lidar, though robust to lighting conditions, tends to be heavier, bulkier, and more power-intensive. Thermal-infrared cameras can track the target RSO throughout difficult illumination conditions without these limitations. While useful, thermal-infrared imagery lacks the resolution and feature-richness of visible cameras. In this work, images of a target satellite in low Earth orbit are photo-realistically simulated in both visible and thermal-infrared bands. Pixel-level fusion methods are used to create visible/thermal-infrared composites that leverage the best aspects of each camera. Navigation errors from a monocular SLAM algorithm are compared between visible, thermal-infrared, and fused imagery in various lighting and trajectories. Fused imagery yields substantially improved navigation performance over visible-only and thermal-only methods.
Authors:Hamidreza Moradi, Melika Filvantorkaman
Abstract:
Wearable biosensors increasingly require continuous and battery-free power sources, but conventional skin-mounted thermoelectric generators are limited by the small temperature differences available in real environments. This work introduces a hybrid thermoplasmonic and thermoelectric energy harvester that combines multiband plasmonic absorption with machine-learning-guided optimization to improve on-body energy conversion. A broadband metasurface made of cross-bowtie nanoantennas is designed to absorb infrared radiation across the 2 to 12 micron range, capturing human body emission, ambient infrared radiation, and near-infrared sunlight. Electromagnetic simulations show strong field enhancement in nanoscale antenna gaps, producing localized thermoplasmonic heating directly above flexible Bi2Te3 thermoelectric junctions. Coupled optical, thermal, and electrical modeling indicates that this localized heating increases the effective temperature difference from the typical 3 to 4 degrees C of standard wearable thermoelectric generators to approximately 13 degrees C. This results in a power density of about 0.15 mW per cm^2 under indoor-relevant infrared flux, representing a four- to six-fold improvement over existing flexible devices. A machine-learning surrogate model trained on multiphysics data predicts temperature rise and electrical output with high accuracy (R2 greater than 0.92) and identifies optimal device geometries through Pareto-front analysis. The proposed hybrid thermoplasmonic, thermoelectric, and machine-learning framework provides a scalable route toward more efficient, compact, and flexible energy harvesters for autonomous and long-term wearable physiological monitoring.
Authors:Yueran Zhao, Chang-Sheng Mei, Nathan J. McDannold, Shenyan Zong, Guofeng Shen
Abstract:
Background: Accurate proton resonance frequency (PRF) MR thermometry is essential for monitoring temperature rise during thermal ablation with high intensity focused ultrasound (FUS). Conventional referenceless methods such as complex field estimation (CFE) and phase finite difference (PFD) tend to exhibit errors when susceptibility-induced phase discontinuities occur at tissue interfaces.
Authors:Rosana Caro, Lorena Cruz, Arturo Martinez, Pablo S. Naharro, Santiago Muelas, Kevin King Sancho, Elena Cuerda, Maria del Mar Barbero-Barrera, Antonio LaTorre
Abstract:
Designing Zero-Emissions Buildings (ZEBs) involves balancing numerous complex objectives that traditional methods struggle to address. Fenestration, encompassing façade openings and shading systems, plays a critical role in ZEB performance due to its high thermal transmittance and solar radiation admission. This paper presents a novel simulation-based optimization method for fenestration designed for practical application. It uses a hybrid metaheuristic algorithm and relies on rules and an updatable catalog, to fully automate the design process, create a highly diverse search space, minimize biases, and generate detailed solutions ready for architectural prescription. Nineteen fenestration variables, over which architects have design flexibility, were optimized to reduce heating, cooling demand, and thermal discomfort in residential buildings. The method was tested across three Spanish climate zones. Results demonstrate that the considered optimization algorithm significantly outperforms the baseline Genetic Algorithm in both quality and robustness, with these differences proven to be statistically significant. Furthermore, the findings offer valuable insights for ZEB design, highlighting challenges in reducing cooling demand in warm climates, and showcasing the superior efficiency of automated movable shading systems compared to fixed solutions.
Authors:Christian Mollière, Iker Cumplido, Marco Zeulner, Lukas Liesenhoff, Matthias Schubert, Julia Gottfriedsen
Abstract:
The rapid growth of data from satellite-based Earth observation (EO) systems poses significant challenges in data transmission and storage. We evaluate the potential of task-specific learned compression algorithms in this context to reduce data volumes while retaining crucial information. In detail, we compare traditional compression (JPEG 2000) versus a learned compression approach (Discretized Mixed Gaussian Likelihood) on three EO segmentation tasks: Fire, cloud, and building detection. Learned compression notably outperforms JPEG 2000 for large-scale, multi-channel optical imagery in both reconstruction quality (PSNR) and segmentation accuracy. However, traditional codecs remain competitive on smaller, single-channel thermal infrared datasets due to limited data and architectural constraints. Additionally, joint end-to-end optimization of compression and segmentation models does not improve performance over standalone optimization.
Authors:Ali Waseem, Malcolm Mielle
Abstract:
Inverse heat problems refer to the estimation of material thermophysical properties given observed or known heat diffusion behaviour. Inverse heat problems have wide-ranging uses, but a critical application lies in quantifying how building facade renovation reduces thermal transmittance, a key determinant of building energy efficiency. However, solving inverse heat problems with non-invasive data collected in situ is error-prone due to environmental variability or deviations from theoretically assumed conditions. Hence, current methods for measuring thermal conductivity are either invasive, require lengthy observation periods, or are sensitive to environmental and experimental conditions. Here, we present a PINN-based iterative framework to estimate the thermal conductivity k of a wall from a set of thermographs; our framework alternates between estimating the forward heat problem with a PINN for a fixed k, and optimizing k by comparing the thermographs and surface temperatures predicted by the PINN, repeating until the estimated k's convergence. Using both environmental data captured by a weather station and data generated from Finite-Volume-Method software simulations, we accurately predict k across different environmental conditions and data collection sampling times, given the temperature profile of the wall at dawn is close to steady state. Although violating the steady-state assumption impacts the accuracy of k's estimation, we show that our proposed framework still only exhibits a maximum MAE of 4.0851. Our work demonstrates the potential of PINN-based methods for reliable estimation of material properties in situ and under realistic conditions, without lengthy measurement campaigns. Given the lack of research on using machine learning, and more specifically on PINNs, for solving in-situ inverse problems, we expect our work to be a starting point for more research on the topic.
Authors:Yifan Sun, Zhi Li, Tetsuya Imamura, Yuji Ohishi, Chris Wolverton, Ken Kurosaki
Abstract:
Thermoelectrics (TEs) are promising candidates for energy harvesting with performance quantified by figure of merit, $ZT$. To accelerate the discovery of high-$ZT$ materials, efforts have focused on identifying compounds with low thermal conductivity $κ$. Using a curated dataset of 71,913 entries, we show that high-$ZT$ materials reside not only in the low-$κ$ regime but also cluster near a lattice-to-total thermal conductivity ratio ($κ_\mathrm{L}/κ$) of approximately 0.5, consistent with the phonon-glass electron-crystal design concept. Building on this insight, we construct a framework consisting of two machine learning models for the lattice and electronic components of thermal conductivity that jointly provide both $κ$ and $κ_\mathrm{L}/κ$ for screening and guiding the optimization of TE materials. Among 104,567 compounds screened, our models identify 2,522 ultralow-$κ$ candidates. Follow-up case studies demonstrate that this framework can reliably provide optimization strategies by suggesting new dopants and alloys that shift pristine materials toward the $κ_\mathrm{L}/κ$ approaching 0.5 regime. Ultimately, by integrating rapid screening with PGEC-guided optimization, our data-driven framework effectively bridges the critical gap between materials discovery and performance enhancement.
Authors:Mingxuan Tian, Haochen Mu, Donghong Ding, Mengjiao Li, Yuhan Ding, Jianping Zhao
Abstract:
With the development of digital twins and smart manufacturing systems, there is an urgent need for real-time distortion field prediction to control defects in metal Additive Manufacturing (AM). However, numerical simulation methods suffer from high computational cost, long run-times that prevent real-time use, while conventional Machine learning (ML) models struggle to extract spatiotemporal features for long-horizon prediction and fail to decouple thermo-mechanical fields. This paper proposes a Physics-informed Neural Operator (PINO) to predict z and y-direction distortion for the future 15 s. Our method, Physics-informed Deep Operator Network-Recurrent Neural Network (PIDeepONet-RNN) employs trunk and branch network to process temperature history and encode distortion fields, respectively, enabling decoupling of thermo-mechanical responses. By incorporating the heat conduction equation as a soft constraint, the model ensures physical consistency and suppresses unphysical artifacts, thereby establishing a more physically consistent mapping between the thermal history and distortion. This is important because such a basis function, grounded in physical laws, provides a robust and interpretable foundation for predictions. The proposed models are trained and tested using datasets generated from experimentally validated Finite Element Method (FEM). Evaluation shows that the model achieves high accuracy, low error accumulation, time efficiency. The max absolute errors in the z and y-directions are as low as 0.9733 mm and 0.2049 mm, respectively. The error distribution shows high errors in the molten pool but low gradient norms in the deposited and key areas. The performance of PINO surrogate model highlights its potential for real-time long-horizon physics field prediction in controlling defects.
Authors:Vinit Mehta, Charu Sharma, Karthick Thiyagarajan
Abstract:
With the rapid advancement of artificial intelligence and robotics, the integration of Large Language Models (LLMs) with 3D vision is emerging as a transformative approach to enhancing robotic sensing technologies. This convergence enables machines to perceive, reason and interact with complex environments through natural language and spatial understanding, bridging the gap between linguistic intelligence and spatial perception. This review provides a comprehensive analysis of state-of-the-art methodologies, applications and challenges at the intersection of LLMs and 3D vision, with a focus on next-generation robotic sensing technologies. We first introduce the foundational principles of LLMs and 3D data representations, followed by an in-depth examination of 3D sensing technologies critical for robotics. The review then explores key advancements in scene understanding, text-to-3D generation, object grounding and embodied agents, highlighting cutting-edge techniques such as zero-shot 3D segmentation, dynamic scene synthesis and language-guided manipulation. Furthermore, we discuss multimodal LLMs that integrate 3D data with touch, auditory and thermal inputs, enhancing environmental comprehension and robotic decision-making. To support future research, we catalog benchmark datasets and evaluation metrics tailored for 3D-language and vision tasks. Finally, we identify key challenges and future research directions, including adaptive model architectures, enhanced cross-modal alignment and real-time processing capabilities, which pave the way for more intelligent, context-aware and autonomous robotic sensing systems.
Authors:Maharshi Pathak, SungKu Kang, Vanessa C. Whittem, Katherine Bassett, Michael B. Kane, David J. Fannon
Abstract:
The expansion of renewable electricity generation, growing demands due to electrification, greater prevalence of working from home, and increasing frequency and severity of extreme weather events, will place new demands on the electric supply and distribution grid. Broader adoption of demand response programs (DRPs) for the residential sector may help meet these challenges; however, experience shows that occupant overrides in DRPs compromises their effectiveness. There is a lack of formal understanding of how discomfort, routines, and other motivations affect DRP overrides and other related human building interactions (HBI). This paper reports preliminary findings from a study of 20 households in Colorado and Massachusetts, US over three months. Participants responded to ecological momentary assessments (EMA) triggered by thermostat interactions and at random times throughout the day. EMAs included Likert-scale questions of thermal preference, preference intensity, and changes to 7 different activity types that could affect thermal comfort, and an opened ended question about motivations of such actions. Twelve tags were developed to categorize motivation responses and analyzed statistically to identify associations between motivations, preferences, and HBI actions. Reactions to changes in the thermal environment were the most frequently observed motivation, 118 of 220 responses. On the other hand, 47% responses were at least partially motivated by non-thermal factors, suggesting limited utility for occupant behavior models founded solely on thermal comfort. Changes in activity level and clothing were less likely to be reported when EMAs were triggered by thermostat interactions, while fan interactions were more likely. Windows, shades, and portable heater interactions had no significant dependence on how the EMA was triggered.
Authors:Florian Ebmeier, Nicole Ludwig, Jannik Thuemmel, Georg Martius, Volker H. Franz
Abstract:
Solar thermal systems (STS) present a promising avenue for low-carbon heat generation, with a well-running system providing heat at minimal cost and carbon emissions. However, STS can exhibit faults due to improper installation, maintenance, or operation, often resulting in a substantial reduction in efficiency or even damage to the system. As monitoring at the individual level is economically prohibitive for small-scale systems, automated monitoring and fault detection should be used to address such issues. Recent advances in data-driven anomaly detection, particularly in time series analysis, offer a cost-effective solution by leveraging existing sensors to identify abnormal system states. Here, we propose a probabilistic reconstruction-based framework for anomaly detection. We evaluate our method on the publicly available PaSTS dataset of operational domestic STS, which features real-world complexities and diverse fault types. Our experiments show that reconstruction-based methods can detect faults in domestic STS both qualitatively and quantitatively, while generalizing to previously unseen systems. We also demonstrate that our model outperforms both simple and more complex deep learning baselines. Additionally, we show that heteroscedastic uncertainty estimation is essential to fault detection performance. Finally, we discuss the engineering overhead required to unlock these improvements and make a case for simple deep learning models.
Authors:Tomáš Bezděk, Haomu Yuan, Vojtěch Novák, Silvie Illésová, Martin Beseda
Abstract:
This study systematically benchmarks classical optimization strategies for the Quantum Approximate Optimization Algorithm when applied to Generalized Mean-Variance Problems under near-term Noisy Intermediate-Scale Quantum conditions. We evaluate Dual Annealing, Constrained Optimization by Linear Approximation, and the Powell Method across noiseless, sampling noise, and two thermal noise models. Our Cost Function Landscape Analysis revealed that the Quantum Approximate Optimization Algorithm angle parameters $γ$ were largely inactive in the noiseless regime. This insight motivated a parameter-filtered optimization approach, in which we focused the search space exclusively on the active $β$ parameters. This filtering substantially improved parameter efficiency for fast optimizers like Constrained Optimization by Linear Approximation (reducing evaluations from 21 to 12 in the noiseless case) and enhanced robustness, demonstrating that leveraging structural insights is an effective architecture-aware noise mitigation strategy for Variational Quantum Algorithms.
Authors:Anna Van Boven, Kyri Baker
Abstract:
Wholesale power markets often use linear approximations of power system constraints. Because it does not consider inequality constraints, using AC power flow for feasibility post-processing can violate bounds on reactive power, voltage magnitudes, or thermal limits. There remains a need for a streamlined analytical approach that can guarantee AC feasibility while adhering to variable bounds. This paper suggests an augmented implementation of AC power flow that uses an additional two bus types (PQV and P) to help resolve voltage bound violations present in the traditional approach. The proposed method sacrifices the voltage setpoint at a generator in exchange for fixing the voltage at a load bus, thereby moving a degree of freedom around the network. Results on the IEEE 14-bus, 57-bus, and 300-bus test cases demonstrate how switching bus types can reduce overall network violations and help find feasible power system setpoints.
Authors:Rune Rost, Lorenzo Branca, Tobias Buck
Abstract:
Radiative transfer is a fundamental process in astrophysics, essential for both interpreting observations and modeling thermal and dynamical feedback in simulations via ionizing radiation and photon pressure. However, numerically solving the underlying radiative transfer equation is computationally intensive due to the complex interaction of light with matter and the disparity between the speed of light and the typical gas velocities in astrophysical environments, making it particularly expensive to include the effects of on-the-fly radiation in hydrodynamic simulations. This motivates the development of surrogate models that can significantly accelerate radiative transfer calculations while preserving high accuracy. We present a surrogate model based on a Fourier Neural Operator architecture combined with U-Nets. Our model approximates three-dimensional, monochromatic radiative transfer in time-dependent regimes, in absorption-emission approximation, achieving speedups of more than 2 orders of magnitude while maintaining an average relative error below 3%, demonstrating our approach's potential to be integrated into state-of-the-art hydrodynamic simulations.
Authors:Patrik Valábek, Michaela Horváthová, Martin Klaučo
Abstract:
This paper presents a deep Koopman-based Economic Model Predictive Control (EMPC) for efficient operation of a laboratory-scale pasteurization unit (PU). The method uses Koopman operator theory to transform the complex, nonlinear system dynamics into a linear representation, enabling the application of convex optimization while representing the complex PU accurately. The deep Koopman model utilizes neural networks to learn the linear dynamics from experimental data, achieving a 45% improvement in open-loop prediction accuracy over conventional N4SID subspace identification. Both analyzed models were employed in the EMPC formulation that includes interpretable economic costs, such as energy consumption, material losses due to inadequate pasteurization, and actuator wear. The feasibility of EMPC is ensured using slack variables. The deep Koopman EMPC and N4SID EMPC are numerically validated on a nonlinear model of multivariable PU under external disturbance. The disturbances include feed pump fail-to-close scenario and the introduction of a cold batch to be pastuerized. These results demonstrate that the deep Koopmand EMPC achieves a 32% reduction in total economic cost compared to the N4SID baseline. This improvement is mainly due to the reductions in material losses and energy consumption. Furthermore, the steady-state operation via Koopman-based EMPC requires 10.2% less electrical energy. The results highlight the practical advantages of integrating deep Koopman representations with economic optimization to achieve resource-efficient control of thermal-intensive plants.
Authors:Samarth Toolhally, Joeri Roelofs, Siep Weiland, Amritam Das
Abstract:
In inkjet printing, optimal paper moisture is crucial for print quality, achieved through hot-air impingement in the fixation unit. This paper presents a modular digital twin of the fixation unit, modeling the thermo-fluidic drying process and monitoring its spatio-temporal performance. The novel approach formulates the digital twin as an infinite-dimensional state estimator that infers fixation states from limited sensor data, while remaining robust to disturbances. Modularity is achieved through a graph-theoretic model, where each node represents thermo-fluidic dynamics in different sections of the fixation unit. Evaporation is modeled as a nonlinear boundary effect coupled with node dynamics via Linear Fractional Representation. Using the Partial Integral Equation (PIE) framework, we develop a unified approach for stability, input-output analysis, simulation, and rapid prototyping, validated with operational data from a commercial printer. An $\mathcal{H}_{\infty}$-optimal Luenberger state estimator is then synthesized to estimate thermal states from available sensor data, enabling real-time monitoring of spatio-temporal thermal effects on paper sheets.
Authors:Morteza Sadeghi, Hadi Keramati, Sajjad Bigham
Abstract:
We present a Bézier-based Multi-Fidelity Thermal Optimization Framework, which is a computationally efficient methodology for the global optimization of 3D heat sinks. The flexible Bézier-parameterized fin geometries and the adopted multi-fidelity pseudo-3D thermal modeling strategy meet at a balance between accuracy and computational cost. In this method, the smooth and compact Bézier representation of fins defines the design space from which diverse topologies can be generated with minimal design variables. A global optimizer, the Covariance Matrix Adaptation Evolution Strategy, minimizes the pressure drop with respect to a given surface-average temperature constraint to achieve improvement in the pressure loss. In the framework, the pseudo-3D model couples two thermally interacting 2D layers: a thermofluid layer representing the fluid domain passing through the fins, and a conductive base plate representing the surface where excessive average temperature is to be avoided. Both layers are coupled with calibrated heat transfer coefficients obtained from high-fidelity 3D simulations. For several fin geometries, the proposed framework has been validated by comparing the pseudo-3D results with those of full 3D simulations, which yielded good agreement in terms of temperature distribution and pressure drops when the computational cost was reduced by several orders of magnitude. Optimization results show that it attains up to 50\% pressure loss reduction compared to conventional straight-fin configurations, and it reveals a clear trade-off between thermal performance and hydraulic efficiency. Thus, the proposed method forms a new basis for fast, geometry-flexible, and optimized heat sink design, enabling efficient exploration of complex geometries.
Authors:Niklas Wölki, Lukas Kondmann, Christian Mollière, Martin Langer, Julia Gottfriedsen, Martin Werner
Abstract:
Onboard cloud segmentation is a critical yet underexplored task in thermal Earth observation (EO), particularly for CubeSat missions constrained by limited hardware and spectral information. CubeSats often rely on a single thermal band and lack sufficient labeled data, making conventional cloud masking techniques infeasible. This work addresses these challenges by applying transfer learning to thermal cloud segmentation for the FOREST-2 CubeSat, using a UNet with a lightweight MobileNet encoder. We pretrain the model on the public Landsat-7 Cloud Cover Assessment Dataset and fine-tune it with a small set of mission-specific samples in a joint-training setup, improving the macro F1 from 0.850 to 0.877 over FOREST-2-only baselines. We convert the model to a TensorRT engine and demonstrate full-image inference in under 5 seconds on an NVIDIA Jetson Nano. These results show that leveraging public datasets and lightweight architectures can enable accurate, efficient thermal-only cloud masking on-orbit, supporting real-time decision-making in data-limited EO missions.
Authors:Ehsan Ghaderi, Mohamad Ali Bijarchi, Siamak Kazemzadeh Hannani, Ali Nouri Boroujerdi
Abstract:
In this study, the capabilities of the Physics-Informed Neural Network (PINN) method are investigated for three major tasks: modeling, simulation, and optimization in the context of the heat conduction problem. In the modeling phase, the governing equation of heat transfer by conduction is reconstructed through equation discovery using fractional-order derivatives, enabling the identification of the fractional derivative order that best describes the physical behavior. In the simulation phase, the thermal conductivity is treated as a physical parameter, and a parametric simulation is performed to analyze its influence on the temperature field. In the optimization phase, the focus is placed on the inverse problem, where the goal is to infer unknown physical properties from observed data. The effectiveness of the PINN approach is evaluated across these three fundamental engineering problem types and compared against conventional numerical methods. The results demonstrate that although PINNs may not yet outperform traditional numerical solvers in terms of speed and accuracy for forward problems, they offer a powerful and flexible framework for parametric simulation, optimization, and equation discovery, making them highly valuable for inverse and data-driven modeling applications.
Authors:Kaichen Ouyang, Yezhi Xia
Abstract:
Synthetic Benchmark Problems (SBPs) are commonly used to evaluate the performance of metaheuristic algorithms. However, these SBPs often contain various unrealistic properties, potentially leading to underestimation or overestimation of algorithmic performance. While several benchmark suites comprising real-world problems have been proposed for various types of metaheuristics, a notable gap exists for Constrained Multi-objective Optimization Problems (CMOPs) derived from practical engineering applications, particularly in the domain of Battery Thermal Management System (BTMS) design. To address this gap, this study develops and presents a specialized benchmark suite for multi-objective optimization in BTMS. This suite comprises a diverse collection of real-world constrained problems, each defined via accurate surrogate models based on recent research to efficiently represent complex thermal-fluid interactions. The primary goal of this benchmark suite is to provide a practical and relevant testing ground for evolutionary algorithms and optimization methods focused on energy storage thermal management. Future work will involve establishing comprehensive baseline results using state-of-the-art algorithms, conducting comparative analyses, and developing a standardized ranking scheme to facilitate robust performance assessment.
Authors:Zhenglai Shen, Hongyu Zhou
Abstract:
Compounding climate hazards, such as wildfire-induced outages and urban heatwaves, challenge the stability and equity of cities. We present a Hazard-Responsive Digital Twin (H-RDT) that combines physics-informed neural network modeling, multimodal data fusion, and equity-aware risk analytics for urban-scale response. In a synthetic district with diverse building archetypes and populations, a simulated wildfire-outage-heatwave cascade shows that H-RDT maintains stable indoor temperature predictions (approximately 31 to 33 C) under partial sensor loss, reproducing outage-driven surges and recovery. The reinforcement learning based fusion module adaptively reweights IoT, UAV, and satellite inputs to sustain spatiotemporal coverage, while the equity-adjusted mapping isolates high-vulnerability clusters (schools, clinics, low-income housing). Prospective interventions, such as preemptive cooling-center activation and microgrid sharing, reduce population-weighted thermal risk by 11 to 13 percent, shrink the 95th-percentile (tail) risk by 7 to 17 percent, and cut overheating hours by up to 9 percent. Beyond the synthetic demonstration, the framework establishes a transferable foundation for real-city implementation, linking physical hazard modeling with social equity and decision intelligence. The H-RDT advances digital urban resilience toward adaptive, learning-based, and equity-centered decision support for climate adaptation.
Authors:Takahiro Kushida, Kenichiro Tanaka
Abstract:
This paper introduces a novel method for detailed 3D shape reconstruction utilizing thermal polarization cues. Unlike state-of-the-art methods, the proposed approach is independent of illumination and material properties. In this paper, we formulate a general theory of polarization observation and show that long-wave infrared (LWIR) polarimetric imaging is free from the ambiguities that affect visible polarization analyses. Subsequently, we propose a method for recovering detailed 3D shapes using multi-view thermal polarimetric images. Experimental results demonstrate that our approach effectively reconstructs fine details in transparent, translucent, and heterogeneous objects, outperforming existing techniques.
Authors:Nika Mlinarič Hribar, Matjaž Depolli, Gregor Kosec
Abstract:
This study investigates the impact of wind velocity averaging on Dynamic Thermal Rating (DTR) calculations. It is based on a high-temporal-resolution (1 second) wind measurements obtained from a transmission line in Slovenia, Europe. Wind speed and direction variability are analysed, and two averaging methods, namely vector averaging, where velocity is averaged as vector, and hybrid averaging, where speed is averaged as scalar, are employed. DTR calculations are performed on both high-resolution data and averaged data (5 minute averaging window). It is demonstrated that averaging has a significant effect on both Nusselt number and ampacity, and the effect exhibits a strong angular dependency on the relative angle of the wind to the line. Therefore, two limit cases are studied: in the case of parallel wind, averaged data underestimates the ampacity, and there is a significant amount of cases where the underestimation is larger than 10 %. In the case of perpendicular wind, the two averaging methods affect the results in different ways, but both result in a substantial amount of cases where ampacity is overestimated, potentially leading to unsafe operation. The main takeaway of the study is that averaging wind velocity has a significant impact on DTR results, and special emphasis should be given to the averaging method, as different methods affect the results in different ways.
Authors:Abdollah Rahimi, Mehdi Jafari Shahbazzadeh, Amid Khatibi
Abstract:
Wireless Body Area Networks (WBANs) have gained significant attention due to their applications in healthcare monitoring, sports, military communication, and remote patient care. These networks consist of wearable or implanted sensors that continuously collect and transmit physiological data, requiring efficient and reliable communication. However, WBANs face challenges such as limited energy, dynamic topology, and sensitivity to node temperature, which demand specialized routing strategies. Traditional shortest-path routing often causes congestion and overheating in specific nodes, leading to early failures. To address these problems, this paper proposes an intelligent temperature-aware and reliability-based routing approach that enhances WBAN performance. The proposed method works in two phases: (1) network setup and intelligent path selection, and (2) dynamic traffic management and hotspot avoidance. In the first phase, nodes share information such as residual energy, temperature, link reliability, and delay to build an optimized topology using a multi-criteria decision algorithm. The second phase continuously monitors real-time conditions and reroutes traffic away from overheated or depleted nodes. Simulation results show that the proposed approach improves throughput by 13 percent, reduces end-to-end delay by 10 percent, decreases energy consumption by 25 percent, and lowers routing load by 30 percent compared to existing methods.
Authors:Siva Teja Kakileti, Bharath Govindaraju, Sudhakar Sampangi, Geetha Manjunath
Abstract:
Mammography, the current standard for breast cancer screening, has reduced sensitivity in women with dense breast tissue, contributing to missed or delayed diagnoses. Thermalytix, an AI-based thermal imaging modality, captures functional vascular and metabolic cues that may complement mammographic structural data. This study investigates whether a breast density-informed multi-modal AI framework can improve cancer detection by dynamically selecting the appropriate imaging modality based on breast tissue composition. A total of 324 women underwent both mammography and thermal imaging. Mammography images were analyzed using a multi-view deep learning model, while Thermalytix assessed thermal images through vascular and thermal radiomics. The proposed framework utilized Mammography AI for fatty breasts and Thermalytix AI for dense breasts, optimizing predictions based on tissue type. This multi-modal AI framework achieved a sensitivity of 94.55% (95% CI: 88.54-100) and specificity of 79.93% (95% CI: 75.14-84.71), outperforming standalone mammography AI (sensitivity 81.82%, specificity 86.25%) and Thermalytix AI (sensitivity 92.73%, specificity 75.46%). Importantly, the sensitivity of Mammography dropped significantly in dense breasts (67.86%) versus fatty breasts (96.30%), whereas Thermalytix AI maintained high and consistent sensitivity in both (92.59% and 92.86%, respectively). This demonstrates that a density-informed multi-modal AI framework can overcome key limitations of unimodal screening and deliver high performance across diverse breast compositions. The proposed framework is interpretable, low-cost, and easily deployable, offering a practical path to improving breast cancer screening outcomes in both high-resource and resource-limited settings.
Authors:Mehran Ebrahimi, Masayuki Yano
Abstract:
We present an online-adaptive hyperreduced reduced basis element method for model order reduction of parameterized, component-based nonlinear systems. The method, in the offline phase, prepares a library of hyperreduced archetype components of various fidelity levels and, in the online phase, assembles the target system using instantiated components whose fidelity is adaptively selected to satisfy a user-prescribed system-level error tolerance. To achieve this, we introduce a hierarchical error estimation framework that compares solutions at successive fidelity levels and drives a local refinement strategy based on component-wise error indicators. We also provide an efficient estimator for the system-level error to ensure that the adaptive strategy meets the desired accuracy. Component-wise hyperreduction is performed using an empirical quadrature procedure, with the training accuracy guided by the Brezzi--Rappaz--Raviart theorem. The proposed method is demonstrated on a family of nonlinear thermal fin systems comprising up to 225 components and 68 parameters. Numerical results show that the hyperreduced basis element model achieves O(100) computational reduction at 1% error level relative to the truth finite-element model. In addition, the adaptive refinement strategy provides more effective error control than uniform refinement by selectively enriching components with higher local errors.
Authors:Sojun Ono, Kazuyuki Sugimura
Abstract:
We present a neural-network emulator for the thermal and chemical evolution in Population~III star formation. The emulator accurately reproduces the thermochemical evolution over a wide density range spanning 21 orders of magnitude (10$^{-3}$-10$^{18}$ cm$^{-3}$), tracking six primordial species: H, H$_2$, e$^{-}$, H$^{+}$, H$^{-}$, and H$_2^{+}$. To handle the broad dynamic range, we partition the density range into five subregions and train separate deep operator networks (DeepONets) in each region. When applied to randomly sampled thermochemical states, the emulator achieves relative errors below 10% in over 90% of cases for both temperature and chemical abundances (except for the rare species H$_2^{+}$). The emulator is roughly ten times faster on a CPU and more than 1000 times faster for batched predictions on a GPU, compared with conventional numerical integration. Furthermore, to ensure robust predictions under many iterations, we introduce a novel timescale-based update method, where a short-timestep update of each variable is computed by rescaling the predicted change over a longer timestep equal to its characteristic variation timescale. In one-zone collapse calculations, the results from the timescale-based method agree well with traditional numerical integration even with many iterations at a timestep as short as 10$^{-4}$ of the free-fall time. This proof-of-concept study suggests the potential for neural network-based chemical emulators to accelerate hydrodynamic simulations of star formation.
Authors:Zhong Guo, Prabir Barooah
Abstract:
We describe a framework of modeling a central chilled
water plant (CCWP) that consists of an aggregate
cooling coil, a number of heterogeneous chillers and
cooling towers, and a chilled water-based thermal
energy storage system. We improve upon existing component
models from the open literature using a constrained
optimization-based framework to ensure that the models
respect capacities of all the heat exchangers (cooling
coils, chillers, and cooling towers) irrespective of
the inputs provided. As a result, the proposed model has a wider
range of validity compared to existing models; the
latter can produce highly erroneous outputs when inputs are not
within normal operating range. This
feature is essential for training learning-based
controllers that can choose inputs beyond normal operating conditions and is lacking in currently available
models. The overall plant model is
implemented in Matlab and is made publicly
available. Simulation of a CCWP with closed loop
control is provided as an illustration.
Authors:Cheng Li, Pengfei Danga, Yuehui Xiana, Yumei Zhou, Bofeng Shi, Xiangdong Ding, Jun Suna, Dezhen Xue
Abstract:
The design of shape memory alloys (SMAs) with high transformation temperatures and large mechanical work output remains a longstanding challenge in functional materials engineering. Here, we introduce a data-driven framework based on generative adversarial network (GAN) inversion for the inverse design of high-performance SMAs. By coupling a pretrained GAN with a property prediction model, we perform gradient-based latent space optimization to directly generate candidate alloy compositions and processing parameters that satisfy user-defined property targets. The framework is experimentally validated through the synthesis and characterization of five NiTi-based SMAs. Among them, the Ni$_{49.8}$Ti$_{26.4}$Hf$_{18.6}$Zr$_{5.2}$ alloy achieves a high transformation temperature of 404 $^\circ$C, a large mechanical work output of 9.9 J/cm$^3$, a transformation enthalpy of 43 J/g , and a thermal hysteresis of 29 °C, outperforming existing NiTi alloys. The enhanced performance is attributed to a pronounced transformation volume change and a finely dispersed of Ti$_2$Ni-type precipitates, enabled by sluggish Zr and Hf diffusion, and semi-coherent interfaces with localized strain fields. This study demonstrates that GAN inversion offers an efficient and generalizable route for the property-targeted discovery of complex alloys.
Authors:Shreshth A. Malik, Tiarnan A. S. Doherty, Benjamin Colmey, Stephen J. Roberts, Yarin Gal, Paul A. Midgley
Abstract:
High-fidelity electron microscopy simulations required for quantitative crystal structure refinements face a fundamental challenge: while physical interactions are well-described theoretically, real-world experimental effects are challenging to model analytically. To address this gap, we present a novel hybrid physics-machine learning framework that integrates differentiable physical simulations with neural networks. By leveraging automatic differentiation throughout the simulation pipeline, our method enables gradient-based joint optimization of physical parameters and neural network components representing experimental variables, offering superior scalability compared to traditional second-order methods. We demonstrate this framework through application to three-dimensional electron diffraction (3D-ED) structure refinement, where our approach learns complex thickness distributions directly from diffraction data rather than relying on simplified geometric models. This method achieves state-of-the-art refinement performance across synthetic and experimental datasets, recovering atomic positions, thermal displacements, and thickness profiles with high fidelity. The modular architecture proposed can naturally be extended to accommodate additional physical phenomena and extended to other electron microscopy techniques. This establishes differentiable hybrid modeling as a powerful new paradigm for quantitative electron microscopy, where experimental complexities have historically limited analysis.
Authors:Yinan Yu, Alex Gonzalez-Caceres, Samuel Scheidegger, Sanjay Somanath, Alexander Hollberg
Abstract:
Renovating existing buildings is essential for climate impact. Early-phase renovation planning requires simulations based on thermal 3D models at Level of Detail (LoD) 3, which include features like windows. However, scalable and accurate identification of such features remains a challenge. This paper presents the Scalable Image-to-3D Facade Parser (SI3FP), a pipeline that generates LoD3 thermal models by extracting geometries from images using both computer vision and deep learning. Unlike existing methods relying on segmentation and projection, SI3FP directly models geometric primitives in the orthographic image plane, providing a unified interface while reducing perspective distortions. SI3FP supports both sparse (e.g., Google Street View) and dense (e.g., hand-held camera) data sources. Tested on typical Swedish residential buildings, SI3FP achieved approximately 5% error in window-to-wall ratio estimates, demonstrating sufficient accuracy for early-stage renovation analysis. The pipeline facilitates large-scale energy renovation planning and has broader applications in urban development and planning.
Authors:Stephanie Wohlfahrt, Christoph Praschl, Horst Leitner, Wolfram Jantsch, Julia Konic, Silvio Schueler, Andreas Stöckl, David C. Schedl
Abstract:
We use unmanned aerial drones to estimate wildlife density in southeastern Austria and compare these estimates to camera trap data. Traditional methods like capture-recapture, distance sampling, or camera traps are well-established but labour-intensive or spatially constrained. Using thermal (IR) and RGB imagery, drones enable efficient, non-intrusive animal counting. Our surveys were conducted during the leafless period on single days in October and November 2024 in three areas of a sub-Illyrian hill and terrace landscape. Flight transects were based on predefined launch points using a 350 m grid and an algorithm that defined the direction of systematically randomized transects. This setup allowed surveying large areas in one day using multiple drones, minimizing double counts. Flight altitude was set at 60 m to avoid disturbing roe deer (Capreolus capreolus) while ensuring detection. Animals were manually annotated in the recorded imagery and extrapolated to densities per square kilometer. We applied three extrapolation methods with increasing complexity: naive area-based extrapolation, bootstrapping, and zero-inflated negative binomial modelling. For comparison, a Random Encounter Model (REM) estimate was calculated using camera trap data from the flight period. The drone-based methods yielded similar results, generally showing higher densities than REM, except in one area in October. We hypothesize that drone-based density reflects daytime activity in open and forested areas, while REM estimates average activity over longer periods within forested zones. Although both approaches estimate density, they offer different perspectives on wildlife presence. Our results show that drones offer a promising, scalable method for wildlife density estimation.
Authors:Chunlin Wu, Liangliang Zhang, Tengxiang Wang, Huiming Yin
Abstract:
This paper proposes a single-domain dual-reciprocity inclusion-based boundary element method (DR-iBEM) for a three-dimensional fully bonded bi-layered composite embedded with ellipsoidal inhomogeneities under transient/harmonic thermal loads. The heat equation is interpreted as a static one containing time- and frequency-dependent nonhomogeneous source terms, which is similar to eigen-fields but is transformed into a boundary integral by the dual-reciprocity method. Using the steady-state bimaterial Green's function, boundary integral equations are proposed to take into account continuity conditions of temperature and heat flux, which avoids setting up any continuity equations at the bimaterial interface. Eigen-temperature-gradients and eigen-heat-source are introduced to simulate the material mismatch in thermal conductivity and heat capacity, respectively. The DR-iBEM algorithm is particularly suitable for investigating the transient and harmonic thermal behaviors of bi-layered composites and is verified by the finite element method (FEM). Numerical comparison with the FEM demonstrates its robustness and accuracy. The method has been applied to a functionally graded material as a bimaterial with graded particle distributions, where particle size and gradation effects are evaluated.
Authors:James Rhodes, Lawrence Ong, Duy T. Ngo
Abstract:
Monitoring respiratory health with the use of channel state information (CSI) has shown promising results. Many existing methods focus on monitoring only the respiratory rate, while others focus on monitoring the motion of the chest as a patient breathes, which is referred to as the respiratory waveform. This paper presents WiRM, a two-staged approach to contactless respiration monitoring. In the first stage, WiRM improves upon existing respiratory rate estimation techniques by using conjugate multiplication for phase sanitisation and adaptive multi-trace carving (AMTC) for tracing how the respiratory rate changes over time. When compared against three state-of-the-art methods, WiRM has achieved an average reduction of $38\%$ in respiratory rate root mean squared error (RMSE). In the second stage, WiRM uses this improved respiratory rate estimate to inform the decomposition and selection of the respiratory waveform from the CSI data. Remarkably, WiRM delivers a $178.3\%$ improvement in average absolute correlation with the ground truth respiratory waveform. Within the literature, it is difficult to compare the robustness of existing algorithms in noisy environments. In this paper, we develop a purpose-built simulation toolkit to evaluate the robustness of respiration monitoring solutions under various noise conditions, including thermal, multiplicative, and phase noise. Our results show that WiRM demonstrates improved or comparable resilience to these common noise sources.
Authors:Bingjia Xiao, Tao Chen, Wenbin Zhang, Xin Qian, Puqing Jiang
Abstract:
Frequency-domain thermoreflectance (FDTR) is a widely used technique for characterizing thermal properties of multilayer thin films. However, extracting multiple parameters from FDTR measurements presents a nonlinear inverse problem due to its high dimensionality and multimodal, non-convex solution space. This study evaluates four popular global optimization algorithms: Genetic Algorithm (GA), Quantum Genetic Algorithm (QGA), Particle Swarm Optimization (PSO), and Fireworks Algorithm (FWA), for extracting parameters from FDTR measurements of a GaN/Si heterostructure. However, none achieve reliable convergence within 60 seconds. To improve convergence speed and accuracy, we propose an AI-driven hybrid optimization framework that combines each global algorithm with a Quasi-Newton local refinement method, resulting in four hybrid variants: HGA, HQGA, HPSO, and HFWA. Among these, HPSO outperforms all other methods, with 80% of trials reaching the target fitness value within 60 seconds, showing greater robustness and a lower risk of premature convergence. In contrast, only 30% of HGA and HQGA trials and 20% of HFWA trials achieve this threshold. We then evaluate the worst-case performance across 100 independent trials for each algorithm when the time is extended to 1000 seconds. Only HPSO, PSO, and HGA consistently reach the target accuracy, with HPSO converging five times faster than the others. HPSO provides a general-purpose solution for inverse problems in thermal metrology and can be readily extended to other model-fitting techniques.
Authors:Sheikh Md Shakeel Hassan, Xianwei Zou, Akash Dhruv, Vishwanath Ganesan, Aparna Chandramowlishwaran
Abstract:
Modeling boiling (an inherently chaotic, multiphase process central to energy and thermal systems) remains a significant challenge for neural PDE surrogates. Existing models require future input (e.g., bubble positions) during inference because they fail to learn nucleation from past states, limiting their ability to autonomously forecast boiling dynamics. They also fail to model flow boiling velocity fields, where sharp interface-momentum coupling demands long-range and directional inductive biases. We introduce Bubbleformer, a transformer-based spatiotemporal model that forecasts stable and long-range boiling dynamics including nucleation, interface evolution, and heat transfer without dependence on simulation data during inference. Bubbleformer integrates factorized axial attention, frequency-aware scaling, and conditions on thermophysical parameters to generalize across fluids, geometries, and operating conditions. To evaluate physical fidelity in chaotic systems, we propose interpretable physics-based metrics that evaluate heat-flux consistency, interface geometry, and mass conservation. We also release BubbleML 2.0, a high-fidelity dataset that spans diverse working fluids (cryogens, refrigerants, dielectrics), boiling configurations (pool and flow boiling), flow regimes (bubbly, slug, annular), and boundary conditions. Bubbleformer sets new benchmark results in both prediction and forecasting of two-phase boiling flows.
Authors:Mufakir Qamar Ansari, Mudabir Qamar Ansari
Abstract:
The paradigm shift towards multi-core and heterogeneous computing, driven by the fundamental power and thermal limits of single-core processors, has established energy efficiency as a first-class design constraint in high-performance computing (HPC). Heterogeneous systems, integrating traditional multi-core CPUs with specialized accelerators like discrete (dGPU) and integrated (iGPU) graphics processing units, offer a compelling path to navigating the trade-offs between performance and power. However, quantifying these trade-offs on widely accessible hardware remains a critical area of study. This paper presents a direct, empirical measurement of the performance and energy-to-solution of a canonical HPC workload -- a 4096x4096 matrix-matrix multiplication -- on three distinct compute architectures within a single consumer-grade laptop: a multi-core AMD Ryzen 7 5800H CPU, a discrete NVIDIA GeForce GTX 1650 GPU, and an integrated AMD Radeon Vega GPU. Using standard, validated, and minimally intrusive tools such as Linux perf and nvidia-smi, we find that the discrete GPU is not only the performance leader, achieving a 93.5x speedup over the CPU, but is also the most energy-efficient, consuming only 2% of the energy used by the CPU, resulting in a 50-fold improvement in energy efficiency. These findings provide a practical demonstration of the "race to idle" principle and offer clear, quantitative guidance on architectural choices for energy-aware software development.
Authors:Leo Guo, Adwait Inamdar, Willem D. van Driel, GuoQi Zhang
Abstract:
Solder joint reliability related to failures due to thermomechanical loading is a critically important yet physically complex engineering problem. As a result, simulated behavior is oftentimes computationally expensive. In an increasingly data-driven world, the usage of efficient data-driven design schemes is a popular choice. Among them, Bayesian optimization (BO) with Gaussian process regression is one of the most important representatives. The authors argue that computational savings can be obtained from exploiting thorough surrogate modeling and selecting a design candidate based on multiple acquisition functions. This is feasible due to the relatively low computational cost, compared to the expensive simulation objective. This paper addresses the shortcomings in the adjacent literature by providing and implementing a novel heuristic framework to perform BO with adaptive hyperparameters across the various optimization iterations. Adaptive BO is subsequently compared to regular BO when faced with synthetic objective minimization problems. The results show the efficiency of adaptive BO when compared any worst-performing regular Bayesian schemes. As an engineering use case, the solder joint reliability problem is tackled by minimizing the accumulated non-linear creep strain under a cyclic thermal load. Results show that adaptive BO outperforms regular BO by 3% on average at any given computational budget threshold, critically saving half of the computational expense budget. This practical result underlines the methodological potential of the adaptive Bayesian data-driven methodology to achieve better results and cut optimization-related expenses. Lastly, in order to promote the reproducibility of the results, the data-driven implementations are made available on an open-source basis.
Authors:Jan M. Nordbotten, Martin A. Fernø, Bernd Flemisch, Anthony R. Kovscek, Knut-Andreas Lie, Jakub W. Both, Olav Møyner, Tor Harald Sandve, Etienne Ahusborde, Sebastian Bauer, Zhangxing Chen, Holger Class, Chaojie Di, Didier Ding, David Element, Abbas Firoozabadi, Eric Flauraud, Jacques Franc, Firdovsi Gasanzade, Yousef Ghomian, Marie Ann Giddins, Christopher Green, Bruno R. B. Fernandes, George Hadjisotiriou, Glenn Hammond, Hai Huang, Dickson Kachuma, Michel Kern, Timo Koch, Prasanna Krishnamurthy, Kjetil Olsen Lye, David Landa-Marbán, Michael Nole, Paolo Orsini, Nicolas Ruby, Pablo Salinas, Mohammad Sayyafzadeh, Jakub Solovský, Jakob Torben, Adam Turner, Denis V. Voskov, Kai Wendel, AbdAllah A. Youssef
Abstract:
The 11th Society of Petroleum Engineers Comparative Solution Project (shortened SPE11 herein) benchmarked simulation tools for geological carbon dioxide (CO$_2$) storage. A total of 45 groups from leading research institutions and industry across the globe signed up to participate, with 18 ultimately contributing valid results that were included in the comparative study reported here.
This paper summarizes the SPE11. A comprehensive introduction and qualitative discussion of the submitted data are provided, together with an overview of online resources for accessing the full depth of data. A global metric for analyzing the relative distance between submissions is proposed and used to conduct a quantitative analysis of the submissions. This analysis attempts to statistically resolve the key aspects influencing the variability between submissions.
The study shows that the major qualitative variation between the submitted results is related to thermal effects, dissolution-driven convective mixing, and resolution of facies discontinuities. Moreover, a strong dependence on grid resolution is observed across all three versions of the SPE11. However, our quantitative analysis suggests that the observed variations are predominantly influenced by factors not documented in the technical responses provided by the participants. We therefore identify that unreported variations due to human choices within the process of setting up, conducting, and reporting on the simulations underlying each SPE11 submission are at least as impactful as the computational choices reported.
Authors:Satwik Dutta, Shruthigna Chandupatla, John Hansen
Abstract:
Reliability on cloud providers for ASR inference to support child-centered voice-based applications is becoming challenging due to regulatory and privacy challenges. Motivated by a privacy-preserving design, this study aims to develop a lightweight & efficient Whisper ASR system capable of running on a Raspberry Pi. Upon evaluation of the MyST corpus and by examining various filtering strategies to fine-tune the `tiny.en' model, a Word Error Rate (WER) of 15.9% was achieved (11.8% filtered). A low-rank compression reduces the encoder size by 0.51M with 1.26x faster inference in GPU, with 11% relative WER increase. During inference on Pi, the compressed version required ~2 GFLOPS fewer computations. The RTF for both the models ranged between [0.23-0.41] for various input audio durations. Analyzing the RAM usage and CPU temperature showed that the PI was capable of handling both the tiny models, however it was noticed that small models initiated additional overhead/thermal throttling.
Authors:Shantanav Chakraborty, Soonwon Choi, Soumik Ghosh, Tudor GiurgicÄ-Tiron
Abstract:
Deep thermalization refers to the emergence of Haar-like randomness from quantum systems upon partial measurements. As a generalization of quantum thermalization, it is often associated with high complexity and entanglement. Here, we introduce computational deep thermalization and construct the fastest possible dynamics exhibiting it at infinite effective temperature. Our circuit dynamics produce quantum states with low entanglement in polylogarithmic depth that are indistinguishable from Haar random states to any computationally bounded observer. Importantly, the observer is allowed to request many copies of the same residual state obtained from partial projective measurements on the state -- this condition is beyond the standard settings of quantum pseudorandomness, but natural for deep thermalization. In cryptographic terms, these states are pseudorandom, pseudoentangled, and crucially, retain these properties under local measurements. Our results demonstrate a new form of computational thermalization, where thermal-like behavior arises from structured quantum states endowed with cryptographic properties, instead of from highly unstructured ensembles. The low resource complexity of preparing these states suggests scalable simulations of deep thermalization using quantum computers. Our work also motivates the study of computational quantum pseudorandomness beyond BQP observers.
Authors:Vivek Teja Tanjavooru, Prashant Pant, Thomas Hamacher, Holger Hesse
Abstract:
This paper presents a mixed-integer, nonlinear, multi-objective optimization strategy for optimal power allocation among parallel strings in Battery Energy Storage Systems (BESS). High-fidelity control is achieved by co-simulating the optimizer with a BESS electro-thermal simulation that models spatial thermal dynamics of the battery, providing real-time State of Charge (SOC) and temperature feedback. The optimizer prioritizes reliability by enforcing power availability as a hard constraint and penalizing battery thermal derating. Within these bounds, the controller performs a Pareto sweep on the relative weights of inverter and battery losses to balance the trade-off between inverter efficiency and battery efficiency. The inverter loss model is based on an empirical lookup table (LUT) derived from a commercial inverter system, while the battery thermal loss model uses SOC and temperature-dependent internal resistance, with electric current computed from the battery Equivalent Circuit Model (ECM). When the optimization was applied to a two-string BESS, the competing effects of inverter and battery losses on system availability and thermal derating were observed. The balanced operation yielded improvements of 1% in battery efficiency, 1.5% in inverter efficiency, and 2% in derating efficiency, while maintaining higher availability. Additionally, a 5 degrees C reduction in BESS peak temperature also suggests reduced thermal stress without compromising availability.
Authors:Duong Nguyen-Ngoc Tran, Long Hoang Pham, Chi Dai Tran, Quoc Pham-Nam Ho, Huy-Hung Nguyen, Jae Wook Jeon
Abstract:
Multi-Object Tracking in thermal images is essential for surveillance systems, particularly in challenging environments where RGB cameras struggle due to low visibility or poor lighting conditions. Thermal sensors enhance recognition tasks by capturing infrared signatures, but a major challenge is their low-level feature representation, which makes it difficult to accurately detect and track pedestrians. To address this, the paper introduces a novel tuning method for pedestrian tracking, specifically designed to handle the complex motion patterns in thermal imagery. The proposed framework optimizes two-stages, ensuring that each stage is tuned with the most suitable hyperparameters to maximize tracking performance. By fine-tuning hyperparameters for real-time tracking, the method achieves high accuracy without relying on complex reidentification or motion models. Extensive experiments on PBVS Thermal MOT dataset demonstrate that the approach is highly effective across various thermal camera conditions, making it a robust solution for real-world surveillance applications.
Authors:Aaron C. Davis, Siting Zhang, Adalyn Meeks, Diya Sakhrani, Luis Carlos Sanjuan Acosta, D. Ethan Kelley, Emma Caldwell, Luis Solorio, Craig J. Goergen, David J. Cappelleri
Abstract:
This paper presents innovative designs for 3D-printed tumbling microrobots, specifically engineered for targeted in vivo drug delivery applications. The microrobot designs, created using stereolithography 3D printing technologies, incorporate permanent micro-magnets to enable actuation via a rotating magnetic field actuator system. The experimental framework encompasses a series of locomotion characterization tests to evaluate microrobot performance under various conditions. Testing variables include variations in microrobot geometries, actuation frequencies, and environmental conditions, such as dry and wet environments, and temperature changes. The paper outlines designs for three drug loading methods, along with comprehensive assessments thermal drug release using a focused ultrasound system, as well as biocompatibility tests. Animal model testing involves tissue phantoms and in vivo rat models, ensuring a thorough evaluation of the microrobots' performance and compatibility. The results highlight the robustness and adaptability of the proposed microrobot designs, showcasing the potential for efficient and targeted in vivo drug delivery. This novel approach addresses current limitations in existing tumbling microrobot designs and paves the way for advancements in targeted drug delivery within the large intestine.
Authors:Tingting Zhou, Feng Zhang, Haoyang Fu, Baoxiang Pan, Renhe Zhang, Feng Lu, Zhixin Yang
Abstract:
The visible light reflectance data from geostationary satellites is crucial for meteorological observations and plays an important role in weather monitoring and forecasting. However, due to the lack of visible light at night, it is impossible to conduct continuous all-day weather observations using visible light reflectance data. This study pioneers the use of generative diffusion models to address this limitation. Based on the multi-band thermal infrared brightness temperature data from the Advanced Geostationary Radiation Imager (AGRI) onboard the Fengyun-4B (FY4B) geostationary satellite, we developed a high-precision visible light reflectance generative model, called Reflectance Diffusion (RefDiff), which enables 0.47~μ\mathrm{m}, 0.65~μ\mathrm{m}, and 0.825~μ\mathrm{m} bands visible light reflectance generation at night. Compared to the classical models, RefDiff not only significantly improves accuracy through ensemble averaging but also provides uncertainty estimation. Specifically, the SSIM index of RefDiff can reach 0.90, with particularly significant improvements in areas with complex cloud structures and thick clouds. The model's nighttime generation capability was validated using VIIRS nighttime product, demonstrating comparable performance to its daytime counterpart. In summary, this research has made substantial progress in the ability to generate visible light reflectance at night, with the potential to expand the application of nighttime visible light data.
Authors:Di Zhang, Ligang Liu
Abstract:
We present a rigorous asymptotic analysis framework for investigating the thermal conductivity of shell lattice metamaterials, extending prior work from mechanical stiffness to heat transfer. Central to our analysis is a new metric, the asymptotic directional conductivity (ADC), which captures the leading-order influence of the middle surface geometry on the effective thermal conductivity in the vanishing-thickness limit. A convergence theorem is established for evaluating ADC, along with a sharp upper bound and the necessary and sufficient condition for achieving this bound. These results provide the first theoretical justification for the optimal thermal conductivity of triply periodic minimal surfaces. Furthermore, we show that ADC yields a third-order approximation to the effective conductivity of shell lattices at low volume fractions. To support practical design applications, we develop a discrete algorithm for computing and optimizing ADC over arbitrary periodic surfaces. Numerical results confirm the theoretical predictions and demonstrate the robustness and effectiveness of the proposed optimization algorithm.
Authors:Augustine Twumasi, Prokash Chandra Roy, Zixun Li, Soumya Shouvik Bhattacharjee, Zhengtao Gan
Abstract:
Laser powder bed fusion (L-PBF) is a widely recognized additive manufacturing technology for producing intricate metal components with exceptional accuracy. A key challenge in L-PBF is the formation of complex microstructures affecting product quality. We propose a physics-guided, machine-learning approach to optimize scan paths for desired microstructure outcomes, such as equiaxed grains. We utilized a phase-field method (PFM) to model crystalline grain structure evolution. To reduce computational costs, we trained a surrogate machine learning model, a 3D U-Net convolutional neural network, using single-track phase-field simulations with various laser powers to predict crystalline grain orientations based on initial microstructure and thermal history. We investigated three scanning strategies across various hatch spacings within a square domain, achieving a two-orders-of-magnitude speedup using the surrogate model. To reduce trial and error in designing laser scan toolpaths, we used deep reinforcement learning (DRL) to generate optimized scan paths for target microstructure. Results from three cases demonstrate the DRL approach's effectiveness. We integrated the surrogate 3D U-Net model into our DRL environment to accelerate the reinforcement learning training process. The reward function minimizes both aspect ratio and grain volume of the predicted microstructure from the agent's scan path. The reinforcement learning algorithm was benchmarked against conventional zigzag approach for smaller and larger domains, showing machine learning methods' potential to enhance microstructure control and computational efficiency in L-PBF optimization.
Authors:Cyrill Bösch, Geoffrey Roeder, Marc Serra-Garcia, Ryan P. Adams
Abstract:
We show that the out-of-equilibrium driving protocol of score-based generative models (SGMs) can be learned via local learning rules. The gradient with respect to the parameters of the driving protocol is computed directly from force measurements or from observed system dynamics. As a demonstration, we implement an SGM in a network of driven, nonlinear, overdamped oscillators coupled to a thermal bath. We first apply it to the problem of sampling from a mixture of two Gaussians in 2D. Finally, we train a 12x12 oscillator network on the MNIST dataset to generate images of handwritten digits 0 and 1.
Authors:Dan Sturm, Marzieyh Rezaei, Alana Dee, Sajjad Moazeni
Abstract:
Co-packaged optics (CPO) has emerged as a promising solution for achieving the ultra-high bandwidths, shoreline densities, and energy efficiencies required by future GPUs and network switches for AI. Microring modulators (MRMs) are well suited for transmitters due to their compact size, high energy efficiency, and natural compatibility with dense wavelength-division multiplexing (DWDM). However, extending beyond the recently demonstrated 200 Gb/s will require more advanced modulation formats, such as higher-order coherent modulation (e.g., QAM-16).
In this work, we show how microring resonators (MRMs) can be efficiently used to implement phase-constant amplitude modulators and form the building blocks of a transmitter for offset QAM-16, which has been shown to simplify carrier-phase recovery relative to conventional QAM. We simulate and evaluate the performance of our proposed MRM-based coherent CPO (C2PO) transmitters using a foundry-provided commercial silicon photonics process, demonstrating an input-normalized electric field amplitude contrast of 0.64 per dimension. Through full link-level bit error rate modeling, we show that our design achieves 400 Gb/s using offset QAM-16 at a total optical laser power of 9.65 dBm-comparable to that required by conventional QAM-16 MZI-based links, despite using 10-100x less area. We further conduct a thermal simulation to assess the transmitter's thermal stability at the MRM input optical power required to meet a target BER at the desired data rates. Finally, as a proof of concept, we demonstrate 25 Gb/s MRM-based offset QAM-4 modulation with a chip fabricated in the GlobalFoundries 45 nm monolithic silicon photonics process.
Authors:Mostafa A. Atalla, Jelte Nieuwenhuis, Alan Martin, Xuan Wang, Ahranee Canden, Matt J. Carré, Roger Lewis, Aimée Sakes, Michaël Wiertlewski
Abstract:
Transluminal minimally invasive surgery uses natural orifices and small incisions to access internal anatomical structures, promoting quicker recovery and reduced morbidity. However, navigating instruments--catheters and endoscopes--through anatomical pathways creates frictional interactions with luminal walls, risking complications such as perforation, poor haptic feedback, and instrument buckling. In this paper, we present a new approach to actively lubricate transluminal instruments and dynamically reduce friction with surrounding tissues. This approach employs ultrasonic vibrations, at the instrument surface, to generate a pressurized fluid layer at the contact interface, lubricating the interface and thereby reducing friction. We implemented this approach in a prototype catheter, which we validated under dry and liquid-lubricated conditions, across rigid and soft interfaces, and along varied anatomical curvatures. In a cardiac catheter use case, active lubrication reduced friction by up to 42% on ex-vivo porcine aorta tissue and 82% on rigid substrates, denoting its potential performance on healthy and calcified tissue, respectively. Thermal imaging confirmed that temperature at the tissue-catheter interface remained within safe limits. Additionally, the system effectively prevented buckling during catheter insertion experiment, further showcasing its potential. By minimizing injury risk and enhancing procedural stability, active lubrication can drastically enhance the safety and efficacy of transluminal interventions.
Authors:Aditi Tiwari, Farzaneh Masoud, Dac Trong Nguyen, Jill Kraft, Heng Ji, Klara Nahrstedt
Abstract:
Modern AI systems struggle most in environments where reliability is critical - scenes with smoke, poor visibility, and structural deformation. Each year, tens of thousands of firefighters are injured on duty, often due to breakdowns in situational perception. We introduce Fire360, a benchmark for evaluating perception and reasoning in safety-critical firefighting scenarios. The dataset includes 228 360-degree videos from professional training sessions under diverse conditions (e.g., low light, thermal distortion), annotated with action segments, object locations, and degradation metadata. Fire360 supports five tasks: Visual Question Answering, Temporal Action Captioning, Object Localization, Safety-Critical Reasoning, and Transformed Object Retrieval (TOR). TOR tests whether models can match pristine exemplars to fire-damaged counterparts in unpaired scenes, evaluating transformation-invariant recognition. While human experts achieve 83.5% on TOR, models like GPT-4o lag significantly, exposing failures in reasoning under degradation. By releasing Fire360 and its evaluation suite, we aim to advance models that not only see, but also remember, reason, and act under uncertainty. The dataset is available at: https://uofi.box.com/v/fire360dataset.
Authors:Sonakshi Gupta, Akhlak Mahmood, Shivank Shukla, Rampi Ramprasad
Abstract:
Machine learning has revolutionized polymer science by enabling rapid property prediction and generative design. Large language models (LLMs) offer further opportunities in polymer informatics by simplifying workflows that traditionally rely on large labeled datasets, handcrafted representations, and complex feature engineering. LLMs leverage natural language inputs through transfer learning, eliminating the need for explicit fingerprinting and streamlining training. In this study, we finetune general purpose LLMs -- open-source LLaMA-3-8B and commercial GPT-3.5 -- on a curated dataset of 11,740 entries to predict key thermal properties: glass transition, melting, and decomposition temperatures. Using parameter-efficient fine-tuning and hyperparameter optimization, we benchmark these models against traditional fingerprinting-based approaches -- Polymer Genome, polyGNN, and polyBERT -- under single-task (ST) and multi-task (MT) learning. We find that while LLM-based methods approach traditional models in performance, they generally underperform in predictive accuracy and efficiency. LLaMA-3 consistently outperforms GPT-3.5, likely due to its tunable open-source architecture. Additionally, ST learning proves more effective than MT, as LLMs struggle to capture cross-property correlations, a key strength of traditional methods. Analysis of molecular embeddings reveals limitations of general purpose LLMs in representing nuanced chemo-structural information compared to handcrafted features and domain-specific embeddings. These findings provide insight into the interplay between molecular embeddings and natural language processing, guiding LLM selection for polymer informatics.
Authors:Yanpei Shi, Bo Feng, Yuxin Zhong, Haochen Guo, Bangcheng Han, Rui Feng
Abstract:
Thermally induced laser noise poses a critical limitation to the sensitivity of quantum sensor arrays employing ultra-stable amplified lasers, primarily stemming from nonlinear gain-temperature coupling effects in tapered amplifiers (TAs). To address this challenge, we present a robust intelligent control strategy that synergistically integrates an encoder-decoder physics-informed gated recurrent unit (PI-GRU) network with a model predictive control (MPC) framework. Our methodology incorporates physical soft constraints into the neural network architecture, yielding a predictive model with enhanced physical consistency that demonstrates robust extrapolation capabilities beyond the training data distribution. Leveraging the PI-GRU model's accurate multi-step predictive performance, we implement a hierarchical parallel MPC architecture capable of real-time thermal instability compensation. This hybrid approach achieves cross-domain consistent thermal stabilization in TAs under diverse laser power operations. Remarkably, while trained exclusively on low-power operational data, our system demonstrates exceptional generalization, improving prediction accuracy by 58.2% and temperature stability by 69.1% in previously unseen high-power operating regimes, as experimentally validated. The novel synchronization of physics-informed neural networks with advanced MPC frameworks presented in this work establishes a groundbreaking paradigm for addressing robustness challenges in cross-domain predictive control applications, overcoming conventional modeling limitations.
Authors:Juan Angelo Vargas-Fajardo, Diana Manvelyan-Stroot, Catharina Czech, Pietro Botazzoli, Fabian Duddeck
Abstract:
High temperatures and structural deformations can compromise the functionality and reliability of new components for mechatronic systems. Therefore, high-fidelity simulations (HFS) are employed during the design process, as they enable a detailed analysis of the thermal and structural behavior of the system. However, such simulations are both computationally expensive and tedious, particularly during iterative optimization procedures. Establishing a parametric reduced order model (pROM) can accelerate the design's optimization if the model can accurately predict the behavior over a wide range of material and geometric properties. However, many existing methods exhibit limitations when applied to wide design ranges.
In this work, we introduce the parametric Box Reduction (pBR) method, a matrix interpolation technique that minimizes the non-physical influence of training points due to the large parameter ranges. For this purpose, we define a new interpolation function that computes a local weight for each design variable and integrates them into the global function. Furthermore, we develop an intuitive clustering technique to select the training points for the model, avoiding numerical artifacts from distant points. Additionally, these two strategies do not require normalizing the parameter space and handle every property equally. The effectiveness of the pBR method is validated through two physical applications: structural deformation of a cantilever Timoshenko beam and heat transfer of a power module of a power converter. The results demonstrate that the pBR approach can accurately capture the behavior of mechatronic components across large parameter ranges without sacrificing computational efficiency.
Authors:A. Ashok, A. Cabrera, S. Baje, A. Zambanini, K. Allinger, A. Bahr, S. van Waasen
Abstract:
A universal quantum computer~(QC), though promising ground breaking solutions to complex problems, still faces several challenges with respect to scalability. Current state-of-the-art QC use a great quantity of cables to connect the physical qubits, situated in the cryogenic temperature, to room temperature electronics. Integrated cryogenic electronics together with semiconductor spin qubits is one way closer for scalability. Such a scalable quantum computer can have qubits and the control electronics at 4K stage. Being at 4K, more thermal dissipation is allowed without overloading the cooling capability of the fridge. Still, control and power circuitry is expected to be highly efficient. While commercial CMOS technologies are found to be operatable at \qty{}{mK}, lack of reliable cryogenic models while designing, increased mismatches at cryo temperatures makes the design challenging and risky. Using an FDSOI technology with backgate biasing to compensate for the threshold voltage drift happening at cryo~(compensating around 200mV) and digital circuitry is a way to address this challenge. In this work, a self-clocked digital low dropout regulator (DLDO) is designed in FDSOI for high power efficient, variation tolerant regulator to supply cryogenic circuits for Quantum computing. The proposed digital LDO is more resilient to mismatch and having self clocking and close and fine loops addresses the power efficiency and faster transient response.
Authors:Xue Cui, Vincent Gbouna Zakka, Minhyun Lee
Abstract:
Occupancy plays an essential role in influencing the energy consumption and operation of heating, ventilation, and air conditioning (HVAC) systems. Traditional HVAC typically operate on fixed schedules without considering occupancy. Advanced occupant-centric control (OCC) adopted occupancy status in regulating HVAC operations. RGB images combined with computer vision (CV) techniques are widely used for occupancy detection, however, the detailed facial and body features they capture raise significant privacy concerns. Low-resolution thermal images offer a non-invasive solution that mitigates privacy issues. The study developed an occupancy detection model utilizing low-resolution thermal images and CV techniques, where transfer learning was applied to fine-tune the You Only Look Once version 5 (YOLOv5) model. The developed model ultimately achieved satisfactory performance, with precision, recall, mAP50, and mAP50 values approaching 1.000. The contributions of this model lie not only in mitigating privacy concerns but also in reducing computing resource demands.
Authors:Ruiyue Huang, Claire E. Heaney, Maarten van Reeuwijk
Abstract:
The Neural Networks for Partial Differential Equations (NN4PDEs) approach is used to determine the parameters of a simple land-surface model using PyTorch's backpropagation engine. In order to test the inverse model, a synthetic dataset is created by running the model in forward mode with known parameter values to create soil temperature time series that can be used as observations for the inverse model. We show that it is not possible to obtain a reliable parameter estimation using a single observed soil temperature time series. Using measurements at two depths, reliable parameter estimates can be obtained although it is not possible to differentiate between latent and sensible heat fluxes. We apply the inverse model to urban flux tower data in Phoenix, United States, and show that the thermal conductivity, volumetric heat capacity, and the combined sensible-latent heat transfer coefficient can be reliably estimated using an observed value for the effective surface albedo. The resulting model accurately predicts the outgoing longwave radiation, conductive soil fluxes and the combined sensible-latent heat fluxes.
Authors:Weihua Yang, Yicong Zhou
Abstract:
Visible images provide rich details and color information only under well-lighted conditions while infrared images effectively highlight thermal targets under challenging conditions such as low visibility and adverse weather. Infrared-visible image fusion aims to integrate complementary information from infrared and visible images to generate a high-quality fused image. Existing methods exhibit critical limitations such as neglecting color structure information in visible images and performance degradation when processing low-quality color-visible inputs. To address these issues, we propose a quaternion infrared-visible image fusion (QIVIF) framework to generate high-quality fused images completely in the quaternion domain. QIVIF proposes a quaternion low-visibility feature learning model to adaptively extract salient thermal targets and fine-grained texture details from input infrared and visible images respectively under diverse degraded conditions. QIVIF then develops a quaternion adaptive unsharp masking method to adaptively improve high-frequency feature enhancement with balanced illumination. QIVIF further proposes a quaternion hierarchical Bayesian fusion model to integrate infrared saliency and enhanced visible details to obtain high-quality fused images. Extensive experiments across diverse datasets demonstrate that our QIVIF surpasses state-of-the-art methods under challenging low-visibility conditions.
Authors:Michael Marinaccio, Fatemeh Afghah
Abstract:
High-fidelity wildfire monitoring using Unmanned Aerial Vehicles (UAVs) typically requires multimodal sensing - especially RGB and thermal imagery - which increases hardware cost and power consumption. This paper introduces SAM-TIFF, a novel teacher-student distillation framework for pixel-level wildfire temperature prediction and segmentation using RGB input only. A multimodal teacher network trained on paired RGB-Thermal imagery and radiometric TIFF ground truth distills knowledge to a unimodal RGB student network, enabling thermal-sensor-free inference. Segmentation supervision is generated using a hybrid approach of segment anything (SAM)-guided mask generation, and selection via TOPSIS, along with Canny edge detection and Otsu's thresholding pipeline for automatic point prompt selection. Our method is the first to perform per-pixel temperature regression from RGB UAV data, demonstrating strong generalization on the recent FLAME 3 dataset. This work lays the foundation for lightweight, cost-effective UAV-based wildfire monitoring systems without thermal sensors.
Authors:Youngkyu Kim, Byounghyun Yoo, Ji Young Yun, Hyeokmin Lee, Sehyeon Park, Jin Woo Moon, Eun Ji Choi
Abstract:
Achieving thermal comfort while maintaining energy efficiency is a critical objective in building system control. Conventional thermal comfort models, such as the Predicted Mean Vote (PMV), rely on both environmental and personal variables. However, the use of fixed-location sensors limits the ability to capture spatial variability, which reduces the accuracy of occupant-specific comfort estimation. To address this limitation, this study proposes a new PMV estimation method that incorporates spatial environmental data reconstructed using the Gappy Proper Orthogonal Decomposition (Gappy POD) algorithm. In addition, a group PMV-based control framework is developed to account for the thermal comfort of multiple occupants. The Gappy POD method enables fast and accurate reconstruction of indoor temperature fields from sparse sensor measurements. Using these reconstructed fields and occupant location data, spatially resolved PMV values are calculated. Group-level thermal conditions are then derived through statistical aggregation methods and used to control indoor temperature in a multi-occupant living lab environment. Experimental results show that the Gappy POD algorithm achieves an average relative error below 3\% in temperature reconstruction. PMV distributions varied by up to 1.26 scale units depending on occupant location. Moreover, thermal satisfaction outcomes varied depending on the group PMV method employed. These findings underscore the importance for adaptive thermal control strategies that incorporate both spatial and individual variability, offering valuable insights for future occupant-centric building operations.
Authors:Samuel Olivier, James S. Warsa, HyeongKae Park
Abstract:
The design of efficient numerical methods for modeling thermal radiative transfer (TRT) is challenging due to the stiff, nonlinear coupling between radiation and material energies, especially at the time scales of interest in high energy density physics and astrophysics. Here, we investigate the use of the Second Moment Method (SMM) to accelerate absorption-emission within the context of the multigroup, Discrete Ordinates transport equations with discontinuous Galerkin spatial discretization. SMM employs a reduced-dimensional, diffusion-based model of radiation transport that, when coupled with suitable discrete closures, serves as a proxy for the transport equation, isolating the transport equation from the stiff absorption-emission physics. We use a gray low-order system to reduce the cost of solving the low-order system and leverage SMM low-order discretizations specifically designed to be scalably solvable with existing linear solver technology. Our algorithm robustly resolves the nonlinear TRT system while only relying on transport sweeps, linearly solving symmetric and positive definite, gray diffusion systems, and nonlinearly solving the spatially pointwise energy balance equation. This algorithm is used as a vehicle to compare the efficacy of low-order discretizations developed for steady-state, linear transport on gray and multigroup TRT problems in one and two spatial dimensions.
Authors:Aditi Nachnani, Kai K. Li-Caldwell, Saptarshi Biswas, Prince Sharma, Gaoyuan Ouyang, Prashant Singh
Abstract:
We present a machine-learning guided approach to predict saturation magnetization (MS) and coercivity (HC) in Fe-rich soft magnetic alloys, particularly Fe-Si-B systems. ML models trained on experimental data reveals that increasing Si and B content reduces MS from 1.81T (DFT~2.04 T) to ~1.54 T (DFT~1.56T) in Fe-Si-B, which is attributed to decreased magnetic density and structural modifications. Experimental validation of ML predicted magnetic saturation on Fe-1Si-1B (2.09T), Fe-5Si-5B (2.01T) and Fe-10Si-10B (1.54T) alloy compositions further support our findings. These trends are consistent with density functional theory (DFT) predictions, which link increased electronic disorder and band broadening to lower MS values. Experimental validation on selected alloys confirms the predictive accuracy of the ML model, with good agreement across compositions. Beyond predictive accuracy, detailed uncertainty quantification and model interpretability including through feature importance and partial dependence analysis reveals that MS is governed by a nonlinear interplay between Fe content, early transition metal ratios, and annealing temperature, while HC is more sensitive to processing conditions such as ribbon thickness and thermal treatment windows. The ML framework was further applied to Fe-Si-B/Cr/Cu/Zr/Nb alloys in a pseudo-quaternary compositional space, which shows comparable magnetic properties to NANOMET (Fe84.8Si0.5B9.4Cu0.8 P3.5C1), FINEMET (Fe73.5Si13.5B9 Cu1Nb3), NANOPERM (Fe88Zr7B4Cu1), and HITPERM (Fe44Co44Zr7B4Cu1. Our fundings demonstrate the potential of ML framework for accelerated search of high-performance, Co- and Ni-free, soft magnetic materials.
Authors:Dinan Li, Panagiotis Kakosimos
Abstract:
The number of electrified powertrains is ever increasing today towards a more sustainable future; thus, it is essential that unwanted failures are prevented, and a reliable operation is secured. Monitoring the internal temperatures of motors and keeping them under their thresholds is an important first step. Conventional modeling methods require expert knowledge and complicated mathematical approaches. With all the data a modern electric drive collects nowadays during the system operation, it is feasible to apply data-driven approaches for estimating thermal behaviors. In this paper, multiple machine-learning methods are investigated on their capability to approximate the temperatures of the stator winding and bearing in induction motors. The explored algorithms vary from linear to neural networks. For this reason, experimental lab data have been captured from a powertrain under predetermined operating conditions. For each approach, a hyperparameter search is then performed to find the optimal configuration. All the models are evaluated by various metrics, and it has been found that neural networks perform satisfactorily even under transient conditions.
Authors:Dinan Li, Panagiotis Kakosimos, Luca Peretti
Abstract:
The recent technological advances in digitalization have revolutionized the industrial sector. Leveraging data analytics has now enabled the collection of deep insights into the performance and, as a result, the optimization of assets. Industrial drives, for example, already accumulate all the necessary information to control electric machines. These signals include but are not limited to currents, frequency, and temperature. Integrating machine learning (ML) models responsible for predicting the evolution of those directly collected or implicitly derived parameters enhances the smartness of industrial systems even further. In this article, data already residing in most modern electric drives has been used to develop a data-driven thermal model of a power module. A test bench has been designed and used specifically for training and validating the thermal digital twin undergoing various static and dynamic operating profiles. Different approaches, from traditional linear models to deep neural networks, have been implemented to emanate the best ML model for estimating the case temperature of a power module. Several evaluation metrics were then used to assess the investigated methods' performance and implementation in industrial embedded systems.
Authors:Panagiotis Kakosimos, Alireza Nemat Saberi, Luca Peretti
Abstract:
This study explores alternative framework configurations for adapting thermal machine learning (ML) models for power converters by combining transfer learning (TL) and federated learning (FL) in a piecewise manner. This approach inherently addresses challenges such as varying operating conditions, data sharing limitations, and security implications. The framework starts with a base model that is incrementally adapted by multiple clients via adapting three state-of-the-art domain adaptation techniques: Fine-tuning, Transfer Component Analysis (TCA), and Deep Domain Adaptation (DDA). The Flower framework is employed for FL, using Federated Averaging for aggregation. Validation with field data demonstrates that fine-tuning offers a straightforward TL approach with high accuracy, making it suitable for practical applications. Benchmarking results reveal a comprehensive comparison of these methods, showcasing their respective strengths and weaknesses when applied in different scenarios. Locally hosted FL enhances performance when data aggregation is not feasible, while cloud-based FL becomes more practical with a significant increase in the number of clients, addressing scalability and connectivity challenges.
Authors:Jean-Luc Feugeas, Julien Mathiaud, Luc Mieussens, Thomas Vigier
Abstract:
The M1 moment model for electronic transport is commonly used to describe non-local thermal transport effects in laser-plasma simulations. In this article, we propose a new asymptotic-preserving scheme based on the Unified Gas Kinetic Scheme (UGKS) for this model in two-dimensional space. This finite volume kinetic scheme follows the same approach as in our previous article and relies on a moment closure, at the numerical scale, of the microscopic flux of UGKS. The method is developed for both structured and unstructured meshes, and several techniques are introduced to ensure accurate fluxes in the diffusion limit. A second-order extension is also proposed. Several test cases validate the different aspects of the scheme and demonstrate its efficiency in multiscale simulations. In particular, the results demonstrate that this method accurately captures non-local thermal effects.
Authors:Marie-Hélène Azam, Julien Berger, Edouard Walther, Sihem Guernouti
Abstract:
Numerical simulation is a powerful tool for assessing the causes of an Urban Heat Island (UHI) effect or quantifying the impact of mitigation solutions on outdoor and indoor thermal comfort. For that purpose, several models have been developed at the district scale. At this scale, the outside surface energy budget is detailed, however building models are very simplified and considered as a boundary condition of the district scale model. This shortcoming inhibits the opportunity to investigate the effect of urban microclimate on the inside building conditions. The aim of this work is to improve the representation of the physical phenomena involved in the building models of a district model. For that purpose, the model integrates inside and outside fully detailed long-wave radiative flux. The numerical model is based on finite differences to solve conduction through all the surfaces and the radiosity method to solve long-wave radiative heat fluxes inside and outside. Calculated temperatures and heat fluxes are evaluated with respect to \textit{in situ} measurements from an experimental demonstrator over 14 sensors and a 24-day period. Results are also compared to state-of-the-art models simulation tool show improvement of the RMSE of $0.9 \ \mathsf{^{\,\circ}C}$ to $2.1 \ \mathsf{^{\,\circ}C}$ on the surface temperature modeled.
Authors:Sanath Keshav, Julius Herb, Felix Fritzen
Abstract:
Heterogeneous materials are crucial to producing lightweight components, functional components, and structures composed of them. A crucial step in the design process is the rapid evaluation of their effective mechanical, thermal, or, in general, constitutive properties. The established procedure is to use forward models that accept microstructure geometry and local constitutive properties as inputs. The classical simulation-based approach, which uses, e.g., finite elements and FFT-based solvers, can require substantial computational resources. At the same time, simulation-based models struggle to provide gradients with respect to the microstructure and the constitutive parameters. Such gradients are, however, of paramount importance for microstructure design and for inverting the microstructure-property mapping. Machine learning surrogates can excel in these situations. However, they can lead to unphysical predictions that violate essential bounds on the constitutive response, such as the upper (Voigt-like) or the lower (Reuss-like) bound in linear elasticity. Therefore, we propose a novel spectral normalization scheme that a priori enforces these bounds. The approach is fully agnostic with respect to the chosen microstructural features and the utilized surrogate model. All of these will automatically and strictly predict outputs that obey the upper and lower bounds by construction. The technique can be used for any constitutive tensor that is symmetric and where upper and lower bounds (in the Löwner sense) exist, i.e., for permeability, thermal conductivity, linear elasticity, and many more. We demonstrate the use of spectral normalization in the Voigt-Reuss net using a simple neural network. Numerical examples on truly extensive datasets illustrate the improved accuracy, robustness, and independence of the type of input features in comparison to much-used neural networks.
Authors:Leo Tunkle, Kamal Abdulraheem, Linyu Lin, Majdi I. Radaideh
Abstract:
The economic feasibility of nuclear microreactors will depend on minimizing operating costs through advancements in autonomous control, especially when these microreactors are operating alongside other types of energy systems (e.g., renewable energy). This study explores the application of deep reinforcement learning (RL) for real-time drum control in microreactors, exploring performance in regard to load-following scenarios. By leveraging a point kinetics model with thermal and xenon feedback, we first establish a baseline using a single-output RL agent, then compare it against a traditional proportional-integral-derivative (PID) controller. This study demonstrates that RL controllers, including both single- and multi-agent RL (MARL) frameworks, can achieve similar or even superior load-following performance as traditional PID control across a range of load-following scenarios. In short transients, the RL agent was able to reduce the tracking error rate in comparison to PID. Over extended 300-minute load-following scenarios in which xenon feedback becomes a dominant factor, PID maintained better accuracy, but RL still remained within a 1% error margin despite being trained only on short-duration scenarios. This highlights RL's strong ability to generalize and extrapolate to longer, more complex transients, affording substantial reductions in training costs and reduced overfitting. Furthermore, when control was extended to multiple drums, MARL enabled independent drum control as well as maintained reactor symmetry constraints without sacrificing performance -- an objective that standard single-agent RL could not learn. We also found that, as increasing levels of Gaussian noise were added to the power measurements, the RL controllers were able to maintain lower error rates than PID, and to do so with less control effort.
Authors:Zhanat Karashbayeva, Julien Berger, Helcio R. B. Orlande, Marie-Hélène Azam
Abstract:
Urbanization is the key contributor for climate change. Increasing urbanization rate causes an urban heat island (UHI) effect, which strongly depends on the short- and long-wave radiation balance heat flux between the surfaces. In order to calculate accurately this heat flux, it is required to assess the surface temperature which depends on the knowledge of the thermal properties and the surface heat transfer coefficients in the heat transfer problem. The aim of this paper is to estimate the thermal properties of the ground and the time varying surface heat transfer coefficient by solving an inverse problem. The Dufort--Frankel scheme is applied for solving the unsteady heat transfer problem. For the inverse problem, a Markov chain Monte Carlo method is used to estimate the posterior probability density function of unknown parameters within the Bayesian framework of statistics, by applying the Metropolis-Hastings algorithm for random sample generation. Actual temperature measurements available at different ground depths were used for the solution of the inverse problem. Different time discretizations were examined for the transient heat transfer coefficient at the ground surface, which then involved different prior distributions. Results of different case studies show that the estimated values of the unknown parameters were in accordance with literature values. Moreover, with the present solution of the inverse problem the temperature residuals were smaller than those obtained by using literature values for the unknowns.
Authors:Maarten Vlaswinkel, Duarte Antunes, Frank Willems
Abstract:
Decarbonization of the transport sector sets increasingly strict demands to maximize thermal efficiency and minimize greenhouse gas emissions of Internal Combustion Engines. This has led to complex engines with a surge in the number of corresponding tunable parameters in actuator set points and control settings. Automated calibration is therefore essential to keep development time and costs at acceptable levels. In this work, an innovative self-learning calibration method is presented based on in-cylinder pressure curve shaping. This method combines Principal Component Decomposition with constrained Bayesian Optimization. To realize maximal thermal engine efficiency, the optimization problem aims at minimizing the difference between the actual in-cylinder pressure curve and an Idealized Thermodynamic Cycle. By continuously updating a Gaussian Process Regression model of the pressure's Principal Components weights using measurements of the actual operating conditions, the mean in-cylinder pressure curve as well as its uncertainty bounds are learned. This information drives the optimization of calibration parameters, which are automatically adapted while dealing with the risks and uncertainties associated with operational safety and combustion stability. This data-driven method does not require prior knowledge of the system. The proposed method is successfully demonstrated in simulation using a Reactivity Controlled Compression Ignition engine model. The difference between the Gross Indicated Efficiency of the optimal solution found and the true optimum is 0.017%. For this complex engine, the optimal solution was found after 64.4s, which is relatively fast compared to conventional calibration methods.
Authors:Mehdi Moshtaghi, Siavash H. Khajavi, Joni Pajarinen
Abstract:
We introduce RGB-Th-Bench, the first benchmark designed to evaluate the ability of Vision-Language Models (VLMs) to comprehend RGB-Thermal image pairs. While VLMs have demonstrated remarkable progress in visual reasoning and multimodal understanding, their evaluation has been predominantly limited to RGB-based benchmarks, leaving a critical gap in assessing their capabilities in infrared vision tasks. Existing visible-infrared datasets are either task-specific or lack high-quality annotations necessary for rigorous model evaluation. To address these limitations, RGB-Th-Bench provides a comprehensive evaluation framework covering 14 distinct skill dimensions, with a total of 1,600+ expert-annotated Yes/No questions. The benchmark employs two accuracy metrics: a standard question-level accuracy and a stricter skill-level accuracy, which evaluates model robustness across multiple questions within each skill dimension. This design ensures a thorough assessment of model performance, including resilience to adversarial and hallucinated responses. We conduct extensive evaluations on 19 state-of-the-art VLMs, revealing significant performance gaps in RGB-Thermal understanding. Our results show that even the strongest models struggle with thermal image comprehension, with performance heavily constrained by their RGB-based capabilities. Additionally, the lack of large-scale application-specific and expert-annotated thermal-caption-pair datasets in pre-training is an important reason of the observed performance gap. RGB-Th-Bench highlights the urgent need for further advancements in multimodal learning to bridge the gap between visible and thermal image understanding. The dataset is available through this link, and the evaluation code will also be made publicly available.
Authors:Hanshuo Qiu, Jie Jiang, Ruoli Yang, Lixin Zhan, Jizhao Liu
Abstract:
RGB-T road scene semantic segmentation enhances visual scene understanding in complex environments characterized by inadequate illumination or occlusion by fusing information from RGB and thermal images. Nevertheless, existing RGB-T semantic segmentation models typically depend on simple addition or concatenation strategies or ignore the differences between information at different levels. To address these issues, we proposed a novel RGB-T road scene semantic segmentation network called Brain-Inspired Multi-Iteration Interaction Network (BIMII-Net). First, to meet the requirements of accurate texture and local information extraction in road scenarios like autonomous driving, we proposed a deep continuous-coupled neural network (DCCNN) architecture based on a brain-inspired model. Second, to enhance the interaction and expression capabilities among multi-modal information, we designed a cross explicit attention-enhanced fusion module (CEAEF-Module) in the feature fusion stage of BIMII-Net to effectively integrate features at different levels. Finally, we constructed a complementary interactive multi-layer decoder structure, incorporating the shallow-level feature iteration module (SFI-Module), the deep-level feature iteration module (DFI-Module), and the multi-feature enhancement module (MFE-Module) to collaboratively extract texture details and global skeleton information, with multi-module joint supervision further optimizing the segmentation results. Experimental results demonstrate that BIMII-Net achieves state-of-the-art (SOTA) performance in the brain-inspired computing domain and outperforms most existing RGB-T semantic segmentation methods. It also exhibits strong generalization capabilities on multiple RGB-T datasets, proving the effectiveness of brain-inspired computer models in multi-modal image segmentation tasks.
Authors:Konstantinos Tsoupos, Stylianos Tzelepis, Georgios Sklavenitis, Dimitrios Stoupis, Grigorios Pavlakis, Panagiotis Bountzioukas, Christina Athanasiadou, Lily Ha, David Palma, Loris Franchi, Alkis Hatzopoulos
Abstract:
AcubeSAT is an open-source CubeSat mission aiming to explore the effects of microgravity and radiation on eukaryotic cells using a compact microfluidic lab-on-a-chip platform. It is developed by SpaceDot, a volunteer, interdisciplinary student team at the Aristotle University of Thessaloniki and supported by the "Fly Your Satellite! 3" program of the European Space Agency (ESA) Education Office.
The nanosatellite features an in-house designed on-board computer subsystem responsible for telecommand execution, telemetry fetching, onboard time synchronization, in-orbit patching, and fault recovery. The subsystem is designed on one PC/104 standard compatible Printed Circuit Board (PCB) that hosts the On-board Computer (OBC) on the one side and the Attitude and Orbit Control Subsystem (AOCS) on the other, and it is compatible with the LibreCube standard. The hosted subsystems are functionally isolated and feature an ARM Cortex-M7, radiation-tolerant microcontroller each.
Before sending anything to space thorough testing is required and specifically the on-board computer board underwent vibration and thermal cycling tests to ensure nominal operation in all conditions.
This paper aims to elucidate the decision-making process, design iterations, and development stages of the custom board and accompanying in-house software. Insights garnered from the initial partially successful environmental test campaign at the ESA CubeSat Support Facility will be shared, along with the ensuing preparations, results, and lessons learned from subsequent testing endeavors in April 2024. Furthermore, the current developmental status will be discussed alongside future electromagnetic compatibility testing, integration plan on a FlatSat, and prospects for the open-source design as a cost-effective, and modular solution that can be tailored with little effort for upcoming missions.
Authors:Partho Bhoumik, Christopher Bailey, Krishnendu Chakrabarty
Abstract:
Fan-out wafer-level packaging (FOWLP) addresses the demand for higher interconnect densities by offering reduced form factor, improved signal integrity, and enhanced performance. However, FOWLP faces manufacturing challenges such as coefficient of thermal expansion (CTE) mismatch, warpage, die shift, and post-molding protrusion, causing misalignment and bonding issues during redistribution layer (RDL) buildup. Moreover, the organic nature of the package exposes it to severe thermo-mechanical stresses during fabrication and operation. In order to address these challenges, we propose a comprehensive defect analysis and testing framework for FOWLP interconnects. We use Ansys Q3D to map defects to equivalent electrical circuit models and perform fault simulations to investigate the impacts of these defects on chiplet functionality. Additionally, we present a built-in self-test (BIST) architecture to detect stuck-at and bridging faults while accurately diagnosing the fault type and location. Our simulation results demonstrate the efficacy of the proposed BIST solution and provide critical insights for optimizing design decisions in packages, balancing fault detection and diagnosis with the cost of testability insertion.
Authors:Aman Singh, Bhavya Giri Goswami, Ketan Nehete, Shishir N. Y. Kolathaya
Abstract:
This paper introduces a chain-driven, sandwich-legged, mid-size quadruped robot designed as an accessible research platform. The design prioritizes enhanced locomotion capabilities, improved reliability and safety of the actuation system, and simplified, cost-effective manufacturing processes. Locomotion performance is optimized through a sandwiched leg design and a dual-motor configuration, reducing leg inertia for agile movements. Reliability and safety are achieved by integrating robust cable strain reliefs, efficient heat sinks for motor thermal management, and mechanical limits to restrict leg motion. Simplified design considerations include a quasi-direct drive (QDD) actuator and the adoption of low-cost fabrication techniques, such as laser cutting and 3D printing, to minimize cost and ensure rapid prototyping. The robot weighs approximately 25 kg and is developed at a cost under \$8000, making it a scalable and affordable solution for robotics research. Experimental validations demonstrate the platform's capability to execute trot and crawl gaits on flat terrain and slopes, highlighting its potential as a versatile and reliable quadruped research platform.
Authors:Meng Yuan, Adam Burman, Changfu Zou
Abstract:
The proper disposal and repurposing of end-of-life electric vehicle batteries are critical for maximizing their environmental benefits. This study introduces a robust model predictive control (MPC) framework designed to optimize the battery discharging process during pre-treatment, ensuring both efficiency and safety. The proposed method explicitly incorporates temperature constraints to prevent overheating and potential hazards. By leveraging a control-oriented equivalent circuit model integrated with thermal dynamics, the MPC algorithm dynamically adjusts the discharging profile to maintain safe operating temperatures. Additionally, the robust controller is designed to account for model mismatches between the nonlinear battery dynamics and the linearized model, ensuring reliable performance under varying conditions. The effectiveness of this approach is demonstrated through simulations comparing the robust MPC method with conventional discharging strategies, including constant current-constant voltage (CC-CV) and constant current-constant temperature (CC-CT) methods. Results indicate that the robust MPC framework significantly reduces discharging time while adhering to safety constraints, offering a promising solution for the recycling and second-life applications of lithium-ion batteries.
Authors:Yiqing Guo, Nagur Cherukuru, Eric Lehmann, Xiubin Qi, Mark Doubelld, S. L. Kesav Unnithan, Ming Feng
Abstract:
Sea surface temperature (SST) is a fundamental physical parameter characterising the thermal state of sea surface. Due to the intricate thermal interactions between land, sea, and atmosphere, the spatial gradients of SST in coastal waters often appear at finer spatial scales than those in open ocean waters. The Thermal Infrared Sensor (TIRS) onboard Landsat-8, with its 100-meter spatial resolution, offers a unique opportunity to uncover fine-scale coastal SST patterns that would otherwise be overlooked by coarser-resolution thermal sensors. In this study, we first analysed the spatiotemporal patterns of SST in South Australia's temperate coastal waters from 2014 to 2023 by developing an operational approach for SST retrieval from the Landsat-8 TIRS sensor. A buoy was deployed off the coast of Port Lincoln, South Australia, to validate the quality of SST retrievals. Then the daily baseline climatology of SST with 100 m resolution was constructed, which allowed for the detection and analysis of anomalous SST events. Our results suggest the following: (1) the satellite-derived SST data aligned well with the in-situ measured SST values; (2) the semi-enclosed, shallow regions of Upper Spencer Gulf and Upper St Vincent Gulf showed higher temperatures during summer and cooler temperatures during winter than waters closer to the open ocean, resulting in a higher seasonal variation in SST; (3) the near-shore shallow areas in Spencer Gulf and St Vincent Gulf, and regions surrounding Kangaroo Island, were identified to have a higher probability of SST anomalies compared to the rest of the study area; and (4) anomalous SST events were more likely to happen during the warm months than the cool months. We hope these findings would be helpful in supporting the fishing and aquaculture industries in the coastal waters of South Australia.
Authors:Amir Jahangiri, Tatiana Agback, Ulrika Brath, Vladislav Orekhov
Abstract:
In multidimensional NMR spectroscopy, practical resolution is defined as the ability to distinguish and accurately determine signal positions against a background of overlapping peaks, thermal noise, and spectral artifacts. In the pursuit of ultimate resolution, we introduce Peak Probability Presentations ($P^3$)- a statistical spectral representation that assigns a probability to each spectral point, indicating the likelihood of a peak maximum occurring at that location. The mapping between the spectrum and $P^3$ is achieved using MR-Ai, a physics-inspired deep learning neural network architecture, designed to handle multidimensional NMR spectra. Furthermore, we demonstrate that MR-Ai enables coprocessing of multiple spectra, facilitating direct information exchange between datasets. This feature significantly enhances spectral quality, particularly in cases of highly sparse sampling. Performance of MR-Ai and high value of the $P^3$ are demonstrated on the synthetic data and spectra of Tau, MATL1, Calmodulin, and several other proteins.
Authors:Sudheer Mishra, Natarajan E
Abstract:
In this work, we present and analyze a novel stabilized virtual element formulation for the coupled Stokes-Temperature equation on polygonal meshes, employing equal-order element pairs where viscosity depends on temperature. The main objective of the proposed virtual elements is to develop a stabilized virtual element problem that avoids higher-order derivative terms and bilinear forms involving velocity, pressure and temperature, thereby avoiding the coupling between virtual element pairs. Moreover, it also reduces the violation of divergence-free constraints and offers reasonable control over the gradient of temperature. We derive the stability of the continuous solution using the Banach fixed-point theorem under sufficiently small data. The stabilized coupled virtual element problem is formulated using the local projection-based stabilization methods. We demonstrate the existence and uniqueness of the stabilized discrete solution using the Brouwer fixed-point theorem and the contraction theorem under the assumption of sufficient small data by showing the well-posedness of the stabilized decoupled virtual element problems. Furthermore, we derive the error estimates with optimal convergence rates in the energy norms. We present several numerical examples to confirm the theoretical findings. Additionally, the numerical behavior of the proposed stabilized method is shown to be robust with respect to linear and non-linear thermal conductivity.
Authors:Gregg Rabideau, Joseph Russino, Andrew Branch, Nihal Dhamani, Tiago Stegun Vaquero, Steve Chien, Jean-Pierre de la Croix, Federico Rossi
Abstract:
NASA's Cooperative Autonomous Distributed Robotic Exploration (CADRE) mission, slated for flight to the Moon's Reiner Gamma region in 2025/2026, is designed to demonstrate multi-agent autonomous exploration of the Lunar surface and sub-surface. A team of three robots and a base station will autonomously explore a region near the lander, collecting the data required for 3D reconstruction of the surface with no human input; and then autonomously perform distributed sensing with multi-static ground penetrating radars (GPR), driving in formation while performing coordinated radar soundings to create a map of the subsurface. At the core of CADRE's software architecture is a novel autonomous, distributed planning, scheduling, and execution (PS&E) system. The system coordinates the robots' activities, planning and executing tasks that require multiple robots' participation while ensuring that each individual robot's thermal and power resources stay within prescribed bounds, and respecting ground-prescribed sleep-wake cycles. The system uses a centralized-planning, distributed-execution paradigm, and a leader election mechanism ensures robustness to failures of individual agents. In this paper, we describe the architecture of CADRE's PS&E system; discuss its design rationale; and report on verification and validation (V&V) testing of the system on CADRE's hardware in preparation for deployment on the Moon.
Authors:Daan de Bos, Marc Serra-Garcia
Abstract:
We introduce a network of coupled oscillators that can learn to solve a classification task from a set of examples -- performing both training and inference through the nonlinear evolution of the system. We accomplish this by combining three key elements to achieve learning: A long-term memory that stores learned responses, analogous to the synapses in biological brains; a short-term memory that stores the neural activations, similar to the firing patterns of neurons; and an evolution law that updates the synapses in response to novel examples, inspired by synaptic plasticity. Achieving all three elements in wave-based information processors such as metamaterials is a significant challenge. Here, we solve it by leveraging the material multistability to implement long-term memory, and harnessing symmetries and thermal noise to realize the learning rule. Our analysis reveals that the learning mechanism, although inspired by synaptic plasticity, also shares parallelisms with bacterial evolution strategies, where mutation rates increase in the presence of noxious stimuli.
Authors:Xuguang Zhang, Hexiang Zhang, Amjad Almansour, Mrityunjay Singh, Hengling Zhu, Michael C. Halbig, Yi Zheng
Abstract:
Effective thermal management is critical for lithium-ion battery packs' safe and efficient operations, particularly in applications such as drones, where compact designs and varying airflow conditions present unique challenges. This study investigates the thermal performance of a 16-cell lithium-ion battery pack by optimizing cooling airflow configurations and integrating phase change materials (PCMs) for enhanced heat dissipation. Seven geometric configurations were evaluated under airflow speeds ranging from 0 to 15 m/s, reflecting the operational conditions of civilian drones. A comprehensive 3D simulation approach was used to analyze the effects of inlet and outlet configurations, airflow dynamics, and PCM phase transition behavior. Results indicate that the trapezoidal (wide-base) configuration, paired with a 5-inlet and 1-outlet setup, achieves the most balanced performance, effectively maintaining optimal operating temperatures across low and high-speed airflow conditions. PCM integration further stabilized thermal behavior, with phase change durations extending to 12.5 min under tested conditions. These findings highlight the importance of geometric optimization and material integration in advancing compact and reliable thermal management systems for energy-dense battery packs. This study provides a foundation for designing efficient cooling strategies tailored to lightweight applications such as drones and portable energy storage systems.
Authors:Jacob Thrän, Tim C. Green, Robert Shorten
Abstract:
To make well-informed investment decisions, energy system stakeholders require reliable cost frameworks for demand response (DR) and storage technologies. While the levelised cost of storage (LCOS) permits comprehensive cost comparisons between different storage technologies, no generic cost measure for the comparison of different DR schemes exists. This paper introduces the levelised cost of demand response (LCODR) which is an analogous measure to the LCOS but crucially differs from it by considering consumer reward payments. Additionally, the value factor from cost estimations of variable renewable energy is adapted to account for the variable availability of DR. The LCODRs for four direct load control (DLC) schemes and twelve storage applications are estimated and contrasted against LCOS literature values for the most competitive storage technologies. The DLC schemes are vehicle-to-grid, smart charging, smart heat pumps, and heat pumps with thermal storage. The results show that only heat pumps with thermal storage consistently outcompete storage technologies with EV-based DR schemes being competitive for some applications. The results and the underlying methodology offer a tool for energy system stakeholders to assess the competitiveness of DR schemes even with limited user data.
Authors:Yuzhuo Li, Yunwei Li
Abstract:
As AI-driven computing infrastructures rapidly scale, discussions around data center design often emphasize energy consumption, water and electricity usage, workload scheduling, and thermal management. However, these perspectives often overlook the critical interplay between AI-specific load transients and power electronics. This paper addresses that gap by examining how large-scale AI workloads impose unique demands on power conversion chains and, in turn, how the power electronics themselves shape the dynamic behavior of AI-based infrastructure. We illustrate the fundamental constraints imposed by multi-stage power conversion architectures and highlight the key role of final-stage modules in defining realistic power slew rates for GPU clusters. Our analysis shows that traditional designs, optimized for slower-varying or CPU-centric workloads, may not adequately accommodate the rapid load ramps and drops characteristic of AI accelerators. To bridge this gap, we present insights into advanced converter topologies, hierarchical control methods, and energy buffering techniques that collectively enable robust and efficient power delivery. By emphasizing the bidirectional influence between AI workloads and power electronics, we hope this work can set a good starting point and offer practical design considerations to ensure future exascale-capable data centers can meet the stringent performance, reliability, and scalability requirements of next-generation AI deployments.
Authors:Tuna ErdoÄan, Shi-Yuan Wang, Shang-Jen Su, Matthieu Bloch
Abstract:
We consider a joint communication and sensing problem in an optical link in which a low-power transmitter attempts to communicate with a receiver while simultaneously identifying the range of a defect creating a backscattered signal. We model the system as a lossy thermal noise bosonic channel in which the location of the target, modeled as a beamsplitter, affects the timing of the backscattered signal. Motivated by the envisioned deployment of entanglement sharing quantum networks, we allow the transmitter to exploit entanglement to assist its sensing and communication. Since entanglement is known to enhance sensing, as known from quantum illumination, and increase communication rates, as known from the characterization of the entanglement-assisted capacity, the transmitter is faced with a trade-off and must judiciously allocate its entanglement resources. Our main result is a characterization of the trade-offs incurred in the form of an achievable rate/error-exponent region which can beat time-sharing in certain cases. The proof of our result relies on technical results of independent interests, by which we carefully show how to extend the known asymptotic characterization of multi-hypothesis testing Chernoff exponent in finite-dimensional spaces to infinite-dimensional spaces and provide a characterization of phase shift keying modulated displaced thermal states in Fock basis.
Authors:Viktor Kozák, Karel KoÅ¡nar, Jan Chudoba, Miroslav Kulich, Libor PÅeuÄil
Abstract:
Inspection systems utilizing unmanned aerial vehicles (UAVs) equipped with thermal cameras are increasingly popular for the maintenance of photovoltaic (PV) power plants. However, automation of the inspection task is a challenging problem as it requires precise navigation to capture images from optimal distances and viewing angles.
This paper presents a novel localization pipeline that directly integrates PV module detection with UAV navigation, allowing precise positioning during inspection. Detections are used to identify the power plant structures in the image and associate these with the power plant model. We define visually recognizable anchor points for the initial association and use object tracking to discern global associations. We present three distinct methods for visual segmentation of PV modules based on traditional computer vision, deep learning, and their fusion, and we evaluate their performance in relation to the proposed localization pipeline.
The presented methods were verified and evaluated using custom aerial inspection data sets, demonstrating their robustness and applicability for real-time navigation. Additionally, we evaluate the influence of the power plant model's precision on the localization methods.
Authors:Guillermo Federico Umbricht, Domingo Alberto Tarzia, Diana Rubio
Abstract:
In this work, a thermal energy transfer problem in a one-dimensional multilayer body is theoretically analyzed, considering diffusion, advection, internal heat generation or loss linearly dependent on temperature in each layer, as well as heat generation due to external sources. Additionally, the thermal contact resistance at the interfaces between each pair of materials is taken into account. The problem is mathematically modeled, and explicit analytical solutions are derived using Fourier techniques. A convergent finite difference scheme is also formulated to simulate specific cases. The solution is consistent with previous results. A numerical example is provided, demonstrating the coherence between the obtained results and the physical behavior of the problem. This work was recently published for a two-layer body; the generalization to m-layer bodies allows for conclusions that enhance the theoretical understanding of heat transfer in multilayer materials and may contribute to improving the thermal design of multilayer engineering systems.
Authors:Pegah Eshraghi, Arman Nikkhah Dehnavi, Maedeh Mirdamadi, Riccardo Talami, Zahra-Sadat Zomorodian
Abstract:
As urbanization accelerates, open spaces are increasingly recognized for their role in enhancing sustainability and well-being, yet they remain underexplored compared to built spaces. This study introduces an AI-driven framework that integrates machine learning models (MLMs) and explainable AI techniques to optimize Sky View Factor (SVF) and visibility, key spatial metrics influencing thermal comfort and perceived safety in urban spaces. Unlike global optimization methods, which are computationally intensive and impractical for localized adjustments, this framework supports incremental design improvements with lower computational costs and greater flexibility. The framework employs SHapley Adaptive Explanations (SHAP) to analyze feature importance and Counterfactual Explanations (CFXs) to propose minimal design changes. Simulations tested five MLMs, identifying XGBoost as the most accurate, with building width, park area, and heights of surrounding buildings as critical for SVF, and distances from southern buildings as key for visibility. Compared to Genetic Algorithms, which required approximately 15/30 minutes across 3/4 generations to converge, the tested CFX approach achieved optimized results in 1 minute with a 5% RMSE error, demonstrating significantly faster performance and suitability for scalable retrofitting strategies. This interpretable and computationally efficient framework advances urban performance optimization, providing data-driven insights and practical retrofitting solutions for enhancing usability and environmental quality across diverse urban contexts.
Authors:Shuo Tong, Han Liu, Runyuan Guo, Xueqiong Tian, Wenqing Wang, Ding Liu, Youmin Zhang
Abstract:
Data-driven soft sensors (DDSS) have become mainstream methods for predicting key performance indicators in process industries. However, DDSS development requires complex and costly customized designs tailored to various tasks during the modeling process. Moreover, DDSS are constrained to a single structured data modality, limiting their ability to incorporate additional contextual knowledge. Furthermore, DDSSs' limited representation learning leads to weak predictive performance with scarce data. To address these challenges, we propose a general framework named LLM-TKESS (large language model for text-based knowledge-embedded soft sensing), harnessing the powerful general problem-solving capabilities, cross-modal knowledge transfer abilities, and few-shot capabilities of LLM for enhanced soft sensing modeling. Specifically, an auxiliary variable series encoder (AVS Encoder) is proposed to unleash LLM's potential for capturing temporal relationships within series and spatial semantic relationships among auxiliary variables. Then, we propose a two-stage fine-tuning alignment strategy: in the first stage, employing parameter-efficient fine-tuning through autoregressive training adjusts LLM to rapidly accommodate process variable data, resulting in a soft sensing foundation model (SSFM). Subsequently, by training adapters, we adapt the SSFM to various downstream tasks without modifying its architecture. Then, we propose two text-based knowledge-embedded soft sensors, integrating new natural language modalities to overcome the limitations of pure structured data models. Furthermore, benefiting from LLM's pre-existing world knowledge, our model demonstrates outstanding predictive capabilities in small sample conditions. Using the thermal deformation of air preheater rotor as a case study, we validate through extensive experiments that LLM-TKESS exhibits outstanding performance.
Authors:Giuseppe Nicoletta, Mauro Daniel Luigi Bruno, Peng Yu, Zhiming Wang, Maria Penelope De Santo, Roberto Caputo, Antonio Ferraro
Abstract:
Counterfeiting poses an evergrowing challenge, driving the need for innovative and sophisticated anti-counterfeiting strategies and technologies. Many solutions focus on tags characterized by optical features that are partially or completely camouflaged to the human eye, thus discouraging scammers. In this paper, a QR code is laser printed on a thin plastic foil previously coated by a specific nanocavity consisting of a metal/insulator/metal/insulator (MIMI) multilayer. This metamaterial possesses unique features in terms of light transmission that are due to the specific design. A thin layer of polymer dispersed liquid crystals, fabricated incorporating specific nematic liquid crystals in a polymer matrix, is able to camouflage the QR code that becomes, then, readable only under specific thermal conditions. Three anti-counterfeiting tags were fabricated, each using a distinct LC with its own nematic-isotropic transition temperature. The peculiar combination of the unique optical properties of nematic liquid crystals and optical nanocavities results in the creation of a novel type of tags showing two different encoding levels. Stress tests including water immersion, bending test, and prolonged heating have been performed ensuring the long-term stability of the tags. The realized two security-level anti-counterfeiting tags are cost-effective, straightforward to manufacture and, thanks to their flexibility, can be easily integrated into packaging and products.
Authors:Mehran Ebrahimi, Masayuki Yano
Abstract:
We introduce a hyperreduced reduced basis element method for model reduction of parameterized, component-based systems in continuum mechanics governed by nonlinear partial differential equations. In the offline phase, the method constructs, through a component-wise empirical training, a library of archetype components defined by a component-wise reduced basis and hyperreduced quadrature rules with varying hyperreduction fidelities. In the online phase, the method applies an online adaptive scheme informed by the Brezzi-Rappaz-Raviart theorem to select an appropriate hyperreduction fidelity for each component to meet the user-prescribed error tolerance at the system level. The method accommodates the rapid construction of hyperreduced models for large-scale component-based nonlinear systems and enables model reduction of problems with many continuous and topology-varying parameters. The efficacy of the method is demonstrated on a two-dimensional nonlinear thermal fin system that comprises up to 225 components and 68 independent parameters.
Authors:Chengzhong Zhang, Hongyu Zhao, Wenjie Zhang
Abstract:
Effective early-stage detection of internal short circuit in lithium-ion batteries is crucial to preventing thermal runaway. This report proposes an effective approach to address this challenging issue, in which the current change, state of charge and resistance are considered simultaneously to depict the voltage differential envelope curve. The envelope naturally utilizes the inherent physical information of the battery and accounts for error interference, providing a high-precision range for battery voltage fluctuations under any operating conditions. This study validates the algorithm using data from 10 fault intervals under dynamic operating condition. The results demonstrate that the algorithm achieves 100% accuracy and responds rapidly, enabling timely detection of early-stage internal short circuit faults in batteries. Compared to signal processing-based and neural network methods, the proposed approach offers significant advantages in both accuracy and practicality, making it highly relevant for the safe application and widespread adoption of lithium-ion batteries.
Authors:Nicolas Delaissé, Peyman Havaej, Dieter Fauconnier, Joris Degroote
Abstract:
This paper presents a new solver developed in OpenFOAM for the modeling of lubricant in the narrow gap between two surfaces inducing hydrodynamic pressures up to few gigapascal. Cavitation is modeled using the homogeneous equilibrium model. The mechanical and thermodynamic constitutive behavior of the lubricant is accurately captured by inclusion of compressibility, lubricant rheology and thermal effects. Different constitutive models can be selected at run time, through the adoption of the modular approach of OpenFOAM. By combining the lubricant solver with a structural solver using a coupling tool, elastohydrodynamically lubricated contacts can be accurately simulated in a partitioned way. The solution approach is validated and examples with different slip conditions are included. The benefit for the OpenFOAM community of this work is the creation of a new solver for lubricant flow in challenging conditions and at the same the illustration of combining OpenFOAM solvers with other open-source software packages.
Authors:Indu Kant Deo, Youngsoo Choi, Saad A. Khairallah, Alexandre Reikher, Maria Strantza
Abstract:
In Laser Powder Bed Fusion (LPBF), the applied laser energy produces high thermal gradients that lead to unacceptable final part distortion. Accurate distortion prediction is essential for optimizing the 3D printing process and manufacturing a part that meets geometric accuracy requirements. This study introduces data-driven parameterized reduced-order models (ROMs) to predict distortion in LPBF across various machine process settings. We propose a ROM framework that combines Proper Orthogonal Decomposition (POD) with Gaussian Process Regression (GPR) and compare its performance against a deep-learning based parameterized graph convolutional autoencoder (GCA). The POD-GPR model demonstrates high accuracy, predicting distortions within $\pm0.001mm$, and delivers a computational speed-up of approximately 1800x.
Authors:TuÄçe Gökdemir, Jakub Rydzewski
Abstract:
In molecular dynamics (MD) simulations, transitions between states are often rare events due to energy barriers that exceed the thermal temperature. Because of their infrequent occurrence and the huge number of degrees of freedom in molecular systems, understanding the physical properties that drive rare events is immensely difficult. A common approach to this problem is to propose a collective variable (CV) that describes this process by a simplified representation. However, choosing CVs is not easy, as it often relies on physical intuition. Machine learning (ML) techniques provide a promising approach for effectively extracting optimal CVs from MD data. Here, we provide a note on a recent unsupervised ML method called spectral map, which constructs CVs by maximizing the timescale separation between slow and fast variables in the system.
Authors:Chintan Jansari, Stéphane P. A. Bordas, Marco Montemurro, Elena Atroshchenko
Abstract:
The thermal conductivity of Functionally Graded Materials (FGMs) can be efficiently designed through topology optimization to obtain thermal meta-structures that actively steer the heat flow. Compared to conventional analytical design methods, topology optimization allows handling arbitrary geometries, boundary conditions and design requirements; and producing alternate designs for non-unique problems. Additionally, as far as the design of meta-structures is concerned, topology optimization does not need intuition-based coordinate transformation or the form invariance of governing equations, as in the case of transformation thermotics. We explore isogeometric density-based topology optimization in the continuous setting, which perfectly aligns with FGMs. In this formulation, the density field, geometry and solution of the governing equations are parameterized using non-uniform rational basis spline entities. Accordingly, the heat conduction problem is solved using Isogeometric Analysis. We design various 2D & 3D thermal meta-structures under different design scenarios to showcase the effectiveness and versatility of our approach. We also design thermal meta-structures based on architected cellular materials, a special class of FGMs, using their empirical material laws calculated via numerical homogenization.
Authors:Yutong Chen, Daisuke Sumiyoshi, Riki Sakai, Takahiro Yamamoto, Takahiro Ueno, Jewon Oh
Abstract:
In response to the substantial energy consumption in buildings, the Japanese government initiated the BI-Tech (Behavioral Insights X Technology) project in 2019, aimed at promoting voluntary energy-saving behaviors through the utilization of AI and IoT technologies. Our study aimed at small and medium-sized office buildings introduces a cost-effective IoT-based BI-Tech system, utilizing the Raspberry Pi 4B+ platform for real-time monitoring of indoor thermal conditions and air conditioner (AC) set-point temperature. Employing machine learning and image recognition, the system analyzes data to calculate the PMV index and predict energy consumption changes due to temperature adjustments. The integration of mobile and desktop applications conveys this information to users, encouraging energy-efficient behavior modifications. The machine learning model achieved with an R2 value of 97%, demonstrating the system's efficiency in promoting energy-saving habits among users.
Authors:Demetrius Gulewicz, Uduak Inyang-Udoh, Trevor Bird, Neera Jain
Abstract:
Model predictive control has gained popularity for its ability to satisfy constraints and guarantee robustness for certain classes of systems. However, for systems whose dynamics are characterized by a high state dimension, substantial nonlinearities, and stiffness, suitable methods for online nonlinear MPC are lacking. One example of such a system is a vehicle thermal management system (TMS) with integrated thermal energy storage (TES), also referred to as a hybrid TMS. Here, hybrid refers to the ability to achieve cooling through a conventional heat exchanger or via melting of a phase change material, or both. Given increased electrification in vehicle platforms, more stringent performance specifications are being placed on TMS, in turn requiring more advanced control methods. In this paper, we present the design and real-time implementation of a nonlinear model predictive controller with 77 states on an experimental hybrid TMS testbed. We show how, in spite of high-dimension and stiff dynamics, an explicit integration method can be obtained by linearizing the dynamics at each time step within the MPC horizon. This integration method further allows the first-order gradients to be calculated with minimal additional computational cost. Through simulated and experimental results, we demonstrate the utility of the proposed solution method and the benefits of TES for mitigating highly transient heat loads achieved by actively controlling its charging and discharging behavior.
Authors:Kota Nishida, Yoshihiro Midoh, Noriyuki Miura, Satoshi Kawakami, Jun Shiomi
Abstract:
Silicon Photonics-based AI Accelerators (SPAAs) have been considered as promising AI accelerators achieving high energy efficiency and low latency. While many researchers focus on improving SPAAs' energy efficiency and latency, their physical security has not been sufficiently studied. This paper first proposes a threat of thermal fault injection attacks on SPAAs based on Vector-Matrix Multipliers (VMMs) utilizing Mach-Zhender Interferometers. This paper then proposes SecONN, an optical neural network framework that is capable of not only inferences but also concurrent detection of the attacks. In addition, this paper introduces a concept of Wavelength Division Perturbation (WDP) where wavelength dependent VMM results are utilized to increase detection accuracy. Simulation results show that the proposed method achieves 88.7% attack-caused average misprediction recall.
Authors:Xujun Wei, Feng Zhang, Renhe Zhang, Wenwen Li, Cuiping Liu, Bin Guo, Jingwei Li, Haoyang Fu, Xu Tang
Abstract:
In the past few years, Artificial Intelligence (AI)-based weather forecasting methods have widely demonstrated strong competitiveness among the weather forecasting systems. However, these methods are insufficient for high-spatial-resolution short-term nowcasting within 6 hours, which is crucial for warning short-duration, mesoscale and small-scale weather events. Geostationary satellite remote sensing provides detailed, high spatio-temporal and all-day observations, which can address the above limitations of existing methods. Therefore, this paper proposed an advanced data-driven thermal infrared cloud images forecasting model, "DaYu." Unlike existing data-driven weather forecasting models, DaYu is specifically designed for geostationary satellite observations, with a temporal resolution of 0.5 hours and a spatial resolution of ${0.05}^\circ$ $\times$ ${0.05}^\circ$. DaYu is based on a large-scale transformer architecture, which enables it to capture fine-grained cloud structures and learn fast-changing spatio-temporal evolution features effectively. Moreover, its attention mechanism design achieves a balance in computational complexity, making it practical for applications. DaYu not only achieves accurate forecasts up to 3 hours with a correlation coefficient higher than 0.9, 6 hours higher than 0.8, and 12 hours higher than 0.7, but also detects short-duration, mesoscale, and small-scale weather events with enhanced detail, effectively addressing the shortcomings of existing methods in providing detailed short-term nowcasting within 6 hours. Furthermore, DaYu has significant potential in short-term climate disaster prevention and mitigation.
Authors:Zheng Liu, Yuan Jiang, Yumeng Li, Pingfeng Wang
Abstract:
With the popularity of electric vehicles, the demand for lithium-ion batteries is increasing. Temperature significantly influences the performance and safety of batteries. Battery thermal management systems can effectively control the temperature of batteries; therefore, the performance and safety can be ensured. However, the development process of battery thermal management systems is time-consuming and costly due to the extensive training dataset needed by data-driven models requiring enormous computational costs for finite element analysis. Therefore, a new approach to constructing surrogate models is needed in the era of AI. Physics-informed machine learning enforces the physical laws in surrogate models, making it the perfect candidate for estimating battery pack temperature distribution. In this study, we first developed a 21700 battery pack indirect liquid cooling system with cold plates on the top and bottom with thermal paste surrounding the battery cells. Then, the simplified finite element model was built based on experiment results. Due to the high coolant flow rate, the cold plates can be considered as constant temperature boundaries, while battery cells are the heat sources. The physics-informed convolutional neural network served as a surrogate model to estimate the temperature distribution of the battery pack. The loss function was constructed considering the heat conduction equation based on the finite difference method. The physics-informed loss function helped the convergence of the training process with less data. As a result, the physics-informed convolutional neural network showed more than 15 percents improvement in accuracy compared to the data-driven method with the same training data.
Authors:Shunjing Zhao, Hanlun Lei, Xian Shi
Abstract:
Surface temperature distribution is crucial for thermal property-based studies about irregular asteroids in our Solar System. While direct numerical simulations could model surface temperatures with high fidelity, they often take a significant amount of computational time, especially for problems where temperature distributions are required to be repeatedly calculated. To this end, deep operator neural network (DeepONet) provides a powerful tool due to its high computational efficiency and generalization ability. In this work, we applied DeepONet to the modelling of asteroid surface temperatures. Results show that the trained network is able to predict temperature with an accuracy of ~1% on average, while the computational cost is five orders of magnitude lower, hence enabling thermal property analysis in a multidimensional parameter space. As a preliminary application, we analyzed the orbital evolution of asteroids through direct N-body simulations embedded with instantaneous Yarkovsky effect inferred by DeepONet-based thermophysical modelling.Taking asteroids (3200) Phaethon and (89433) 2001 WM41 as examples, we show the efficacy and efficiency of our AI-based approach.
Authors:E. GarcÃa-MacÃas, Z. D. Harris, E. MartÃnez-Pañeda
Abstract:
We present TDS Simulator, a new software tool aimed at modelling thermal desorption spectroscopy (TDS) experiments. TDS is a widely used technique for quantifying key characteristics of hydrogen-material interactions, such as diffusivity and trapping. However, interpreting the output of TDS experiments is non-trivial and requires appropriate post-processing tools. This work introduces the first software tool capable of simulating TDS curves for arbitrary choices of material parameters and hydrogen trap characteristics, using the primary hydrogen diffusion and trapping models (Oriani, McNabb-Foster). Moreover, TDS Simulator contains a specific functionality for loading experimental TDS data and conducting the inverse calibration of a selected transport model, providing automatic estimates of the density and binding energy of each hydrogen trap type in the material. In its first version, TDS Simulator is provided as a MATLAB App, which is made freely available to the community and provides a simple graphical user interface (GUI) to make use of TDS Simulator straightforward. As reported in the present manuscript, the outputs of TDS Simulator have been extensively validated against literature data. Demonstrations of automatic determination of trap characteristics from experimental data through the optimisation tool are also provided. The present work enables an efficient and straightforward characterisation of hydrogen-material characteristics relevant to multiple applications, from nuclear fusion to the development of hydrogen-compatible materials for the hydrogen economy. TDS Simulator can be downloaded from https://mechmat.web.ox.ac.uk/codes.
Authors:Xiaoqi Ling, Cheng Cai, Demin Kong, Zhisheng Wei, Jing Wu, Lei Wang, Zhaohong Deng
Abstract:
Computational protein design (CPD) refers to the use of computational methods to design proteins. Traditional methods relying on energy functions and heuristic algorithms for sequence design are inefficient and do not meet the demands of the big data era in biomolecules, with their accuracy limited by the energy functions and search algorithms. Existing deep learning methods are constrained by the learning capabilities of the networks, failing to extract effective information from sparse protein structures, which limits the accuracy of protein design. To address these shortcomings, we developed an Efficient attention-based Models for Computational Protein Design using amino acid microenvironment (EMOCPD). It aims to predict the category of each amino acid in a protein by analyzing the three-dimensional atomic environment surrounding the amino acids, and optimize the protein based on the predicted high-probability potential amino acid categories. EMOCPD employs a multi-head attention mechanism to focus on important features in the sparse protein microenvironment and utilizes an inverse residual structure to optimize the network architecture. The proposed EMOCPD achieves over 80% accuracy on the training set and 68.33% and 62.32% accuracy on two independent test sets, respectively, surpassing the best comparative methods by over 10%. In protein design, the thermal stability and protein expression of the predicted mutants from EMOCPD show significant improvements compared to the wild type, effectively validating EMOCPD's potential in designing superior proteins. Furthermore, the predictions of EMOCPD are influenced positively, negatively, or have minimal impact based on the content of the 20 amino acids, categorizing amino acids as positive, negative, or neutral. Research findings indicate that EMOCPD is more suitable for designing proteins with lower contents of negative amino acids.
Authors:Weidong Wu, Yong Zhang, Lili Hao, Yang Chen, Xiaoyan Sun, Dunwei Gong
Abstract:
Physics-Informed Neural Operators provide efficient, high-fidelity simulations for systems governed by partial differential equations (PDEs). However, most existing studies focus only on multi-scale, multi-physics systems within a single spatial region, neglecting the case with multiple interconnected sub-regions, such as gas and thermal systems. To address this, this paper proposes a Physics-Informed Partitioned Coupled Neural Operator (PCNO) to enhance the simulation performance of such networks. Compared to the existing Fourier Neural Operator (FNO), this method designs a joint convolution operator within the Fourier layer, enabling global integration capturing all sub-regions. Additionally, grid alignment layers are introduced outside the Fourier layer to help the joint convolution operator accurately learn the coupling relationship between sub-regions in the frequency domain. Experiments on gas networks demonstrate that the proposed operator not only accurately simulates complex systems but also shows good generalization and low model complexity.
Authors:Mikhail Khrenov, Moon Tan, Lauren Fitzwater, Michelle Hobdari, Sneha Prabha Narra
Abstract:
Metal additive manufacturing (AM) opens the possibility for spatial control of as-fabricated microstructure and properties. However, since the solid state diffusional transformations that drive microstructure outcomes are governed by nonlinear ODEs in terms of temperature, which is itself governed by PDEs over the entire part domain, solving for the system inputs needed to achieve desired microstructure distributions has proven difficult. In this work, we present a trajectory optimization approach for spatial control of microstructure in metal AM, which we demonstrate by controlling the hardness of a low-alloy steel in electron beam powder bed fusion (EB-PBF). To this end, we present models for thermal and microstructural dynamics. Next, we use experimental data to identify the parameters of the microstructure transformation dynamics. We then pose spatial microstructure control as a finite-horizon optimal control problem. The optimal power field trajectory is computed using an augmented Lagrangian differential dynamic programming (AL-DDP) method with GPU acceleration. The resulting time-varying power fields are then realized on an EB-PBF machine through an approximation scheme. Measurements of the resultant hardness shows that the optimized power field trajectory is able to closely produce the desired hardness distribution.
Authors:Arash Baharvandi, Duong Tung Nguyen
Abstract:
This paper examines the integrated generation and transmission expansion planning problem to address the growing challenges associated with increasing power network loads. The proposed approach optimizes the operation and investment costs for new generation units and transmission lines, while also considering the environmental benefits of integrating renewable energy sources (RES) and the impact of electric vehicle (EV) charging on the grid. The inherent uncertainties in demand, EV charging loads, and RES generation are managed using a hybrid stochastic-robust optimization approach. Additionally, the model integrates Dynamic Thermal Line Rating (DTLR) to improve the efficiency and resilience of transmission lines. The framework also tackles the uncertainty related to DTLR, incorporating a heuristic linearization technique to reduce model complexity. The effectiveness of the proposed model and techniques is evaluated through simulations conducted on two case studies: the modified IEEE 6-bus system and the IEEE 24-bus Reliability Test System.
Authors:Alessandro Montanari, Ashok Thangarajan, Khaldoon Al-Naimi, Andrea Ferlini, Yang Liu, Ananta Narayanan Balaji, Fahim Kawsar
Abstract:
Sensory earables have evolved from basic audio enhancement devices into sophisticated platforms for clinical-grade health monitoring and wellbeing management. This paper introduces OmniBuds, an advanced sensory earable platform integrating multiple biosensors and onboard computation powered by a machine learning accelerator, all within a real-time operating system (RTOS). The platform's dual-ear symmetric design, equipped with precisely positioned kinetic, acoustic, optical, and thermal sensors, enables highly accurate and real-time physiological assessments. Unlike conventional earables that rely on external data processing, OmniBuds leverage real-time onboard computation to significantly enhance system efficiency, reduce latency, and safeguard privacy by processing data locally. This capability includes executing complex machine learning models directly on the device. We provide a comprehensive analysis of OmniBuds' design, hardware and software architecture demonstrating its capacity for multi-functional applications, accurate and robust tracking of physiological parameters, and advanced human-computer interaction.
Authors:Yifu Ding, Jansen Wong, Serena Patel, Dharik Mallapragada, Guiyan Zang, Robert Stoner
Abstract:
India aims to achieve net-zero emissions by 2070 and has set an ambitious target of 500 GW of renewable power generation capacity by 2030. Coal plants currently contribute to more than 60\% of India's electricity generation in 2022. Upgrading and decarbonizing high-emission coal plants became a pressing energy issue. A key technical parameter for coal plants is the operating station heat rate (SHR), which represents the thermal efficiency of a coal plant. Yet, the operating SHR of Indian coal plants varies and is not comprehensively documented. This study extends from several existing databases and creates an SHR dataset for 806 Indian coal plant units using machine learning (ML), presenting the most comprehensive coverage to date. Additionally, it incorporates environmental factors such as water stress risk and coal prices as prediction features to improve accuracy. This dataset, easily downloadable from our visualization platform, could inform energy and environmental policies for India's coal power generation as the country transitions towards its renewable energy targets.
Authors:Dhruv Suri, Praneet Dutta, Flora Xue, Ines Azevedo, Ravi Jain
Abstract:
As Chile's electric power sector advances toward a future powered by renewable energy, accurate forecasting of renewable generation is essential for managing grid operations. The integration of renewable energy sources is particularly challenging due to the operational difficulties of managing their power generation, which is highly variable compared to fossil fuel sources, delaying the availability of clean energy. To mitigate this, we quantify the impact of increasing intermittent generation from wind and solar on thermal power plants in Chile and introduce a hybrid wind speed forecasting methodology which combines two custom ML models for Chile. The first model is based on TiDE, an MLP-based ML model for short-term forecasts, and the second is based on a graph neural network, GraphCast, for medium-term forecasts up to 10 days. Our hybrid approach outperforms the most accurate operational deterministic systems by 4-21% for short-term forecasts and 5-23% for medium-term forecasts and can directly lower the impact of wind generation on thermal ramping, curtailment, and system-level emissions in Chile.
Authors:Eléa Prat, Pierre Pinson, Richard M. Lusby, Riwal Plougonven, Jordi Badosa, Philippe Drobinski
Abstract:
As seasonal thermal energy storage emerges as an efficient solution to reduce CO2 emissions of buildings, challenges appear related to its optimal operation. In a system including short-term electricity storage, long-term heat storage, and where electricity and heat networks are connected through a heat pump, it becomes crucial to operate the system on two time scales. Based on real data from a university building, we simulate the operation of such a system over a year, comparing different strategies based on model predictive control (MPC). The first objective of this paper is to determine the minimum prediction horizon to retrieve the results of the full-horizon operation problem with cost minimization. The second objective is to evaluate a method that combines MPC with setting targets on the heat storage level at the end of the prediction horizon, based on historical data. For a prediction horizon of 6 days, the suboptimality gap with the full-horizon results is 4.31%, compared to 11.42% when using a prediction horizon of 42 days and fixing the final level to be equal to the initial level, which is a common approach.
Authors:Tim Hageman, Jessica MejÃa, Ravindra Duddu, Emilio MartÃnez-Pañeda
Abstract:
Full thickness crevasses can transport water from the glacier surface to the bedrock where high water pressures can open kilometre-long cracks along the basal interface, which can accelerate glacier flow. We present a first computational modelling study that describes time-dependent fracture propagation in an idealised glacier causing rapid supraglacial lake drainage. A novel two-scale numerical method is developed to capture the elastic and viscoelastic deformations of ice along with crevasse propagation. The fluid-conserving thermo-hydro-mechanical model incorporates turbulent fluid flow and accounts for melting/refreezing in fractures. Applying this model to observational data from a 2008 rapid lake drainage event indicates that viscous deformation exerts a much stronger control on hydrofracture propagation compared to thermal effects. This finding contradicts the conventional assumption that elastic deformation is adequate to describe fracture propagation in glaciers over short timescales (minutes to several hours) and instead demonstrates that viscous deformation must be considered to reproduce observations of lake drainage rate and local ice surface elevation change. As supraglacial lakes continue expanding inland and as Greenland Ice Sheet temperatures become warmer than -8 degree C, our results suggest rapid lake drainages are likely to occur without refreezing, which has implications for the rate of sea level rise.
Authors:Savvas Panagi, Chrysovalantis Spanias, Petros Aristidou
Abstract:
The growing electrification of transportation and heating through Electric Vehicles (EVs) and Heat Pumps (HPs) introduces both flexibility and complexity to Active Distribution Networks (ADNs). These resources provide substantial operational flexibility but also create tightly coupled thermal-electrical dynamics that challenge conventional network management. This paper proposes a unified co-optimization framework that integrates a calibrated 3R2C grey-box building thermal model into a network-constrained Optimal Power Flow (OPF). The framework jointly optimizes EVs, HPs, and photovoltaic systems while explicitly enforcing thermal comfort, Distributed Energy Resource (DER) limits, and full power flow physics. To maintain computational tractability, Second-Order Cone Programming (SOCP) relaxations are evaluated on a realistic low-voltage feeder. The analysis shows that, despite network heterogeneity violating some theoretical exactness conditions, the relaxation remains exact in practice. Comparative assessments of convex DistFlow, bus injection, and branch flow formulations reveal that convex DistFlow achieves sub-second runtimes and near-optimal performance even at high DER penetration levels. Simulations confirm the effectiveness of coordinated scheduling, yielding reductions of 41% in transformer aging, 54% in losses, and complete elimination of voltage violations, demonstrating the value of integrated thermal-electrical coordination in future smart grids.
Authors:Philip Tobuschat, Simon Duenser, Markus Bambach, Ivo Aschwanden
Abstract:
Researchers have identified various sources of tool positioning errors for articulated industrial robots and have proposed dedicated compensation strategies. However, these typically require individual, specialized experiments with separate models and identification procedures. This article presents a unified approach to the static calibration of industrial robots that identifies a robot model, including geometric and non-geometric effects (compliant bending, thermal deformation, gear transmission errors), using only a single, straightforward experiment for data collection. The model augments the kinematic chain with virtual joints for each modeled effect and realizes the identification using Gauss-Newton optimization with analytic gradients. Fisher information spectra show that the estimation is well-conditioned and the parameterization near-minimal, whereas systematic temporal cross-validation and model ablations demonstrate robustness of the model identification. The resulting model is very accurate and its identification robust, achieving a mean position error of 26.8 $μm$ on a KUKA KR30 industrial robot compared to 102.3 $μm$ for purely geometric calibration.
Authors:Zhaoqi Su, Shihai Chen, Xinyan Lin, Liqin Huang, Zhipeng Su, Xiaoqiang Lu
Abstract:
Multi-modal scene reconstruction integrating RGB and thermal infrared data is essential for robust environmental perception across diverse lighting and weather conditions. However, extending 3D Gaussian Splatting (3DGS) to multi-spectral scenarios remains challenging. Current approaches often struggle to fully leverage the complementary information of multi-modal data, typically relying on mechanisms that either tend to neglect cross-modal correlations or leverage shared representations that fail to adaptively handle the complex structural correlations and physical discrepancies between spectrums. To address these limitations, we propose ThermoSplat, a novel framework that enables deep spectral-aware reconstruction through active feature modulation and adaptive geometry decoupling. First, we introduce a Cross-Modal FiLM Modulation mechanism that dynamically conditions shared latent features on thermal structural priors, effectively guiding visible texture synthesis with reliable cross-modal geometric cues. Second, to accommodate modality-specific geometric inconsistencies, we propose a Modality-Adaptive Geometric Decoupling scheme that learns independent opacity offsets and executes an independent rasterization pass for the thermal branch. Additionally, a hybrid rendering pipeline is employed to integrate explicit Spherical Harmonics with implicit neural decoding, ensuring both semantic consistency and high-frequency detail preservation. Extensive experiments on the RGBT-Scenes dataset demonstrate that ThermoSplat achieves state-of-the-art rendering quality across both visible and thermal spectrums.
Authors:Jingwei Dong, André M. H. Teixeira
Abstract:
This paper studies the design of detection observers against stealthy bias injection attacks in stochastic linear systems under Gaussian noise, considering adversaries that exploit noise and inject crafted bias signals into a subset of sensors in a slow and coordinated manner, thereby achieving malicious objectives while remaining stealthy. To address such attacks, we formulate the observer design as a max-min optimization problem to enhance the detectability of worst-case BIAs, which attain a prescribed attack impact with the least detectability evaluated via Kullback-Leibler divergence. To reduce the computational complexity of the derived non-convex design problem, we consider the detectability of worst-case BIAs at three specific time instants: attack onset, one step after attack occurrence, and the steady state. We prove that the Kalman filter is optimal for maximizing the BIA detectability at the attack onset, regardless of the subset of attacked sensors. For the one-step and steady-state cases, the observer design problems are approximated by bi-convex optimization problems, which can be efficiently solved using alternating optimization and alternating direction method of multipliers. Moreover, more tractable linear matrix inequality relaxations are developed. Finally, the effectiveness of the proposed stealth-aware detection framework is demonstrated through an application to a thermal system.
Authors:Władysław Skarbek, Michał Salomonowicz, Michał Król
Abstract:
Estimating the position and orientation of a camera with respect to an observed scene is one of the central problems in computer vision, particularly in the context of camera calibration and multi-sensor systems. This paper addresses the planar Perspective--$n$--Point problem, with special emphasis on the initial estimation of the pose of a calibration object. As a solution, we propose the \texttt{PnP-ProCay78} algorithm, which combines the classical quadratic formulation of the reconstruction error with a Cayley parameterization of rotations and least-squares optimization. The key component of the method is a deterministic selection of starting points based on an analysis of the reconstruction error for two canonical vectors, allowing costly solution-space search procedures to be avoided. Experimental validation is performed using data acquired also from high-resolution RGB cameras and very low-resolution thermal cameras in an integrated RGB--IR setup. The results demonstrate that the proposed algorithm achieves practically the same projection accuracy as optimal \texttt{SQPnP} and slightly higher than \texttt{IPPE}, both prominent \texttt{PnP-OpenCV} procedures. However, \texttt{PnP-ProCay78} maintains a significantly simpler algorithmic structure. Moreover, the analysis of optimization trajectories in Cayley space provides an intuitive insight into the convergence process, making the method attractive also from a didactic perspective. Unlike existing PnP solvers, the proposed \texttt{PnP-ProCay78} algorithm combines projection error minimization with an analytically eliminated reconstruction-error surrogate for translation, yielding a hybrid cost formulation that is both geometrically transparent and computationally efficient.
Authors:Abdullah Jirjees, Ryan Myers, Muhammad Haris Ikram, Mohamed H. Zaki
Abstract:
Detecting vulnerable road users (VRUs), particularly children and adolescents, in low light and adverse weather conditions remains a critical challenge in computer vision, surveillance, and autonomous vehicle systems. This paper presents a purpose-built lightweight object detection model designed to identify young pedestrians in various environmental scenarios. To address these challenges, our approach leverages thermal imaging from long-wave infrared (LWIR) cameras, which enhances detection reliability in conditions where traditional RGB cameras operating in the visible spectrum fail. Based on the YOLO11 architecture and customized for thermal detection, our model, termed LTV-YOLO (Lightweight Thermal Vision YOLO), is optimized for computational efficiency, accuracy and real-time performance on edge devices. By integrating separable convolutions in depth and a feature pyramid network (FPN), LTV-YOLO achieves strong performance in detecting small-scale, partially occluded, and thermally distinct VRUs while maintaining a compact architecture. This work contributes a practical and scalable solution to improve pedestrian safety in intelligent transportation systems, particularly in school zones, autonomous navigation, and smart city infrastructure. Unlike prior thermal detectors, our contribution is task-specific: a thermally only edge-capable design designed for young and small VRUs (children and distant adults). Although FPN and depthwise separable convolutions are standard components, their integration into a thermal-only pipeline optimized for short/occluded VRUs under adverse conditions is, to the best of our knowledge, novel.
Authors:Ju-Hong Oh, Seon-In Kim, Eui-Jong Kim
Abstract:
Thermal energy storage (TES) systems coupled with heat pumps offer significant potential for improving building energy efficiency by shifting electricity demand to off-peak hours. However, conventional operating strategies maintain conservatively low chilled water temperatures throughout the cooling season, a practice that results in suboptimal heat pump performance. This study proposes a physics-based integrated simulation framework to determine the maximum feasible chilled water supply temperature while ensuring cooling stability. The framework integrates four submodels: relative humidity prediction, dynamic cooling load estimation, cooling coil performance prediction, and TES discharge temperature prediction. Validation against measured data from an office building demonstrates reliable accuracy across all sub-models (e.g., CVRMSE of 9.3% for cooling load and R2 of 0.91 for peak-time discharge temperature). The integrated simulation reveals that the proposed framework can increase the daily initial TES charging temperature by an average of 2.55 °C compared to conventional fixed-temperature operation, enabling the heat pump to operate at a higher coefficient of performance. This study contributes a practical methodology for optimizing TES charging temperatures in building heating, ventilation, and air conditioning (HVAC) systems while maintaining indoor setpoint temperatures.
Authors:Diogo D. Carvalho, Pablo J. Bilbao, Warren B. Mori, Luis O. Silva, E. Paulo Alves
Abstract:
We propose a methodology to infer collision operators from phase space data of plasma dynamics. Our approach combines a differentiable kinetic simulator, whose core component in this work is a differentiable Fokker-Planck solver, with a gradient-based optimisation method to learn the collisional operators that best describe the phase space dynamics. We test our method using data from two-dimensional Particle-in-Cell simulations of spatially uniform thermal plasmas, and learn the collision operator that captures the self-consistent electromagnetic interaction between finite-size charged particles over a wide variety of simulation parameters. We demonstrate that the learned operators are more accurate than alternative estimates based on particle tracks, while making no prior assumptions about the relevant time-scales of the processes and significantly reducing memory requirements. We find that the retrieved operators, obtained in the non-relativistic regime, are in excellent agreement with theoretical predictions derived for electrostatic scenarios. Our results show that differentiable simulators offer a powerful and computational efficient approach to infer novel operators for a wide rage of problems, such as electromagnetically dominated collisional dynamics and stochastic wave-particle interactions.
Authors:Giuseppe Romano, Rodrigo Arrieta, Steven G. Johnson
Abstract:
A key challenge in topology optimization (TopOpt) is that manufacturable structures, being inherently binary, are non-differentiable, creating a fundamental tension with gradient-based optimization. The subpixel-smoothed projection (SSP) method addresses this issue by smoothing sharp interfaces at the subpixel level through a first-order expansion of the filtered field. However, SSP does not guarantee differentiability under topology changes, such as the merging of two interfaces, and therefore violates the convergence guarantees of many popular gradient-based optimization algorithms. We overcome this limitation by regularizing SSP with the Hessian of the filtered field, resulting in a twice-differentiable projected density during such transitions, while still guaranteeing an almost-everywhere binary structure. We demonstrate the effectiveness of our second-order SSP (SSP2) methodology on both thermal and photonic problems, showing that SSP2 has faster convergence than SSP for connectivity-dominant cases -- where frequent topology changes occur -- while exhibiting comparable performance otherwise. Beyond improving convergence guarantees for CCSA optimizers, SSP2 enables the use of a broader class of optimization algorithms with stronger theoretical guarantees, such as interior-point methods. Since SSP2 adds minimal complexity relative to SSP or traditional projection schemes, it can be used as a drop-in replacement in existing TopOpt codes.
Authors:Ming Liu, Yosuke Hasegawa
Abstract:
Conjugate scalar transport with interfacial jump conditions on complex interfacial geometries is common in thermal and chemical processes, while its accurate and efficient simulations are still quite challenging. In the present study, a novel treatment of a two-phase interface in the volume penalization method, a kind of immersed boundary method, for solving conjugate scalar transport with general interfacial boundary conditions is developed. We first propose an interfacial treatment for solving an advection-diffusion equation with a Neumann boundary condition, and then extend it to general conjugate scalar transport with both interfacial flux and scalar jumps. A one-dimensional diffusion problem is solved to verify the present scheme and demonstrate the advantage of the present scheme in improving accuracy and unifying the governing equations in the two phases with an additional source term representing the local jump condition of the interfacial scalar flux. Then, the present scheme is further applied to fluid-solid coupled scalar diffusion and advection-diffusion problems with the scalar and its flux jumps across the interface. The simulation results of the present scheme generally show good agreement with reference results obtained by body-fitted mesh simulations with average relative deviations less than 3.0%.
Authors:Yerzhan Mustafa, Selçuk Köse
Abstract:
Interface circuits are the key components that enable the hybrid integration of superconductor and semiconductor digital electronics. The design requirements of superconductor-semiconductor interface circuits vary depending on the application, such as high-performance classical computing, superconducting quantum computing, and digital signal processing. In this survey, various interface circuits are categorized based on the working principle and structure. The superconducting output drivers are explored, which are capable of converting and amplifying, e.g., single flux quantum (SFQ) voltage pulses, to voltage levels that semiconductor circuits can process. Several trade-offs between circuit- and system-level design parameters are examined. Accordingly, parameters such as the data rate, output voltage, power dissipation, layout area, thermal/heat load of cryogenic cables, and bit-error rate are considered.
Authors:Nardos Belay Abera, Yize Chen
Abstract:
The AI datacenters are currently being deployed on a large scale to support the training and deployment of power-intensive large-language models (LLMs). Extensive amount of computation and cooling required in datacenters increase concerns about the energy use and carbon emissions of AI datacenters. Although current state-of-the-art has examined the energy efficiency of LLM inference, most prior research focused on optimizing compute-side scheduling without considering thermal objectives or constraints. Since GPU-intensive inference generates substantial heat that can degrade datacenter performance, ignoring thermal effects can increase total energy consumption and reduce the efficiency of LLM serving. To fill this gap, we profile the characteristics of GPU servers under varying cooling and AI jobs, and develop a joint cooling and computing modeling approach for AI datacenters. Built upon such workload and thermal dynamics models, a novel hierarchical control framework is proposed to co-optimize computing and thermal management by identifying the optimal GPU parallelism, frequency (DVFS), and cooling control knobs. Using real Azure inference traces and detailed GPU profiling, our model balances serving latency and thermal constraints in AI datacenters while significantly improving AI datacenters' energy efficiency.
Authors:Shreyas Rajeev, Karthik Mudenahalli Ashoka, Amit Mallappa Tiparaddi
Abstract:
Accurately forecasting long-term atmospheric variables remains a defining challenge in meteorological science due to the chaotic nature of atmospheric systems. Temperature data represents a complex superposition of deterministic cyclical climate forces and stochastic, short-term fluctuations. While planetary mechanics drive predictable seasonal periodicities, rapid meteorological changes such as thermal variations, pressure anomalies, and humidity shifts introduce nonlinear volatilities that defy simple extrapolation. Historically, the Seasonal Autoregressive Integrated Moving Average (SARIMA) model has been the standard for modeling historical weather data, prized for capturing linear seasonal trends. However, SARIMA operates under strict assumptions of stationarity, failing to capture abrupt, nonlinear transitions. This leads to systematic residual errors, manifesting as the under-prediction of sudden spikes or the over-smoothing of declines. Conversely, Deep Learning paradigms, specifically Long Short-Term Memory (LSTM) networks, demonstrate exceptional efficacy in handling intricate time-series data. By utilizing memory gates, LSTMs learn complex nonlinear dependencies. Yet, LSTMs face instability in open-loop forecasting; without ground truth feedback, minor deviations compound recursively, causing divergence. To resolve these limitations, we propose a Hybrid SARIMA-LSTM architecture. This framework employs a residual-learning strategy to decompose temperature into a predictable climate component and a nonlinear weather component. The SARIMA unit models the robust, long-term seasonal trend, while the LSTM is trained exclusively on the residuals the nonlinear errors SARIMA fails to capture. By fusing statistical stability with neural plasticity, this hybrid approach minimizes error propagation and enhances long-horizon accuracy.
Authors:Sonia Yeh, Rishabh Ghotge, Yujia Shi, Luka de Koe
Abstract:
We introduce a stressor-agnostic Resilience Key Performance Indicator (Resilience KPI) for megawatt charging stations (MSC) serving heavy-duty vehicles. Beyond routine performance statistics (e.g., availability, throughput), the KPI quantifies a site's ability to anticipate, operate under degradation, and recover from disruptions using observable signals already in the framework: ride-through capability, restoration speed, service under N-1, expected unserved charging energy, and queue impacts. The headline score is normalised to 0-100 for fair cross-site and cross-vendor benchmarking, with optional stressor-specific breakouts (grid, ICT, thermal, flooding, on-site incidents) for diagnostics and robustness checks. DATEX II provides a solid baseline for resilience KPIs centred on infrastructure inventory, status, and pricing, while additional KPIs, especially around grid capacity, on-site flexibility, heavy-vehicle geometry, environmental hardening, maintenance, and market exposure, are essential for a complete resilience picture and will require extensions or complementary data sources. The KPI is designed for monthly/quarterly reporting to support design and operational decisions and cost-benefit assessment of mitigations (e.g., backup power, spares, procedures). It offers a consistent, transparent methodology that consolidates heterogeneous logs and KPIs into a single, auditable indicator, making resilience comparable across sites, vendors, and jurisdictions.
Authors:Kun-Chih, Chen, Chia-Hsin Chen, Lei-Qi Wang, Chun-Chieh Wang
Abstract:
This paper addresses the challenges of thermal sensor allocation and full-chip temperature reconstruction in multi-core systems by leveraging an entropy-based sensor placement strategy and an adaptive compressive sensing approach. By selecting sensor locations that capture diverse thermal behaviors and dynamically adjusting the measurement matrix, our method significantly enhances the accuracy of the full-chip temperature reconstruction. Experimental results demonstrate that our approach reduces full-chip temperature reconstruction error by 18% to 95%. In addition to the full-chip temperature reconstruction efficiency enhancement, our proposed method improves hardware efficiency by 5% to 514% over the related works. These findings highlight the potential of our method for more effective dynamic temperature management in future high-performance multi-core systems.
Authors:Jidu Yu, Jidong Zhao
Abstract:
In the Material Point Method (MPM), accurately imposing Neumann-type thermal boundary conditions, particularly convective heat flux boundaries, remains a significant challenge due to the inherent nonconformity between complex evolving material boundaries and the fixed background grid. This paper introduces a novel Virtual Heat Flux Method (VHFM) to overcome this limitation. The core idea is to construct a virtual flux field on an auxiliary domain surrounding the physical boundary, which exactly satisfies the prescribed boundary condition. This transforms the surface integral in the weak form into an equivalent, and easily computed, volumetric integral. Consequently, VHFM eliminates the need for explicit boundary tracking, specialized boundary particles, or complex surface reconstruction. A unified formulation is presented, demonstrating the method's straightforward extension to general scalar, vector, and tensor Neumann conditions. The accuracy, robustness, and convergence of VHFM are rigorously validated through a series of numerical benchmarks, including 1D transient analysis, 2D and 3D curved boundaries, and problems with large rotations and complex moving geometries. The results show that VHFM achieves accuracy comparable to conforming node-based imposition and significantly outperforms conventional particle-based approaches. Its simplicity, computational efficiency, and robustness make it an attractive solution for integrating accurate thermal boundary conditions into thermo-mechanical and other multiphysics MPM frameworks.
Authors:Lauritz Zendel, Chiara Springer, Frank Dammel, Peter Stephan
Abstract:
Latent Thermal Energy Storages (LTES) can store thermal energy in a narrow temperature range. Therefore, they are favorable for integration into Rankine-based Carnot Batteries. For the design of such systems, simulations based on accurate models are desirable. However, physical phenomena such as natural convection in LTES units cannot be modeled directly in transient system models. Simplified models are required. Therefore, the objective of this work is to derive simplified LTES unit models for use in system models. In transient simulations the state of charge of the LTES influences its temperature profile. The temperature profile depends on the geometry of the LTES unit. Therefore, the geometry must be considered to model the transient behavior of an LTES unit. The LTES unit under investigation has a shell and tube heat exchanger structure. The phase change material (PCM) is located between the hexagonal fins and in the space between the finned tubes. Aluminum fins are used. They have a high thermal conductivity and thus compensate for the low thermal conductivity of the sodium nitrate used as PCM. The interaction between fins and PCM is complex. Therefore, a numerical approach can be used to gain insight into the behavior of the LTES unit. To transfer the results of a complex model to a simplified model where fins and PCM are not considered individually, the effective thermal conductivity of a single finned tube can be used to approximate the performance of the LTES unit. In this study, a model of a section with a single finned tube is developed using the COMSOL software. The effective thermal conductivity of the system is determined by varying the effective thermal conductivity in a simplified model and comparing the results with reference cases based on a complex modeling approach. The results can serve as model input for simplified system models of Carnot Batteries, among others.
Authors:Jingbo Qu, Yijie Wang, Yujie Fu, Putai Zhang, Weihan Li, Mian Li
Abstract:
Battery Energy Storage Systems (BESSs) are increasingly critical to power-system stability, yet their operation and maintenance remain dominated by reactive, expert-dependent diagnostics. While cell-level inconsistencies provide early warning signals of degradation and safety risks, the lack of scalable and interpretable decision-support frameworks prevents these signals from being effectively translated into operational actions. Here we introduce an inconsistency-driven operation and maintenance paradigm for large-scale BESSs that systematically transforms routine monitoring data into explainable, decision-oriented guidance. The proposed framework integrates multi-dimensional inconsistency evaluation with large language model-based semantic reasoning to bridge the gap between quantitative diagnostics and practical maintenance decisions. Using eight months of field data from an in-service battery system comprising 3,564 cells, we demonstrate how electrical, thermal, and aging-related inconsistencies can be distilled into structured operational records and converted into actionable maintenance insights through a multi-agent framework. The proposed approach enables accurate and explainable responses to real-world operation and maintenance queries, reducing response time and operational cost by over 80% compared with conventional expert-driven practices. These results establish a scalable pathway for intelligent operation and maintenance of battery energy storage systems, with direct implications for reliability, safety, and cost-effective integration of energy storage into modern power systems.
Authors:Shrenik Jadhav, Zheng Liu
Abstract:
Effective data center cooling is crucial for reliable operation; however, cooling systems often exhibit inefficiencies that result in excessive energy consumption. This paper presents a three-stage, physics-guided machine learning framework for identifying and reducing cooling energy waste in high-performance computing facilities. Using one year of 10-minute resolution operational data from the Frontier exascale supercomputer, we first train a monotonicity-constrained gradient boosting surrogate that predicts facility accessory power from coolant flow rates, temperatures, and server power. The surrogate achieves a mean absolute error of 0.026 MW and predicts power usage effectiveness within 0.01 of measured values for 98.7% of test samples. In the second stage, the surrogate serves as a physics-consistent baseline to quantify excess cooling energy, revealing approximately 85 MWh of annual inefficiency concentrated in specific months, hours, and operating regimes. The third stage evaluates guardrail-constrained counterfactual adjustments to supply temperature and subloop flows, demonstrating that up to 96% of identified excess can be recovered through small, safe setpoint changes while respecting thermal limits and operational constraints. The framework yields interpretable recommendations, supports counterfactual analyses such as flow reduction during low-load periods and redistribution of thermal duty across cooling loops, and provides a practical pathway toward quantifiable reductions in accessory power. The developed framework is readily compatible with model predictive control and can be extended to other liquid-cooled data centers with different configurations and cooling requirements.
Authors:Sunki Hong, Jisoo Lee
Abstract:
Accurate grid load forecasting is safety-critical: under-predictions risk supply shortfalls, while symmetric error metrics can mask this operational asymmetry. We introduce an operator-legible evaluation framework -- Under-Prediction Rate (UPR), tail Reserve$_{99.5}^{\%}$ requirements, and explicit inflation diagnostics (Bias$_{24h}$/OPR) -- to quantify one-sided reliability risk beyond MAPE. Using this framework, we evaluate state space models (Mamba variants) and strong baselines on a weather-aligned California Independent System Operator (CAISO) dataset spanning Nov 2023--Nov 2025 (84,498 hourly records across 5 regional transmission areas) under a rolling-origin walk-forward backtest. We develop and evaluate thermal-lag-aligned weather fusion strategies for these architectures. Our results demonstrate that standard accuracy metrics are insufficient proxies for operational safety: models with comparable MAPE can imply materially different tail reserve requirements (Reserve$_{99.5}^{\%}$). We show that explicit weather integration narrows error distributions, reducing the impact of temperature-driven demand spikes. Furthermore, while probabilistic calibration reduces large-error events, it can induce systematic schedule inflation. We introduce Bias/OPR-constrained objectives to enable auditable trade-offs between minimizing tail risk and preventing trivial over-forecasting.
Authors:Alireza Moayedikia, Sattar Dorafshan
Abstract:
Deteriorating civil infrastructure requires automated inspection techniques overcoming limitations of visual assessment. While Ground Penetrating Radar and Infrared Thermography enable subsurface defect detection, single modal approaches face complementary constraints radar struggles with moisture and shallow defects, while thermography exhibits weather dependency and limited depth. This paper presents a multi modal attention network fusing radar temporal patterns with thermal spatial signatures for bridge deck delamination detection. Our architecture introduces temporal attention for radar processing, spatial attention for thermal features, and cross modal fusion with learnable embeddings discovering complementary defect patterns invisible to individual sensors. We incorporate uncertainty quantification through Monte Carlo dropout and learned variance estimation, decomposing uncertainty into epistemic and aleatoric components for safety critical decisions. Experiments on five bridge datasets reveal that on balanced to moderately imbalanced data, our approach substantially outperforms baselines in accuracy and AUC representing meaningful improvements over single modal and concatenation based fusion. Ablation studies demonstrate cross modal attention provides critical gains beyond within modality attention, while multi head mechanisms achieve improved calibration. Uncertainty quantification reduces calibration error, enabling selective prediction by rejecting uncertain cases. However, under extreme class imbalance, attention mechanisms show vulnerability to majority class collapse. These findings provide actionable guidance: attention based architecture performs well across typical scenarios, while extreme imbalance requires specialized techniques. Our system maintains deployment efficiency, enabling real time inspection with characterized capabilities and limitations.
Authors:C. Ramírez-Dolores, J. C. Zamora-Luria, J. A. Altamirano-Acosta, L. Sarao-Cruz, P. Jiménez-Palma, J. Moreno-Falconi
Abstract:
The use of Earth-Air-Water Heat Exchangers (EAWHE) for sustainable air conditioning has not been widely studied. Due to their experimental nature, methods of characterizing internal thermal air distribution impose high dependence on instrumentation by sensors and entail data acquisition and computational costs. This document presents an alternative method that estimates air temperature distribution while minimizing the need for a dense network of sensors in the experimental system. The proposed model, DARL (Data of Air and Random Length), can predict the temperature of air circulating inside EAWHEs. DARL is a significant methodological advance that integrates experimental data from boundary conditions with simulations based on pseudo-random numbers (PRNs). These PRNs are generated using Fermat's prime numbers as seeds to initialize the generator. Ordinary linear regressions and robust statistical validations, including the Shapiro-Wilk test and root mean square error, have demonstrated that the model can estimate the thermal distribution of air at different lengths with a relative error of less than 6.2%. These results demonstrate the model's efficiency, predictive capacity, and potential to reduce dependence on sensors.
Authors:Mamoru Saita, Yutaka Hori
Abstract:
Temperature is a fundamental regulator of chemical and biochemical kinetics, yet capturing nonlinear thermal effects directly from experimental data remains a major challenge due to limited throughput and model flexibility. Recent advances in machine learning have enabled flexible modeling beyond conventional physical laws, but most existing strategies remain confined to surrogate models of end-point yields rather than full kinetic dynamics. Consequently, an end-to-end framework that unifies systematic kinetic data acquisition with machine learning based modeling has been lacking. In this paper, we present a unified framework that integrates droplet microfluidics with machine learning for the systematic analysis of temperature-dependent reaction kinetics. The platform is specifically designed to enable stable immobilization and long-term time-lapse imaging of thousands of droplets under dynamic thermal gradients. This configuration yields massively parallel time-resolved datasets across diverse temperature conditions that capture transient kinetics and provides particularly suitable inputs for training machine-learning models of reaction dynamics. Leveraging these datasets, we train Neural ODE models, which embed neural networks within differential equations to flexibly represent nonlinear temperature dependencies beyond conventional formulations. We demonstrate accurate prediction of enzymatic kinetics across diverse thermal environments, highlighting the robustness and versatility of the approach. Our framework bridges high-throughput experimental data acquisition with data-driven modeling, establishing a versatile foundation for enhanced predictive ability and rational analysis and design of temperature-sensitive biochemical processes.
Authors:Agrippina Mwangi, León Navarro-Hilfiker, Lukasz Brewka, Mikkel Gryning, Elena Fumagalli, Madeleine Gibescu
Abstract:
Stochastic disruptions such as flash events arising from benign traffic bursts and switch thermal fluctuations are major contributors to intermittent service degradation in software-defined industrial networks. These events violate IEC~61850-derived quality-of-service requirements and user-defined service-level agreements, hindering the reliable and timely delivery of control, monitoring, and best-effort traffic in IEC~61400-25-compliant wind power plants. Failure to maintain these requirements often results in delayed or lost control signals, reduced operational efficiency, and increased risk of wind turbine generator downtime. To address these challenges, this study proposes a threshold-triggered Deep Q-Network self-healing agent that autonomically detects, analyzes, and mitigates network disruptions while adapting routing behavior and resource allocation in real time. The proposed agent was trained, validated, and tested on an emulated tri-clustered switch network deployed in a cloud-based proof-of-concept testbed. Simulation results show that the proposed agent improves disruption recovery performance by 53.84% compared to a baseline shortest-path and load-balanced routing approach and outperforms state-of-the-art methods, including the Adaptive Network-based Fuzzy Inference System by 13.1% and the Deep Q-Network and traffic prediction-based routing optimization method by 21.5%, in a super-spine leaf data-plane architecture. Additionally, the agent maintains switch thermal stability by proactively initiating external rack cooling when required. These findings highlight the potential of deep reinforcement learning in building resilience in software-defined industrial networks deployed in mission-critical, time-sensitive application scenarios.
Authors:Matei Drilea, Alexander Dijkshoorn, Gusthavo Ribeiro Salomão, Stefano Stramigioli, Gijs Krijnen
Abstract:
The excellent structural and piezoresistive properties of continuous carbon fiber make it suitable for both structural and sensing applications. This work studies the use of 3D printed, continuous carbon fiber reinforced beams as self-sensing structures. It is demonstrated how the sensitivity of these carbon fiber strain gauges can be increased irreversibly by means of a pretreatment by ``breaking-in'' the sensors with a large compressive bending load. The increase in the gauge factor is attributed to local progressive fiber failure, due to the combination of the thermal residual stress from the printing process and external loading. The coextrusion of conductive filament around the carbon fibers is demonstrated as a means of improving the reliability, noise and electrical connection of the sensors. A micrograph of the sensor cross section shows that the conductive filament contacts the various carbon fiber bundles. All-in-all, the use of ``breaking-in'' carbon fiber strain gauges in combination with coextrusion of conductive filament hold promises for 3D printed structural sensors with a high sensitivity.
Authors:Ercan Erkalkan, Vedat Topuz, Ayça Ak
Abstract:
This study introduces a lightweight perimeter tracking method designed for micro UAV teams operating over wildfire environments under limited bandwidth conditions. Thermal image frames generate coarse hot region masks through adaptive thresholding and morphological refinement, while RGB frames contribute edge cues and suppress texture related false detections using gradient based filtering. A rule level merging strategy selects boundary candidates and simplifies them via the Ramer Douglas Peucker algorithm. The system incorporates periodic beacons and an inertial feedback loop that maintains trajectory stability in the presence of GPS degradation. The guidance loop targets sub 50 ms latency on embedded System on Chip (SoC) platforms by constraining per frame pixel operations and precomputing gradient tables. Small scale simulations demonstrate reductions in average path length and boundary jitter compared to a pure edge tracking baseline, while maintaining environmental coverage measured through intersection merge analysis. Battery consumption and computational utilization confirm the feasibility of achieving 10, 15 m/s forward motion on standard micro platforms. This approach enables rapid deployment in the field, requiring robust sensing and minimal communications for emergency reconnaissance applications.
Authors:Stefan Matthes, Markus Schramm
Abstract:
Parabolic trough Concentrating Solar Power (CSP) plants operate large hydraulic networks of collector loops that must deliver a uniform outlet temperature despite spatially heterogeneous optical performance, heat losses, and pressure drops. While loop temperatures are measured, loop-level mass flows and receiver heat-loss parameters are unobserved, making it impossible to diagnose hydraulic imbalances or receiver degradation using standard monitoring tools. We present a physics-informed learning framework that infers (i) loop-level mass-flow ratios and (ii) time-varying receiver heat-transfer coefficients directly from routine operational data. The method exploits nocturnal homogenization periods -- when hot oil is circulated through a non-irradiated field -- to isolate hydraulic and thermal-loss effects. A differentiable conjugate heat-transfer model is discretized and embedded into an end-to-end learning pipeline optimized using historical plant data from the 50 MW Andasol 3 solar field. The model accurately reconstructs loop temperatures (RMSE $<2^\circ$C) and produces physically meaningful estimates of loop imbalances and receiver heat losses. Comparison against drone-based infrared thermography (QScan) shows strong correspondence, correctly identifying all areas with high-loss receivers. This demonstrates that noisy real-world CSP operational data contain enough information to recover latent physical parameters when combined with appropriate modeling and differentiable optimization.
Authors:Julius Mercz, Philipp Reiss, Christian Reiter
Abstract:
For a sustained human presence on the Moon, robust in-situ resource utilisation supply chains to provide consumables and propellant are necessary. A promising process is molten salt electrolysis, which typically requires temperatures in excess of 900°C. Fission reactors do not depend on solar irradiance and are thus well suited for power generation on the Moon, especially during the 14-day lunar night. As of now, fission reactors have only been considered for electric power generation, but the reactor coolant could also be used directly to heat those processes to their required temperatures. In this work, a concept for a co-generation fission power plant on the Moon that can directly heat a MSE plant to the required temperatures and provide a surplus of electrical energy for the lunar base is presented. The neutron transport code Serpent 2 is used to model a ceramic core, gas-cooled very-high-temperature microreactor design and estimate its lifetime with a burnup simulation in hot conditions with an integrated step-wise criticality search. Calculations show a neutronically feasible operation time of at least 10 years at 100kW thermal power. The obtained power distributions lay a basis for further thermal-hydraulic studies on the technical feasibility of the reactor design and the power plant.
Authors:Vakhtang Chulukhadze, Zihuan Liu, Ziqian Yao, Lezli Matto, Tzu-Hsuan Hsu, Nishanth Ravi, Xiaoyu Niu, Michael E. Liao, Mark S. Goorsky, Neal Hall, Ruochen Lu
Abstract:
Piezoelectric micromachined ultrasonic transducers (PMUTs) are widely used in applications that demand mechanical resilience, thermal stability, and compact form factors. Lead zirconate titanate (PZT) and aluminum nitride (AlN) active layers are used in PMUTs to enable acoustic actuation, sensing, or bidirectional operation. These platforms rely on bimorph films to maximize electromechanical coupling ($k^2$) through thin-film deposition, which uses intermediate electrode layers to establish opposing electric fields. Consequently, incumbent PMUT platforms are limited in achievable film thickness and feature material interfaces that compromise mechanical integrity and thermal performance. Combined with the intrinsic limitations of PZT and AlN, these factors motivate exploration of alternative PMUT material platforms. Recent efforts have sought to demonstrate that single-crystal lithium niobate (LN) is a promising candidate, offering substantially higher $k^2$ and bidirectional performance. Advances in LN film transfer technology have enabled the formation of periodically poled piezoelectric (P3F) LN, facilitating a bimorph stack without intermediate electrodes. In this work, we showcase bimorph PMUTs incorporating a mechanically robust, 20 micron thick P3F LN active layer. We establish the motivation for LN PMUTs through a material comparison, followed by extensive membrane geometry optimization and subsequent enhancement of the PMUT's $k^2$. We demonstrate a 775 kHz flexural mode device with a quality factor (Q) of 200 and an extracted $k^2$ of 6.4%, yielding a high transmit efficiency of 65 nm/V with a mechanically robust active layer. We leverage the high performance to demonstrate extreme-temperature resilience, showcasing stable device operation up to 600 degrees C and survival up to 900 degrees C, highlighting LN's potential as a resilient PMUT platform.
Authors:Antonio Varagnolo, Giuseppe Romano, Raphaël Pestourie
Abstract:
Designing materials with controlled heat flow at the nano-scale is central to advances in microelectronics, thermoelectrics, and energy-conversion technologies. At these scales, phonon transport follows the Boltzmann Transport Equation (BTE), which captures non-diffusive (ballistic) effects but is too costly to solve repeatedly in inverse-design loops. Existing surrogate approaches trade speed for accuracy: fast macroscopic solvers can overestimate conductivities by hundreds of percent, while recent data-driven operator learners often require thousands of high-fidelity simulations. This creates a need for a fast, data-efficient surrogate that remains reliable across ballistic and diffusive regimes. We introduce a Physics-Enhanced Deep Surrogate (PEDS) that combines a differentiable Fourier solver with a neural generator and couples it with uncertainty-driven active learning. The Fourier solver acts as a physical inductive bias, while the network learns geometry-dependent corrections and a mixing coefficient that interpolates between macroscopic and nano-scale behavior. PEDS reduces training-data requirements by up to 70% compared with purely data-driven baselines, achieves roughly 5% fractional error with only 300 high-fidelity BTE simulations, and enables efficient design of porous geometries spanning 12-85 W m$^{-1}$ K$^{-1}$ with average design errors of 4%. The learned mixing parameter recovers the ballistic-diffusive transition and improves out of distribution robustness. These results show that embedding simple, differentiable low-fidelity physics can dramatically increase surrogate data-efficiency and interpretability, making repeated PDE-constrained optimization practical for nano-scale thermal-materials design.
Authors:Sang woo Ham, Donghun Kim
Abstract:
The gray-box modeling approach, which uses a semi-physical thermal network model, has been widely used in building prediction applications, such as model predictive control (MPC). However, unmeasured disturbances, such as occupants, lighting, and in/exfiltration loads, make it challenging to apply this approach to practical buildings. In this study, we propose a hybrid modeling approach that integrates the gray-box model with a model for unmeasured disturbance. After reviewing several system identification approaches, we systematically designed the unmeasured disturbance model with a model selection process based on statistical tests to make it robust. We generated data based on the building model calibrated by real operational data and then trained the hybrid model for two different weather conditions. The Hybrid model approach demonstrates the reduction of RMSE approximately 0.2-0.9C and 0.3-2C on 1-day ahead temperature prediction compared to the Conventional approach for mild (Berkeley, CA) and cold (Chicago, IL) climates, respectively. In addition, this approach was applied for experimental data obtained from the laboratory building to be used for the MPC application, showing superior prediction performances.
Authors:Mark Moussa, Andre Williams, Seth Roffe, Douglas Morton
Abstract:
Rapid and accurate wildfire detection is crucial for emergency response and environmental management. In airborne and spaceborne missions, real-time algorithms must distinguish between no fire, active fire, and post-fire conditions, and estimate fire intensity. Multispectral and hyperspectral thermal imagers provide rich spectral information, but high data dimensionality and limited onboard resources make real-time processing challenging. As wildfires increase in frequency and severity, the need for low-latency and computationally efficient onboard detection methods is critical. We present a systematic evaluation of multiple deep learning architectures, including custom Convolutional Neural Networks (CNNs) and Transformer-based models, for multi-class fire classification. We also introduce PyroFocus, a two-stage pipeline that performs fire classification followed by fire radiative power (FRP) regression or segmentation to reduce inference time and computational cost for onboard deployment. Using data from NASA's MODIS/ASTER Airborne Simulator (MASTER), which is similar to a next-generation fire detection sensor, we compare accuracy, inference latency, and resource efficiency. Experimental results show that the proposed two-stage pipeline achieves strong trade-offs between speed and accuracy, demonstrating significant potential for real-time edge deployment in future wildfire monitoring missions.
Authors:Wenhao Sha, Tienchong Chang
Abstract:
Heat transfer in semiconductor devices is dominated by chip and substrate assemblies, where heat generated within a finite chip layer dissipates into a semi-infinite substrate with much higher thermophysical properties. This mismatch produces steep interfacial temperature gradients, making the transient thermal response highly sensitive to the interface. Conventional numerical solvers require excessive discretization to resolve these dynamics, while physics-informed neural networks (PINNs) often exhibit unstable convergence and loss of physical consistency near the material interface. To address these challenges, we introduce HeatTransFormer, a physics-guided Transformer architecture for interface-dominated diffusion problems. The framework integrates physically informed spatiotemporal sampling, a Laplace-based activation emulating analytical diffusion solutions, and a mask-free attention mechanism supporting bidirectional spatiotemporal coupling. These components enable the model to resolve steep gradients, maintain physical consistency, and remain stable where PINNs typically fail. HeatTransFormer produces coherent temperature fields across the interface when applied to a finite layer and semi-infinite substrate configuration. Coupled with a physics-constrained inverse strategy, it further enables reliable identification of three unknown thermal properties simultaneously using only external measurements. Overall, this work demonstrates that physics-guided Transformer architectures provide a unified framework for forward and inverse modeling in interface-dominated thermal systems.
Authors:Haoxiang Zhang, Ruihao Yuan, Lihui Zhang, Yushi Luo, Qiang Zhang, Pan Ding, Xiaodong Ren, Weijie Xing, Niu Gao, Jishan Chen, Chubo Zhang
Abstract:
The industrial adoption of Artificial Intelligence for Engineering (AI4E) faces two fundamental bottlenecks: scarce high-quality data and the lack of interpretability in black-box models-particularly critical in safety-sensitive sectors like aerospace. We present an explainable, few-shot AI4E framework that is systematically informed by physics and expert knowledge throughout its architecture. Starting from only 32 experimental samples in an aerial K439B superalloy castings repair welding case, we first augment physically plausible synthetic data through a three-stage protocol: differentiated noise injection calibrated to process variabilities, enforcement of hard physical constraints, and preservation of inter-parameter relationships. We then employ a nested optimization strategy for constitutive model discovery, where symbolic regression explores equation structures while differential evolution optimizes parameters, followed by intensive parameter refinement using hybrid global-local optimization. The resulting interpretable constitutive equation achieves 88% accuracy in predicting hot-cracking tendency. This equation not only provides quantitative predictions but also delivers explicit physical insight, revealing how thermal, geometric, and metallurgical mechanisms couple to drive cracking-thereby advancing engineers' cognitive understanding of the process. Furthermore, the constitutive equation serves as a multi-functional tool for process optimization and high-fidelity virtual data generation, enabling accuracy improvements in other data-driven models. Our approach provides a general blueprint for developing trustworthy AI systems that embed engineering domain knowledge directly into their architecture, enabling reliable adoption in high-stakes industrial applications where data is limited but physical understanding is available.
Authors:Euzeli C. dos Santos, Josinaldo Lopes Araujo Rocha, Anielson dos Santos Souza, Isaac Soares de Freitas, Hudson E. Alencar Menezes
Abstract:
In tropical semiarid regions, prickly pear cactus has emerged as a vital forage resource due to its high drought tolerance and minimal water requirements. However, even limited weed infestation can severely compromise cactus productivity, as the species are highly sensitive to competition for essential resources, which includes water, mineral nutrients, and sun exposure. Conventional herbicide-based weed control strategies face growing limitations due to resistance development and environmental concerns, underscoring the need for sustainable alternatives. This study revisits the historically underexplored application of linear Fresnel lenses for thermal weed control and establishes the technical feasibility of a contemporary autonomous weed management system that incorporates LFL technology within an unmanned ground vehicle platform. Leveraging real-time image processing, georeferencing, and mechanical actuation, the system can perform a two-phase operation-weed mapping during non-optimal solar hours and targeted solar termination during peak irradiance. Analytical modeling quantifies the effective area coverage and time constraints imposed by the solar window. Preliminary results indicate that, while unsuitable for dense weed infestations, the system presents a viable solution for precision, post-emergent weed control in sparse infestations. The favorable solar geometry in tropical zones, especially in the Brazilian semiarid region, and the targeted nature of the approach make this technology particularly well-suited for sustainable agriculture in under-resourced regions.
Authors:Reihaneh Jahedan, Satya Peddada, Mark Jennings, Sunil Katragadda, James Allison, Nenad Miljkovic
Abstract:
As the automotive industry moves towards vehicle electrification, designing and optimizing thermal management systems (TMSs) for Battery Electric Vehicles (BEVs) has become a critical focus in recent years. The dependence of battery performance on operating temperature, the lack of waste combustion heat, and the significant effect of TMS energy consumption on driving range make the design of BEV TMSs highly complicated compared to conventional vehicles. Although prior research has focused on optimizing the configuration of thermal systems for varying ambient conditions, a holistic approach to studying the full potential of reconfigurable TMS architectures has not yet been fully explored. The complex design landscape of multi-mode reconfigurable systems is difficult to navigate. Relying solely on expert intuition and creativity to identify new architectures both restricts progress and leaves significant performance improvements unrealized. In this study, using graph modelling of TMS architectures, we propose a systematic method to automatically enumerate and simulate reconfigurable architectures for a TMS, given the desired operating modes, along with a framework to conduct transient performance analysis and optimization-based trade-off studies among system performance, energy consumption, and complexity. We explored more than 150 operating mode sequences, retaining 39 unique architectures for further evaluation. MATLAB Simscape models of these architectures were automatically created and their performance evaluated. The multi-objective optimization results provide decision support for selecting the best architecture based on user priorities.
Authors:Yang Yang, Ye Ji, Matthias Möller, Can Ayas
Abstract:
Thermal modeling of Laser Powder Bed Fusion (LPBF) is challenging due to steep, rapidly moving thermal gradients induced by the laser, which are difficult to resolve accurately with conventional Finite Element Methods. Highly refined, dynamically adaptive spatial discretization is typically required, leading to prohibitive computational costs. Semi-analytical approaches mitigate this by decomposing the temperature field into an analytical point-source solution and a complementary numerical field that enforces boundary conditions. However, state-of-the-art implementations either necessitate extensive mesh refinement near boundaries or rely on restrictive image source techniques, limiting their efficiency and applicability to complex geometries. This study presents a novel reformulation of the semi-analytical framework using Isogeometric Analysis. The laser heat input is captured by the analytical point-source solution, while the complementary correction field, which imposes boundary conditions, is solved using a spline-based IGA discretization. The governing heat equation for the correction field is cast in a weak form, discretized with NURBS basis functions, and advanced in time using an implicit $θ$-scheme. This approach leverages IGA's key advantages: exact geometry representation, higher-order continuity, and superior accuracy per degree of freedom. These features unlock efficient thermal modeling of realistic parts with complex contours. Our strategy eliminates the need for scan-wise remeshing and robustly handles intricate geometric features like sharp corners and varying cross-sections. Numerical examples demonstrate that the proposed semi-analytical IGA method delivers accurate temperature predictions and achieves substantial computational efficiency gains compared to standard FEM, establishing it as a powerful new tool for high-fidelity thermal simulation in LPBF.
Authors:Alex S. C. Maia, John B. Hall, Hugo F. M. Milan, Izabelle A. M. A. Teixeira
Abstract:
Advances in technology are transforming sustainable cattle farming practices, with electronic feeding systems generating big longitudinal datasets on individual animal feed intake, offering the possibility for autonomous precision livestock systems. However, the literature still lacks a methodology that fully leverages these longitudinal big data to accurately predict feed intake accounting for environmental conditions. To fill this gap, we developed an AI-based framework to accurately predict feed intake of individual animals and pen-level aggregation. Data from 19 experiments (>16.5M samples; 2013-2024) conducted at Nancy M. Cummings Research Extension & Education Center (Carmen, ID) feedlot facility and environmental data from AgriMet Network weather stations were used to develop two novel environmental indices: InComfort-Index, based solely on meteorological variables, showed good predictive capability for thermal comfort but had limited ability to predict feed intake; EASI-Index, a hybrid index integrating environmental variables with feed intake behavior, performed well in predicting feed intake but was less effective for thermal comfort. Together with the environmental indices, machine learning models were trained and the best-performing machine learning model (XGBoost) accuracy was RMSE of 1.38 kg/day for animal-level and only 0.14 kg/(day-animal) at pen-level. This approach provides a robust AI-based framework for predicting feed intake in individual animals and pens, with potential applications in precision management of feedlot cattle, through feed waste reduction, resource optimization, and climate-adaptive livestock management.
Authors:Johannes Nicklaus, Lea Brass, Gunnar Schubert
Abstract:
We study linear policy approximations for the risk-conscious operation of an industrial energy system with uncertain wind power, significant and variable electricity demand, and high thermal output, as found in a modern foundry. The system incorporates thermal storage and operates under rolling forecasts, leading to a sequential decision-making framework. To address uncertainty in key parameters, we formulate chance-constrained optimization problems that limit the probability of critical constraint violations, such as unmet demand requirements or the exceedance of system boundaries. To reduce computational effort, we replace direct uncertainty handling with a parameter-modified cost function that approximates the underlying risk structure. We validate our method through a numerical case study, demonstrating the trade-offs between operational efficiency and reliability in a stochastic environment.
Authors:Jin Ye, Lingmei Wang, Shujian Zhang, Haihang Wu
Abstract:
With the global energy transition and rapid development of renewable energy, the scheduling optimization challenge for combined power-heat systems under new energy integration and multiple uncertainties has become increasingly prominent. Addressing this challenge, this study proposes an intelligent scheduling method based on the improved Dual-Delay Deep Deterministic Policy Gradient (PVTD3) algorithm. System optimization is achieved by introducing a penalty term for grid power purchase variations. Simulation results demonstrate that under three typical scenarios (10%, 20%, and 30% renewable penetration), the PVTD3 algorithm reduces the system's comprehensive cost by 6.93%, 12.68%, and 13.59% respectively compared to the traditional TD3 algorithm. Concurrently, it reduces the average fluctuation amplitude of grid power purchases by 12.8%. Regarding energy storage management, the PVTD3 algorithm reduces the end-time state values of low-temperature thermal storage tanks by 7.67-17.67 units while maintaining high-temperature tanks within the 3.59-4.25 safety operating range. Multi-scenario comparative validation demonstrates that the proposed algorithm not only excels in economic efficiency and grid stability but also exhibits superior sustainable scheduling capabilities in energy storage device management.
Authors:Pol Benítez, Cibrán López, Edgardo Saucedo, Teruyasu Mizoguchi, Claudio Cazorla
Abstract:
Machine learning (ML) methods have become powerful tools for predicting material properties with near first-principles accuracy and vastly reduced computational cost. However, the performance of ML models critically depends on the quality, size, and diversity of the training dataset. In materials science, this dependence is particularly important for learning from low-symmetry atomistic configurations that capture thermal excitations, structural defects, and chemical disorder, features that are ubiquitous in real materials but underrepresented in most datasets. The absence of systematic strategies for generating representative training data may therefore limit the predictive power of ML models in technologically critical fields such as energy conversion and photonics. In this work, we assess the effectiveness of graph neural network (GNN) models trained on two fundamentally different types of datasets: one composed of randomly generated atomic configurations and another constructed using physically informed sampling based on lattice vibrations. As a case study, we address the challenging task of predicting electronic and mechanical properties of a prototypical family of optoelectronic materials under realistic finite-temperature conditions. We find that the phonons-informed model consistently outperforms the randomly trained counterpart, despite relying on fewer data points. Explainability analyses further reveal that high-performing models assign greater weight to chemically meaningful bonds that control property variations, underscoring the importance of physically guided data generation. Overall, this work demonstrates that larger datasets do not necessarily yield better GNN predictive models and introduces a simple and general strategy for efficiently constructing high-quality training data in materials informatics.
Authors:Parya Dolatyabi, Ali Farajzadeh Bavil, Mahdi Khodayar
Abstract:
Restoring power distribution systems (PDS) after large-scale outages requires sequential switching operations that reconfigure feeder topology and coordinate distributed energy resources (DERs) under nonlinear constraints such as power balance, voltage limits, and thermal ratings. These challenges make conventional optimization and value-based RL approaches computationally inefficient and difficult to scale. This paper applies a Heterogeneous-Agent Reinforcement Learning (HARL) framework, instantiated through Heterogeneous-Agent Proximal Policy Optimization (HAPPO), to enable coordinated restoration across interconnected microgrids. Each agent controls a distinct microgrid with different loads, DER capacities, and switch counts, introducing practical structural heterogeneity. Decentralized actor policies are trained with a centralized critic to compute advantage values for stable on-policy updates. A physics-informed OpenDSS environment provides full power flow feedback and enforces operational limits via differentiable penalty signals rather than invalid action masking. The total DER generation is capped at 2400 kW, and each microgrid must satisfy local supply-demand feasibility. Experiments on the IEEE 123-bus and IEEE 8500-node systems show that HAPPO achieves faster convergence, higher restored power, and smoother multi-seed training than DQN, PPO, MAES, MAGDPG, MADQN, Mean-Field RL, and QMIX. Results demonstrate that incorporating microgrid-level heterogeneity within the HARL framework yields a scalable, stable, and constraint-aware solution for complex PDS restoration.
Authors:Farideh Abdollahi, Kourosh Malek, Thomas Kadyk, Nadiia Kulyk, Christophe Gerling, Michael H. Eikerling
Abstract:
Prognostics and Health Management is crucial for the reliability and lifetime assessment of Polymer Electrolyte Fuel Cells (PEFCs). Here, we review the current advances on this topic, focusing mainly on key degradation mechanisms and methodologies such as physics-aware, data-driven, and hybrid modeling approaches. Key open challenges are analyzed, including the need for more accurate degradation modeling, effective management of multi-stack systems, and advancements in the currently underdeveloped action phase, in which diagnostic and prognostic insights are translated into real-time system responses, such as dynamic load derating, thermal-management adjustments, or automated maintenance triggers, to prevent failures and extend PEFC life. While notable strides have been made in recent years in diagnostics and remaining useful life estimation, it remains challenging to seamlessly integrate these insights into actionable strategies. Future directions highlight the need to address data scarcity and advance interdisciplinary research. Key focus areas include sensor integration, artificial intelligence, and digital twins. Additionally material innovations play a crucial role in bridging existing gaps. This work, therefore, intends to map the further development of Prognostics and Health Management systems toward ensuring the viability of PEFCs in practical applications.
Authors:Siddhesh Pimpale, Sagar Mahadik
Abstract:
The multi-phase inverter has become more complicated, particularly in an Electric Vehicle (EV)'s power train, which requires a robust fault protection system. The proposed active short circuit and safe discharge mechanisms are also included in this work, dedicated to multi-phase converters in failure conditions. With silicon carbide (SiC) power modules increasingly used in high efficiency and high-power applications, the reliability under fault conditions is an extremely important factor. Cascading failures and permanent damage will occur in multi phase inverter systems if short circuit faults are not prevented. The proposed method combines one centralized short circuit detection, active phase shorting and controlled discharge to make these structures more robust. The on chip active short circuit mechanism isolates the affected phases quickly preventing faults from spreading to other areas of the inverter and the safe discharge mechanism controls energy discharged in fault scenarios, which reduces the thermal stress placed on essential components. The experimental results show that the proposed mechanisms can effectively enhance a fault detection performance, system response during faults, and the operation as whole at faults over the several existing methods. These mechanisms are demonstrated to be very important for enhancing the safety and reliability of multiphase inverters, especially for critical applications of such inverters as EV where high operational security is required.
Authors:Choon-Jie Wong, Adam A. Larkin, Jie Bao, Maria Skyllas-Kazacos, Barry J. Welch, Nadia Ahli, Maitha Faraj, Mohamed Mahmoud
Abstract:
Aluminium is manufactured through the Hall-Héroult process, which is very energy intensive. Power modulation, as an industrial-scale demand-side power management approach, allows aluminium smelters to operate with variable power consumption rates and as such be powered by renewable energy sources. In this way, aluminium smelting cells can be used as a large virtual energy storage to balance power demand-supply and stabilise electrical grids. This paper studies the potential optimal power modulation operating conditions, including time-varying line current and anode-cathode distance (ACD) profiles to maximise the aluminium reduction cell profitability subject to constraints on the cell thermal balance. To deal with the complex cell dynamics which are spatially distributed and multi-timescale, a novel optimisation approach that utilises both reduced-order and detailed models is developed. The results yield insight into the optimal line current and ACD profiles for different power modulation scenarios including the time of use electricity tariff and spot price. These results can form the foundation for further studies into online control policies of aluminium reduction cells.
Authors:Lars Olt, Luis Diego Fonseca Flores, Ian Mckinley
Abstract:
Thermal Desktop (TD) is an industry-standard thermal analysis tool used to create and analyze thermal models for landers, rovers, spacecraft, and instrument payloads. Currently, limited software exists to extract and visualize metrics relevant to heat flow within TD, impeding thermal engineers from analyzing their results quickly. This paper discusses a graphical user interface (GUI) built in MATLAB and C++ which uses TDs application programming interface (API), OpenTD, and a custom parser to address this void. Specifically, we present a method for efficiently loading temperature, conductance, and submodel metrics using a side effect of TDs Compressed Solution Results (CSR) files. This approach can reduce the runtime for correlating model nodes and conductors with submodel IDs by orders of magnitude. Lastly, we reflect on the shortcomings of this method for reading data, consider the future of the GUI, and provide recommendations for subsequent OpenTD releases.
Authors:Giorrgio M. Cavallazzi, Miguel Perex Cuadrado, Alfredo Pinelli
Abstract:
Neural operators have emerged as powerful tools for learning solution operators of partial differential equations (PDEs). However, standard spectral methods based on Fourier transforms struggle with problems involving discontinuous coefficients due to the Gibbs phenomenon and poor representation of sharp interfaces. We introduce the Walsh-Hadamard Neural Operator (WHNO), which leverages Walsh-Hadamard transforms-a spectral basis of rectangular wave functions naturally suited for piecewise constant fields-combined with learnable spectral weights that transform low-sequency Walsh coefficients to capture global dependencies efficiently. We validate WHNO on three problems: steady-state Darcy flow (preliminary validation), heat conduction with discontinuous thermal conductivity, and the 2D Burgers equation with discontinuous initial conditions. In controlled comparisons with Fourier Neural Operators (FNO) under identical conditions, WHNO demonstrates superior accuracy with better preservation of sharp solution features at material interfaces. Critically, we discover that weighted ensemble combinations of WHNO and FNO achieve substantial improvements over either model alone: for both heat conduction and Burgers equation, optimal ensembles reduce mean squared error by 35-40 percent and maximum error by up to 25 percent compared to individual models. This demonstrates that Walsh-Hadamard and Fourier representations capture complementary aspects of discontinuous PDE solutions, with WHNO excelling at sharp interfaces while FNO captures smooth features effectively.
Authors:Mehmet Turker Takci, James Day, Meysam Qadrdan
Abstract:
The rapid growth of data centres poses an evolving challenge for power systems with high variable renewable energy. Traditionally operated as passive electrical loads, data centres, have the potential to become active participants that provide flexibility to the grid. However, quantifying and utilising this flexibility have not yet been fully explored. This paper presents an integrated, whole facility optimisation model to investigate the least cost operating schedule of data centres and characterise the aggregate flexibility available from data centres to the power system. The model accounts for IT workload shifting, UPS energy storage, and cooling system. Motivated by the need to alleviate the increasing strain on power systems while leveraging their untapped flexibility potential, this study makes two primary contributions: (i) an operational optimisation model that integrates IT scheduling, UPS operation, and cooling dynamics to establish a cost optimal baseline operation, and (ii) a duration-aware flexibility assessment that, for any given start time and power deviation, computes the maximum feasible duration from this baseline while respecting all operational, thermal, and recovery constraints. This method characterises the aggregate flexibility envelope. Results reveal a clear temporal structure and a notable asymmetry in flexibility provision: upward flexibility (electricity load reduction) is driven by deferring IT workload, which allows for a secondary reduction in cooling power. Downward flexibility (electricity load increase) relies on increasing power consumption of the cooling system, supported by the TES buffer, and charging the UPS. This framework translates abstract flexibility potential into quantified flexibility magnitude and duration that system operators could investigate for use in services such as reserve, frequency response, and price responsive demand.
Authors:Jose Marie Antonio Minoza, Rex Gregor Laylo, Christian F Villarin, Sebastian C. Ibanez
Abstract:
Machine learning inference occurs at a massive scale, yet its environmental impact remains poorly quantified, especially on low-resource hardware. We present ML-EcoLyzer, a cross-framework tool for measuring the carbon, energy, thermal, and water costs of inference across CPUs, consumer GPUs, and datacenter accelerators. The tool supports both classical and modern models, applying adaptive monitoring and hardware-aware evaluation. We introduce the Environmental Sustainability Score (ESS), which quantifies the number of effective parameters served per gram of CO$_2$ emitted. Our evaluation covers over 1,900 inference configurations, spanning diverse model architectures, task modalities (text, vision, audio, tabular), hardware types, and precision levels. These rigorous and reliable measurements demonstrate that quantization enhances ESS, huge accelerators can be inefficient for lightweight applications, and even small models may incur significant costs when implemented suboptimally. ML-EcoLyzer sets a standard for sustainability-conscious model selection and offers an extensive empirical evaluation of environmental costs during inference.
Authors:Rong Wu, Yim-Sang Yu
Abstract:
Medical image segmentation has been significantly advanced by deep learning architectures, notably U-Net variants. However, existing models struggle to achieve efficient global context modeling and long-range dependency reasoning under practical computational budgets simultaneously. In this work, we propose a novel hybrid architecture utilizing U-Mamba with Heat Conduction Equation. Our model combines Mamba-based state-space modules for efficient long-range reasoning with Heat Conduction Operators (HCOs) in the bottleneck layers, simulating frequency-domain thermal diffusion for enhanced semantic abstraction. Experimental results on multimodal abdominal CT and MRI datasets demonstrate that the proposed model consistently outperforms strong baselines, validating its effectiveness and generalizability. It suggest that blending state-space dynamics with heat-based global diffusion offers a scalable and interpretable solution for medical segmentation tasks.
Authors:Isabela Suaza-Sierra, Hernan A. Moreno, Luis A De la Fuente, Thomas M. Neeson
Abstract:
Accurate prediction of Reservoir Water Temperature (RWT) is vital for sustainable water management, ecosystem health, and climate resilience. Yet, prediction alone offers limited insight into the governing physical processes. To bridge this gap, we integrated explainable machine learning (ML) with symbolic modeling to uncover the drivers of RWT dynamics across ten reservoirs in the Red River Basin, USA, using over 10,000 depth-resolved temperature profiles. We first employed ensemble and neural models, including Random Forest (RF), Extreme Gradient Boosting (XGBoost), and Multilayer Perceptron (MLP), achieving high predictive skill (best RMSE = 1.20 degree Celsius, R^2 = 0.97). Using SHAP (SHapley Additive exPlanations), we quantified the contribution of physical drivers such as air temperature, depth, wind, and lake volume, revealing consistent patterns across reservoirs. To translate these data-driven insights into compact analytical expressions, we developed Kolmogorov Arnold Networks (KANs) to symbolically approximate RWT. Ten progressively complex KAN equations were derived, improving from R^2 = 0.84 using a single predictor (7-day antecedent air temperature) to R^2 = 0.92 with ten predictors, though gains diminished beyond five, highlighting a balance between simplicity and accuracy. The resulting equations, dominated by linear and rational forms, incrementally captured nonlinear behavior while preserving interpretability. Depth consistently emerged as a secondary but critical predictor, whereas precipitation had limited effect. By coupling predictive accuracy with explanatory power, this framework demonstrates how KANs and explainable ML can transform black-box models into transparent surrogates that advance both prediction and understanding of reservoir thermal dynamics.
Authors:SiWoo Kim, JhongHyun An
Abstract:
Robust perception at night remains challenging for thermal-infrared detection: low contrast and weak high-frequency cues lead to duplicate, overlapping boxes, missed small objects, and class confusion. Prior remedies either translate TIR to RGB and hope pixel fidelity transfers to detection -- making performance fragile to color or structure artifacts -- or fuse RGB and TIR at test time, which requires extra sensors, precise calibration, and higher runtime cost. Both lines can help in favorable conditions, but do not directly shape the thermal representation used by the detector. We keep mono-modality inference and tackle the root causes during training. Specifically, we introduce training-only objectives that sharpen instance-level decision boundaries by pulling together features of the same class and pushing apart those of different classes -- suppressing duplicate and confusing detections -- and that inject cross-modal semantic priors by aligning the student's multi-level pyramid features with an RGB-trained teacher, thereby strengthening texture-poor thermal features without visible input at test time. In experiments, our method outperformed prior approaches and achieved state-of-the-art performance.
Authors:ShivKishan Dubey, Rohit Sharma
Abstract:
As temperature drops, molecular systems may undergo spontaneous ordering, moving from random behavior to orderly structure. This research demonstrates a direct analogy between this type of thermodynamic ordering in molecular systems and the development of coherent logic in computationally complex problem sets. We have proposed a mapping of Boolean SAT problem instances to pairwise Ising Hamiltonian models. Using simulated annealing, we then applied phenomenal cooling to the system through thermal evolution from high entropy random assignment to lower entropy, ordered assignments (the energy minima) using molecular cooling analogs. This indicated that there was a rapid "first-order" or "logical crystallization" of satisfiable logical configurations. The degree of backbone rigidity did not strongly correlate with the level of physical ordering observed in the system; thus, it appears that there is primarily a local alignment of constraint satisfaction occurring in the system. Thus, we have provided empirical evidence that satisfiable logical configurations are analogous to the low energy crystalline states observed in molecular systems and provide evidence for a unified thermodynamic view of computational coherence and complexity.
Authors:Paul Seurin, Auradha Annaswamy, Linyu Lin
Abstract:
Integrated energy systems (IES) are complex heterogeneous architectures that typically encompass power sources, hydrogen electrolyzers, energy storage, and heat exchangers. This integration is achieved through operating control strategy optimization. However, the lack of physical understanding as to how these systems evolve over time introduces uncertainties that hinder reliable application thereof. Techniques that can accommodate such uncertainties are fundamental for ensuring proper operation of these systems. Unfortunately, no unifying methodology exists for accommodating uncertainties in this regard. That being said, adaptive control (AC) is a discipline that may allow for accommodating such uncertainties in real-time. In the present work, we derive an AC formulation for linear systems in which all states are observable and apply it to the control of a glycol heat exchanger (GHX) in an IES. Based on prior research in which we quantified the uncertainties of the GHXs system dynamics, we introduced an error of 50% on four terms of the nominal model. In the case where a linear quadratic regulator is used as the nominal control for the reference system, we found that employing AC can reduce the mean absolute error and integral time absolute error by a factor of 30%-75%. This reduction is achieved with minimal computing overhead and control infrastructure, thus underscoring the strength of AC. However, the control effort induced is significant, therefore warranting further study in order to estimate its impact on a physical system. To address further challenges, including partially observable and non-linear dynamics, enhancements of the linear formulation are currently being developed.
Authors:Bai Li, Achilleas Kourtellis, Rong Cao, Joseph Post, Brian Porter, Yu Zhang
Abstract:
Rapid and reliable incident detection is critical for reducing crash-related fatalities, injuries, and congestion. However, conventional methods, such as closed-circuit television, dashcam footage, and sensor-based detection, separate detection from verification, suffer from limited flexibility, and require dense infrastructure or high penetration rates, restricting adaptability and scalability to shifting incident hotspots. To overcome these challenges, we developed DARTS, a drone-based, AI-powered real-time traffic incident detection system. DARTS integrates drones' high mobility and aerial perspective for adaptive surveillance, thermal imaging for better low-visibility performance and privacy protection, and a lightweight deep learning framework for real-time vehicle trajectory extraction and incident detection. The system achieved 99% detection accuracy on a self-collected dataset and supports simultaneous online visual verification, severity assessment, and incident-induced congestion propagation monitoring via a web-based interface. In a field test on Interstate 75 in Florida, DARTS detected and verified a rear-end collision 12 minutes earlier than the local transportation management center and monitored incident-induced congestion propagation, suggesting potential to support faster emergency response and enable proactive traffic control to reduce congestion and secondary crash risk. Crucially, DARTS's flexible deployment architecture reduces dependence on frequent physical patrols, indicating potential scalability and cost-effectiveness for use in remote areas and resource-constrained settings. This study presents a promising step toward a more flexible and integrated real-time traffic incident detection system, with significant implications for the operational efficiency and responsiveness of modern transportation management.
Authors:Zainab Akhtar, Eunice Jengo, Björn HaÃler
Abstract:
This study presents a lightweight, domain-informed AI model for predicting indoor temperatures in naturally ventilated schools and homes in Sub-Saharan Africa. The model extends the Temp-AI-Estimator framework, trained on Tanzanian school data, and evaluated on Nigerian schools and Gambian homes. It achieves robust cross-country performance using only minimal accessible inputs, with mean absolute errors of 1.45°C for Nigerian schools and 0.65°C for Gambian homes. These findings highlight AI's potential for thermal comfort management in resource-constrained environments.
Authors:Yannick Hollenweger, Dennis M. Kochman, Burigede Liu
Abstract:
Neural network surrogate models for constitutive laws in computational mechanics have been in use for some time. In plasticity, these models often rely on gated recurrent units (GRUs) or long short-term memory (LSTM) cells, which excel at capturing path-dependent phenomena. However, they suffer from long training times and time-resolution-dependent predictions that extrapolate poorly. Moreover, most existing surrogates for macro- or mesoscopic plasticity handle only relatively simple material behavior. To overcome these limitations, we introduce the Temperature-Aware Recurrent Neural Operator (TRNO), a time-resolution-independent neural architecture. We apply the TRNO to model the temperature-dependent plastic response of polycrystalline magnesium, which shows strong plastic anisotropy and thermal sensitivity. The TRNO achieves high predictive accuracy and generalizes effectively across diverse loading cases, temperatures, and time resolutions. It also outperforms conventional GRU and LSTM models in training efficiency and predictive performance. Finally, we demonstrate multiscale simulations with the TRNO, yielding a speedup of at least three orders of magnitude over traditional constitutive models.
Authors:Hao Chen, Fang Qiu, Li An, Douglas Stow, Eve Bohnett, Haitao Lyu, Shuang Tian
Abstract:
Wildlife and human activities are key components of landscape systems. Understanding their spatial distribution is essential for evaluating human wildlife interactions and informing effective conservation planning. Multiperspective monitoring of wildlife and human activities by combining camera traps and drone imagery. Capturing the spatial patterns of their distributions, which allows the identification of the overlap of their activity zones and the assessment of the degree of human wildlife conflict. The study was conducted in Chitwan National Park (CNP), Nepal, and adjacent regions. Images collected by visible and nearinfrared camera traps and thermal infrared drones from February to July 2022 were processed to create training and testing datasets, which were used to build deep learning models to automatic identify wildlife and human activities. Drone collected thermal imagery was used for detecting targets to provide a multiple monitoring perspective. Spatial pattern analysis was performed to identify animal and resident activity hotspots and delineation potential human wildlife conflict zones. Among the deep learning models tested, YOLOv11s achieved the highest performance with a precision of 96.2%, recall of 92.3%, mAP50 of 96.7%, and mAP50 of 81.3%, making it the most effective for detecting objects in camera trap imagery. Drone based thermal imagery, analyzed with an enhanced Faster RCNN model, added a complementary aerial viewpoint for camera trap detections. Spatial pattern analysis identified clear hotspots for both wildlife and human activities and their overlapping patterns within certain areas in the CNP and buffer zones indicating potential conflict. This study reveals human wildlife conflicts within the conserved landscape. Integrating multiperspective monitoring with automated object detection enhances wildlife surveillance and landscape management.
Authors:Yongxiang Liu, Yuchun Ma, Eren Kurshan, Glenn Reinman, Jason Cong
Abstract:
Most previous 3D IC research focused on stacking traditional 2D silicon layers, so the interconnect reduction is limited to inter-block delays. In this paper, we propose techniques that enable efficient exploration of the 3D design space where each logical block can span more than one silicon layers. Although further power and performance improvement is achievable through fine grain 3D integration, the necessary modeling and tool infrastructure has been mostly missing. We develop a cube packing engine which can simultaneously optimize physical and architectural design for effective utilization of 3D in terms of performance, area and temperature. Our experimental results using a design driver show 36% performance improvement (in BIPS) over 2D and 14% over 3D with single layer blocks. Additionally multi-layer blocks can provide up to 30% reduction in power dissipation compared to the single-layer alternatives. Peak temperature of the design is kept within limits as a result of thermal-aware floorplanning and thermal via insertion techniques.
Authors:Mathis Rezzouk, Fabrice Gagnon, Alyson Champagne, Mathieu Roy, Philippe Albouy, Michel-Pierre Coll, Cem Subakan
Abstract:
EEG-based analysis of pain perception, enhanced by machine learning, reveals how the brain encodes pain by identifying neural patterns evoked by noxious stimulation. However, a major challenge that remains is the generalization of machine learning models across individuals, given the high cross-participant variability inherent to EEG signals and the limited focus on direct pain perception identification in current research. In this study, we systematically evaluate the performance of cross-participant generalization of a wide range of models, including traditional classifiers and deep neural classifiers for identifying the sensory modality of thermal pain and aversive auditory stimulation from EEG recordings. Using a novel dataset of EEG recordings from 108 participants, we benchmark model performance under both within- and cross-participant evaluation settings. Our findings show that traditional models suffered the largest drop from within- to cross-participant performance, while deep learning models proved more resilient, underscoring their potential for subject-invariant EEG decoding. Even though performance variability remained high, the strong results of the graph-based model highlight its potential to capture subject-invariant structure in EEG signals. On the other hand, we also share the preprocessed dataset used in this study, providing a standardized benchmark for evaluating future algorithms under the same generalization constraints.
Authors:Johannes F. Hevler, Shivam Verma, Mirat Soijtra, Carolyn R. Bertozzi
Abstract:
Thermal Tracks is a Python-based statistical framework for analyzing protein thermal stability data that overcomes key limitations of existing thermal proteome profiling (TPP) work-flows. Unlike standard approaches that assume sigmoidal melting curves and are constrained by empirical null distributions (limiting significant hits to approximately 5 % of data), Thermal Tracks uses Gaussian Process (GP) models with squared-exponential kernels to flexibly model any melting curve shape while generating unbiased null distributions through kernel priors. This framework is particularly valuable for analyzing proteome-wide perturbations that significantly alter protein thermal stability, such as pathway inhibitions, genetic modifications, or environmental stresses, where conventional TPP methods may miss biologically relevant changes due to their statistical constraints. Furthermore, Thermal Tracks excels at analyzing proteins with un-conventional melting profiles, including phase-separating proteins and membrane proteins, which often exhibit complex, non-sigmoidal thermal stability behaviors. Thermal Tracks is freely available from GitHub and is implemented in Python, providing an accessible and flexible tool for proteome-wide thermal profiling studies.
Authors:Huan Zhang, Hui Zhang, Yan Wang, Yingxiang Xu
Abstract:
Heat transfer in composites is critical in engineering, where imperfect layer contact causes thermal contact resistance (TCR), leading to interfacial temperature discontinuity. We propose solving this numerically using the optimized Schwarz method (OSM), which decouples the heterogeneous problem into homogeneous subproblems. This avoids ill-conditioned systems from monolithic solving due to high contrast and interface jumps. Both energy estimate and Fourier analysis are used to prove the convergence of this algorithm when the standard Robin condition is applied to transmit information between subdomains. To achieve fast convergence, instead of the standard Robin, the scaled Robin transmission condition is proposed, and the involved free parameter is rigorously optimized. The results reveal several new findings due to the presence of TCR: first, the larger the TCR, the faster the OSM converges; second, mesh-independent convergence is achieved in the asymptotic sense, in contrast to the mesh-dependent results without TCR; and last, the heterogeneity contrast benefits the convergence, with a larger contrast leading to faster convergence. Interestingly, different from the case without TCR, the thermal conductivity also benefits the convergence, similar to the effect of heterogeneity. Numerical experiments confirm the theoretical findings and demonstrate the method's potential for nonlinear problems on irregular domains.
Authors:Tai Hyoung Rhee, Dong-guw Lee, Ayoung Kim
Abstract:
Thermal infrared imaging exhibits considerable potentials for robotic perception tasks, especially in environments with poor visibility or challenging lighting conditions. However, TIR images typically suffer from heavy non-uniform fixed-pattern noise, complicating tasks such as object detection, localization, and mapping. To address this, we propose a diffusion-based TIR image denoising framework leveraging latent-space representations and wavelet-domain optimization. Utilizing a pretrained stable diffusion model, our method fine-tunes the model via a novel loss function combining latent-space and discrete wavelet transform (DWT) / dual-tree complex wavelet transform (DTCWT) losses. Additionally, we implement a cascaded refinement stage to enhance fine details, ensuring high-fidelity denoising results. Experiments on benchmark datasets demonstrate superior performance of our approach compared to state-of-the-art denoising methods. Furthermore, our method exhibits robust zero-shot generalization to diverse and challenging real-world TIR datasets, underscoring its effectiveness for practical robotic deployment.
Authors:Daniel Andrés López, Vincent Weber, Severin Zentgraf, Barlo Hillen, Perikles Simon, Elmar Schömer
Abstract:
Infrared thermography is emerging as a powerful tool in sports medicine, allowing assessment of thermal radiation during exercise and analysis of anatomical regions of interest, such as the well-exposed calves. Building on our previous advanced automatic annotation method, we aimed to transfer the stereo- and multimodal-based labeling approach from treadmill running to ergometer cycling. Therefore, the training of the semantic segmentation network with automatic labels and fine-tuning on high-quality manually annotated images has been examined and compared in different data set combinations. The results indicate that fine-tuning with a small fraction of manual data is sufficient to improve the overall performance of the deep neural network. Finally, combining automatically generated labels with small manually annotated data sets accelerates the adaptation of deep neural networks to new use cases, such as the transition from treadmill to bicycle.
Authors:Raul Castilla-Arquillo, Carlos Perez-del-Pulgar, Levin Gerdes, Alfonso Garcia-Cerezo, Miguel A. Olivares-Mendez
Abstract:
Robot navigation in unstructured environments requires multimodal perception systems that can support safe navigation. Multimodality enables the integration of complementary information collected by different sensors. However, this information must be processed by machine learning algorithms specifically designed to leverage heterogeneous data. Furthermore, it is necessary to identify which sensor modalities are most informative for navigation in the target environment. In Martian exploration, thermal imagery has proven valuable for assessing terrain safety due to differences in thermal behaviour between soil types. This work presents OmniUnet, a transformer-based neural network architecture for semantic segmentation using RGB, depth, and thermal (RGB-D-T) imagery. A custom multimodal sensor housing was developed using 3D printing and mounted on the Martian Rover Testbed for Autonomy (MaRTA) to collect a multimodal dataset in the Bardenas semi-desert in northern Spain. This location serves as a representative environment of the Martian surface, featuring terrain types such as sand, bedrock, and compact soil. A subset of this dataset was manually labeled to support supervised training of the network. The model was evaluated both quantitatively and qualitatively, achieving a pixel accuracy of 80.37% and demonstrating strong performance in segmenting complex unstructured terrain. Inference tests yielded an average prediction time of 673 ms on a resource-constrained computer (Jetson Orin Nano), confirming its suitability for on-robot deployment. The software implementation of the network and the labeled dataset have been made publicly available to support future research in multimodal terrain perception for planetary robotics.
Authors:Shahriar Kabir, Istiak Ahmmed Rifti, H. M. Shadman Tabib, Mushfiqur Rahman, Sadatul Islam Sadi, Hasnaen Adil, Ahmed Mahir Sultan Rumi, Ch Md Rakin Haider
Abstract:
The proliferation of drones in civilian airspace has raised urgent security concerns, necessitating robust real-time surveillance systems. In response to the 2025 VIP Cup challenge tasks - drone detection, tracking, and payload identification - we propose a dual-stream drone monitoring framework. Our approach deploys independent You Only Look Once v11-nano (YOLOv11n) object detectors on parallel infrared (thermal) and visible (RGB) data streams, deliberately avoiding early fusion. This separation allows each model to be specifically optimized for the distinct characteristics of its input modality, addressing the unique challenges posed by small aerial objects in diverse environmental conditions. We customize data preprocessing and augmentation strategies per domain - such as limiting color jitter for IR imagery - and fine-tune training hyperparameters to enhance detection performance under conditions of heavy noise, low light, and motion blur. The resulting lightweight YOLOv11n models demonstrate high accuracy in distinguishing drones from birds and in classifying payload types, all while maintaining real-time performance. This report details the rationale for a dual-modality design, the specialized training pipelines, and the architectural optimizations that collectively enable efficient and accurate drone surveillance across RGB and IR channels.
Authors:Deepak Joshi, Mayukha Pal
Abstract:
Accurate detection of defects such as hotspots and snail trails in photovoltaic modules is essential for maintaining energy efficiency and system reliablility. This work presents a supervised deep learning framework for segmenting thermal infrared images of PV panels, using a dataset of 277 aerial thermographic images captured by zenmuse XT infrared camera mounted on a DJI Matrice 100 drone. The preprocessing pipeline includes image resizing, CLAHE based contrast enhancement, denoising, and normalisation. A lightweight semantic segmentation model based on SegFormer is developed, featuring a customised Transformwer encoder and streamlined decoder, and fine-tuned on annotated images with manually labeled defect regions. To evaluate performance, we benchmark our model against U-Net, DeepLabV3, PSPNet, and Mask2Former using consistent preprocessing and augmentation. Evaluation metrices includes per-class Dice score, F1-score, Cohen's kappa, mean IoU, and pixel accuracy. The SegFormer-based model outperforms baselines in accuracy and efficiency, particularly for segmenting small and irregular defects. Its lightweight design real-time deployment on edge devices and seamless integration with drone-based systems for automated inspection of large-scale solar farms.
Authors:Hamza Mettali, Rousset François, Eric Bideaux, Clausse Marc
Abstract:
The integration of renewable sources is essential for decarbonizing heat production in district energy networks. Beyond biomass-based solutions, solar thermal energy, with or without heat pumps, presents a significant opportunity. However, system performance is highly dependent on outdoor and setpoint temperatures. This study aims to optimize system design using a multi-criteria approach that considers techno-economic and environmental (CO2) factors. A Mixed-Integer Linear Programming (MILP) model is developed, incorporating temperature discretization for problem linearization and capturing key dynamic characteristics of heat generators. The model improves convergence, reducing a 19% MIP gap in 26 hours to 10% in 12 hours by dissipating 6% excess solar heat. A multi-scenario analysis under two carbon taxation levels and different CO2 emission cases revealed solar integration up to 11,932 m${}^2$ but increased gas reliance (50%) and TES losses (49%). Wood boiler inclusion reduced solar dependency, covering 45% of heat, lowered LCOH, but limited renewable penetration. Higher carbon taxes boosted solar adoption but faced storage inefficiencies, while biomass enhanced cost efficiency and system stability.
Authors:Tianyuan Wang, Mark A Post, Mathieu Deremetz
Abstract:
The use of autonomous robots in space is an essential part of the "New Space" commercial ecosystem of assembly and re-use of space hardware components in Earth orbit and beyond. The STARFAB project aims to create a ground demonstration of an orbital automated warehouse as a hub for sustainable commercial operations and servicing. A critical part of this fully-autonomous robotic facility will be the capability to monitor, inspect, and assess the condition of both the components stored in the warehouse, and the STARFAB facility itself. This paper introduces ongoing work on the STARFAB Mobile Inspection Module (MIM). The MIM uses Standard Interconnects (SI) so that it can be carried by Walking Manipulators (WM) as an independently-mobile robot, and multiple MIMs can be stored and retrieved as needed for operations on STARFAB. The MIM carries high-resolution cameras, a 3D profilometer, and a thermal imaging sensor, with the capability to add other modular sensors. A grasping tool and torque wrench are stored within the modular body for use by an attached WM for maintenance operations. Implementation and testing is still ongoing at the time of writing. This paper details the concept of operations for the MIM as an on-orbit autonomous inspection and maintenance system, the mechanical and electronic design of the MIM, and the sensors package used for non-destructive testing.
Authors:Marwan Hassini, Colette Mintsa-Eya, Eduardo Redondo-Iglesias, Pascal Venet
Abstract:
Understanding how batteries perform after automotive use is crucial to determining their potential for reuse. This article presents experimental results aimed at advancing knowledge of retired battery performance. Three modules extracted from electric vehicles were tested. Their performance was assessed, and the results were analyzed statistically using analysis of variance (ANOVA). The 36 retired cells exhibited a high level of performance, albeit with significant variation. On average, the cells had a 95% state of health capacity with a dispersion of 2.4%. ANOVA analysis suggests that cell performance is not correlated with their position inside the module. These results demonstrate the need to evaluate dispersion within retired batteries and to develop thermal management and balancing systems for second-life batteries.
Authors:Nicholas Kirschbaum, Nathaniel Wood, Chang-Eun Kim, Thejaswi U. Tumkur, Chinedum Okwudire
Abstract:
Laser powder bed fusion (LPBF) is an additive manufacturing technique that has gained popularity thanks to its ability to produce geometrically complex, fully dense metal parts. However, these parts are prone to internal defects and geometric inaccuracies, stemming in part from variations in the melt pool. This paper proposes a novel vector-level feedforward control framework for regulating melt pool area in LPBF. By decoupling part-scale thermal behavior from small-scale melt pool physics, the controller provides a scale-agnostic prediction of melt pool area and efficient optimization over it. This is done by operating on two coupled lightweight models: a finite-difference thermal model that efficiently captures vector-level temperature fields and a reduced-order, analytical melt pool model. Each model is calibrated separately with minimal single-track and 2D experiments, and the framework is validated on a complex 3D geometry in both Inconel 718 and 316L stainless steel. Results showed that feedforward vector-level laser power scheduling reduced geometric inaccuracy in key dimensions by 62%, overall porosity by 16.5%, and photodiode variation by 6.8% on average. Overall, this modular, data-efficient approach demonstrates that proactively compensating for known thermal effects can significantly improve part quality while remaining computationally efficient and readily extensible to other materials and machines.
Authors:Aon Safdar, Usman Akram, Waseem Anwar, Basit Malik, Mian Ibad Ali
Abstract:
Automatic Target Detection (ATD) and Recognition (ATR) from Thermal Infrared (TI) imagery in the defense and surveillance domain is a challenging computer vision (CV) task in comparison to the commercial autonomous vehicle perception domain. Limited datasets, peculiar domain-specific and TI modality-specific challenges, i.e., limited hardware, scale invariance issues due to greater distances, deliberate occlusion by tactical vehicles, lower sensor resolution and resultant lack of structural information in targets, effects of weather, temperature, and time of day variations, and varying target to clutter ratios all result in increased intra-class variability and higher inter-class similarity, making accurate real-time ATR a challenging CV task. Resultantly, contemporary state-of-the-art (SOTA) deep learning architectures underperform in the ATR domain. We propose a modified anchor-based single-stage detector, called YOLOatr, based on a modified YOLOv5s, with optimal modifications to the detection heads, feature fusion in the neck, and a custom augmentation profile. We evaluate the performance of our proposed model on a comprehensive DSIAC MWIR dataset for real-time ATR over both correlated and decorrelated testing protocols. The results demonstrate that our proposed model achieves state-of-the-art ATR performance of up to 99.6%.
Authors:Michael Ryan, Mohammad Hassan Baqershahi, Hessamoddin Moshayedi, Elyas Ghafoori
Abstract:
Wire-arc directed energy deposition (DED) has emerged as a promising additive manufacturing (AM) technology for large-scale structural engineering applications. However, the complex thermal dynamics inherent to the process present challenges in ensuring structural integrity and mechanical properties of fabricated thick walls and plates. While finite element method (FEM) simulations have been conventionally employed to predict thermal history during deposition, their computational demand remains prohibitively high for actual large-scale applications. Given the necessity of multiple repetitive simulations for heat management and the determination of an optimal printing strategy, FEM simulation quickly becomes entirely infeasible. Instead, advancements have been made in using trained neural networks as surrogate models for rapid prediction. However, traditional data-driven approaches necessitate large amounts of relevant and verifiable external data, during the training and validation of the neural network. Regarding large-scale wire-arc DED, none of these data sources are readily available in quantities sufficient for an accurate surrogate. The introduction of physics-informed neural networks (PINNs) has opened up an alternative simulation strategy by leveraging the existing physical knowledge of the phenomena with advanced machine learning methods. Despite their theoretical advantages, PINNs have seen limited application in the context of large-scale wire-arc DED for structural engineering. This study investigates the scalability of PINNs, focusing on efficient collocation points sampling, a critical factor controlling both the training time and model performance. Results show PINNs can reduce computational time and effort by up to 98.6%, while maintaining the desired accuracy and offering "super-resolution". Future directions for enhancing PINN performance in metal AM are discussed.
Authors:Luiz Aldeia Machado, Victor Coppo Leite, Elia Merzari, Arthur Motta, Roberto Ponciroli, Lander Ibarra, Lise Charlot
Abstract:
Proactive maintenance strategies, such as Predictive Maintenance (PdM), play an important role in the operation of Nuclear Power Plants (NPPs), particularly due to their capacity to reduce offline time by preventing unexpected shutdowns caused by component failures.
In this work, we explore the use of a Convolutional Neural Network (CNN) architecture combined with a computational thermomechanical model to calculate the temperature, stress, and strain of a Pressurized Water Reactor (PWR) fuel rod during operation. This estimation relies on a limited number of temperature measurements from the cladding's outer surface. This methodology can potentially aid in developing PdM tools for nuclear reactors by enabling real-time monitoring of such systems.
The training, validation, and testing datasets were generated through coupled simulations involving BISON, a finite element-based nuclear fuel performance code, and the MOOSE Thermal-Hydraulics Module (MOOSE-THM). We conducted eleven simulations, varying the peak linear heat generation rates. Of these, eight were used for training, two for validation, and one for testing.
The CNN was trained for over 1,000 epochs without signs of overfitting, achieving highly accurate temperature distribution predictions. These were then used in a thermomechanical model to determine the stress and strain distribution within the fuel rod.
Authors:Soroush Shahi, Farzad Shahabi, Rama Nabulsi, Glenn Fernandes, Aggelos Katsaggelos, Nabil Alshurafa
Abstract:
Wearable cameras are increasingly used as an observational and interventional tool for human behaviors by providing detailed visual data of hand-related activities. This data can be leveraged to facilitate memory recall for logging of behavior or timely interventions aimed at improving health. However, continuous processing of RGB images from these cameras consumes significant power impacting battery lifetime, generates a large volume of unnecessary video data for post-processing, raises privacy concerns, and requires substantial computational resources for real-time analysis. We introduce THOR, a real-time adaptive spatio-temporal RGB frame sampling method that leverages thermal sensing to capture hand-object patches and classify them in real-time. We use low-resolution thermal camera data to identify moments when a person switches from one hand-related activity to another, and adjust the RGB frame sampling rate by increasing it during activity transitions and reducing it during periods of sustained activity. Additionally, we use the thermal cues from the hand to localize the region of interest (i.e., the hand-object interaction) in each RGB frame, allowing the system to crop and process only the necessary part of the image for activity recognition. We develop a wearable device to validate our method through an in-the-wild study with 14 participants and over 30 activities, and further evaluate it on Ego4D (923 participants across 9 countries, totaling 3,670 hours of video). Our results show that using only 3% of the original RGB video data, our method captures all the activity segments, and achieves hand-related activity recognition F1-score (95%) comparable to using the entire RGB video (94%). Our work provides a more practical path for the longitudinal use of wearable cameras to monitor hand-related activities and health-risk behaviors in real time.
Authors:Roham Maiti, Debasmita Bhoumik
Abstract:
Brain plays a crucial role in regulating body functions and cognitive processes, with brain tumors posing significant risks to human health. Precise and prompt detection is a key factor in proper treatment and better patient outcomes. Traditional methods for detecting brain tumors, that include biopsies, MRI, and CT scans often face challenges due to their high costs and the need for specialized medical expertise. Recent developments in machine learning (ML) and deep learning (DL) has exhibited strong capabilities in automating the identification and categorization of brain tumors from medical images, especially MRI scans. However, these classical ML models have limitations, such as high computational demands, the need for large datasets, and long training times, which hinder their accessibility and efficiency. Our research uses MobileNET model for efficient detection of these tumors. The novelty of this project lies in building an accurate tumor detection model which use less computing re-sources and runs in less time followed by efficient decision making through the use of image processing technique for accurate results. The suggested method attained an average accuracy of 98.5%.
Authors:J. de Curtò, Cristina LiCalzi, Julien Tubiana Warin, Jack Gehlert, Brian Langbein, Alexandre Gamboa, Chris Sixbey, William Maguire, Santiago Fernández, Ãlvaro Maestroarena, Alex Brenchley, Logan Maroclo, Philemon Mercado, Joshua DeJohn, Cesar Velez, Ethan Dahmus, Taylor Steinys, David Fritz, I. de ZarzÃ
Abstract:
This paper presents innovative solutions to critical challenges in planetary and deep-space exploration electronics. We synthesize findings across diverse mission profiles, highlighting advances in: (1) MARTIAN positioning systems with dual-frequency transmission to achieve $\pm$1m horizontal accuracy; (2) artificial reef platforms for Titan's hydrocarbon seas utilizing specialized sensor arrays and multi-stage communication chains; (3) precision orbital rendezvous techniques demonstrating novel thermal protection solutions; (4) miniaturized CubeSat architectures for asteroid exploration with optimized power-to-mass ratios; and (5) next-generation power management systems for MARS rovers addressing dust accumulation challenges. These innovations represent promising directions for future space exploration technologies, particularly in environments where traditional Earth-based electronic solutions prove inadequate. The interdisciplinary nature of these developments highlights the critical intersection of aerospace engineering, electrical engineering, and planetary science in advancing human exploration capabilities beyond Earth orbit.
Authors:Cyrus Addy, Ajay Kumar Gurumadaiah, Yixiang Gao, Kwame Awuah-Offei
Abstract:
Underground mining operations face significant safety challenges that make emergency response capabilities crucial. While robots have shown promise in assisting with search and rescue operations, their effectiveness depends on reliable miner detection capabilities. Deep learning algorithms offer potential solutions for automated miner detection, but require comprehensive training datasets, which are currently lacking for underground mining environments. This paper presents a novel thermal imaging dataset specifically designed to enable the development and validation of miner detection systems for potential emergency applications. We systematically captured thermal imagery of various mining activities and scenarios to create a robust foundation for detection algorithms. To establish baseline performance metrics, we evaluated several state-of-the-art object detection algorithms including YOLOv8, YOLOv10, YOLO11, and RT-DETR on our dataset. While not exhaustive of all possible emergency situations, this dataset serves as a crucial first step toward developing reliable thermal-based miner detection systems that could eventually be deployed in real emergency scenarios. This work demonstrates the feasibility of using thermal imaging for miner detection and establishes a foundation for future research in this critical safety application.
Authors:Graydon Schulze-Kalt, Robert Pitu, Spencer Shelton, Catherine Todd, Zane Ebel, Ian Goldberg, Leon Gold, Henry Czarnecki, Mason McCormack, Larry Li, Zumi Riekse, Brian Yu, Akash Piya, Vidya Suri, Dylan Hu, Colleen Kim, John Baird, Seth Knights, Logan Hanssler, Michael Lembeck, Tian Zhong
Abstract:
The undergraduate-led Polarization-modUlated Laser Satellite Experiment (PULSE-A) at the University of Chicago seeks to demonstrate the feasibility of circular polarization shift keyed satellite-to-ground laser communication. PULSE-A's low-cost open-source bus serves as the backbone of the mission and has been designed in tandem with the Payload, with design driven by strict requirements for pointing accuracy, component alignment, power demand, and thermal stability. This work presents the design and testing of the PULSE-A bus.
The spacecraft bus was designed to fill two major needs: (1) to meet the requirements of the PULSE-A mission, and (2) to be easily configurable for future missions that desire enhanced capabilities over other low-cost open-source designs. At its core, the bus features dual BeagleBone Black Industrial compute units, selected for their flight heritage, integrated via a PC/104 header standard. PULSE-A implements Goddard Space Flight Center's core Flight System (cFS), which takes a modular software architecture approach and is built in C. The use of C as the primary language aligns with the expertise of the University of Chicago's Computer Science department, allowing for ease of development by PULSE-A's undergraduate flight software team.
The CubeSat structure utilizes Gran Systems' 3U frame, modified to accommodate openings for various ports and deployable components. Inside, the avionics stack uses the PC/104 standard quad rails, which terminate in PULSE-A's custom-designed Payload Box that houses all of the Payload components and optical fiber runs. This work also covers the techniques and iterative engineering processes used to develop the thermal control and dissipation mechanisms for the specific requirements, under volume, mass, and temperature-range constraints.
Authors:Ján Boldocký, Shahriar Dadras Javan, Martin Gulan, Martin Mönnigmann, Ján DrgoÅa
Abstract:
We propose a novel approach to solving input- and state-constrained parametric mixed-integer optimal control problems using Differentiable Predictive Control (DPC). Our approach follows the differentiable programming paradigm by learning an explicit neural policy that maps control parameters to integer- and continuous-valued decision variables. This policy is optimized via stochastic gradient descent by differentiating the quadratic model predictive control objective through the closed-loop finite-horizon response of the system dynamics. To handle integrality constraints, we incorporate three differentiable rounding strategies. The approach is evaluated on a conceptual thermal energy system, comparing its performance with the optimal solution for different lengths of the prediction horizon. The simulation results indicate that our self-supervised learning approach can achieve near-optimal control performance while significantly reducing inference time by avoiding online optimization, thus implying its potential for embedded deployment even on edge devices.
Authors:Mahdi Falaki, Maria A. Amer
Abstract:
Single-modality object tracking (e.g., RGB-only) encounters difficulties in challenging imaging conditions, such as low illumination and adverse weather conditions. To solve this, multimodal tracking (e.g., RGB-T models) aims to leverage complementary data such as thermal infrared features. While recent Vision Transformer-based multimodal trackers achieve strong performance, they are often computationally expensive due to large model sizes. In this work, we propose a novel lightweight RGB-T tracking algorithm based on Mobile Vision Transformers (MobileViT). Our tracker introduces a progressive fusion framework that jointly learns intra-modal and inter-modal interactions between the template and search regions using separable attention. This design produces effective feature representations that support more accurate target localization while achieving a small model size and fast inference speed. Compared to state-of-the-art efficient multimodal trackers, our model achieves comparable accuracy while offering significantly lower parameter counts (less than 4 million) and the fastest GPU inference speed of 122 frames per second. This paper is the first to propose a tracker using Mobile Vision Transformers for RGB-T tracking and multimodal tracking at large. Tracker code and model weights will be made publicly available upon acceptance.
Authors:Maoyuan Li, Sihong Li, Guancheng Shen, Yun Zhang, Huamin Zhou
Abstract:
To address the challenges of untimely detection and online monitoring lag in injection molding quality anomalies, this study proposes a mixed feature attention-artificial neural network (MFA-ANN) model for high-precision online prediction of product weight. By integrating mechanism-based with data-driven analysis, the proposed architecture decouples time series data (e.g., melt flow dynamics, thermal profiles) from non-time series data (e.g., mold features, pressure settings), enabling hierarchical feature extraction. A self-attention mechanism is strategically embedded during cross-domain feature fusion to dynamically calibrate inter-modality feature weights, thereby emphasizing critical determinants of weight variability. The results demonstrate that the MFA-ANN model achieves a RMSE of 0.0281 with 0.5 g weight fluctuation tolerance, outperforming conventional benchmarks: a 25.1% accuracy improvement over non-time series ANN models, 23.0% over LSTM networks, 25.7% over SVR, and 15.6% over RF models, respectively. Ablation studies quantitatively validate the synergistic enhancement derived from the integration of mixed feature modeling (contributing 22.4%) and the attention mechanism (contributing 11.2%), significantly enhancing the model's adaptability to varying working conditions and its resistance to noise. Moreover, critical sensitivity analyses further reveal that data resolution significantly impacts prediction reliability, low-fidelity sensor inputs degrade performance by 23.8% RMSE compared to high-precision measurements. Overall, this study provides an efficient and reliable solution for the intelligent quality control of injection molding processes.
Authors:Ali Chouman, Peter Riederer, Frédéric Wurtz
Abstract:
Climate change poses a serious threat to the Earth's ecosystems, fueled primarily by escalating greenhouse gas emissions. Among the main contributors, the building sector stands out due to its significant energy demand. Addressing this challenge requires innovative techniques in the control of energy systems in buildings. This paper deals with the formulation of a methodology designed to evaluate the performance of these controllers. The evaluation process involves the establishment of a comprehensive test protocol and a diverse set of scenarios to evaluate the controllers. Key performance indicators are used to quantify their effectiveness based on the test results. A practical case study is presented as an application to introduce this methodology, focusing on the integration of Model Predictive Controllers (MPCs) with the Dimosim thermal simulation platform. The digital twin of the Greener building in Grenoble is used as a model for emulation. The paper demonstrates the ability of the proposed methodology to test and rank MPCs in different test scenarios, providing valuable feedback on their performance capabilities. The paper highlights the importance of the developed approach in systematically evaluating and ranking MPCs for optimized building energy management.
Authors:Alborz Jelvani, Richard P Martin, Santosh Nagarakatte
Abstract:
Processors with dynamic power management provide a variety of settings to control energy efficiency. However, tuning these settings does not achieve optimal energy savings. We highlight how existing power capping mechanisms can address these limitations without requiring any changes to current power governors. We validate this approach using system measurements across a month-long data acquisition campaign from SPEC CPU 2017 benchmarks on a server-class system equipped with dual Intel Xeon Scalable processors. Our results indicate that setting a simple power cap can improve energy efficiency by up to 25% over traditional energy-saving system configurations with little performance loss, as most default settings focus on thermal regulation and performance rather than compute efficiency. Power capping is very accessible compared to other approaches, as it can be implemented with a single Linux command. Our results point to programmers and administrators using power caps as a primary mechanism to maintain significant energy efficiency while retaining acceptable performance, as opposed to deploying complex DVFS algorithms.
Authors:Wouter J. Schuttert, Mohammed Iqbal Abdul Rasheed, Bojana RosiÄ
Abstract:
Anisotropic material properties, such as the thermal conductivities of engineering composites, exhibit variability due to inherent material heterogeneity and manufacturing-related uncertainties. Mathematically, these properties are modeled as symmetric positive definite (SPD) tensors, which reside on a curved Riemannian manifold. Extending this description to a stochastic framework requires preserving both the SPD structure and the underlying spatial symmetries of the tensors. This is achieved through the spectral decomposition of tensors, which enables the parameterization of uncertainties into scale (strength) and rotation (orientation) components. To quantify the impact of strength and orientation uncertainties on the thermal behaviour of the composite, the stochastic material tensor must be propagated through a physics-based forward model. This process necessitates computationally efficient surrogate models, for which a feedforward neural network (FNN) is employed. However, conventional FNN architectures are not well-suited for SPD tensors, as directly using tensor components as input features fails to preserve their underlying geometric structure, often leading to suboptimal performance. To address this issue, we introduce the Constitutive Manifold Neural Network (CMNN), which incorporates input layers that map SPD tensors from the curved manifold to the local tangent space-a flat vector space-thus preserving the statistical and geometric information in the dataset. A case study involving steady-state heat conduction with stochastic anisotropic conductivity demonstrates that geometry-preserving neural network significantly enhances learning performance compared to conventional multilayer perceptrons (MLPs). These findings underscore the importance of manifold-aware methods when working with tensor-valued data in engineering applications.
Authors:Ngoc Tuyen Do, Tri Nhu Do
Abstract:
In the surveillance and defense domain, multi-target detection and classification (MTD) is considered essential yet challenging due to heterogeneous inputs from diverse data sources and the computational complexity of algorithms designed for resource-constrained embedded devices, particularly for Al-based solutions. To address these challenges, we propose a feature fusion and knowledge-distilled framework for multi-modal MTD that leverages data fusion to enhance accuracy and employs knowledge distillation for improved domain adaptation. Specifically, our approach utilizes both RGB and thermal image inputs within a novel fusion-based multi-modal model, coupled with a distillation training pipeline. We formulate the problem as a posterior probability optimization task, which is solved through a multi-stage training pipeline supported by a composite loss function. This loss function effectively transfers knowledge from a teacher model to a student model. Experimental results demonstrate that our student model achieves approximately 95% of the teacher model's mean Average Precision while reducing inference time by approximately 50%, underscoring its suitability for practical MTD deployment scenarios.
Authors:Joseph Sullivan, Ian Good, Samuel A. Burden, Jeffrey Ian Lipton
Abstract:
Energy efficiency is critical to the success of legged robotics. Efficiency is lost through wasted energy during locomotion and standing. Including elastic elements has been shown to reduce movement costs, while including breaks can reduce standing costs. However, adding separate elements for each increases the mass and complexity of a leg, reducing overall system performance. Here we present a novel compliant mechanism using a Handed Shearing Auxetic (HSA) that acts as a spring and break in a monopod hopping robot. The HSA acts as a parallel elastic actuator, reducing electrical power for dynamic hopping and matching the efficiency of state-of-the-art compliant hoppers. The HSA\u2019s auxetic behavior enables dual functionality. During static tasks, it locks under large forces with minimal input power by blocking deformation, creating high friction similar to a capstan mechanism. This allows the leg to support heavy loads without motor torque, addressing thermal inefficiency. The multi-functional design enhances both dynamic and static performance, offering a versatile solution for robotic applications.
Authors:Carlos A. Vargas Venegas, Daning Huang, Patrick Blonigan, JohnTencer
Abstract:
This work presents a physics-infused reduced-order modeling (PIROM) framework for efficient and accurate prediction of transient thermal behavior in multi-layered hypersonic thermal protection systems (TPS). The PIROM architecture integrates a reduced-physics backbone, based on the lumped-capacitance model (LCM), with data-driven correction dynamics formulated via a coarse-graining approach rooted in the Mori-Zwanzig formalism. While the LCM captures the dominant heat transfer mechanisms, the correction terms compensate for residual dynamics arising from higher-order non-linear interactions and heterogeneities across material layers. The proposed PIROM is benchmarked against two non-intrusive reduced-order models (ROMs): Operator Inference (OpInf) and Neural Ordinary Differential Equations (NODE). The PIROM consistently achieves errors below 1% for a wide range of extrapolative settings involving time- and space-dependent boundary conditions and temperature-varying material property perturbations. In contrast, OpInf exhibits moderate degradation, and NODE suffers substantial loss in accuracy due to its lack of embedded physics. Despite higher training costs, PIROM delivers online evaluations of two orders of magnitude faster than the full-order model. These results demonstrate that PIROM effectively reconciles the trade-offs between accuracy, generalizability, and efficiency, providing a robust framework for thermal modeling of TPS under diverse operating conditions.
Authors:Hanseong Jo, Pavel Shafirin, Christopher Le, Caden Chan, Artur Davoyan
Abstract:
Soft electrothermal actuators are of great interest in diverse application domains for their simplicity, compliance, and ease of control. However, the very nature of thermally induced mechanical actuation sets inherent operation constraints: unidirectional motion, environmental sensitivity, and slow response times limited by passive cooling. To overcome these constraints, we propose a meta-actuator architecture, which uses engineered heat transfer in thin films to achieve multifunctional operation. We demonstrate electrically selectable bidirectional motion with large deflection ($ \geq $28% of actuator length at 0.75 W), suppressed thermal sensitivity to ambient temperature changes when compared to conventional actuators (>100$ \times $ lower), and actively forced return to the rest state, which is 10 times faster than that with passive cooling. We further show that our meta-actuator approach enables extended ranges of motions for manipulating complex objects. Versatile soft gripper operations highlight the meta-actuator's potential for soft robotics and devices.
Authors:Zakia Tamanna Tisha, Ujjwal Guin
Abstract:
The modern semiconductor industry requires memory solutions that can keep pace with the high-speed demands of high-performance computing. Embedded non-volatile memories (eNVMs) address these requirements by offering faster access to stored data at an improved computational throughput and efficiency. Furthermore, these technologies offer numerous appealing features, including limited area-energy-runtime budget and data retention capabilities. Among these, the data retention feature of eNVMs has garnered particular interest within the semiconductor community. Although this property allows eNVMs to retain data even in the absence of a continuous power supply, it also introduces some vulnerabilities, prompting security concerns. These concerns have sparked increased interest in examining the broader security implications associated with eNVM technologies. This paper examines the security aspects of eNVMs by discussing the reasons for vulnerabilities in specific memories from an architectural point of view. Additionally, this paper extensively reviews eNVM-based security primitives, such as physically unclonable functions and true random number generators, as well as techniques like logic obfuscation. The paper also explores a broad spectrum of security threats to eNVMs, including physical attacks such as side-channel attacks, fault injection, and probing, as well as logical threats like information leakage, denial-of-service, and thermal attacks. Finally, the paper presents a study of publication trends in the eNVM domain since the early 2000s, reflecting the rising momentum and research activity in this field.
Authors:A. A. Solovykh, N. E. Rybin, I. S. Novikov, A. V. Shapeev
Abstract:
Accounting for nuclear quantum effects (NQEs) can significantly alter material properties at finite temperatures. Atomic modeling using the path-integral molecular dynamics (PIMD) method can fully account for such effects, but requires computationally efficient and accurate models of interatomic interactions. Empirical potentials are fast but may lack sufficient accuracy, whereas quantum-mechanical calculations are highly accurate but computationally expensive. Machine-learned interatomic potentials offer a solution to this challenge, providing near-quantum-mechanical accuracy while maintaining high computational efficiency compared to density functional theory (DFT) calculations. In this context, an interface was developed to integrate moment tensor potentials (MTPs) from the MLIP-2 software package into PIMD calculations using the i-PI software package. This interface was then applied to active learning of potentials and to investigate the influence of NQEs on material properties, namely the temperature dependence of lattice parameters and thermal expansion coefficients, as well as radial distribution functions, for lithium hydride (LiH) and silicon (Si) systems. The results were compared with experimental data, quasi-harmonic approximation calculations, and predictions from the universal machine learning force field MatterSim. These comparisons demonstrated the high accuracy and effectiveness of the MTP-PIMD approach.
Authors:Atsuya Kusui, Susumu Hirai, Asuka Takai
Abstract:
To reproduce natural standing-up motion, recent studies have emphasized the importance of coordination between the assisting robot and the human. However, many non-wearable assistive devices have struggled to replicate natural motion trajectories. While wearable devices offer better coordination with the human body, they present challenges in completely isolating mechanical and electrical hazards. To address this, we developed a novel standing-assist robot that integrates features of both wearable and non-wearable systems, aiming to achieve high coordination while maintaining safety. The device employs a four-link mechanism aligned with the human joint structure, designed to reproduce the S-shaped trajectory of the hip and the arc trajectory of the knee during natural standing-up motion. Subject-specific trajectory data were obtained using a gyroscope, and the link lengths were determined to drive the seat along the optimal path. A feedforward speed control using a stepping motor was implemented, and the reproducibility of the trajectory was evaluated based on the geometric constraints of the mechanism. A load-bearing experiment with weights fixed to the seat was conducted to assess the trajectory accuracy under different conditions. Results showed that the reproduction errors for the hip and knee trajectories remained within approximately 4 percent of the seat's total displacement, demonstrating high fidelity to the target paths. In addition, durability testing, thermal safety evaluation, and risk assessment confirmed the reliability and safety of the system for indoor use. These findings suggest that the proposed design offers a promising approach for developing assistive technologies that adapt to individual physical characteristics, with potential applications in elderly care and rehabilitation.
Authors:Jinke Li, Yue Wu, Xiaoyan Yang
Abstract:
Thermal Infrared (TIR) technology involves the use of sensors to detect and measure infrared radiation emitted by objects, and it is widely utilized across a broad spectrum of applications. The advancements in object detection methods utilizing TIR images have sparked significant research interest. However, most traditional methods lack the capability to effectively extract and fuse local-global information, which is crucial for TIR-domain feature attention. In this study, we present a novel and efficient thermal infrared object detection framework, known as CRT-YOLO, that is based on centralized feature regulation, enabling the establishment of global-range interaction on TIR information. Our proposed model integrates efficient multi-scale attention (EMA) modules, which adeptly capture long-range dependencies while incurring minimal computational overhead. Additionally, it leverages the Centralized Feature Pyramid (CFP) network, which offers global regulation of TIR features. Extensive experiments conducted on two benchmark datasets demonstrate that our CRT-YOLO model significantly outperforms conventional methods for TIR image object detection. Furthermore, the ablation study provides compelling evidence of the effectiveness of our proposed modules, reinforcing the potential impact of our approach on advancing the field of thermal infrared object detection.
Authors:Noboru Katayama, Rintaro Ishida
Abstract:
A fault detection method for power conversion circuits using thermal images and a convolutional autoencoder is presented. The autoencoder is trained on thermal images captured from a commercial power module at randomly varied load currents and augmented image2 generated through image processing techniques such as resizing, rotation, perspective transformation, and bright and contrast adjustment. Since the autoencoder is trained to output images identical to input only for normal samples, it reconstructs images similar to normal ones even when the input images containing faults. A small heater is attached to the circuit board to simulate a fault on a power module, and then thermal images were captured from different angles and positions, as well as various load currents to test the trained autoencoder model. The areas under the curve (AUC) were obtained to evaluate the proposed method. The results show the autoencoder model can detect anomalies with 100% accuracy under given conditions. The influence of hyperparameters such as the number of convolutional layers and image augmentation conditions on anomaly detection accuracy was also investigated.
Authors:Seokjun Kwon, Jeongmin Shin, Namil Kim, Soonmin Hwang, Yukyung Choi
Abstract:
In autonomous driving, thermal image semantic segmentation has emerged as a critical research area, owing to its ability to provide robust scene understanding under adverse visual conditions. In particular, unsupervised domain adaptation (UDA) for thermal image segmentation can be an efficient solution to address the lack of labeled thermal datasets. Nevertheless, since these methods do not effectively utilize the complementary information between RGB and thermal images, they significantly decrease performance during domain adaptation. In this paper, we present a comprehensive study on cross-spectral UDA for thermal image semantic segmentation. We first propose a novel masked mutual learning strategy that promotes complementary information exchange by selectively transferring results between each spectral model while masking out uncertain regions. Additionally, we introduce a novel prototypical self-supervised loss designed to enhance the performance of the thermal segmentation model in nighttime scenarios. This approach addresses the limitations of RGB pre-trained networks, which cannot effectively transfer knowledge under low illumination due to the inherent constraints of RGB sensors. In experiments, our method achieves higher performance over previous UDA methods and comparable performance to state-of-the-art supervised methods.
Authors:Tamilselvan Subramani, Sebastian Bartscher
Abstract:
Digital twins enable real-time simulation and prediction in engineering systems. This paper presents a novel framework for predictive digital twins of a headlamp heatsink, integrating physics-based reduced-order models (ROMs) from computational fluid dynamics (CFD) with supervised machine learning. A component-based ROM library, derived via proper orthogonal decomposition (POD), captures thermal dynamics efficiently. Machine learning models, including Decision Trees, k-Nearest Neighbors, Support Vector Regression (SVR), and Neural Networks, predict optimal ROM configurations, enabling rapid digital twin updates. The Neural Network achieves a mean absolute error (MAE) of 54.240, outperforming other models. Quantitative comparisons of predicted and original values demonstrate high accuracy. This scalable, interpretable framework advances thermal management in automotive systems, supporting robust design and predictive maintenance.
Authors:Dong Xing, Xianxun Zhu, Wei Zhou, Qika Lin, Hang Yang, Yuqing Wang
Abstract:
The recent Segment Anything Model (SAM) demonstrates strong instance segmentation performance across various downstream tasks. However, SAM is trained solely on RGB data, limiting its direct applicability to RGB-thermal (RGB-T) semantic segmentation. Given that RGB-T provides a robust solution for scene understanding in adverse weather and lighting conditions, such as low light and overexposure, we propose a novel framework, SARTM, which customizes the powerful SAM for RGB-T semantic segmentation. Our key idea is to unleash the potential of SAM while introduce semantic understanding modules for RGB-T data pairs. Specifically, our framework first involves fine tuning the original SAM by adding extra LoRA layers, aiming at preserving SAM's strong generalization and segmentation capabilities for downstream tasks. Secondly, we introduce language information as guidance for training our SARTM. To address cross-modal inconsistencies, we introduce a Cross-Modal Knowledge Distillation(CMKD) module that effectively achieves modality adaptation while maintaining its generalization capabilities. This semantic module enables the minimization of modality gaps and alleviates semantic ambiguity, facilitating the combination of any modality under any visual conditions. Furthermore, we enhance the segmentation performance by adjusting the segmentation head of SAM and incorporating an auxiliary semantic segmentation head, which integrates multi-scale features for effective fusion. Extensive experiments are conducted across three multi-modal RGBT semantic segmentation benchmarks: MFNET, PST900, and FMB. Both quantitative and qualitative results consistently demonstrate that the proposed SARTM significantly outperforms state-of-the-art approaches across a variety of conditions.
Authors:Sarah Flanery, Anson Trapani, Christiana Chamon, Leyla Nazhandali
Abstract:
This study investigates a duality approach to information leak detection in the generalized Kirchhoff-Law-Johnson-Noise secure key exchange scheme proposed by Vadai, Mingesz, and Gingl (VMG-KLJN). While previous work by Chamon and Kish sampled voltages at zero-current instances, this research explores sampling currents at zero-voltage crossings. The objective is to determine if this dual approach can reveal information leaks in non-equilibrium KLJN systems. Results indicate that the duality method successfully detects information leaks, further supporting the necessity of thermal equilibrium for unconditional security in KLJN systems. Our findings confirm that the duality method successfully detects information leaks, with results closely mirroring those of Chamon and Kish, showing comparable vulnerabilities in non-equilibrium conditions. These results further support the necessity of thermal equilibrium for unconditional security in the KLJN scheme.
Authors:Jerome Samuel S, Puneet Kumar Patra, Md Rushdie Ibne Islam
Abstract:
Many thermo-mechanical processes, such as thermal expansion and stress relaxation, originate at the atomistic scale. We develop a sequential multiscale approach to study thermally stressed superelastic polyimide to explore these effects. The continuum-scale smoothed particle hydrodynamics (SPH) model is coupled with atomistic molecular dynamics (MD) through constitutive modelling, where thermo-mechanical properties and equations of state are derived from MD simulations. The results are verified through benchmark problems of heat transfer. Finally, we analyse the insulating capabilities of superelastic polyimide by simulating the thermal response of an aluminium plate. The result shows a considerable reduction in the thermal stress, strain and temperature field development in the aluminium plate when superelastic polyimide is used as an insulator. The present work demonstrates the effectiveness of the multi-scale method in capturing thermo-mechanical interactions in superelastic polyimide.
Authors:Vallary Gupta, Ahana Sarkar, Chirag Deb, Arnab Jana
Abstract:
Energy-poor households often compromise their thermal comfort and refrain from operating mechanical cooling devices to avoid high electricity bills. This is compounded by certain behavioral practices like retention of older, less efficient appliances, resulting in missed energy savings. Thus, the need to enhance efficiency becomes critical in these households. However, due to a lack of comprehensive data in India, little is understood about their electricity consumption patterns and usage efficiency. Estimating inefficiency and assessing its determinants is crucial for improving their quality of life. This study measures the inefficiency in electricity consumption due to household practices and appliances in social housing in Mumbai, India. It considers technological determinants in addition to socio-economic variables. The study employs primary data collected from rehabilitation housing and slums in Mumbai. Stochastic frontier analysis, a parametric approach, is applied to estimate indicators of electricity consumption and inefficiency. While household size and workforce participation significantly affect consumption behavior in rehabilitation housing, it is limited to the workforce in slums. The ownership of appliances, except for washing machines in slums, also exhibits considerable impacts. The mean efficiency scores of 83% and 91% for rehabilitation housing and slums, respectively, empirically quantify the potential savings achievable. Factors that positively influence inefficiency include the duration of operating refrigerators, washing machines, iron, and AC. These results hold implications for enhancing the uptake of efficient appliances in addition to accelerating energy efficiency retrofits in the region. Policies should focus on awareness and the development of appliance markets through incentives.
Authors:Akshit Gupta, Remko Uijlenhoet
Abstract:
Healthy urban forests comprising of diverse trees and shrubs play a crucial role in mitigating climate change. They provide several key advantages such as providing shade for energy conservation, and intercepting rainfall to reduce flood runoff and soil erosion. Traditional approaches for monitoring the health of urban forests require instrumented inspection techniques, often involving a high amount of human labor and subjective evaluations. As a result, they are not scalable for cities which lack extensive resources. Recent approaches involving multi-spectral imaging data based on terrestrial sensing and satellites, are constrained respectively with challenges related to dedicated deployments and limited spatial resolutions. In this work, we propose an alternative approach for monitoring the urban forests using simplified inputs: street view imagery, tree inventory data and meteorological conditions. We propose to use image-to-image translation networks to estimate two urban forest health parameters, namely, NDVI and CTD. Finally, we aim to compare the generated results with ground truth data using an onsite campaign utilizing handheld multi-spectral and thermal imaging sensors. With the advent and expansion of street view imagery platforms such as Google Street View and Mapillary, this approach should enable effective management of urban forests for the authorities in cities at scale.
Authors:Dekang Zhang, Dan Niu, Zhou Jin, Yichao Dong, Jingweijia Tan, Changyin Sun
Abstract:
In the post-Moore era, 2.5D chiplet-based ICs present significant challenges in thermal management due to increased power density and thermal hotspots. Neural network-based thermal prediction models can perform real-time predictions for many unseen new designs. However, existing CNN-based and GCN-based methods cannot effectively capture the global thermal features, especially for high-frequency components, hindering prediction accuracy enhancement. In this paper, we propose a novel frequency-spatial dual domain aware prediction network (FSA-Heat) for fast and high-accuracy thermal prediction in 2.5D ICs. It integrates high-to-low frequency and spatial domain encoder (FSTE) module with frequency domain cross-scale interaction module (FCIFormer) to achieve high-to-low frequency and global-to-local thermal dissipation feature extraction. Additionally, a frequency-spatial hybrid loss (FSL) is designed to effectively attenuate high-frequency thermal gradient noise and spatial misalignments. The experimental results show that the performance enhancements offered by our proposed method are substantial, outperforming the newly-proposed 2.5D method, GCN+PNA, by considerable margins (over 99% RMSE reduction, 4.23X inference time speedup). Moreover, extensive experiments demonstrate that FSA-Heat also exhibits robust generalization capabilities.
Authors:Xi Tong, Xing Luo, Jiangxin Yang, Yanpeng Cao
Abstract:
Multispectral imaging plays a critical role in a range of intelligent transportation applications, including advanced driver assistance systems (ADAS), traffic monitoring, and night vision. However, accurate visible and thermal (RGB-T) image registration poses a significant challenge due to the considerable modality differences. In this paper, we present a novel joint Self-Correlation and Cross-Correspondence Estimation Framework (SC3EF), leveraging both local representative features and global contextual cues to effectively generate RGB-T correspondences. For this purpose, we design a convolution-transformer-based pipeline to extract local representative features and encode global correlations of intra-modality for inter-modality correspondence estimation between unaligned visible and thermal images. After merging the local and global correspondence estimation results, we further employ a hierarchical optical flow estimation decoder to progressively refine the estimated dense correspondence maps. Extensive experiments demonstrate the effectiveness of our proposed method, outperforming the current state-of-the-art (SOTA) methods on representative RGB-T datasets. Furthermore, it also shows competitive generalization capabilities across challenging scenarios, including large parallax, severe occlusions, adverse weather, and other cross-modal datasets (e.g., RGB-N and RGB-D).
Authors:Mahdi Hasanzadeh, Kasem Khalil, Cynthia Sturton, Ahmad Patooghy
Abstract:
Multi-Processor System-on-Chips (MPSoCs) are highly vulnerable to thermal attacks that manipulate dynamic thermal management systems. To counter this, we propose an adaptive real-time monitoring mechanism that detects abnormal thermal patterns in chip tiles. Our design space exploration helped identify key thermal features for an efficient anomaly detection module to be implemented at routers of network-enabled MPSoCs. To minimize hardware overhead, we employ weighted moving average (WMA) calculations and bit-shift operations, ensuring a lightweight yet effective implementation. By defining a spectrum of abnormal behaviors, our system successfully detects and mitigates malicious temperature fluctuations, reducing severe cases from 3.00°C to 1.9°C. The anomaly detection module achieves up to 82% of accuracy in detecting thermal attacks, which is only 10-15% less than top-performing machine learning (ML) models like Random Forest. However, our approach reduces hardware usage by up to 75% for logic resources and 100% for specialized resources, making it significantly more efficient than ML-based solutions. This method provides a practical, low-cost solution for resource-constrained environments, ensuring resilience against thermal attacks while maintaining system performance.
Authors:Martin Kocur, Niels Henze
Abstract:
Understanding thermal regulation and subjective perception of temperature is crucial for improving thermal comfort and human energy consumption in times of global warming. Previous work shows that an environment's color temperature affects the experienced temperature. As virtual reality (VR) enables visual immersion, recent work suggests that a VR scene's color temperature also affects experienced temperature. In addition, virtual avatars representing thermal cues influence users' thermal perception and even the body temperature. As immersive technology becomes increasingly prevalent in daily life, leveraging thermal cues to enhance thermal comfort - without relying on actual thermal energy - presents a promising opportunity. Understanding these effects is crucial for optimizing virtual experiences and promoting sustainable energy practices. Therefore, we propose three controlled experiments to learn more about thermal effects caused by virtual worlds and avatars.
Authors:Bojja Venu, Adam Bosak, Juan Raul Padron-Griffe
Abstract:
Materials exhibit geometric structures across mesoscopic to microscopic scales, influencing macroscale properties such as appearance, mechanical strength, and thermal behavior. Capturing and modeling these multiscale structures is challenging but essential for computer graphics, engineering, and materials science. We present a framework inspired by hypertexture methods, using implicit functions and adaptive sphere tracing to synthesize multiscale structures on the fly without precomputation. This framework models volumetric materials with particulate, fibrous, porous, and laminar structures, allowing control over size, shape, density, distribution, and orientation. We enhance structural diversity by superimposing implicit periodic functions while improving computational efficiency. The framework also supports spatially varying particulate media, particle agglomeration, and piling on convex and concave structures, such as rock formations (mesoscale), without explicit simulation. We show its potential in the appearance modeling of volumetric materials and explore how spatially varying properties influence perceived macroscale appearance. Our framework enables seamless multiscale modeling, reconstructing procedural volumetric materials from image and signed distance field (SDF) synthetic exemplars using first-order and gradient-free optimization.
Authors:Athanasios Athanasopoulos, Matúš Mihalák, Marcin Pietrasik
Abstract:
One of the key safety considerations of battery manufacturing is thermal runaway, the uncontrolled increase in temperature which can lead to fires, explosions, and emissions of toxic gasses. As such, development of automated systems capable of detecting such events is of considerable importance in both academic and industrial contexts. In this work, we investigate the use of deep learning for detecting thermal runaway in the battery production line of VDL Nedcar, a Dutch automobile manufacturer. Specifically, we collect data from the production line to represent both baseline (non thermal runaway) and thermal runaway conditions. Thermal runaway was simulated through the use of external heat and smoke sources. The data consisted of both optical and thermal images which were then preprocessed and fused before serving as input to our models. In this regard, we evaluated three deep-learning models widely used in computer vision including shallow convolutional neural networks, residual neural networks, and vision transformers on two performance metrics. Furthermore, we evaluated these models using explainability methods to gain insight into their ability to capture the relevant feature information from their inputs. The obtained results indicate that the use of deep learning is a viable approach to thermal runaway detection in battery production lines.
Authors:Ramachandran Anantharaman, Carlos Gonzalez Rojas, Luna Artemis van Leeuwen, Leyla Ãzkan
Abstract:
Heat exchangers (HEXs) play a central role in process industries for thermal energy transfer. Fouling, the gradual accumulation of solids on heat transfer surfaces, causes a time-varying decrease in the overall heat transfer coefficient (U(t)), significantly impacting the efficiency of heat transfer. Good estimation and modeling of fouling (the heat transfer coefficient) will lead to better fouling mitigation strategies. This study investigates the identifiability of the time-varying $U(t)$ in HEXs from closed-loop operational data, without external excitation of reference signals or knowledge of the controller parameters. We establish that while the complete system model cannot be identified under these given constraints, the time-varying heat transfer coefficient $U(t)$ remains identifiable. Further, we propose a neural network based architecture, called (Per-PINN), for estimation and modeling the heat transfer coefficient from the closed-loop system data. This Per-PINN model is shown to perform better than the existing Physics-Informed Neural Networks (PINN) based models for inverse parameter learning as it inherently fixes the underlying physical equations and learns only the time-varying parameter U(t).
Authors:Weizheng Zhang, Hao Pan, Lin Lu, Xiaowei Duan, Xin Yan, Ruonan Wang, Qiang Du
Abstract:
Heat exchangers are critical components in a wide range of engineering applications, from energy systems to chemical processing, where efficient thermal management is essential. The design objectives for heat exchangers include maximizing the heat exchange rate while minimizing the pressure drop, requiring both a large interface area and a smooth internal structure. State-of-the-art designs, such as triply periodic minimal surfaces (TPMS), have proven effective in optimizing heat exchange efficiency. However, TPMS designs are constrained by predefined mathematical equations, limiting their adaptability to freeform boundary shapes. Additionally, TPMS structures do not inherently control flow directions, which can lead to flow stagnation and undesirable pressure drops.
This paper presents DualMS, a novel computational framework for optimizing dual-channel minimal surfaces specifically for heat exchanger designs in freeform shapes. To the best of our knowledge, this is the first attempt to directly optimize minimal surfaces for two-fluid heat exchangers, rather than relying on TPMS. Our approach formulates the heat exchange maximization problem as a constrained connected maximum cut problem on a graph, with flow constraints guiding the optimization process. To address undesirable pressure drops, we model the minimal surface as a classification boundary separating the two fluids, incorporating an additional regularization term for area minimization. We employ a neural network that maps spatial points to binary flow types, enabling it to classify flow skeletons and automatically determine the surface boundary. DualMS demonstrates greater flexibility in surface topology compared to TPMS and achieves superior thermal performance, with lower pressure drops while maintaining a similar heat exchange rate under the same material cost.
Authors:Sara Ruiz-Moreno, Antonio J. Gallego, Manuel MacÃas, Eduardo F. Camacho
Abstract:
This paper presents a novel method to optimize thermal balance in parabolic trough collector (PTC) plants. It uses a market-based system to distribute flow among loops combined with an artificial neural network (ANN) to reduce computation and data requirements. This auction-based approach balances loop temperatures, accommodating varying thermal losses and collector efficiencies. Validation across different thermal losses, optical efficiencies, and irradiance conditions-sunny, partially cloudy, and cloudy-show improved thermal power output and intercept factors compared to a no-allocation system. It demonstrates scalability and practicality for large solar thermal plants, enhancing overall performance. The method was first validated through simulations on a realistic solar plant model, then adapted and successfully tested in a 50 MW solar trough plant, demonstrating its advantages. Furthermore, the algorithms have been implemented, commissioned, and are currently operating in 13 commercial solar trough plants.
Authors:Mattia Scarpa, Francesco Pase, Ruggero Carli, Mattia Bruschetta, Franscesco Toso
Abstract:
Digital twins for power electronics require accurate power losses whose direct measurements are often impractical or impossible in real-world applications. This paper presents a novel hybrid framework that combines physics-based thermal modeling with data-driven techniques to identify and correct power losses accurately using only temperature measurements. Our approach leverages a cascaded architecture where a neural network learns to correct the outputs of a nominal power loss model by backpropagating through a reduced-order thermal model. We explore two neural architectures, a bootstrapped feedforward network, and a recurrent neural network, demonstrating that the bootstrapped feedforward approach achieves superior performance while maintaining computational efficiency for real-time applications. Between the interconnection, we included normalization strategies and physics-guided training loss functions to preserve stability and ensure physical consistency. Experimental results show that our hybrid model reduces both temperature estimation errors (from 7.2+-6.8°C to 0.3+-0.3°C) and power loss prediction errors (from 5.4+-6.6W to 0.2+-0.3W) compared to traditional physics-based approaches, even in the presence of thermal model uncertainties. This methodology allows us to accurately estimate power losses without direct measurements, making it particularly helpful for real-time industrial applications where sensor placement is hindered by cost and physical limitations.
Authors:Snehamoy Chatterjee, Greg Waite, Sidike Paheding, Luke Bowman
Abstract:
Forecasting volcanic activity is critical for hazard assessment and risk mitigation. Volcanic Radiative Power (VPR), derived from thermal remote sensing data, serves as an essential indicator of volcanic activity. In this study, we employ Bayesian Regularized Neural Networks (BRNN) to predict future VPR values based on historical data from Fuego Volcano, comparing its performance against Scaled Conjugate Gradient (SCG) and Levenberg-Marquardt (LM) models. The results indicate that BRNN outperforms SCG and LM, achieving the lowest mean squared error (1.77E+16) and the highest R-squared value (0.50), demonstrating its superior ability to capture VPR variability while minimizing overfitting. Despite these promising results, challenges remain in improving the model's predictive accuracy. Future research should focus on integrating additional geophysical parameters, such as seismic and gas emission data, to enhance forecasting precision. The findings highlight the potential of machine learning models, particularly BRNN, in advancing volcanic activity forecasting, contributing to more effective early warning systems for volcanic hazards.
Authors:Adrian Villalobos, Iban Barrutia, Rafael Pena-Alzola, Tomislav Dragicevic, Jose I. Aizpurua
Abstract:
Semiconductor devices, especially MOSFETs (Metal-oxide-semiconductor field-effect transistor), are crucial in power electronics, but their reliability is affected by aging processes influenced by cycling and temperature. The primary aging mechanism in discrete semiconductors and power modules is the bond wire lift-off, caused by crack growth due to thermal fatigue. The process is empirically characterized by exponential growth and an abrupt end of life, making long-term aging forecasts challenging. This research presents a comprehensive comparative assessment of different forecasting methods for MOSFET failure forecasting applications. Classical tracking, statistical forecasting and Neural Network (NN) based forecasting models are implemented along with novel Temporal Fusion Transformers (TFTs). A comprehensive comparison is performed assessing their MOSFET ageing forecasting ability for different forecasting horizons. For short-term predictions, all algorithms result in acceptable results, with the best results produced by classical NN forecasting models at the expense of higher computations. For long-term forecasting, only the TFT is able to produce valid outcomes owing to the ability to integrate covariates from the expected future conditions. Additionally, TFT attention points identify key ageing turning points, which indicate new failure modes or accelerated ageing phases.
Authors:Abhijith M S, Sandra S
Abstract:
The study proposes a data-driven model which combines the Dynamic Mode Decomposition with multi-linear interpolation to predict the thermal fields of nanofluid flows at unseen Reynolds numbers (Re) and particle volume concentrations ($ε$). The flow, considered for the study, is laminar and incompressible. The study employs an in-house Fortran-based solver to predict the thermal fields of Al$_2$O$_3$-water nanofluid flow through a two-dimensional rectangular channel, with the bottom wall subjected to a uniform heat flux. The performance of two models operating in one- and two-dimensional parametric spaces are investigated. Initially, a DMD with linear interpolation (DMD-LI) based solver is used for prediction of temperature of the nanofluid at any Re $>$ 100. The DMD-LI based model, predicts temperature fields with a maximum percentage difference of just 0.0273\%, in comparison with the CFD-based solver at Re =960, and $ε$ = 1.0\%. The corresponding difference in the average Nusselt numbers is only 0.39\%. Following that a DMD with bi-linear interpolation (DMD-BLI) based solver is used for prediction of temperature of the nanofluid at any Re $>$ 100 and $ε$ $>$ 0.5\%. The performance of two different ways of stacking the data are also examined. When compared to the CFD-based model, the DMD-BLI-based model predicts the temperature fields with a maximum percentage difference of 0.21 \%, at Re = 800 and $ε$ = 1.35\%. And the corresponding percentage difference in the average Nusselt number prediction is only 6.08\%. All the results are reported in detail. Along side the important conclusions, the future scope of the study is also listed.
Authors:Ana Sanz Cozcolluela, Yasemin Vardar
Abstract:
The growing adoption of extended reality, XR, has driven demand for wearable technologies that can replicate natural tactile sensations and allow users to interact freely with their surroundings using bare fingers. However, most existing wearable haptic technologies that support such free interactions can deliver sensations across limited tactile modalities. Here, we introduce a soft haptic ring and a data-driven rendering methodology to generate multimodal texture sensations. The device integrates pneumatic and hydraulic actuation to simulate roughness, thermal, and softness cues on the proximal phalanx, enabling users to explore surroundings naturally with their fingertips. The rendering methodology dynamically modulates those cues based on the user's exploratory actions. We validated our approach by conducting a user study with fifteen participants, who matched six virtual textures generated by the ring to their real counterparts and rated their perceived sensations. Participants achieved up to ninety percent accuracy in texture matching. The adjective ratings confirmed that the ring delivers distinct, perceptually rich stimuli across all rendered sensations. These findings highlight the ring's potential for immersive XR applications, offering diverse tactile feedback without restricting physical interaction.
Authors:Jun Li, Qifeng Xu, Yifan Lin, Nan Xie
Abstract:
The insufficient stability and reliability of Optical Voltage Sensor is primarily caused by thermal stress induced birefringence. In this paper, a method based on arbitrary electric field direction modulation and isomerism electrodes is proposed to suppress or regulate it. With the aid of multi-physics Finite Element Method, Jones Matrix and the theory of photoelastic effect, it is found that metal or transparent isomerism electrodes can generate a special thermal stress distribution, which regulates the birefringence in the optical path and their induced measurement error. The experiment is conducted on a 10mm cubic bismuth germanite crystal, with cutting directions 110, -110 and 001. The experiment result shows that Cu isomerism electrodes with electric field angle of 59.9 degrees could generate 37% less birefringence error compared to parallel plate electrodes, in the temperature range from 25 degrees Celsius to 40 degrees Celsius. However, the Indium Tin Oxide electrodes with field angle of 29.6 degrees produces approximately 7 times error because of its bad ductility and thermal conduction. The proposed modeling and suppression method for birefringence is beneficial to design of high accuracy optical voltage sensor or electro-optical modulator.
Authors:Haoran Ma, Kaihan Zhang, Jiannan Cai
Abstract:
Heat exposure significantly influences pedestrian routing behaviors. Existing methods such as agent-based modeling (ABM) and empirical measurements fail to account for individual physiological variations and environmental perception mechanisms under thermal stress. This results in a lack of human-centred, heat-adaptive routing suggestions. To address these limitations, we propose a novel Vision Language Model (VLM)-driven Persona-Perception-Planning-Memory (PPPM) framework that integrating street view imagery and urban network topology to simulate heat-adaptive pedestrian routing. Through structured prompt engineering on Gemini-2.0 model, eight distinct heat-sensitive personas were created to model mobility behaviors during heat exposure, with empirical validation through questionnaire survey. Results demonstrate that simulation outputs effectively capture inter-persona variations, achieving high significant congruence with observed route preferences and highlighting differences in the factors driving agents decisions. Our framework is highly cost-effective, with simulations costing 0.006USD and taking 47.81s per route. This Artificial Intelligence-Generated Content (AIGC) methodology advances urban climate adaptation research by enabling high-resolution simulation of thermal-responsive mobility patterns, providing actionable insights for climate-resilient urban planning.
Authors:Bo Liu, Wei Wang, Charles Moulinec, Stefano Rolfo, Marion Samler, Ehimen Iyamabo, Constantinos Katsamis, Marc Chevalier
Abstract:
The aim of this work is to further expand the capability of the coarse-grid Computational Fluid Dynamics (CFD) approach, SubChCFD, to effectively simulate transient and buoyancy-influenced flows, which are critical in accident analyses of High-Temperature Gas-cooled Reactors (HTGRs). It has been demonstrated in our previous work that SubChCFD is highly adaptable to HTGR fuel designs and performs exceptionally well in modelling steady-state processes. In this study, the approach is extended to simulate a Loss of Flow Accident (LOFA) transient, where coolant circulation is disrupted, causing the transition from forced convection to buoyancy-driven natural circulation within the reactor core. To enable SubChCFD to capture the complex physics involved, corrections were introduced to the empirical correlations to account for the effects of flow unsteadiness, property variation and buoyancy.
A 1/12th sector of the reactor core, representing the smallest symmetric unit, was modelled using a coarse mesh of approximately 60 million cells. This mesh size is about 6% of that required for a Reynolds Averaged Navier Stokes (RANS) model, where mesh sizes can typically reach the order of 1 billion cells for such configurations. Simulation results show that SubChCFD effectively captures the thermal hydraulic behaviours of the reactor during a LOFA transient, producing predictions in good agreement with RANS simulations while significantly reducing computational cost.
Authors:Yudhishthira Kundu, Manroop Kaur, Tripty Wig, Kriti Kumar, Pushpanjali Kumari, Vivek Puri, Manish Arora
Abstract:
Cerebras' wafer-scale engine (WSE) technology merges multiple dies on a single wafer. It addresses the challenges of memory bandwidth, latency, and scalability, making it suitable for artificial intelligence. This work evaluates the WSE-3 architecture and compares it with leading GPU-based AI accelerators, notably Nvidia's H100 and B200. The work highlights the advantages of WSE-3 in performance per watt and memory scalability and provides insights into the challenges in manufacturing, thermal management, and reliability. The results suggest that wafer-scale integration can surpass conventional architectures in several metrics, though work is required to address cost-effectiveness and long-term viability.
Authors:E Harshith Kumar Yadav, Rahul Narava, Anshika, Shashi Shekher Jha
Abstract:
Managing equal charge levels in active cell balancing while charging a Li-ion battery is challenging. An imbalance in charge levels affects the state of health of the battery, along with the concerns of thermal runaway and fire hazards. Traditional methods focus on safety assurance as a trade-off between safety and charging time. Others deal with battery-specific conditions to ensure safety, therefore losing on the generalization of the control strategies over various configurations of batteries. In this work, we propose a method to learn safe battery charging actions by using a safety-layer as an add-on over a Deep Reinforcement Learning (RL) agent. The safety layer perturbs the agent's action to prevent the battery from encountering unsafe or dangerous states. Further, our Deep RL framework focuses on learning a generalized policy that can be effectively employed with varying configurations of batteries. Our experimental results demonstrate that the safety-layer based action perturbation incurs fewer safety violations by avoiding unsafe states along with learning a robust policy for several battery configurations.
Authors:Simon Malacek, José Portela, Yannick Marcus Werner, Sonja Wogrin
Abstract:
Despite various efforts, decarbonizing the heating sector remains a significant challenge. To tackle it by smart planning, the availability of highly resolved heating demand data is key. Several existing models provide heating demand only for specific applications. Typically, they either offer time series for a larger area or annual demand data on a building level, but not both simultaneously. Additionally, the diversity in heating demand across different buildings is often not considered. To address these limitations, this paper presents a novel method for generating temporally resolved heat demand time series at the building level using publicly available data. The approach integrates a thermal building model with stochastic occupancy simulations that account for variability in user behavior. As a result, the tool serves as a cost-effective resource for cross-sectoral energy system planning and policy development, particularly with a focus on the heating sector. The obtained data can be used to assess the impact of renovation and retrofitting strategies, or to analyze district heating expansion. To illustrate the potential applications of this approach, we conducted a case study in Puertollano (Spain), where we prepared a dataset of heating demand with hourly resolution for each of 9,298 residential buildings. This data was then used to compare two different pathways for the thermal renovation of these buildings. By relying on publicly available data, this method can be adapted and applied to various European regions, offering broad usability in energy system optimization and analysis of decarbonization strategies.
Authors:Y A Rouzoumka, E Terreaux, C Morisseau, J. -P Ovarlez, C Ren
Abstract:
This paper presents a novel approach to radar target detection using Variational AutoEncoders (VAEs). Known for their ability to learn complex distributions and identify out-ofdistribution samples, the proposed VAE architecture effectively distinguishes radar targets from various noise types, including correlated Gaussian and compound Gaussian clutter, often combined with additive white Gaussian thermal noise. Simulation results demonstrate that the proposed VAE outperforms classical adaptive detectors such as the Matched Filter and the Normalized Matched Filter, especially in challenging noise conditions, highlighting its robustness and adaptability in radar applications.
Authors:Zelin Meng, Takanori Fukao
Abstract:
Depth estimation in complex real-world scenarios is a challenging task, especially when relying solely on a single modality such as visible light or thermal infrared (THR) imagery. This paper proposes a novel multimodal depth estimation model, RTFusion, which enhances depth estimation accuracy and robustness by integrating the complementary strengths of RGB and THR data. The RGB modality provides rich texture and color information, while the THR modality captures thermal patterns, ensuring stability under adverse lighting conditions such as extreme illumination. The model incorporates a unique fusion mechanism, EGFusion, consisting of the Mutual Complementary Attention (MCA) module for cross-modal feature alignment and the Edge Saliency Enhancement Module (ESEM) to improve edge detail preservation. Comprehensive experiments on the MS2 and ViViD++ datasets demonstrate that the proposed model consistently produces high-quality depth maps across various challenging environments, including nighttime, rainy, and high-glare conditions. The experimental results highlight the potential of the proposed method in applications requiring reliable depth estimation, such as autonomous driving, robotics, and augmented reality.
Authors:Chengxin Zhang, Yujie Liu, Quan Chen
Abstract:
As the demand for computational power increases, high-bandwidth memory (HBM) has become a critical technology for next-generation computing systems. However, the widespread adoption of HBM presents significant thermal management challenges, particularly in multilayer through-silicon-via (TSV) stacked structures under varying thermal conditions, where accurate prediction of junction temperature and hotspot position is essential during the early design. This work develops a data-driven neural network model for the fast prediction of junction temperature and hotspot position in 3D HBM chiplets. The model, trained with a data set of $13,494$ different combinations of thermal condition parameters, sampled from a vast parameter space characterized by high-dimensional combination (up to $3^{27}$), can accurately and quickly infer the junction temperature and hotspot position for any thermal conditions in the parameter space. Moreover, it shows good generalizability for other thermal conditions not considered in the parameter space. The data set is constructed using accurate finite element solvers. This method not only minimizes the reliance on costly experimental tests and extensive computational resources for finite element analysis but also accelerates the design and optimization of complex HBM systems, making it a valuable tool for improving thermal management and performance in high-performance computing applications.
Authors:Oliver Krumpek, Ole Kroeger, Sebastian Mohr
Abstract:
This paper presents the design and evaluation of a physical support structure for the OptiTrack X22 tracking systems, constructed from carbon fiber-reinforced polymer (CFRP) and Invar steel. These materials were chosen for their low thermal expansion, ensuring geometric stability and rigidity necessary for accurate spatial measurements. The support system is scalable and adaptable for various applications and setups. The study further investigates the effects of camera placement and separation in near-parallel configurations on measurement accuracy and precision. Experimental results show a significant correlation between camera distance and measurement precision - closer camera setups yield higher precision. The optimized camera arrangement allowed the prototype to achieve accuracies of +/-0.74 mm along the camera's line of sight and +/-0.12 mm in orthogonal directions. The experiments show that the standard deviation of the noise on a single measurement plane orthogonal to the camera's line of sight vary between 0.02 and 0.07, indicating that the measurement noise is not constant for every point on that specific plane in the meanurement space. Details of the system's design and validation are provided to enhance reproducibility and encourage further development in areas like industrial automation and medical device tracking. By delivering a modular solution with validated accuracy, this work aims to promote innovation and practical application in precision tracking technology, facilitating broader adoption and iterative improvements. This approach enhances the accessibility and versatility of high-precision tracking technology, supporting future progress in the field.
Authors:Rundi Lu, Hao-En Li, Zhengwei Liu, Jin-Peng Liu
Abstract:
We generalize the Linear Combination of Hamiltonian Simulation (LCHS) formula [An, Liu, Lin, Phys. Rev. Lett. 2023] to simulate time-evolution operators in infinite-dimensional spaces, including scenarios involving unbounded operators. This extension, named Inf-LCHS for short, bridges the gap between finite-dimensional quantum simulations and the broader class of infinite-dimensional quantum dynamics governed by partial differential equations (PDEs). Furthermore, we propose two sampling methods by integrating the infinite-dimensional LCHS with Gaussian quadrature schemes (Inf-LCHS-Gaussian) or Monte Carlo integration schemes (Inf-LCHS-MC). We demonstrate the applicability of the Inf-LCHS theorem to a wide range of non-Hermitian dynamics, including linear parabolic PDEs, queueing models (birth-or-death processes), Schrödinger equations with complex potentials, Lindblad equations, and black hole thermal field equations. Our analysis provides insights into simulating general linear dynamics using a finite number of quantum dynamics and includes cost estimates for the corresponding quantum algorithms.
Authors:Zhan Wang, Chen Weidong, Huang Zhifeng, Md Raisul Islam, Chua Kian Jon
Abstract:
In tropical countries with high humidity, air conditioning can account for up to 60% of a building's energy use. For commercial buildings with centralized systems, the efficiency of the chiller plant is vital, and model predictive control provides an effective strategy for optimizing operations through dynamic adjustments based on accurate load predictions. Artificial neural networks are effective for modelling nonlinear systems but are prone to overfitting due to their complexity. Effective feature engineering can mitigate this issue. While weather data are crucial for load prediction, they are often used as raw numerical inputs without advanced processing. Clustering features is a technique that can reduce model complexity and enhance prediction accuracy. Although previous studies have explored clustering algorithms for load prediction, none have applied them to multidimensional weather data, revealing a research gap. This study presents a cooling load prediction model that combines a neural network with Kalman filtering and K-means clustering. Applied to real world data from a commercial skyscraper in Singapore's central business district, the model achieved a 46.5% improvement in prediction accuracy. An optimal chiller sequencing strategy was also developed through genetic algorithm optimization of the predictive load, potentially saving 13.8% in energy. Finally, the study evaluated the integration of thermal energy storage into the chiller plant design, demonstrating potential reductions in capital and operational costs of 26% and 13%, respectively.
Authors:Florian Krause, Felix Schweizer, Alexandra Burger, Franziska Ludewig, Marcus Knips, Katharina Quade, Andreas Wuersig, Dirk Uwe Sauer
Abstract:
This work demonstrates the potential of fiber optic sensors for measuring thermal effects in lithium-ion batteries, using a fiber optic measurement method of Optical Frequency Domain Reflectometry (OFDR). The innovative application of fiber sensors allows for spatially resolved temperature measurement, particularly emphasizing the importance of monitoring not just the exterior but also the internal conditions within battery cells. Utilizing inert glass fibers as sensors, which exhibit minimal sensitivity to electric fields, opens up new pathways for their implementation in a wide range of applications, such as battery monitoring. The sensors used in this work provide real-time information along the entire length of the fiber, unlike commonly used Fiber Bragg Grating (FBG) sensors. It is shown that using the herein presented novel sensors in a temperature range of 0 to 80 degree celsius reveals a linear thermal dependency with high sensitivity and a local resolution of a few centimeters. Furthermore, this study presents preliminary findings on the potential application of fiber optic sensors in lithium-ion battery (LIB) cells, demonstrating that the steps required for battery integration do not impose any restrictive effects on thermal measurements.
Authors:Yuan Xinjie, Khalid M. Mosalam
Abstract:
Fire safety is crucial for ensuring the stability of building structures, yet evaluating whether a structure meets fire safety requirement is challenging. Fires can originate at any point within a structure, and simulating every potential fire scenario is both expensive and time-consuming. To address this challenge, we propose the concept of the Most Fire-Sensitive Point (MFSP) and an efficient machine learning framework for its identification. The MFSP is defined as the location at which a fire, if initiated, would cause the most severe detrimental impact on the building's stability, effectively representing the worst-case fire scenario. In our framework, a Graph Neural Network (GNN) serves as an efficient and differentiable agent for conventional Finite Element Analysis (FEA) simulators by predicting the Maximum Interstory Drift Ratio (MIDR) under fire, which then guides the training and evaluation of the MFSP predictor. Additionally, we enhance our framework with a novel edge update mechanism and a transfer learning-based training scheme. Evaluations on a large-scale simulation dataset demonstrate the good performance of the proposed framework in identifying the MFSP, offering a transformative tool for optimizing fire safety assessments in structural design. All developed datasets and codes are open-sourced online.
Authors:Hesameddin Safari, Henning Wessels
Abstract:
Modeling plays a critical role in additive manufacturing (AM), enabling a deeper understanding of underlying processes. Parametric solutions for such models are of great importance, enabling the optimization of production processes and considerable cost reductions. However, the complexity of the problem and diversity of spatio-temporal scales involved in the process pose significant challenges for traditional numerical methods. Surrogate models offer a powerful alternative by accelerating simulations and facilitating real-time monitoring and control. The present study presents an operator learning approach that relies on the deep operator network (DeepONet) and physics-informed neural networks (PINN) to predict the three-dimensional temperature distribution during melting and consolidation in laser powder bed fusion (LPBF). Parametric solutions for both single-track and multi-track scenarios with respect to tool path are obtained. To address the challenges in obtaining parametric solutions for multi-track scenarios using DeepONet architecture, a sequential PINN approach is proposed to efficiently manage the increased training complexity inherent in those scenarios. The accuracy and consistency of the model are verified against finite-difference computations. The developed surrogate allows us to efficiently analyze the effect of scanning paths and laser parameters on the thermal history.
Authors:Romuald Ait-Bachir, Carlos Granero-Belinchon, Aurélie Michel, Julien Michel, Xavier Briottet, Lucas Drumetz
Abstract:
Due to the trade-off between the temporal and spatial resolution of thermal spaceborne sensors, super-resolution methods have been developed to provide fine-scale Land SurfaceTemperature (LST) maps. Most of them are trained at low resolution but applied at fine resolution, and so they require a scale-invariance hypothesis that is not always adapted. Themain contribution of this work is the introduction of a Scale-Invariance-Free approach for training Neural Network (NN) models, and the implementation of two NN models, calledScale-Invariance-Free Convolutional Neural Network for Super-Resolution (SIF-CNN-SR) for the super-resolution of MODIS LST products. The Scale-Invariance-Free approach consists ontraining the models in order to provide LST maps at high spatial resolution that recover the initial LST when they are degraded at low resolution and that contain fine-scale texturesinformed by the high resolution NDVI. The second contribution of this work is the release of a test database with ASTER LST images concomitant with MODIS ones that can be usedfor evaluation of super-resolution algorithms. We compare the two proposed models, SIF-CNN-SR1 and SIF-CNN-SR2, with four state-of-the-art methods, Bicubic, DMS, ATPRK, Tsharp,and a CNN sharing the same architecture as SIF-CNN-SR but trained under the scale-invariance hypothesis. We show that SIF-CNN-SR1 outperforms the state-of-the-art methods and the other two CNN models as evaluated with LPIPS and Fourier space metrics focusing on the analysis of textures. These results and the available ASTER-MODIS database for evaluation are promising for future studies on super-resolution of LST.
Authors:Muhammad Ramzy Altahhan, Lynn Munday, Yousry Azmy
Abstract:
The Transatomic Power (TAP) reactor has an unusual design for a molten salt reactor technology, building upon the foundation laid by the Molten Salt Reactor Experiment (MSRE). This design introduces three key modifications to enhance efficiency and compactness: a revised fuel salt composition, an alternative moderator material, and moderator pins surrounded by the molten salt fuel. Unlike traditional solid-fueled reactors that rely on excess positive reactivity at the beginning of life, the TAP concept employs a dynamic approach. The core's design, featuring a cylindrical geometry with square assemblies of moderator rods surrounded by flowing fuel salt, provides flexibility in adjusting the moderator-to-fuel ratio during operation - using movable moderator rods - further adding criticality control capability in addition to the control rods system. Shape optimization of the core can play a crucial role in enhancing performance and efficiency. By applying multiphysics continuous shape optimization techniques to key components, such as the unit cells of the TAP reactor or its moderator assemblies, we can fine-tune the reactor's geometry to achieve optimal performance in key physics like neutronics and thermal hydraulics. We explore this aspect using the optimization module in the Multiphysics Object Oriented Simulation Environment (MOOSE) framework which allows for multiphysics continuous shape optimization. The results reported here illustrate the benefits of applying continuous shape optimization in the design of nuclear reactor components and can help in extending the TAP reactor's performance.
Authors:Karthik Reddy Lyathakula, Aseem Muhammad, Sevki Cesmeci
Abstract:
Thermal protection systems (TPS) of space vehicles are designed computationally rather than experimentally. They are validated using ground experiments, but all aspects of the flight cannot be replicated on ground. This ground-to-flight mapping introduces uncertainties which need to be accounted for while designing any thermal protection system. Thus, precise computational models along with uncertainty quantification in the models are required to design the TPS. The focus of this study is to estimate the thermal material parameters of TPS based on the target reliability requirements using statistical methods. To perform uncertainty quantification (UQ) of a system, a simulated model of the system needs to be solved many times on statistical samples, increasing the computational time and cost of the overall process. A physics-informed neural network (PINN) model is used in the analysis instead of traditional physics based numerical solutions. The accuracy of PINN is comparable to that of the numerical solution. To find the parameter distribution, sampling of the parameter space is performed using Sequential Monte- Carlo (SMC) method. The sampling method is efficient as it generates samples based on the target distribution in parallel and it also generates diverse samples for proper UQ. Combining the use of both PINN predictive model and SMC sampling, the framework can approximate the parameter distributions that satisfy the TPS design reliability constraints. The framework achieved remarkable increases in the speed of performing the reliability analysis of the TPS. This reliability analysis can be used for design optimization of the TPS based on risk analysis along with other systems of the vehicle.
Authors:Abdollah Hajalilou, Elahe Parvini, Tiago A. Morgado, Pedro Alhais Lopes, M. Estrela Melo Jorge, Marta Freitas, Mahmoud Tavakoli
Abstract:
Liquid metal (LM)-based composites hold promise for soft electronics due to their high conductivity and fluidic nature. However, the presence of α_Ga2O3 and GaOOH layers around LM droplets impairs conductivity and performance. We tackle this issue by replacing the oxide layer with conductive silver (Ag) using an ultrasonic_assisted galvanic replacement reaction. The Ag_coated nanoparticles form aggregated, porous microparticles that are mixed with styrene_isoprene_styrene (SIS) polymers, resulting in a digitally printable composite with superior electrical conductivity and electromechanical properties compared to conventional fillers. Adding more LM enhances these properties further. The composite achieves EMI shielding effectiveness (SE) exceeding 75 dB in the X_band frequency range, even at 200 per cent strain, meeting stringent military and medical standards. It is applicable in wireless communications and Bluetooth signal blocking and as a thermal interface material (TIM). Additionally, we highlight its recyclability using a biodegradable solvent, underscoring its eco_friendly potential. This composite represents a significant advancement in stretchable electronics and EMI shielding, with implications for wearable and bioelectronic applications.
Authors:Max Sibeijn, Saeed Ahmed, Mohammad Khosravi, Tamás Keviczky
Abstract:
In this paper, we propose an economic nonlinear model predictive control (MPC) algorithm for district heating networks (DHNs). The proposed method features prosumers, multiple producers, and storage systems, which are essential components of 4th generation DHNs. These networks are characterized by their ability to optimize their operations, aiming to reduce supply temperatures, accommodate distributed heat sources, and leverage the flexibility provided by thermal inertia and storage, all crucial for achieving a fossil-fuel-free energy supply. Developing a smart energy management system to accomplish these goals requires detailed models of highly complex nonlinear systems and computational algorithms able to handle large-scale optimization problems. To address this, we introduce a graph-based optimization-oriented model that efficiently integrates distributed producers, prosumers, storage buffers, and bidirectional pipe flows, such that it can be implemented in a real-time MPC setting. Furthermore, we conduct several numerical experiments to evaluate the performance of the proposed algorithms in closed-loop. Our findings demonstrate that the MPC methods achieved up to 9% cost improvement over traditional rule-based controllers while better maintaining system constraints.
Authors:Riccardo Talami, Jonathan Wright, Bianca Howard
Abstract:
The complexity of performance-based building design stems from the evaluation of numerous candidate design options, driven by the plethora of variables, objectives, and constraints inherent in multi-disciplinary projects. This necessitates optimization approaches to support the identification of well performing designs while reducing the computational time of performance evaluation. In response, this paper proposes and evaluates a sequential approach for multi-objective design optimization of building geometry, fabric, HVAC system and controls for building performance. This approach involves sequential optimizations with optimal solutions from previous stages passed to the next. The performance of the sequential approach is benchmarked against a full factorial search, assessing its effectiveness in finding global optima, solution quality, reliability to scale and variations of problem formulations, and computational efficiency compared to the NSGA-II algorithm. 24 configurations of the sequential approach are tested on a multi-scale case study, simulating 874 to 4,147,200 design options for an office building, aiming to minimize energy demand while maintaining thermal comfort. A two-stage sequential process-(building geometry + fabric) and (HVAC system + controls) identified the same Pareto-optimal solutions as the full factorial search across all four scales and variations of problem formulations, demonstrating 100% effectiveness and reliability. This approach required 100,700 function evaluations, representing a 91.2% reduction in computational effort compared to the full factorial search. In contrast, NSGA-II achieved only 73.5% of the global optima with the same number of function evaluations. This research indicates that a sequential optimization approach is a highly efficient and robust alternative to the standard NSGA-II algorithm.
Authors:Mengfan Wu, Shenshen Yan, Jie Ren
Abstract:
Data-driven machine learning (ML) has demonstrated tremendous potential in material property predictions. However, the scarcity of materials data with costly property labels in the vast chemical space presents a significant challenge for ML in efficiently predicting properties and uncovering structure-property relationships. Here, we propose a novel hierarchy-boosted funnel learning (HiBoFL) framework, which is successfully applied to identify semiconductors with ultralow lattice thermal conductivity ($κ_\mathrm{L}$). By training on only a few hundred materials targeted by unsupervised learning from a pool of hundreds of thousands, we achieve efficient and interpretable supervised predictions of ultralow $κ_\mathrm{L}$, thereby circumventing large-scale brute-force \textit{ab initio} calculations without clear objectives. As a result, we provide a list of candidates with ultralow $κ_\mathrm{L}$ for potential thermoelectric applications and discover a new factor that significantly influences structural anharmonicity. This HiBoFL framework offers a novel practical pathway for accelerating the discovery of functional materials.
Authors:Meriem Chabekh, Nadhir Chougui, Delfim F. M. Torres
Abstract:
We conduct an analysis of a one-dimensional linear problem that describes the vibrations of a connected suspension bridge. In this model, the single-span roadbed is represented as a thermoelastic Shear beam without rotary inertia. We incorporate thermal dissipation into the transverse displacement equation, following Green and Naghdi's theory. Our work demonstrates the existence of a global solution by employing classical Faedo-Galerkin approximations and three a priori estimates. Furthermore, we establish exponential stability through the application of the energy method. For numerical study, we propose a spatial discretization using finite elements and a temporal discretization through an implicit Euler scheme. In doing so, we prove discrete stability properties and a priori error estimates for the discrete problem. To provide a practical dimension to our theoretical findings, we present a set of numerical simulations.
Authors:Zhongxuan Zhang, Bi Zeng, Xinyu Ni, Yimin Du
Abstract:
RGB-T tracking leverages the complementary strengths of RGB and thermal infrared (TIR) modalities to address challenging scenarios such as low illumination and adverse weather. However, existing methods often fail to effectively integrate temporal information and perform efficient cross-modal interactions, which constrain their adaptability to dynamic targets. In this paper, we propose BTMTrack, a novel framework for RGB-T tracking. The core of our approach lies in the dual-template backbone network and the Temporal-Modal Candidate Elimination (TMCE) strategy. The dual-template backbone effectively integrates temporal information, while the TMCE strategy focuses the model on target-relevant tokens by evaluating temporal and modal correlations, reducing computational overhead and avoiding irrelevant background noise. Building upon this foundation, we propose the Temporal Dual Template Bridging (TDTB) module, which facilitates precise cross-modal fusion through dynamically filtered tokens. This approach further strengthens the interaction between templates and the search region. Extensive experiments conducted on three benchmark datasets demonstrate the effectiveness of BTMTrack. Our method achieves state-of-the-art performance, with a 72.3% precision rate on the LasHeR test set and competitive results on RGBT210 and RGBT234 datasets.
Authors:Changyou Geng, Dezhi Ren, Enkai Mao, Changfu Zou, Mario Vašak, Xinyi Zheng, Weiji Han
Abstract:
Reconfigurable battery systems (RBSs) are emerging as a promising solution to improving fault tolerance, charge and thermal balance, energy delivery, etc. To optimize these performance metrics of RBSs, high-dimensional nonlinear integer programming problems need to be formulated and solved. To accomplish this, it is necessary to address several critical challenges stemming from nonlinear battery characteristics, discrete switch states, dynamic system configurations, as well as the curse of dimensionality inherent in large-scale RBSs. Thus, we propose a unified modeling framework to accommodate various possible configurations of an RBS and even to cover different RBS designs and their hybrid combinations, enabling the problem formulation for the RBS optimal control and facilitating the RBS topology design.Further, to solve the formulated RBS optimal control problems, the search space is narrowed to encompass only the feasible solutions, thereby ensuring safe battery connections while substantially curtailing search efforts. These proposed techniques, focusing on unifying the system modeling and narrowing the search space, lay a solid foundation for effectively formulating and efficiently solving RBS optimal control problems. The accuracy and effectiveness of the proposed techniques are demonstrated by both simulation and experimental tests.
Authors:Thomas J. Smart, Bilen Emek Abali, Hans Boschker, Wolfgang Braun
Abstract:
The modeling of deposition rates in Thermal Laser Epitaxy (TLE) is essential for the accurate prediction of the evaporation process and for improved dynamic process control. We demonstrate excellent agreement between experimental data and a model based on a finite element simulation that describes the temperature distribution of an elemental source when irradiated with continuous wave laser radiation. The simulation strongly depends on the thermophysical constants of the material, data of which is lacking for many elements. Effective values for the parameters may be determined with precision by means of an unambiguous reference provided by the melting point of the material, which is directly observed during the experiments. TLE may therefore be used to study the high temperature thermophysical and optical properties of the elements.
Authors:Grant Ruan, Munther A. Dahleh
Abstract:
The battery performance and lifespan of electric vehicles (EVs) degrade significantly in cold climates, requiring a considerable amount of energy to heat up the EV batteries. This paper proposes a novel technology, namely temperature-controlled smart charging, to coordinate the heating/charging power and reduce the total energy use of a solar-powered EV charging station. Instead of fixing the battery temperature setpoints, we analyze the thermal dynamics and inertia of EV batteries, and decide the optimal timing and proper amount of energy allocated for heating. In addition, a temperature-sensitive charging model is formulated with consideration of dynamic charging rates as well as battery health. We further tailor acceleration algorithms for large-scale EV charging, including the reduced-order dual decomposition and vehicle rescheduling. Simulation results demonstrate that the proposed temperature-controlled smart charging is superior in capturing the flexibility value of EV batteries and making full use of the rooftop solar energy. The proposed model typically achieves a 12.5--18.4% reduction in the charging cost and a 0.4--6.8% drop in the overhead energy use for heating.
Authors:Fatemeh Hossein-Khani, Omid Akbari
Abstract:
The increasing scale of manycore systems poses significant challenges in managing reliability while meeting performance demands. Simultaneously, these systems become more susceptible to different aging mechanisms such as negative-bias temperature instability (NBTI), hot carrier injection (HCI), and thermal cycling (TC), as well as the electromigration (EM) phenomenon. In this paper, we propose a reinforcement learning (RL)-based task mapping method to improve the reliability of manycore systems considering the aforementioned aging mechanisms, which consists of three steps including bin packing, task-to-bin mapping, and task-to-core mapping. In the initial step, a density-based spatial application with noise (DBSCAN) clustering method is employed to compose some clusters (bins) based on the cores temperature. Then, the Q-learning algorithm is used for the two latter steps, to map the arrived task on a core such that the minimum thermal variation is occurred among all the bins. Compared to the state-of-the-art works, the proposed method is performed during runtime without requiring any parameter to be calculated offline. The effectiveness of the proposed technique is evaluated on 16, 32, and 64 cores systems using SPLASH2 and PARSEC benchmark suite applications. The results demonstrate up to 27% increase in the mean time to failure (MTTF) compared to the state-of-the-art task mapping techniques.
Authors:Dmitriy Y. Anistratov, Terry S. Haut
Abstract:
Thermal radiative transfer (TRT) is an essential piece of physics in inertial confinement fusion, high-energy density physics, astrophysics etc. The physical models of this type of problem are defined by strongly coupled differential equations describing multiphysics phenomena. This paper presents a new nonlinear multilevel iterative method with two photon energy grids for solving the multigroup radiative transfer equation (RTE) coupled with the material energy balance equation (MEB). The multilevel system of equations of the method is formulated by means of a nonlinear projection approach. The RTE is projected over elements of phase space to derive the low-order equations of different types. The hierarchy of equations consists of (1) multigroup weighted flux equations which can be interpreted as the multigroup RTE averaged over subintervals of angular range and (2) the effective grey (one-group) equations which are spectrum averaged low-order quasidiffusion (aka variable Eddington factor) equations. The system of RTE, low-order and MEB equations is approximated by the fully implicit Euler time-integration method in which absorption coefficient and emission term are evaluated at the current time step. Numerical results are presented to demonstrate convergence of a multilevel iteration algorithm in the Fleck-Cummings test problem with Marshak wave solved with large number of photon energy groups.
Authors:Antonio Alcántara, Pablo Diaz-Cachinero, Alberto Sánchez-González, Carlos Ruiz
Abstract:
Concentrating Solar Power Tower (CSPT) plants rely on heliostat fields to focus sunlight onto a central receiver. Although simple aiming strategies, such as directing all heliostats to the receivers equator, can maximize energy collection, they often result in uneven flux distributions that lead to hotspots, thermal stresses, and reduced receiver lifetimes. This paper presents a novel, data-driven approach that integrates constraint learning, neural network-based surrogates, and mathematical optimization to overcome these challenges. The methodology learns complex heliostat-to-receiver flux interactions from simulation data, constructing a surrogate model that is embedded into a tractable optimization framework. By maximizing a tailored quality score that balances energy collection and flux uniformity, the approach yields smoothly distributed flux profiles and mitigates excessive thermal peaks. An iterative refinement process, guided by the trust region and progressive data sampling, ensures the surrogate model improves the obtained solution by exploring new spaces during the iterations. Results from a real CSPT case study demonstrate that the proposed approach surpasses conventional heuristic methods, offering flatter flux distributions and safer thermal conditions without a substantial loss in overall energy capture.
Authors:Maohua Yan, Ruicheng Wang, Ke Liu
Abstract:
The trade-offs between different mechanical properties of materials pose fundamental challenges in engineering material design, such as balancing stiffness versus toughness, weight versus energy-absorbing capacity, and among the various elastic coefficients. Although gradient-based topology optimization approaches have been effective in finding specific designs and properties, they are not efficient tools for surveying the vast design space of metamaterials, and thus unable to reveal the attainable bound of interdependent material properties. Other common methods, such as parametric design or data-driven approaches, are limited by either the lack of diversity in geometry or the difficulty to extrapolate from known data, respectively. In this work, we formulate the simultaneous exploration of multiple competing material properties as a multi-objective optimization (MOO) problem and employ a neuroevolution algorithm to efficiently solve it. The Compositional Pattern-Producing Networks (CPPNs) is used as the generative model for unit cell designs, which provide very compact yet lossless encoding of geometry. A modified Neuroevolution of Augmenting Topologies (NEAT) algorithm is employed to evolve the CPPNs such that they create metamaterial designs on the Pareto front of the MOO problem, revealing empirical bounds of different combinations of elastic properties. Looking ahead, our method serves as a universal framework for the computational discovery of diverse metamaterials across a range of fields, including robotics, biomedicine, thermal engineering, and photonics.
Authors:Arturo Rodriguez, Ashesh Chattopadhyay, Piyush Kumar, Luis F. Rodriguez, Vinod Kumar
Abstract:
Physics-informed neural networks (PINNs) commonly address ill-posed inverse problems by uncovering unknown physics. This study presents a novel unsupervised learning framework that identifies spatial subdomains with specific governing physics. It uses the partition of unity networks (POUs) to divide the space into subdomains, assigning unique nonlinear model parameters to each, which are integrated into the physics model. A vital feature of this method is a physics residual-based loss function that detects variations in physical properties without requiring labeled data. This approach enables the discovery of spatial decompositions and nonlinear parameters in partial differential equations (PDEs), optimizing the solution space by dividing it into subdomains and improving accuracy. Its effectiveness is demonstrated through applications in porous media thermal ablation and ice-sheet modeling, showcasing its potential for tackling real-world physics challenges.
Authors:Yu-Xuan Chen, Jing Sun, Bo-Qi Meng
Abstract:
Conventional electromagnetic induction-based current transformers suffer from issues such as bulky and complex structures, slow response times, and low safety levels. Consequently, researchers have explored combining various sensing technologies with optical fibers to develop optical current transformers that could become the primary choice for power systems in the future. With the maturation of optoelectronic technology, optical current transformers have emerged. They offer outstanding advantages, including high sensitivity, integration, stability, and the ability to operate in complex environments. This review categorizes optical current transformers based on different principles, including all-fiber current transformers, those based on magnetostrictive effects, magneto-optic effects, and thermal effects. It also discusses their principles, structures, manufacturing techniques, and signal processing, while forecasting their future development trends.
Authors:Gaurav Sharma, P R Kumar
Abstract:
In this paper, we consider the problem of preferentially utilizing intermittent renewable power, such as wind, optimally to support thermal inertial loads in a microgrid environment. Thermal inertial loads can be programmed to preferentially consume from renewable sources. The flexibility in power consumption of inertial loads therefore can be used to absorb the fluctuations in intermittently available renewable power sources, and promote reduction of fossil fuel based costly non-renewable generation. Under a model which promotes renewable consumption by penalizing the non-renewable, but does not account for variations in the end-user requirements, the optimal solution leads to all the users' temperatures behave in a lockstep fashion, that is the power is allocated in such a fashion that all the temperatures are brought to a common value and they are kept the same after that point, resulting in synchronization among all the loads. In the first part, we showed that under a model which additionally penalizes the comfort range violation, the optimal solution is in-fact of desynchronization nature, where the temperatures are intentionally kept apart to avoid power surges resulting from simultaneous comfort violation from many loads.
Authors:N. Benjamin Murphy, Daniel Hallman, Elena Cherkaev, Kenneth M. Golden
Abstract:
We previously demonstrated that the bulk transport coefficients of uniaxial polycrystalline materials, including electrical and thermal conductivity, diffusivity, complex permittivity, and magnetic permeability, have Stieltjes integral representations involving spectral measures of self-adjoint random operators. The integral representations follow from resolvent representations of physical fields involving these self-adjoint operators, such as the electric field $\boldsymbol{E}$ and current density $\boldsymbol{J}$ associated with conductive media with local conductivity $\boldsymbolÏ$ and resistivity $\boldsymbolÏ$ matrices. In this article, we provide a discrete matrix analysis of this mathematical framework which parallels the continuum theory. We show that discretizations of the operators yield real-symmetric random matrices which are composed of projection matrices. We derive discrete resolvent representations for $\boldsymbol{E}$ and $\boldsymbol{J}$ involving the matrices which lead to eigenvector expansions of $\boldsymbol{E}$ and $\boldsymbol{J}$. We derive discrete Stieltjes integral representations for the components of the effective conductivity and resistivity matrices, $\boldsymbolÏ^*$ and $\boldsymbolÏ^*$, involving spectral measures for the real-symmetric random matrices, which are given explicitly in terms of their real eigenvalues and orthonormal eigenvectors. We provide a projection method that uses properties of the projection matrices to show that the spectral measure can be computed by much smaller matrices, which leads to a more efficient and stable numerical algorithm for the computation of bulk transport coefficients and physical fields. We demonstrate this algorithm by numerically computing the spectral measure and current density for model 2D and 3D isotropic polycrystalline media with checkerboard microgeometry.
Authors:Saksham Sharma, Akshit Raizada, Suresh Sundaram
Abstract:
Autonomous off-road navigation is required for applications in agriculture, construction, search and rescue and defence. Traditional on-road autonomous methods struggle with dynamic terrains, leading to poor vehicle control in off-road conditions. Recent deep-learning models have used perception sensors along with kinesthetic feedback for navigation on such terrains. However, this approach has out-of-domain uncertainty. Factors like change in time of day and weather impacts the performance of the model. We propose a multi modal fusion network "IRisPath" capable of using Thermal and RGB images to provide robustness against dynamic weather and light conditions. To aid further works in this domain, we also open-source a day-night dataset with Thermal and RGB images along with pseudo-labels for traversability. In order to co-register for fusion model we also develop a novel method for targetless extrinsic calibration of Thermal, LiDAR and RGB cameras with translation accuracy of +/-1.7cm and rotation accuracy of +/-0.827degrees.
Authors:Arijit Samal, Haroon R Lone
Abstract:
Non-invasive temperature monitoring of individuals plays a crucial role in identifying and isolating symptomatic individuals. Temperature monitoring becomes particularly vital in settings characterized by close human proximity, often referred to as dense settings. However, existing research on non-invasive temperature estimation using thermal cameras has predominantly focused on sparse settings. Unfortunately, the risk of disease transmission is significantly higher in dense settings like movie theaters or classrooms. Consequently, there is an urgent need to develop robust temperature estimation methods tailored explicitly for dense settings.
Our study proposes a non-invasive temperature estimation system that combines a thermal camera with an edge device. Our system employs YOLO models for face detection and utilizes a regression framework for temperature estimation. We evaluated the system on a diverse dataset collected in dense and sparse settings. Our proposed face detection model achieves an impressive mAP score of over 84 in both in-dataset and cross-dataset evaluations. Furthermore, the regression framework demonstrates remarkable performance with a mean square error of 0.18$^{\circ}$C and an impressive $R^2$ score of 0.96. Our experiments' results highlight the developed system's effectiveness, positioning it as a promising solution for continuous temperature monitoring in real-world applications. With this paper, we release our dataset and programming code publicly.
Authors:Simon Mielke, Anthony Stein
Abstract:
Animal excretions in form of urine puddles and feces are a significant source of emissions in livestock farming. Automated detection of soiled floor in barns can contribute to improved management processes but also the derived information can be used to model emission dynamics. Previous research approaches to determine the puddle area require manual detection of the puddle in the barn. While humans can detect animal excretions on thermal images of a livestock barn, automated approaches using thresholds fail due to other objects of the same temperature, such as the animals themselves. In addition, various parameters such as the type of housing, animal species, age, sex, weather and unknown factors can influence the type and shape of excretions. Due to this heterogeneity, a method for automated detection of excretions must therefore be not only be accurate but also robust to varying conditions. These requirements can be met by using contemporary deep learning models from the field of artificial intelligence. This work is the first to investigate the suitability of different deep learning models for the detection of excretions in pigsties, thereby comparing established convolutional architectures with recent transformer-based approaches. The detection models Faster R-CNN, YOLOv8, DETR and DAB-DETR are compared and statistically assessed on two created training datasets representing two pig houses. We apply a method derived from nested cross-validation and report on the results in terms of eight common detection metrics. Our work demonstrates that all investigated deep learning models are generally suitable for reliably detecting excretions with an average precision of over 90%. The models also show robustness on out of distribution data that possesses differences from the conditions in the training data, however, with expected slight decreases in the overall detection performance.
Authors:L. Klochko, M. d'Aquin, A. Togo, L. Chaput
Abstract:
Machine learning promises to accelerate the material discovery by enabling high-throughput prediction of desirable macro-properties from atomic-level descriptors or structures. However, the limited data available about precise values of these properties have been a barrier, leading to predictive models with limited precision or the ability to generalize. This is particularly true of lattice thermal conductivity (LTC): existing datasets of precise (ab initio, DFT-based) computed values are limited to a few dozen materials with little variability. Based on such datasets, we study the impact of transfer learning on both the precision and generalizability of a deep learning model (ParAIsite). We start from an existing model (MEGNet~\cite{Chen2019}) and show that improvements are obtained by fine-tuning a pre-trained version on different tasks. Interestingly, we also show that a much greater improvement is obtained when first fine-tuning it on a large datasets of low-quality approximations of LTC (based on the AGL model) and then applying a second phase of fine-tuning with our high-quality, smaller-scale datasets. The promising results obtained pave the way not only towards a greater ability to explore large databases in search of low thermal conductivity materials but also to methods enabling increasingly precise predictions in areas where quality data are rare.
Authors:Mohammed Riadh Berramdane, Alexandre Battiston, Michele Bardi, Nicolas Blet, Benjamin Rémy, Matthieu Urbain
Abstract:
Facing the thermal management challenges of Wide Bandgap (WBG) semiconductors, this study highlights the use of ARX parametric models, which provide accurate temperature predictions without requiring detailed understanding of component thickness disparities or material physical properties, relying solely on experimental measurements. These parametric models emerge as a reliable alternative to FEM simulations and conventional thermal models, significantly simplifying system identification while ensuring high result accuracy.
Authors:Nicholas E. Pacheco, Kang Zhang, Ashley S. Reyes, Christopher J. Pacheco, Lucas Burstein, Loris Fichera
Abstract:
This paper presents a computational model, based on the Finite Element Method (FEM), that simulates the thermal response of laser-irradiated tissue. This model addresses a gap in the current ecosystem of surgical robot simulators, which generally lack support for lasers and other energy-based end effectors. In the proposed model, the thermal dynamics of the tissue are calculated as the solution to a heat conduction problem with appropriate boundary conditions. The FEM formulation allows the model to capture complex phenomena, such as convection, which is crucial for creating realistic simulations. The accuracy of the model was verified via benchtop laser-tissue interaction experiments using agar tissue phantoms and ex-vivo chicken muscle. The results revealed an average root-mean-square error (RMSE) of less than 2 degrees Celsius across most experimental conditions.
Authors:Matteo Luigi De Pascali, Francesco Casella
Abstract:
The forthcoming energy transition calls for a new generation of thermal power generation systems with low- or zero-emission and highly flexible operation. Dynamic modelling and simulation is a key enabling factor in this field, as controlling such plants is a difficult task for which there is no previous experience and very short design times are expected. The steady-state initialization of those dynamic models is an essential step in the design process, but is unfortunately a difficult task which involves the numerical solution of large systems of nonlinear equations with iterative Newton methods, which is often prone to numerical failures.
In this work, several strategies and methodologies are discussed to successfully achieve steady-state initialization of first-principles equation-based, object-oriented models of advanced thermal power generation systems. These are presented in the context of the Modelica modelling language, but could be applied to other equation-based, object-oriented modelling and simulation environments.
Finally, the successful application of such strategies and methodologies to the SOS-CO2 advanced power generation system is presented.
Authors:Junlan Liu, Qian Yin, Mengshu He, Jun Zhou
Abstract:
The $\text{Cu}_7\text{P}\text{S}_6$ compound has garnered significant attention due to its potential in thermoelectric applications. In this study, we introduce a neuroevolution potential (NEP), trained on a dataset generated from ab initio molecular dynamics (AIMD) simulations, using the moment tensor potential (MTP) as a reference. The low root mean square errors (RMSEs) for total energy and atomic forces demonstrate the high accuracy and transferability of both the MTP and NEP. We further calculate the phonon density of states (DOS) and radial distribution function (RDF) using both machine learning potentials, comparing the results to density functional theory (DFT) calculations. While the MTP potential offers slightly higher accuracy, the NEP achieves a remarkable 41-fold increase in computational speed. These findings provide detailed microscopic insights into the dynamics and rapid Cu-ion diffusion, paving the way for future studies on Cu-based solid electrolytes and their applications in energy devices.
Authors:Miriam Asare-Baiden, Kathleen Jordan, Andrew Chung, Sharon Eve Sonenblum, Joyce C. Ho
Abstract:
Pressure injury (PI) detection is challenging, especially in dark skin tones, due to the unreliability of visual inspection. Thermography has been suggested as a viable alternative as temperature differences in the skin can indicate impending tissue damage. Although deep learning models have demonstrated considerable promise toward reliably detecting PI, the existing work fails to evaluate the performance on darker skin tones and varying data collection protocols. In this paper, we introduce a new thermal and optical imaging dataset of 35 participants focused on darker skin tones where temperature differences are induced through cooling and cupping protocols. We vary the image collection process to include different cameras, lighting, patient pose, and camera distance. We compare the performance of a small convolutional neural network (CNN) trained on either the thermal or the optical images on all skin tones. Our preliminary results suggest that thermography-based CNN is robust to data collection protocols for all skin tones.
Authors:David Shulman, Itai Dattner
Abstract:
This paper introduces an adaptive physics-guided neural network (APGNN) framework for predicting quality attributes from image data by integrating physical laws into deep learning models. The APGNN adaptively balances data-driven and physics-informed predictions, enhancing model accuracy and robustness across different environments. Our approach is evaluated on both synthetic and real-world datasets, with comparisons to conventional data-driven models such as ResNet. For the synthetic data, 2D domains were generated using three distinct governing equations: the diffusion equation, the advection-diffusion equation, and the Poisson equation. Non-linear transformations were applied to these domains to emulate complex physical processes in image form.
In real-world experiments, the APGNN consistently demonstrated superior performance in the diverse thermal image dataset. On the cucumber dataset, characterized by low material diversity and controlled conditions, APGNN and PGNN showed similar performance, both outperforming the data-driven ResNet. However, in the more complex thermal dataset, particularly for outdoor materials with higher environmental variability, APGNN outperformed both PGNN and ResNet by dynamically adjusting its reliance on physics-based versus data-driven insights. This adaptability allowed APGNN to maintain robust performance across structured, low-variability settings and more heterogeneous scenarios. These findings underscore the potential of adaptive physics-guided learning to integrate physical constraints effectively, even in challenging real-world contexts with diverse environmental conditions.
Authors:Federico P. Cortese, Antonio Pievatolo
Abstract:
Thermal comfort is essential for well-being in urban spaces, especially as cities face increasing heat from urbanization and climate change. Existing thermal comfort models usually overlook temporal dynamics alongside spatial dependencies. We address this problem by introducing a spatio-temporal jump model that clusters data with persistence across both spatial and temporal dimensions. This framework enhances interpretability, minimizes abrupt state changes, and easily handles missing data. We validate our approach through extensive simulations, demonstrating its accuracy in recovering the true underlying partition. When applied to hourly environmental data gathered from a set of weather stations located across the city of Singapore, our proposal identifies meaningful thermal comfort regimes, demonstrating its effectiveness in dynamic urban settings and suitability for real-world monitoring. The comparison of these regimes with feedback on thermal preference indicates the potential of an unsupervised approach to avoid extensive surveys.
Authors:Taemin Heo, Ruaridh Macdonald
Abstract:
The increasing need for energy storage solutions to balance variable renewable energy sources has highlighted the potential of Pumped Thermal Electricity Storage (PTES). In this paper, we investigate the trade-offs between model accuracy and computational efficiency in PTES systems. We evaluate a range of PTES models, from physically detailed to simplified variants, focusing on their non-linear charging and discharging capabilities. Our results show that while detailed models provide the most accurate representation of PTES operation by considering mass flow rate ($\dot{m}$) and state of charge (SoC) dependencies, they come at the cost of increased computational complexity. In contrast, simplified models tend to produce overly optimistic predictions by disregarding capability constraints. Other approximated model variants offer a practical compromise, balancing computational efficiency with acceptable accuracy. In particular, models that disregard $\dot{m}$-dependency and approximate nonlinear SoC-dependency with a piecewise linear function achieve similar accuracy to more detailed models but with significantly faster computation times. Our findings offer guidance to modelers in selecting the appropriate PTES representation for their investment models.
Authors:Vibol Yem, Mattia Quartana, Zi Xin, Kazuhiro Fujitsuka, Tomohiro Amemiya
Abstract:
Relaxation is a critical counterbalance to the demands of modern business life. Footbaths, a simple yet highly effective therapeutic practice, have been used for centuries across various cultures to promote relaxation and overall well-being. This study presents a novel approach to simulating the experience of a public footbath through the use of tactile and thermal stimulation of airflow to the calf and those on the foot soles. Our system aims to offer a realistic and immersive virtual footbath experience without the need for actual water, by controlling the temperature and airflow to mimic the sensation of soaking feet in water or a water wave. Without using actual water, our system can be more compact, highly responsive, and more reproducible. The layer of airflow is made as thin as possible by adjusting air outlet, and the Coanda effect is also considered to generate a water surface more realistic. The system can provide a multi-sensory experience, including visual and audio feedback of water flow, enhancing the relaxation and therapeutic benefits of a footbath.
Authors:Akshar Ramkumar, Mehdi Soleimanifar
Abstract:
Providing evidence that quantum computers can efficiently prepare low-energy or thermal states of physically relevant interacting quantum systems is a major challenge in quantum information science. A newly developed quantum Gibbs sampling algorithm by Chen, Kastoryano, and Gilyén provides an efficient simulation of the detailed-balanced dissipative dynamics of non-commutative quantum systems. The running time of this algorithm depends on the mixing time of the corresponding quantum Markov chain, which has not been rigorously bounded except in the high-temperature regime. In this work, we establish a polylog(n) upper bound on its mixing time for various families of random n by n sparse Hamiltonians at any constant temperature. We further analyze how the choice of the jump operators for the algorithm and the spectral properties of these sparse Hamiltonians influence the mixing time. Our result places this method for Gibbs sampling on par with other efficient algorithms for preparing low-energy states of quantumly easy Hamiltonians.
Authors:Públio Elon Correa da Silva, Jurandy Almeida
Abstract:
Deep learning (DL) technologies can transform agriculture by improving crop health monitoring and management, thus improving food safety. In this paper, we explore the potential of edge computing for real-time classification of leaf diseases using thermal imaging. We present a thermal image dataset for plant disease classification and evaluate deep learning models, including InceptionV3, MobileNetV1, MobileNetV2, and VGG-16, on resource-constrained devices like the Raspberry Pi 4B. Using pruning and quantization-aware training, these models achieve inference times up to 1.48x faster on Edge TPU Max for VGG16, and up to 2.13x faster with precision reduction on Intel NCS2 for MobileNetV1, compared to high-end GPUs like the RTX 3090, while maintaining state-of-the-art accuracy.
Authors:Iñigo Delgado-Enales, Joshua Lizundia-Loiola, Patricia Molina-Costa, Javier Del Ser
Abstract:
The increasingly populated cities of the 21st Century face the challenge of being sustainable and resilient spaces for their inhabitants. However, climate change, among other problems, makes these objectives difficult to achieve. The Urban Heat Island (UHI) phenomenon that occurs in cities, increasing their thermal stress, is one of the stumbling blocks to achieve a more sustainable city. The ability to estimate temperatures with a high degree of accuracy allows for the identification of the highest priority areas in cities where urban improvements need to be made to reduce thermal discomfort. In this work we explore the usefulness of image-to-image deep neural networks (DNNs) for correlating spatial and meteorological variables of a urban area with street-level air temperature. The air temperature at street-level is estimated both spatially and temporally for a specific use case, and compared with existing, well-established numerical models. Based on the obtained results, deep neural networks are confirmed to be faster and less computationally expensive alternative for ground-level air temperature compared to numerical models.
Authors:Yuta Tanabe, Kentaro Yaji, Kuniharu Ushijima
Abstract:
This paper proposes a topology optimization method for non-thermal and thermal fluid problems using the Lattice Kinetic Scheme (LKS).LKS, which is derived from the Lattice Boltzmann Method (LBM), requires only macroscopic values, such as fluid velocity and pressure, whereas LBM requires velocity distribution functions, thereby reducing memory requirements. The proposed method computes design sensitivities based on the adjoint variable method, and the adjoint equation is solved in the same manner as LKS; thus, we refer to it as the Adjoint Lattice Kinetic Scheme (ALKS). A key contribution of this method is the proposed approximate treatment of boundary conditions for the adjoint equation, which is challenging to apply directly due to the characteristics of LKS boundary conditions. We demonstrate numerical examples for steady and unsteady problems involving non-thermal and thermal fluids, and the results are physically meaningful and consistent with previous research, exhibiting similar trends in parameter dependencies, such as the Reynolds number. Furthermore, the proposed method reduces memory usage by up to 75% compared to the conventional LBM in an unsteady thermal fluid problem.
Authors:Hakima Bessaih, Annie Millet
Abstract:
We prove that a semi-implicit time Euler scheme for the two-dimensional Bénard-Boussinesq model on the torus D converges. The rate of convergence in probability is almost 1/2 for a multiplicative noise; this relies on moment estimates in various norms for the processes and the scheme.
In case of an additive noise, due to the coupling of the equations, provided that the difference on temperature between the top and bottom parts of the torus is not too big compared to the viscosity and thermal diffusivity, a strong polynomial rate of convergence (almost 1/2) is proven in $(L^2(D))^2$ for the velocity and in $L^2(D)$ for the temperature.
It depends on exponential moments of the scheme; due to linear terms involving the other quantity in both evolution equations, the proof has to be done simultaneaously for both the velocity and the temperature. These rates in both cases are similar to that obtained for the Navier-Stokes equation.
Authors:Jun Takahashi, Sam Slezak, Elizabeth Crosson
Abstract:
Quantum Monte Carlo (QMC) methods have proven invaluable in condensed matter physics, particularly for studying ground states and thermal equilibrium properties of quantum Hamiltonians without a sign problem. Over the past decade, significant progress has also been made on their rigorous convergence analysis.
Heisenberg antiferromagnets (AFM) with bipartite interaction graphs are a popular target of computational QMC studies due to their physical importance, but despite the apparent empirical efficiency of these simulations it remains an open question whether efficient classical approximation of the ground energy is possible in general. In this work we introduce a ground state variant of the stochastic series expansion QMC method, and for the special class of AFM on interaction graphs with an $O(1)$-bipartite component (star-like), we prove rapid mixing of the associated QMC Markov chain (polynomial time in the number of qubits) by using Jerrum and Sinclair's method of canonical paths. This is the first Markov chain analysis of a practical class of QMC algorithms with the loop representation of Heisenberg models.
Our findings contribute to the broader effort to resolve the computational complexity of Heisenberg AFM on general bipartite interaction graphs.
Authors:Guangting Yu, Shiwei Lan, Kookjin Lee, Alex Mahalov
Abstract:
We present a novel method for reconstructing the thermal conductivity coefficient in 1D and 2D heat equations using moving sensors that dynamically traverse the domain to record sparse and noisy temperature measurements. We significantly reduce the computational cost associated with forward PDE evaluations by employing automatic differentiation, enabling a more efficient and scalable reconstruction process. This allows the inverse problem to be solved with fewer sensors and observations. Specifically, we demonstrate the successful reconstruction of thermal conductivity on the 1D circle and 2D torus, using one and four moving sensors, respectively, with their positions recorded over time. Our method incorporates sampling algorithms to compute confidence intervals for the reconstructed conductivity, improving robustness against measurement noise. Extensive numerical simulations of heat dynamics validate the efficacy of our approach, confirming both the accuracy and stability of the reconstructed thermal conductivity. Additionally, the method is thoroughly tested using large datasets from machine learning, allowing us to evaluate its performance across various scenarios and ensure its reliability. This approach provides a cost-effective and flexible solution for conductivity reconstruction from sparse measurements, making it a robust tool for solving inverse problems in complex domains.
Authors:Zhaohe Lv, Guoliang Zhao, Zhanbo Xu, Jiang Wu, Yadong Zhou, Kun Liu
Abstract:
A reliable comfort model is essential to improve occupant satisfaction and reduce building energy consumption. As two types of the most common and intuitive thermal adaptive behaviors, precise recognition of dressing and undressing can effectively support thermal comfort prediction. However, traditional activity recognition suffers from shortcomings in privacy, cost, and performance. To address the above issues, this study proposes a cross-domain transfer learning method for human dressing and undressing adaptive behavior recognition with WiFi. First, we determine the activity interval by calculating the sliding variance for denoised WiFi signals. Subsequently, short-time Fourier transform and discrete wavelet transform are performed to extract action information on the basis of time-frequency analysis. Ultimately, an efficient 1D CNN pre-trained model is integrated with the SVM algorithm as a hybrid model to enhance the identification robustness in new scenarios. Experiment results show that the hybrid model based on transfer learning provides a more accurate prediction for the adaptative behavior of target subjects, achieving 96.9% and 94.9% accuracy in two cases, respectively.
Authors:Michael Huylo, Sina Taheri, Atila Novoselac
Abstract:
There is currently a large federal effort to decarbonize the country's electrical grid as part of the clean energy transition. The elimination of fossil fuel fired systems, and their replacement with intermittent renewable sources and other electric equipment will require better load management techniques to ensure a reliable grid. One strategy for maintaining electric grid reliability utilizes peak shaving. Buildings, accounting for 40% of energy use in the United States, can account for an even higher percentage of energy during peak periods driven by high air conditioning loads during the summer, especially in hotter climes such as Austin, Texas. Many previous studies have modeled the effectiveness of building HVAC demand response methods such as temperature setpoint manipulation, pre-cooling, ventilation scheduling, and thermal energy storage. Thermal storage systems, due to their larger energy capacities, have been shown to be most promising for peak shaving. However, there is a lack of work integrating chilled water energy storage models with validated microgrid-district energy system models to fully capture the dynamics of the proposed strategies. Previously, a validated system model for power generation and heating was developed for the University of Texas at Austin (UT Austin). A new validated model integrates the 65 MW combined heat and power plant (CHP), with the campus' 45,000 ton district cooling system, as well as two chilled water storage tanks. While the existing campus system currently utilizes an operator driven peak shaving strategy utilizing thermal storage, optimization results show that there is room for further improvement and energy savings. The presented results quantify the peak shaving in MW and provide a foundation for further analysis.
Authors:Chuxiao Meng, Conor Porter, Sina Malakpour, Garrett Mathesen, Seongyeon Yang
Abstract:
Pore formation during Laser Powder Bed Fusion (LPBF) has long posed challenges in metal 3D printing, significantly affecting the mechanical properties of the final product. Porosity frequently occurs because of an unstable keyhole formation, triggered by an excess laser energy. Traditional approaches for detecting pores rely heavily on CT scanning, a time-consuming and costly method unsuitable for large-scale production. In response to these limitations, we have developed a real-time pore detection method using thermal sensor data, offering a more efficient, cost-effective alternative for quality control during the LPBF process. Our method, validated against CT-scanned pore counts, provides a high degree of accuracy, achieving an R^2 value of 0.94 between the across eight sample prints. This approach also effectively tracks pore formation trends as the layer-wise printing pattern changes, providing timely insights into product quality, which may serve as important datapoints for real-time adaptive parameters optimization in the future. In contrast to prior machine learning-based techniques, which were limited by high computational costs and lacked direct validation strategy, the method intr
Authors:Wei Liang, Yiting Zhang, Ji Zhang, Erica Cochran Hameen
Abstract:
Ensuring thermal comfort is essential for the well-being and productivity of individuals in built environments. Of the various thermal comfort indicators, the mean radiant temperature (MRT) is very challenging to measure. Most common measurement methodologies are time-consuming and not user-friendly. To address this issue, this paper proposes a novel MRT measurement framework that uses visual simultaneous localization and mapping (SLAM) and semantic segmentation techniques. The proposed approach follows the rule of thumb of the traditional MRT calculation method using surface temperature and view factors. However, it employs visual SLAM and creates a 3D thermal point cloud with enriched surface temperature information. The framework then implements Grounded SAM, a new object detection and segmentation tool to extract features with distinct temperature profiles on building surfaces. The detailed segmentation of thermal features not only reduces potential errors in the calculation of the MRT but also provides an efficient reconstruction of the spatial MRT distribution in the indoor environment. We also validate the calculation results with the reference measurement methodology. This data-driven framework offers faster and more efficient MRT measurements and spatial mapping than conventional methods. It can enable the direct engagement of researchers and practitioners in MRT measurements and contribute to research on thermal comfort and radiant cooling and heating systems.
Authors:Stavros Kassinos, Alessio Alexiadis
Abstract:
Transformer Neural Networks are driving an explosion of activity and discovery in the field of Large Language Models (LLMs). In contrast, there have been only a few attempts to apply Transformers in engineering physics. Aiming to offer an easy entry point to physics-centric Transformers, we introduce a physics-informed Transformer model for solving the heat conduction problem in a 2D plate with Dirichlet boundary conditions. The model is implemented in the machine learning framework MLX and leverages the unified memory of Apple M-series processors. The use of MLX means that the models can be trained and perform predictions efficiently on personal machines with only modest memory requirements. To train, validate and test the Transformer model we solve the 2D heat conduction problem using central finite differences. Each finite difference solution in these sets is initialized with four random Dirichlet boundary conditions, a uniform but random internal temperature distribution and a randomly selected thermal diffusivity. Validation is performed in-line during training to monitor against over-fitting. The excellent performance of the trained model is demonstrated by predicting the evolution of the temperature field to steady state for the unseen test set of conditions.
Authors:Julie Keisler, Margaux Bregere
Abstract:
Electricity is difficult to store, except at prohibitive cost, and therefore the balance between generation and load must be maintained at all times. Electricity is traditionally managed by anticipating demand and intermittent production (wind, solar) and matching flexible production (hydro, nuclear, coal and gas). Accurate forecasting of electricity load and renewable production is therefore essential to ensure grid performance and stability. Both are highly dependent on meteorological variables (temperature, wind, sunshine). These dependencies are complex and difficult to model. On the one hand, spatial variations do not have a uniform impact because population, industry, and wind and solar farms are not evenly distributed across the territory. On the other hand, temporal variations can have delayed effects on load (due to the thermal inertia of buildings). With access to observations from different weather stations and simulated data from meteorological models, we believe that both phenomena can be modeled together. In today's state-of-the-art load forecasting models, the spatio-temporal modeling of the weather is fixed. In this work, we aim to take advantage of the automated representation and spatio-temporal feature extraction capabilities of deep neural networks to improve spatio-temporal weather modeling for load forecasting. We compare our deep learning-based methodology with the state-of-the-art on French national load. This methodology could also be fully adapted to forecasting renewable energy production.
Authors:Youssef Mohamed, Severin Lemaignan, Arzu Guneysu, Patric Jensfelt, Christian Smith
Abstract:
Accurate recognition of human emotions is a crucial challenge in affective computing and human-robot interaction (HRI). Emotional states play a vital role in shaping behaviors, decisions, and social interactions. However, emotional expressions can be influenced by contextual factors, leading to misinterpretations if context is not considered. Multimodal fusion, combining modalities like facial expressions, speech, and physiological signals, has shown promise in improving affect recognition. This paper proposes a transformer-based multimodal fusion approach that leverages facial thermal data, facial action units, and textual context information for context-aware emotion recognition. We explore modality-specific encoders to learn tailored representations, which are then fused using additive fusion and processed by a shared transformer encoder to capture temporal dependencies and interactions. The proposed method is evaluated on a dataset collected from participants engaged in a tangible tabletop Pacman game designed to induce various affective states. Our results demonstrate the effectiveness of incorporating contextual information and multimodal fusion for affective state recognition.
Authors:Juan Gamero-Salinas, Jesús López-Fidalgo
Abstract:
Response Surface Methodology (RSM) and desirability functions were employed in a case study to optimize the thermal and daylight performance of a computational model of a tropical housing typology. Specifically, this approach simultaneously optimized Indoor Overheating Hours (IOH) and Useful Daylight Illuminance (UDI) metrics through an Overall Desirability (D). The lack of significant association between IOH and other annual daylight metrics enabled a focused optimization of IOH and UDI. Each response required only 138 simulation runs (~30 hours for 276 runs) to determine the optimal values for passive strategies: window-to-wall ratio (WWR) and roof overhang depth across four orientations, totalling eight factors. First, initial screening based on $2_V^{8-2}$ fractional factorial design, identified four key factors using stepwise and Lasso regression, narrowed down to three: roof overhang depth on the south and west, WWR on the west, and WWR on the south. Then, RSM optimization yielded an optimal solution (roof overhang: 3.78 meters, west WWR: 3.76%, south WWR: 29.3%) with a D of 0.625 (IOH: 8.33%, UDI: 79.67%). Finally, robustness analysis with 1,000 bootstrap replications provided 95% confidence intervals for the optimal values. This study optimally balances thermal comfort and daylight with few experiments using a computationally-efficient multi-objective approach.
Authors:Dina E. Abdelaleem, Hassan M. Ahmed, M. Sami Soliman, Tarek M. Said
Abstract:
Daily activity monitoring systems used in households provide vital information for health status, particularly with aging residents. Multiple approaches have been introduced to achieve such goals, typically obtrusive and non-obtrusive. Amongst the obtrusive approaches are the wearable devices, and among the non-obtrusive approaches are the movement detection systems, including motion sensors and thermal sensor arrays (TSAs). TSA systems are advantageous when preserving a person's privacy and picking his precise spatial location. In this study, human daily living activities were monitored day and night, constructing the corresponding activity time series and spatial probability distribution and employing a TSA system. The monitored activities are classified into two categories: sleeping and daily activity. Results showed the possibility of distinguishing between classes regardless of day and night. The obtained sleep activity duration was compared with previous research using the same raw data. Results showed that the duration of sleep activity, on average, was 9 hours/day, and daily life activity was 7 hours/day. The person's spatial probability distribution was determined using the bivariate distribution for the monitored location. In conclusion, the results showed that sleeping activity was dominant. Our study showed that TSAs were the optimum choice when monitoring human activity. Our proposed approach tackled limitations encountered by previous human activity monitoring systems, such as preserving human privacy while knowing his precise spatial location.
Authors:J. Garcia-Echeverria, D. Musat, A. Mahsafar, K. R. Mojaver, D. Rolston, G. Cowan, O. Liboiron-Ladouceur
Abstract:
This paper presents a microring resonator-based weight function for neuromorphic photonic applications achieving a record-high precision of 11.3 bits and accuracy of 9.3 bits for 2 Gbps input optical signals. The system employs an all-analog self-referenced proportional-integral-derivative (PID) controller to perform real-time temperature stabilization within a range of up to 60 degree Celsius. A self-calibrated weight function is demonstrated for a range of 6 degree Celsius with a single initial calibration and minimal accuracy and precision degradation. By monitoring the through and drop ports of the microring with variable gain transimpedance amplifiers, accurate and precise weight adjustment is achieved, ensuring optimal performance and reliability. These findings underscore the system's robustness to dynamic thermal environments, highlighting the potential for high-speed reconfigurable analog photonic networks.
Authors:Ye Guo, Chenge Gao, Cong Chen
Abstract:
In the global pursuit of carbon neutrality, the role of batteries is indispensable. They provide pivotal flexibilities to counter uncertainties from renewables, preferably by participating in electricity markets. Unlike thermal generators, however, the dominant type of cost for batteries is opportunity cost, which is more vague and challenging to represent through bids in stipulated formats. This article shows the opposite yet surprising results: The demand-supply function of an ideal battery, considering its opportunity cost, is a staircase function with no more than five segments, which is a perfect match with existing rules in many real electricity markets. The demand-supply function shifts horizontally with price forecasts and vertically with the initial SOC. These results can be generalized to imperfect batteries and numerous battery-like resources, including battery clusters, air-conditioners, and electric vehicle charging stations, although the number of segments may vary. These results pave the way for batteries to participate in electricity markets.
Authors:Khoi Phuong Dao, Juejun Hu
Abstract:
This paper proposes an inverse design scheme for resistive heaters. By adjusting the spatial distribution of a binary electrical resistivity map, the scheme enables objective-driven optimization of heaters to achieve pre-defined steady-state temperature profiles. The approach can be fully automated and is computationally efficient since it does not entail extensive iterative simulations of the entire heater structure. The design scheme offers a powerful solution for resistive heater device engineering in applications spanning electronics, photonics, and microelectromechanical systems.
Authors:Nicolas Rouquette, Alessandro Pinto, Inigo Incer
Abstract:
We present a compositional approach to early modeling and analysis of complex aerospace systems based on assume-guarantee contracts. Components in a system are abstracted into assume-guarantee specifications. Performing algebraic contract operations with Pacti allows us to relate local component specifications to that of the system. Applications to two aerospace case studies (the design of spacecraft to satisfy a rendezvous mission and the design of the thermal management system of a prototypical aircraft) show that this methodology provides engineers with an agile, early analysis and exploration process.
Authors:Pengyue Hou
Abstract:
Complex dynamical systems frequently encounter a recurrent structural instability: the collapse of the spectral gap, driving the system toward a low-dimensional "Zero-Mode Attractor" (e.g., spectral pile-up or over-smoothing). Building upon recent global well-posedness estimates [Hou, arXiv:2601.00638], this work generalizes the Multi-Scale Negative Coupled Information System (MNCIS) framework. We postulate that global stability requires an active topological operator - Adaptive Spectral Negative Coupling (ASNC) - functioning as a state-dependent high-pass filter that penalizes entropy accumulation at spectral boundaries. We validate this unified framework via three implementations: (1) Hydrodynamics: In 3D Navier-Stokes turbulence ($N=256^3$), ASNC acts as a global-enstrophy adaptive sub-grid scale (SGS) model, stabilizing the inviscid limit and preserving the Kolmogorov $-5/3$ inertial range without artificial hyper-viscosity. Crucially, we verify that the operator remains dormant ($γ\approx 0$) during the linear growth phase of physical instabilities, functioning strictly as a conditional topological clamp. (2) Artificial Intelligence: Addressing Over-smoothing in Graph Neural Networks (GNNs), we implement ASNC as a parameter-free topological constraint. Unlike baselines (e.g., DeepGCNs) relying on dense residual connections, our framework enables the training of ultra-deep 64-layer networks without residual connections, maintaining perfectly stationary feature variance ($σ^2 \equiv 1.0$) on the ogbn-arxiv benchmark. (3) Biological Physics: In reaction-diffusion morphogenesis, it stabilizes Turing patterns against diffusive washout in high-entropy regimes. Our results suggest that the MNCIS framework provides a base-independent topological condition for distinguishing viable complex systems from those collapsing into thermal equilibrium.
Authors:Krishna Chaitanya Sunkara
Abstract:
This work presents DCIM 3.0, a unified framework integrating semantic reasoning, predictive analytics, autonomous orchestration, and unified connectivity for next-generation AI data center management. The framework addresses critical challenges in infrastructure automation, sustainability, and digital-twin design through knowledge graph-based intelligence, thermal modeling, and the Unified Device Connectivity Protocol (UDCP).Keywords-Data Center Infrastructure Management, DCIM, AI Data Centers, Knowledge Graphs, Digital Twin, Thermal Management, Infrastructure Automation, Sustainability, GPU Computing, Data Center
Authors:Mina S. Khalaf
Abstract:
Reliable temperature forecasting in Enhanced Geothermal Systems (EGS) is essential, yet petroleum-based decline curves and many machine-learning surrogates do not enforce geothermal heat transfer, while thermo-hydro-mechanical (THM) simulation remains computationally expensive. This study proposes a physics-consistent framework that advances both decline-curve analysis and surrogate modeling. The classical Arps decline family is generalized for geothermal use by introducing an equilibrium-temperature term motivated by Newton-type cooling, ensuring finite late-time temperature limits while reducing exactly to the conventional Arps forms when the equilibrium term is set to zero. The extended decline curves are validated against Utah FORGE downhole temperature measurements and then used to construct learning surrogates on a controlled THM dataset spanning fracture count, well spacing, fracture spacing, host-rock thermal conductivity, and circulation rate. An equation-informed neural network embeds the modified decline equations as differentiable internal computational layers to produce full 0-60 month temperature trajectories from design and operational inputs. A probabilistic Gaussian Process Regression surrogate is also developed for direct multi-horizon forecasting with calibrated uncertainty, while a direct XGBoost regression baseline provides a purely data-driven reference. Across the simulation dataset, the extended decline models reproduce temperature trajectories with near-perfect fidelity (median RMSE = 0.071 °C), and the equation-informed network achieves typical hold-out errors of MAE = 3.06 °C and RMSE = 4.49 °C. The Gaussian Process surrogate delivers the strongest predictive accuracy across 3-60 month horizons (RMSE = 3.39 °C; MAE = 2.34 °C) with well-calibrated uncertainty, whereas the XGBoost baseline exhibits higher errors.
Authors:Sungwoo Kang
Abstract:
Predictive maintenance demands accurate anomaly detection and trustable explanations. Although multimodal fusion of sensor time-series and thermal imagery shows promise, we demonstrate that naive fusion strategies can paradoxically degrade performance. This paper introduces a Cascaded Anomaly Detection framework that decouples detection and localization. Stage 1 employs an LSTM-based sensor encoder with temporal attention for high-accuracy detection, while Stage 2 activates a CNN-based thermal encoder for post-detection fault localization. Our results reveal that sensor-only detection outperforms full fusion by 8.3 percentage points (93.08% vs. 84.79% F1-score), challenging the assumption that additional modalities invariably improve performance. We further contribute an explainability pipeline integrating SHAP, temporal/spatial attention, and gate weight analysis. This analysis uncovers a "modality bias" where fusion models assign 65-87% weight to the weaker thermal modality. Validated on a real-world bearing dataset (78,397 samples), our cascaded approach achieves state-of-the-art accuracy while providing actionable diagnostics for maintenance decision-making.
Authors:Jingming Li
Abstract:
A critical gap exists in LLM task-specific benchmarks. Thermal comfort, a sophisticated interplay of environmental factors and personal perceptions involving sensory integration and adaptive decision-making, serves as an ideal paradigm for evaluating real-world cognitive capabilities of AI systems. To address this, we propose TCEval, the first evaluation framework that assesses three core cognitive capacities of AI, cross-modal reasoning, causal association, and adaptive decision-making, by leveraging thermal comfort scenarios and large language model (LLM) agents. The methodology involves initializing LLM agents with virtual personality attributes, guiding them to generate clothing insulation selections and thermal comfort feedback, and validating outputs against the ASHRAE Global Database and Chinese Thermal Comfort Database. Experiments on four LLMs show that while agent feedback has limited exact alignment with humans, directional consistency improves significantly with a 1 PMV tolerance. Statistical tests reveal that LLM-generated PMV distributions diverge markedly from human data, and agents perform near-randomly in discrete thermal comfort classification. These results confirm the feasibility of TCEval as an ecologically valid Cognitive Turing Test for AI, demonstrating that current LLMs possess foundational cross-modal reasoning ability but lack precise causal understanding of the nonlinear relationships between variables in thermal comfort. TCEval complements traditional benchmarks, shifting AI evaluation focus from abstract task proficiency to embodied, context-aware perception and decision-making, offering valuable insights for advancing AI in human-centric applications like smart buildings.
Authors:Y. Sungtaek Ju
Abstract:
Reconstructing time-resolved flow fields from temporally sparse velocimetry measurements is critical for characterizing many complex thermal-fluid systems. We introduce a machine learning framework for uncertainty-aware flow reconstruction using sparse variational Gaussian processes in the Kolmogorov-Arnold network topology (SVGP-KAN). This approach extends the classical foundations of Linear Stochastic Estimation (LSE) and Spectral Analysis Modal Methods (SAMM) while enabling principled epistemic uncertainty quantification. We perform a systematic comparison of our framework with the classical reconstruction methods as well as Kalman filtering. Using synthetic data from pulsed impingement jet flows, we assess performance across fractional PIV sampling rates ranging from 0.5% to 10%. Evaluation metrics include reconstruction error, generalization gap, structure preservation, and uncertainty calibration. Our SVGP-KAN methods achieve reconstruction accuracy comparable to established methods, while also providing well-calibrated uncertainty estimates that reliably indicate when and where predictions degrade. The results demonstrate a robust, data-driven framework for flow field reconstruction with meaningful uncertainty quantification and offer practical guidance for experimental design in periodic flows.
Authors:Jose I. Aizpurua
Abstract:
The integration of physics-based knowledge with machine learning models is increasingly shaping the monitoring, diagnostics, and prognostics of electrical transformers. In this two-part series, the first paper introduced the foundations of Neural Networks (NNs) and their variants for health assessment tasks. This second paper focuses on integrating physics and uncertainty into the learning process. We begin with the fundamentals of Physics-Informed Neural Networks (PINNs), applied to spatiotemporal thermal modeling and solid insulation ageing. Building on this, we present Bayesian PINNs as a principled framework to quantify epistemic uncertainty and deliver robust predictions under sparse data. Finally, we outline emerging research directions that highlight the potential of physics-aware and trustworthy machine learning for critical power assets.
Authors:Alexander K. Chen
Abstract:
Practical utilization of large-scale machine learning requires a powerful compute setup, a necessity which poses a significant barrier to engagement with such artificial intelligence in more restricted system environments. While cloud computing offers a solution to weaker local environments, certain situations like training involving private or sensitive data, physical environments not available through the cloud, or higher anticipated usage costs, necessitate computing locally. We explore the potential to improve weaker local compute systems at zero additional cost by taking advantage of ubiquitous yet underutilized resources: mobile phones. Specifically, recent iOS phones are equipped with surprisingly powerful processors, but they also face limitations like memory constraints, thermal throttling, and OS sandboxing. We present a proof-of-concept system demonstrating a novel approach to harness an iOS device via distributed pipeline parallelism, achieving significant benefits in a lesser compute environment by accelerating modest model training, batch inference, and agentic LRM tool-usage. We discuss practical use-cases, limitations, and directions for future work. The findings of this paper highlight the potential for the improving commonplace mobile devices to provide greater contributions to machine learning.
Authors:Akshansh Mishra
Abstract:
Accurate prediction of temperature evolution is essential for understanding thermomechanical behavior in friction stir welding. In this study, molecular dynamics simulations were performed using LAMMPS to model aluminum friction stir welding at the atomic scale, capturing material flow, plastic deformation, and heat generation during tool plunge, traverse, and retraction. Atomic positions and velocities were extracted from simulation trajectories and transformed into physics based two dimensional spatial grids. These grids represent local height variation, velocity components, velocity magnitude, and atomic density, preserving spatial correlations within the weld zone. A two-dimensional convolutional neural network was developed to predict temperature directly from the spatially resolved atomistic data. Hyperparameter optimization was carried out to determine an appropriate network configuration. The trained model demonstrates strong predictive capability, achieving a coefficient of determination R square of 0.9439, a root mean square error of 14.94 K, and a mean absolute error of 11.58 K on unseen test data. Class Activation Map analysis indicates that the model assigns higher importance to regions near the tool material interface, which are associated with intense deformation and heat generation in the molecular dynamics simulations. The results show that spatial learning from atomistic simulation data can accurately reproduce temperature trends in friction stir welding while remaining consistent with physical deformation and flow mechanisms observed at the atomic scale.
Authors:Georgios Voulgaris
Abstract:
Modern deep learning models operating on multi-modal visual signals often rely on inductive biases that are poorly aligned with the physical processes governing signal formation, leading to brittle performance under cross-spectral and real-world conditions. In particular, approaches that prioritise direct thermal cues struggle to capture indirect yet persistent environmental alterations induced by sustained heat emissions. This work introduces a physics-aware representation learning framework that leverages multi-spectral information to model stable signatures of long-term physical processes. Specifically, a geological Short Wave Infrared (SWIR) ratio sensitive to soil property changes is integrated with Thermal Infrared (TIR) data through an intermediate fusion architecture, instantiated as FusionNet. The proposed backbone embeds trainable differential signal-processing priors within convolutional layers, combines mixed pooling strategies, and employs wider receptive fields to enhance robustness across spectral modalities. Systematic ablations show that each architectural component contributes to performance gains, with DGCNN achieving 88.7% accuracy on the SWIR ratio and FusionNet reaching 90.6%, outperforming state-of-the-art baselines across five spectral configurations. Transfer learning experiments further show that ImageNet pretraining degrades TIR performance, highlighting the importance of modality-aware training for cross-spectral learning. Evaluated on real-world data, the results demonstrate that combining physics-aware feature selection with principled deep learning architectures yields robust and generalisable representations, illustrating how first-principles signal modelling can improve multi-spectral learning under challenging conditions.
Authors:Moses Kiprono
Abstract:
Chronic wounds, including diabetic foot ulcers which affect up to one-third of people with diabetes, impose a substantial clinical and economic burden, with U.S. healthcare costs exceeding 25 billion dollars annually. Current wound assessment remains predominantly subjective, leading to inconsistent classification and delayed interventions. We present WoundNet-Ensemble, an Internet of Medical Things system leveraging a novel ensemble of three complementary deep learning architectures: ResNet-50, the self-supervised Vision Transformer DINOv2, and Swin Transformer, for automated classification of six clinically distinct wound types. Our system achieves 99.90 percent ensemble accuracy on a comprehensive dataset of 5,175 wound images spanning diabetic foot ulcers, pressure ulcers, venous ulcers, thermal burns, pilonidal sinus wounds, and fungating malignant tumors. The weighted fusion strategy demonstrates a 3.7 percent improvement over previous state-of-the-art methods. Furthermore, we implement a longitudinal wound healing tracker that computes healing rates, severity scores, and generates clinical alerts. This work demonstrates a robust, accurate, and clinically deployable tool for modernizing wound care through artificial intelligence, addressing critical needs in telemedicine and remote patient monitoring. The implementation and trained models will be made publicly available to support reproducibility.
Authors:Yufeng Xie
Abstract:
Infrared and visible image fusion is a pivotal technology in low-altitude UAV reconnaissance missions, providing high-quality data support for downstream tasks such as target detection and tracking by integrating thermal saliency with background texture details.However, traditional no-reference metrics fail(Specifically,like Entropy (EN) and Average Gradient (AG)) in complex low-light environments. They often misinterpret high-frequency sensor noise as valid detail. This creates a "Noise Trap," paradoxically assigning higher scores to noisy images and misguiding fusion algorithms.To address this, we propose the Target-Background Contrast (TBC) metric. Inspired by Weber's Law, TBC focuses on the relative contrast of salient targets rather than global statistics. Unlike traditional metrics, TBC penalizes background noise and rewards target visibility. Experiments on the DroneVehicle dataset demonstrate that TBC aligns better with human perception and provides a reliable standard for low-altitude scenarios.
Authors:Neelakantan Padmanabhan
Abstract:
This work introduces an ensemble parameter estimation framework that enables the Lumped Parameter Linear Superposition (LPLSP) method to generate reduced order thermal models from a single transient dataset. Unlike earlier implementations that relied on multiple parametric simulations to excite each heat source independently, the proposed approach simultaneously identifies all model coefficients using fully transient excitations. Two estimation strategies namely rank-reduction and two-stage decomposition are developed to further reduce computational cost and improve scalability for larger systems. The proposed strategies yield ROMs with mean temperature-prediction errors within 5% of CFD simulations while reducing model-development times to O(10^0 s)-O(10^1 s). Once constructed, the ROM evaluates new transient operating conditions in O(10^0 s), enabling rapid thermal analysis and enabling automated generation of digital twins for both simulated and physical systems.
Authors:Tessa Vu
Abstract:
Pedestrian heat exposure is a critical health risk in dense tropical cities, yet standard routing algorithms often ignore micro-scale thermal variation. Hot Hém is a GeoAI workflow that estimates and operationalizes pedestrian heat exposure in Hô Chí Minh City (HCMC), Vi\d{e}t Nam, colloquially known as Sài Gòn. This spatial data science pipeline combines Google Street View (GSV) imagery, semantic image segmentation, and remote sensing. Two XGBoost models are trained to predict land surface temperature (LST) using a GSV training dataset in selected administrative wards, known as phŏng, and are deployed in a patchwork manner across all OSMnx-derived pedestrian network nodes to enable heat-aware routing. This is a model that, when deployed, can provide a foundation for pinpointing where and further understanding why certain city corridors may experience disproportionately higher temperatures at an infrastructural scale.
Authors:Daniel Cavadia
Abstract:
Precise and real-time detection of gastrointestinal polyps during endoscopic procedures is crucial for early diagnosis and prevention of colorectal cancer. This work presents EndoSight AI, a deep learning architecture developed and evaluated independently to enable accurate polyp localization and detailed boundary delineation. Leveraging the publicly available Hyper-Kvasir dataset, the system achieves a mean Average Precision (mAP) of 88.3% for polyp detection and a Dice coefficient of up to 69% for segmentation, alongside real-time inference speeds exceeding 35 frames per second on GPU hardware. The training incorporates clinically relevant performance metrics and a novel thermal-aware procedure to ensure model robustness and efficiency. This integrated AI solution is designed for seamless deployment in endoscopy workflows, promising to advance diagnostic accuracy and clinical decision-making in gastrointestinal healthcare.
Authors:MD-Nazmus Sunbeam
Abstract:
Claims that humanoid robots achieve ``human-level'' actuation are common but rarely quantified. Peak torque or speed specifications tell us little about whether a joint can deliver the right combination of torque, power, and endurance at task-relevant postures and rates. We introduce a comprehensive framework that makes ``human-level'' measurable and comparable across systems. Our approach has three components. First, a kinematic \emph{DoF atlas} standardizes joint coordinate systems and ranges of motion using ISB-based conventions, ensuring that human and robot joints are compared in the same reference frames. Second, \emph{Human-Equivalence Envelopes (HEE)} define per-joint requirements by measuring whether a robot meets human torque \emph{and} power simultaneously at the same joint angle and rate $(q,ω)$, weighted by positive mechanical work in task-specific bands (walking, stairs, lifting, reaching, and hand actions). Third, the \emph{Human-Level Actuation Score (HLAS)} aggregates six physically grounded factors: workspace coverage (ROM and DoF), HEE coverage, torque-mode bandwidth, efficiency, and thermal sustainability. We provide detailed measurement protocols using dynamometry, electrical power monitoring, and thermal testing that yield every HLAS input from reproducible experiments. A worked example demonstrates HLAS computation for a multi-joint humanoid, showing how the score exposes actuator trade-offs (gearing ratio versus bandwidth and efficiency) that peak-torque specifications obscure. The framework serves as both a design specification for humanoid development and a benchmarking standard for comparing actuation systems, with all components grounded in published human biomechanics data.
Authors:Shakil Ahmed
Abstract:
Quantum networks (QNs) supported by terahertz (THz) wireless links present a transformative alternative to fiber-based infrastructures, particularly in mobile and infrastructure-scarce environments. However, signal attenuation, molecular absorption, and severe propagation losses in THz channels pose significant challenges to reliable quantum state transmission and entanglement distribution. To overcome these limitations, we propose a dynamic reconfigurable intelligent surface (RIS)-assisted wireless QN architecture that leverages adaptive RIS elements capable of switching between active and passive modes based on the incident signal-to-noise ratio (SNR). These dynamic RIS elements enhance beamforming control over amplitude and phase, enabling robust redirection and compensation for THz-specific impairments. We develop a detailed analytical model that incorporates key physical layer phenomena in THz quantum links, including path loss, fading, thermal noise, and alignment variations. A secure optimization framework is formulated to jointly determine RIS placement and entanglement generation rate (EGR) allocation, while satisfying fidelity, security, and fairness constraints under diverse quality of service (QoS) demands. The model also includes an exploration of side-channel vulnerabilities arising from dynamic RIS switching patterns. Simulation results demonstrate that the proposed architecture yields up to 87\% fidelity enhancement and 65\% fairness improvement compared to static RIS baselines, while maintaining robustness under realistic THz channel conditions. These results underscore the promise of dynamic RIS technology in enabling scalable and adaptive quantum communications over wireless THz links.
Authors:Marcelo Cerda Castillo
Abstract:
Short-term forecasting of airport fog (visibility < 1.0 km) presents challenges in geographic generalization because many machine learning models rely on location-specific features and fail to transfer across sites. This study investigates whether fundamental thermodynamic and radiative processes can be encoded in a coordinate-free (location-independent) feature set to enable geographic transferability. A gradient boosting classifier (XGBoost) trained on Santiago, Chile (SCEL, 33S) data from 2002-2009 was evaluated on a 2010-2012 holdout set and under strict zero-shot tests at Puerto Montt (SCTE), San Francisco (KSFO), and London (EGLL). The model achieved AUC values of 0.923-0.947 across distances up to 11,650 km and different fog regimes (radiative, advective, marine). Consistent SHAP feature rankings show that visibility persistence, solar angle, and thermal gradients dominate predictions, suggesting the model learned transferable physical relationships rather than site-specific patterns. Results suggest that physics-informed, coordinate-free feature engineering can yield geographically transferable atmospheric forecasting tools.
Authors:Elijah Pelofske
Abstract:
This study numerically investigates the thermal sampling properties of QAOA, the Quantum Alternating Operator Ansatz which was generalized from the original Quantum Approximate Optimization Algorithm. Specifically, the ability of QAOA to sample from the Gibbs distribution, equivalently the Boltzmann distribution, defined by a classical Ising model, specifically a fully connected disordered spin glass (Sherrington-Kirkpatrick) model. We focus on two different QAOA mixers; the standard transverse field X mixer, and the Grover mixer. At a QAOA depth of one we examine, for a single full QAOA parameter search space period, the energy landscape, the Shannon entropy landscape of the QAOA probability distribution, and the tradeoff between Boltzmann distribution sampling temperature and error rate (how close to the true Boltzmann distribution is the QAOA distribution). We find that at very high temperatures one-round Grover mixer QAOA can sample from the Boltzmann distribution more accurately than the standard X mixer QAOA at one round. Both X mixer and Grover mixer depth one QAOA can serve as approximate Boltzmann distribution samplers, and how good this approximation is depends heavily on the QAOA angle choice.
Authors:Addina Rahaman
Abstract:
Changing climate conditions threaten the natural permafrost thaw-freeze cycle, leading to year-round soil temperatures above 0°C. In Alaska, the warming of the topmost permafrost layer, known as the active layer, signals elevated greenhouse gas release due to high carbon storage. Accurate soil temperature prediction is therefore essential for risk mitigation and stability assessment; however, many existing approaches overlook the numerous factors driving soil thermal dynamics. This study presents a proof-of-concept latitude-based deep learning pipeline for modeling yearly soil temperatures across multiple depths. The framework employs dynamic reanalysis feature data from the ERA5-Land dataset, static geologic and lithological features, sliding-window sequences for seasonal context, a derived scenario signal feature for long-term climate forcing, and latitude band embeddings for spatial sensitivity. Five deep learning models were tested: a Temporal Convolutional Network (TCN), a Transformer, a 1-Dimensional Convolutional Long-Short Term Memory (Conv1DLSTM), a Gated-Recurrent Unit (GRU), and a Bidirectional Long-Short Term Memory (BiLSTM). Results showed solid recognition of latitudinal and depth-wise temperature discrepancies, with the GRU performing best in sequential temperature pattern detection. Bias-corrected CMIP5 RCP data enabled recognition of sinusoidal temperature trends, though limited divergence between scenarios were observed. This study establishes an end-to-end framework for adopting deep learning in active layer temperature modeling, offering seasonal, spatial, and vertical temperature context without intrinsic restrictions on feature selection.
Authors:Mark M. Wilde
Abstract:
Quantum generalizations of the Fisher information are important in quantum information science, with applications in high energy and condensed matter physics and in quantum estimation theory, machine learning, and optimization. One can derive a quantum generalization of the Fisher information matrix in a natural way as the Hessian matrix arising in a Taylor expansion of a smooth divergence. Such an approach is appealing for quantum information theorists, given the ubiquity of divergences in quantum information theory. In contrast to the classical case, there is not a unique quantum generalization of the Fisher information matrix, similar to how there is not a unique quantum generalization of the relative entropy or the Rényi relative entropy. In this paper, I derive information matrices arising from the log-Euclidean, $α$-$z$, and geometric Rényi relative entropies, with the main technical tool for doing so being the method of divided differences for calculating matrix derivatives. Interestingly, for all non-negative values of the Rényi parameter $α$, the log-Euclidean Rényi relative entropy leads to the Kubo-Mori information matrix, and the geometric Rényi relative entropy leads to the right-logarithmic derivative Fisher information matrix. Thus, the resulting information matrices obey the data-processing inequality for all non-negative values of the Rényi parameter $α$ even though the original quantities do not. Additionally, I derive and establish basic properties of $α$-$z$ information matrices resulting from the $α$-$z$ Rényi relative entropies. For parameterized thermal states and time-evolved states, I establish formulas for their $α$-$z$ information matrices and hybrid quantum-classical algorithms for estimating them, with applications in quantum Boltzmann machine learning.
Authors:Stephen Whitelam
Abstract:
We show how to adjust the parameters of a thermodynamic computer by gradient descent in order to perform a desired computation at a specified observation time. Within a digital simulation of a thermodynamic computer, training proceeds by maximizing the probability with which the computer would generate an idealized dynamical trajectory. The idealized trajectory is designed to reproduce the activations of a neural network trained to perform the desired computation. This teacher-student scheme results in a thermodynamic computer whose finite-time dynamics enacts a computation analogous to that of the neural network. The parameters identified in this way can be implemented in the hardware realization of the thermodynamic computer, which will perform the desired computation automatically, driven by thermal noise. We demonstrate the method on a standard image-classification task, and estimate the thermodynamic advantage -- the ratio of energy costs of the digital and thermodynamic implementations -- to exceed seven orders of magnitude. Our results establish gradient descent as a viable training method for thermodynamic computing, enabling application of the core methodology of machine learning to this emerging field.
Authors:Yuhong Lu
Abstract:
Unified multi-modal encoders that bind vision, audio, and other sensors into a shared embedding space are attractive building blocks for robot perception and decision-making. However, on-robot deployment exposes the vision branch to adversarial and natural corruptions, making robustness a prerequisite for safety. Prior defenses typically align clean and adversarial features within CLIP-style encoders and overlook broader cross-modal correspondence, yielding modest gains and often degrading zero-shot transfer. We introduce RLBind, a two-stage adversarial-invariant cross-modal alignment framework for robust unified embeddings. Stage 1 performs unsupervised fine-tuning on clean-adversarial pairs to harden the visual encoder. Stage 2 leverages cross-modal correspondence by minimizing the discrepancy between clean/adversarial features and a text anchor, while enforcing class-wise distributional alignment across modalities. Extensive experiments on Image, Audio, Thermal, and Video data show that RLBind consistently outperforms the LanguageBind backbone and standard fine-tuning baselines in both clean accuracy and norm-bounded adversarial robustness. By improving resilience without sacrificing generalization, RLBind provides a practical path toward safer multi-sensor perception stacks for embodied robots in navigation, manipulation, and other autonomy settings.
Authors:Alejandro D. Mousist
Abstract:
This paper presents ASTREA, the first agentic system deployed on flight-heritage hardware (TRL 9) for autonomous spacecraft operations. Using thermal control as a representative use case, we integrate a resource-constrained Large Language Model (LLM) agent with a reinforcement learning controller in an asynchronous architecture tailored for space-qualified platforms. Ground experiments show that LLM-guided supervision improves thermal stability and reduces violations, confirming the feasibility of combining semantic reasoning with adaptive control under hardware constraints. However, on-orbit validation aboard the International Space Station (ISS) reveals performance degradation caused by inference latency mismatched with the rapid thermal cycles characteristic of Low Earth Orbit (LEO) satellites. These results highlight both the opportunities and current limitations of agentic LLM-based systems in real flight environments, providing practical design guidelines for future space autonomy.
Authors:Eric Guiffo Kaigom
Abstract:
Robots are unrelentingly used to achieve operational efficiency in Industry 4.0 along with symbiotic and sustainable assistance for the work-force in Industry 5.0. As resilience, robustness, and well-being are required in anti-fragile manufacturing and human-centric societal tasks, an autonomous anticipation and adaption to thermal saturation and burns due to motors overheating become instrumental for human safety and robot availability. Robots are thereby expected to self-sustain their performance and deliver user experience, in addition to communicating their capability to other agents in advance to ensure fully automated thermally feasible tasks, and prolong their lifetime without human intervention. However, the traditional robot shutdown, when facing an imminent thermal saturation, inhibits productivity in factories and comfort in the society, while cooling strategies are hard to implement after the robot acquisition. In this work, smart digital twins endowed with generative AI, i.e., variational autoencoders, are leveraged to manage thermally anomalous and generate uncritical robot states. The notion of thermal difficulty is derived from the reconstruction error of variational autoencoders. A robot can use this score to predict, anticipate, and share the thermal feasibility of desired motion profiles to meet requirements from emerging applications in Industry 6.0 and Society 6.0.
Authors:Aryan Gupta
Abstract:
Accurate and fast thermophysical models are needed to embed vapor-liquid equilibrium (VLE) calculations in design, optimization, and control loops for cryogenic mixtures. This study asks whether a structure-aware graph neural network (GNN; DimeNet++) trained on GERG-2008/CoolProp data can act as a practical surrogate for an equation of state (EoS). We generate a ternary dataset over 90-200 K and pressures to 100 bar, curate it with a 15% density filter (reducing 5,200 states to 1,516), and pair each state with a lightweight molecular-dynamics snapshot to supply structural features. The model is trained in two stages; pretraining on residual Helmholtz energy followed by pressure fine-tuning with a stability penalty; and evaluated via single-phase interpolation tests, solver-free derivative-quality diagnostics, an audited VLE driver, and a latency benchmark. Within its regime, the GNN interpolates single-phase properties reasonably well; however, the VLE driver accepts no GNN equilibria on tested binaries (all plotted VLE points are CoolProp fallback or the solver fails), and diagnostic probes reveal jagged P(V|T) paths and thermal-stability flags concentrated in dense/cold regions, indicating insufficient derivative smoothness/consistency for robust equilibrium solving. An end-to-end timing comparison shows no single-phase speed advantage relative to CoolProp (tens of milliseconds vs sub-millisecond). We conclude that, as configured, the surrogate in this study is not solver-ready for VLE and offers no runtime benefit; its value is methodological, delineating failure modes and pointing to remedies such as physics-informed training signals and targeted coverage near phase boundaries.
Authors:Seyd Teymoor Seydi
Abstract:
Accurate and timely mapping of burned areas is crucial for environmental monitoring, disaster management, and assessment of climate change. This study presents a novel approach to automated burned area mapping using the AlphaEArth dataset combined with the Siamese U-Net deep learning architecture. The AlphaEArth Dataset, comprising high-resolution optical and thermal infrared imagery with comprehensive ground-truth annotations, provides an unprecedented resource for training robust burned area detection models. We trained our model with the Monitoring Trends in Burn Severity (MTBS) dataset in the contiguous US and evaluated it with 17 regions cross in Europe. Our experimental results demonstrate that the proposed ensemble approach achieves superior performance with an overall accuracy of 95%, IoU of 0.6, and F1-score of 74% on the test dataset. The model successfully identifies burned areas across diverse ecosystems with complex background, showing particular strength in detecting partially burned vegetation and fire boundaries and its transferability and high generalization in burned area mapping. This research contributes to the advancement of automated fire damage assessment and provides a scalable solution for global burn area monitoring using the AlphaEarth dataset.
Authors:Ioannis Krikidis
Abstract:
This letter investigates a novel wireless-powered quantum optical communication system, in which a batteryless quantum transmitter harvests energy from a classical radio-frequency source to transmit quantum coherent states. The transmission employs M-ary phase shift keying (M-PSK) modulation over an optical channel impaired by thermal noise, and the fundamental detection performance is evaluated using the Helstrom bound. An optimization framework is proposed that jointly determines the optimal quantum measurement and the energy-harvesting time fraction to maximize the effective rate under a block time constraint. Analytical expressions are derived for special cases, while semidefinite programming techniques are employed for the general M-PSK scenario. Numerical results validate the unimodal nature of the effective rate function and demonstrate the impact of the optimal design parameters.
Authors:Serra Aksoy
Abstract:
Artificial intelligence deployment for automated photovoltaic (PV) monitoring faces interpretability barriers that limit adoption in energy infrastructure applications. While deep learning achieves high accuracy in thermal fault detection, validation that model decisions align with thermal physics principles remains lacking, creating deployment hesitancy where understanding model reasoning is critical. This study provides a systematic comparison of convolutional neural networks (ResNet-18, EfficientNet-B0) and vision transformers (ViT-Tiny, Swin-Tiny) for thermal PV fault detection, using XRAI saliency analysis to assess alignment with thermal physics principles. This represents the first systematic comparison of CNNs and vision transformers for thermal PV fault detection with physics-validated interpretability. Evaluation on 20,000 infrared images spanning normal operation and 11 fault categories shows that Swin Transformer achieves the highest performance (94% binary accuracy; 73% multiclass accuracy) compared to CNN approaches. XRAI analysis reveals that models learn physically meaningful features, such as localized hotspots for cell defects, linear thermal paths for diode failures, and thermal boundaries for vegetation shading, consistent with expected thermal signatures. However, performance varies significantly across fault types: electrical faults achieve strong detection (F1-scores >0.90) while environmental factors like soiling remain challenging (F1-scores 0.20-0.33), indicating limitations imposed by thermal imaging resolution. The thermal physics-guided interpretability approach provides methodology for validating AI decision-making in energy monitoring applications, addressing deployment barriers in renewable energy infrastructure.
Authors:Mahmoud Dhimish
Abstract:
Thermal anomaly detection in solar photovoltaic (PV) systems is essential for ensuring operational efficiency and reducing maintenance costs. In this study, we developed and named HOTSPOT-YOLO, a lightweight artificial intelligence (AI) model that integrates an efficient convolutional neural network backbone and attention mechanisms to improve object detection. This model is specifically designed for drone-based thermal inspections of PV systems, addressing the unique challenges of detecting small and subtle thermal anomalies, such as hotspots and defective modules, while maintaining real-time performance. Experimental results demonstrate a mean average precision of 90.8%, reflecting a significant improvement over baseline object detection models. With a reduced computational load and robustness under diverse environmental conditions, HOTSPOT-YOLO offers a scalable and reliable solution for large-scale PV inspections. This work highlights the integration of advanced AI techniques with practical engineering applications, revolutionizing automated fault detection in renewable energy systems.
Authors:Theodore V. Gortsas
Abstract:
The Fisher-KPP partial differential equation has been employed in science to model various biological, chemical, and thermal phenomena. Time fractional extensions of Fisher's equation have also appeared in the literature, aiming to model systems with memory. The solution of the time fractional Fisher-KPP equation is challenging due to the interplay between the nonlinearity and the nonlocality imposed by the fractional derivatives. An accurate method that for the solution of time fractional diffusion problems is the Boundary Element Method (BEM). The conventional BEM has a high computational cost and memory requirements since it leads to dense coefficient matrices. For nonlinear transient problems, its efficiency is further reduced due to the appearance of volume integrals. In the present work an extension of the recently proposed Local Domain Boundary Element Method (LD-BEM) is presented for the solution of nonlinear time fractional Fisher-KPP problems. The implemented numerical method is used to examine various two-dimensional problems related to the Fisher-KPP equation using different definitions of the fractional derivative.
Authors:Orhan Gazi
Abstract:
In this paper we investigate the effects of the thermal noise of the base resistance of common emitter amplifier (CEA) on the output SNR, and we show that a first order Butterworth filter at the output of the CEA significantly improves output SNR significantly and supress the performances of higher order Butterworth, Chebyshev I, II and elliptic filters. We propose a formula for the selection of cut-off frequency of analog filters for given orders to achieve significant SNR improvement at CEA output. Considering the filter complexity and output SNR improvement, we can conclude that the first order Butterworth filter outperforms Chebyshev I, II and elliptic filters.
Authors:Sebastian Barros Elgueta
Abstract:
In 2023, satellite and mobile networks crossed a historic threshold: standard smartphones, using unmodified 3GPP protocols, connected directly to low Earth orbit (LEO) satellites. This first wave of direct-to-device (D2D) demonstrations validated the physical feasibility of satellite-based mobile access. However, these systems remain fallback-grade--rural-only, bandwidth-limited, and fully dependent on Earth-based mobile cores for identity, session, and policy control. This paper asks a more ambitious question: Can a complete mobile network, including radio access, core functions, traffic routing, and content delivery, operate entirely from orbit? And can it deliver sustained, urban-grade service in the world's densest cities? We present the first end-to-end system architecture for a fully orbital telco, integrating electronically steered phased arrays with 1000-beam capacity, space-based deployment of 5G core functions (UPF, AMF), and inter-satellite laser mesh backhaul. We analyze spectral efficiency, beam capacity, and link budgets under dense urban conditions, accounting for path loss, Doppler, and multipath. Simulations show that rooftop and line-of-sight users can sustain 64-QAM throughput, while street-level access is feasible with relay or assisted beam modes. The paper outlines the remaining constraints, power, thermal dissipation, compute radiation hardening, and regulatory models, and demonstrates that these are engineering bottlenecks, not physical limits. Finally, we propose a staged 15-year roadmap from today's fallback D2D systems to autonomous orbital overlays delivering 50-100 Mbps to handhelds in megacities, with zero reliance on terrestrial infrastructure.
Authors:Eldar Knar
Abstract:
Amid accelerated digitalization, not only is the scale of data processing and storage increasing, but so too is the associated infrastructure load on the climate. Current climate models and environmental protocols almost entirely overlook the impact of information and communication technologies on the thermal and energy balance of the biosphere.
This paper proposes the theory of information and climate feedback (ICF) as a new nonlinear model describing the loop of digitalization, energy consumption, the thermal footprint, the climatic response, and the vulnerability of digital infrastructure. The system is formalized via differential equations with delays and parameters of sensitivity, greenness, and phase stability.
A multiscenario numerical analysis, phase reconstructions, and thermal cartography were conducted. Critical regimes, including digital overheating, fluctuational instability, and infrastructural collapse in the absence of adaptive measures, were identified.
The paper concludes with the proposal of an international agreement titled the Green Digital Accord and a set of metrics for sustainable digitalization. This work integrates climatology, information technologies, and the political economy of sustainability.
Authors:Roy Elkayam
Abstract:
Accurate prediction of effluent temperature in recharge basins is essential for optimizing the Soil Aquifer Treatment (SAT) process, as temperature directly influences water viscosity and infiltration rates. This study develops and evaluates predictive models for effluent temperature in the upper recharge layer of a Shafdan SAT system recharge basin using ambient meteorological data. Multiple linear regression (MLR), neural networks (NN), and random forests (RF) were tested for their predictive accuracy and interpretability. The MLR model, preferred for its operational simplicity and robust performance, achieved high predictive accuracy (R2 = 0.86-0.87) and was used to estimate effluent temperatures over a 10-year period. Results highlight pronounced seasonal temperature cycles and the importance of topsoil temperature in governing the thermal profile of the infiltrating effluent. The study provides practical equations for real-time monitoring and long-term planning of SAT operations.
Authors:Ali Peivandizadeh
Abstract:
The explosive growth of artificial intelligence has created gigawatt-scale data centers that fundamentally challenge power system operation, exhibiting power fluctuations exceeding 500 MW within seconds and millisecond-scale variations of 50-75% of thermal design power. This paper presents a comprehensive theoretical framework that reconceptualizes Virtual Power Plants (VPPs) to accommodate these extreme dynamics through a four-layer hierarchical control architecture operating across timescales from 100 microseconds to 24 hours.
We develop control mechanisms and stability criteria specifically tailored to converter-dominated systems with pulsing megawatt-scale loads. We prove that traditional VPP architectures, designed for aggregating distributed resources with response times of seconds to minutes, cannot maintain stability when confronted with AI data center dynamics exhibiting slew rates exceeding 1,000 MW/s at gigawatt scale.
Our framework introduces: (1) a sub-millisecond control layer that interfaces with data center power electronics to actively dampen power oscillations; (2) new stability criteria incorporating protection system dynamics, demonstrating that critical clearing times reduce from 150 ms to 83 ms for gigawatt-scale pulsing loads; and (3) quantified flexibility characterization showing that workload deferability enables 30% peak reduction while maintaining AI service availability above 99.95%.
This work establishes the mathematical foundations necessary for the stable integration of AI infrastructure that will constitute 50-70% of data center electricity consumption by 2030.
Authors:Shahbaz Hussain
Abstract:
The economic dispatch of generators is a major concern in thermal power plants that governs the share of each generating unit with an objective of minimizing fuel cost by fulfilling load demand. This problem is not as simple as it looks because of system constraints that cannot be neglected practically. Moreover, increased awareness of clean technology imposes another important limit on the emission of pollutants obtained from burning of fossil fuels. Classical optimization methods lack the ability of solving such a complex and multi-objective problem. Hence, various modern artificial intelligence (AI) techniques based on evolution and social behaviour of organisms are being used to solve such problems because they are easier to implement, give accurate results and take less computational time. In this work, a study is done on most of the contemporary basic AI techniques being used in literature for power systems in general and combined economic emission dispatch (CEED) in particular. The dispatch problem is implemented on IEEE 30-bus benchmarked system in MATLAB for different load demands considering all gases (COX, NOX and SOX) using particle swarm optimization (PSO) and genetic algorithm (GA) and their results are compared with each other.
Authors:Atahan Karagoz
Abstract:
This paper introduces Energentic Intelligence, a class of autonomous systems defined not by task performance, but by their capacity to sustain themselves through internal energy regulation. Departing from conventional reward-driven paradigms, these agents treat survival-maintaining functional operation under fluctuating energetic and thermal conditions-as the central objective. We formalize this principle through an energy-based utility function and a viability-constrained survival horizon, and propose a modular architecture that integrates energy harvesting, thermal regulation, and adaptive computation into a closed-loop control system. A simulated environment demonstrates the emergence of stable, resource-aware behavior without external supervision. Together, these contributions provide a theoretical and architectural foundation for deploying autonomous agents in resource-volatile settings where persistence must be self-regulated and infrastructure cannot be assumed.
Authors:Panagiotis Kakosimos
Abstract:
The electrification of powertrains is rising as the objective for a more viable future is intensified. To ensure continuous and reliable operation without undesirable malfunctions, it is essential to monitor the internal temperatures of machines and keep them within safe operating limits. Conventional modeling methods can be complex and usually require expert knowledge. With the amount of data collected these days, it is possible to use information models to assess thermal behaviors. This paper investigates artificial intelligence techniques for monitoring the cooling efficiency of induction machines. Experimental data was collected under specific operating conditions, and three machine-learning models have been developed. The optimal configuration for each approach was determined through rigorous hyperparameter searches, and the models were evaluated using a variety of metrics. The three solutions performed well in monitoring the condition of the machine even under transient operation, highlighting the potential of data-driven methods in improving the thermal management.
Authors:John J. Bird
Abstract:
Long range flight by fixed-wing aircraft without propulsion systems can be accomplished by "soaring" -- exploiting randomly located updrafts to gain altitude which is expended in gliding flight. As the location of updrafts is uncertain and cannot be determined except through in situ observation, aircraft exploiting this energy source are at risk of failing to find a subsequent updraft. Determining when an updraft must be exploited to continue flight is essential to managing risk and optimizing speed. Graph percolation offers a theoretical explanation for this risk, and a framework for evaluating it using information available to the operator of a soaring aircraft in flight. The utility of graph percolation as a risk measure is examined by analyzing flight logs from human soaring pilots. This analysis indicates that in sport soaring pilots rarely operate in a condition which does not satisfy graph percolation, identifies an apparent desired minimum node degree, and shows that pilots accept reduced climb rates in order to maintain percolation.
Authors:Tomáš RoubÃÄek
Abstract:
The fully-implicit time discretization (i.e. the backward Euler formula) is applied to compressible nonlinear dynamical models of thermo-viscoelastic solids in the Eulerian description, i.e. in the actual deforming configuration, formulated in terms of rates. The Kelvin-Voigt rheology or also, in the deviatoric part, the Jeffreys rheology (covering creep or plasticity) are considered, using the additive Green-Naghdi's decomposition of total strain into the elastic and the inelastic strains formulated in terms of (objective) rates exploiting the Zaremba-Jaumann time derivative. A linearized convective model at large displacements is considered, focusing on the case where the internal energy additively splits the (convex) mechanical and the thermal parts. The time-discrete suitably regularized scheme is devised. The numerical stability and, considering the multipolar 2nd-grade viscosity, also convergence towards weak solutions are proved, exploiting the convexity of the kinetic energy when written in terms of linear momentum instead of velocity and estimating the temperature gradient from the entropy-like inequality.
Authors:Hassan Irshad Bhatti
Abstract:
Precise temperature measurement at micro/nanoscale is crucial across various domains including physical sciences, chemical processes, industrial production, medical diagnosis, weather forecasting, electronics, and biology. Micro/nanoscale thermal mapping requires precise techniques such as thermocouples, resistance-based devices, infrared thermography, optical interferometry, Raman thermometry, and Time domain-thermoreflectance (TDTR) method. Each method has its advantages and limitations, emphasizing the importance of selecting the appropriate technique. Among these methods, micro-thin film thermocouples (TFTCs) offer a compelling solution due to their direct contact-based temperature measurements, minimal surface preparation requirements, lower cost, and robustness against environmental factors. Thermocouples work on the well-established Seebeck effect, where a voltage is generated proportional to the temperature difference between two points. However, at micro/nanoscale, the Seebeck coefficients of thermocouples differ from those in bulk materials, requiring experimental calibration for precise measurements. To address this, we introduce an on-chip characterization platform with a differential temperature measurement setup on a borosilicate glass substrate. This platform utilizes a microheater as a localized heat source to elevate the temperature at the hot junction of the TFTC while maintaining the cold junction at ambient conditions. Numerical simulations are employed to engineer both the microheater and TFTC junction for precise temperature control. The functionality of this platform is validated by fabricating TFTCs using standard fabrication processes and measuring the TFTC response to determine the differential Seebeck coefficient of a Platinum-Chromium TFTC Junction. The calculated sensitivity of Pt/Cr TFTCs using this calibration method is 19.23 +- 0.405 μV/C.
Authors:Alexei V. Tkachenko
Abstract:
While Landauer's principle sets a fundamental energy limit for irreversible digital computation, we show that Deep Neural Networks (DNNs) implemented on analog physical substrates can operate under markedly different thermodynamic constraints. We distinguish between two classes of analog systems: dynamic and quasi-static. In dynamic systems, energy dissipation arises from neuron resets, with a lower bound governed by Landauer's principle. To analyse a quasi-static analog platform, we construct an explicit mapping of a generic feedforward DNN onto a physical system described by a model Hamiltonian. In this framework, inference can proceed reversibly, with no minimum free energy cost imposed by thermodynamics. We further analyze the training process in quasi-static analog networks and derive a fundamental lower bound on its energy cost, rooted in the interplay between thermal and statistical noise. Our results suggest that while analog implementations can outperform digital ones during inference, the thermodynamic cost of training scales similarly in both paradigms.
Authors:Fabien Casenave
Abstract:
In an industrial group like Safran, numerical simulations of physical phenomena are integral to most design processes. At Safran's corporate research center, we enhance these processes by developing fast and reliable surrogate models for various physics. We focus here on two technologies developed in recent years. The first is a physical reduced-order modeling method for non-linear structural mechanics and thermal analysis, used for calculating the lifespan of high-pressure turbine blades and performing heat analysis of high-pressure compressors. The second technology involves learning physics simulations with non-parameterized geometrical variability using classical machine learning tools, such as Gaussian process regression. Finally, we present our contributions to the open-source and open-data community.
Authors:Karthik Reddy Lyathakula
Abstract:
Estimating the material properties of thermal protection films is crucial for their effective design and application, particularly in high-temperature environments. This work presents a novel approach to determine the properties using uncertainty quantification simulations. We quantify uncertainty in the material properties for effective insulation by proposing a Bayesian distribution for them. Sampling from this distribution is performed using Monte Carlo simulations, which require repeatedly solving the predictive thermal model. To address the computational inefficiency of conventional numerical simulations, we develop a parametric Physics-Informed Neural Network (PINN) to solve the heat transfer problem. The proposed PINN significantly reduces computational time while maintaining accuracy, as verified against traditional numerical solutions. Additionally, we used the Sequential Monte Carlo (SMC) method to enable vectorized and parallel computations, further enhancing computational speedup. Our results demonstrate that integrating MCMC with PINN decreases computational time substantially compared to using standard numerical methods. Moreover, combining the SMC method with PINN yields multifold computational speedup, making this approach highly effective for the rapid and accurate estimation of material properties.
Authors:Qasim Khan
Abstract:
Nonlinear thermoelastic systems play a crucial role in understanding thermal conductivity, stresses, elasticity, and temperature interactions. This research focuses on finding solutions to these systems in their fractional forms, which is a significant aspect of the study. We consider various proposed models related to fractional thermoelasticity and derive results through sophisticated methodologies. Numerical simulations are conducted for both fractional and integer order thermoelastic coupled systems, with results presented in tables and graphs. The graphs indicate a close correspondence between the approximate and exact solutions. The solutions obtained demonstrate convergence for both fractional and integer order problems, ensuring accurate modeling. Furthermore, the tables confirm that greater accuracy can be achieved by increasing the number of terms in the series of solutions.
Authors:David J Poland
Abstract:
This paper presents a novel predictive maintenance framework centered on Enhanced Quantile Regression Neural Networks EQRNNs, for anticipating system failures in industrial robotics. We address the challenge of early failure detection through a hybrid approach that combines advanced neural architectures. The system leverages dual computational stages: first implementing an EQRNN optimized for processing multi-sensor data streams including vibration, thermal, and power signatures, followed by an integrated Spiking Neural Network SNN, layer that enables microsecond-level response times. This architecture achieves notable accuracy rates of 92.3\% in component failure prediction with a 90-hour advance warning window. Field testing conducted on an industrial scale with 50 robotic systems demonstrates significant operational improvements, yielding a 94\% decrease in unexpected system failures and 76\% reduction in maintenance-related downtimes. The framework's effectiveness in processing complex, multi-modal sensor data while maintaining computational efficiency validates its applicability for Industry 4.0 manufacturing environments.
Authors:Dev Shah
Abstract:
Touch is one of the most intuitive ways for humans to interact with the world, and as we advance toward a ubiquitous computing environment where technology seamlessly integrates into daily life, natural interaction methods are essential. This paper introduces UbiTouch, a system leveraging thermal imaging to detect touch interactions on arbitrary surfaces. By employing a single thermal camera, UbiTouch differentiates between hovering and touch, detects multi-finger input, and completes trajectory tracking. Our approach emphasizes the use of lightweight, low-computation algorithms that maintain robust detection accuracy through innovative vision-based processing. UbiTouch aims to enable scalable, sustainable, and adaptable interaction systems for diverse applications, particularly with regards to on-human sensing.
Authors:Osama A. Marzouk
Abstract:
This study explores the suitability of hydrogen-based plasma in direct power extraction (DPE) as a non-conventional electricity generation method. We apply computational modeling and principles in physics and chemistry to estimate different thermal and electric properties of a water-vapor/nitrogen/cesium-vapor (H2O/N2/Cs) gas mixture with different levels of cesium (Cs) at a fixed temperature of 2300 K (2026.85 °C). This gas mixture and temperature are selected because they resemble the stoichiometric combustion of hydrogen with air, followed by the addition of the alkali metal element cesium to allow ionization, thus converting the gas mixture into electrically conducting plasma. We vary the cesium mole fraction in the gas mixture by two orders of magnitude, from a minute amount of 0.0625% (1/1600) to a major amount of 16% (0.16). We use these results to further estimate the theoretical upper limit of the electric power output from a unit volume of a high-speed magnetohydrodynamic (MHD) channel, with the plasma accelerated inside it to twice the local speed of sound (Mach number 2) while subject to an applied magnetic field of 5 T (5 teslas). We report that there is an optimum cesium mole fraction of 3%, at which the power output is maximized. Per 1 m3 of plasma volume, the estimated theoretical electric power generation at 1 atm (101.325 kPa) pressure of the hydrogen-combustion mixture is extraordinarily high at 360 MW/m3, and the plasma electric conductivity is 17.5 S/m. This estimated power generation even reaches an impressive level of 1.15 GW/m3 (11500 MW/m3) if the absolute pressure can be decreased to 0.0625 atm (6.333 kPa), at which the electric conductivity exceeds 55 S/m (more than 10 times the electric conductivity of seawater).
Authors:Ranran Yang
Abstract:
In the context of high fossil fuel consumption and inefficiency within China's energy systems, effective demand-side management is essential. This study examines the thermal characteristics of various building types across different functional areas, utilizing the concept of body coefficient to integrate their unique structural and energy use traits into a demand response framework supported by real-time pricing. We developed a Stackelberg game-based bi-level optimization model that captures the dynamic interplay of costs and benefits between integrated energy providers and users. This model is formulated into a Mixed Integer Linear Programming (MILP) problem using Karush-Kuhn-Tucker (KKT) conditions and linearized with the Big M method, subsequently solved using MATLAB and CPLEX. This approach enables distinctive management of heating loads in public and residential areas, optimizing energy efficiency while balancing the interests of both providers and users. Furthermore, the study explores how the proportion of different area types affects the potential for reducing heat loads, providing insights into the scalability and effectiveness of demand response strategies in integrated energy systems. This analysis not only highlights the economic benefits of such strategies but also their potential in reducing dependency on traditional energy sources, thus contributing to more sustainable energy system practices.
Authors:Ali Safa
Abstract:
This letter provides what is, to the best of our knowledge, a first study on the applicability of ultra-low-resolution thermal cameras for providing rotational odometry measurements to navigational devices such as rovers and drones. Our use of an ultra-low-resolution thermal camera instead of other modalities such as an RGB camera is motivated by its robustness to lighting conditions, while being one order of magnitude less cost-expensive compared to higher-resolution thermal cameras. After setting up a custom data acquisition system and acquiring thermal camera data together with its associated rotational speed label, we train a small 4-layer Convolutional Neural Network (CNN) for regressing the rotational speed from the thermal data. Experiments and ablation studies are conducted for determining the impact of thermal camera resolution and the number of successive frames on the CNN estimation precision. Finally, our novel dataset for the study of low-resolution thermal odometry is openly released with the hope of benefiting future research.
Authors:Saurabh Dixit
Abstract:
A comprehensive 3-D finite element formulation for the coupled thermoelastic system is proposed based on the Total Lagrangian framework to study the thermoelastic damping (TED) in small scale structures. The proposed formulation takes into account geometric nonlinearity because of large deformation and material nonlinearity where material parameters are functions of temperature and strain field. Using the proposed finite element formulation, the TED quality factor is obtained for 1-D rod undergoing longitudinal vibrations using the eigenvalue analysis. We first validate the accuracy of the finite element implementation with previously known theoretical and numerical results. Subsequently we demonstrate the utility of the proposed numerical framework to study the effect of geometric nonlinearity, temperature and strain dependent material nonlinearity on the thermoelastic damping.In addition, the effect of internal/ external heating and different thermal boundary conditions on TED is discussed
Authors:Jakub Rydzewski
Abstract: Understanding the behavior of complex molecular systems is a fundamental problem in physical chemistry. To describe the long-time dynamics of such systems, which is responsible for their most informative characteristics, we can identify a few slow collective variables (CVs) while treating the remaining fast variables as thermal noise. This enables us to simplify the dynamics and treat it as diffusion in a free-energy landscape spanned by slow CVs, effectively rendering the dynamics Markovian. Our recent statistical learning technique, spectral map [Rydzewski, J. Phys. Chem. Lett. 2023, 14, 22, 5216-5220], explores this strategy to learn slow CVs by maximizing a spectral gap of a transition matrix. In this work, we introduce several advancements into our framework, using a high-dimensional reversible folding process of a protein as an example. We implement an algorithm for coarse-graining Markov transition matrices to partition the reduced space of slow CVs kinetically and use it to define a transition state ensemble. We show that slow CVs learned by spectral map closely approach the Markovian limit for an overdamped diffusion. We demonstrate that coordinate-dependent diffusion coefficients only slightly affect the constructed free-energy landscapes. Finally, we present how spectral map can be used to quantify the importance of features and compare slow CVs with structural descriptors commonly used in protein folding. Overall, we demonstrate that a single slow CV learned by spectral map can be used as a physical reaction coordinate to capture essential characteristics of protein folding.