生成时间: 2026-03-30 17:14:58 (UTC+8); Arxiv 发布时间: 2026-03-30 20:00 EDT (2026-03-31 08:00 UTC+8)

今天共有 19 篇相关文章

Keyword: reinforcement learning

Empowering Epidemic Response: The Role of Reinforcement Learning in Infectious Disease Control

赋能流行病应对:强化学习在传染病控制中的作用

Chasing Autonomy: Dynamic Retargeting and Control Guided RL for Performant and Controllable Humanoid Running

追求自主性:动态重定向与控制引导的强化学习,实现高效且可控的人形运行

Reinforcing Structured Chain-of-Thought for Video Understanding

强化结构化思维链以促进视频理解

Neuro-Cognitive Reward Modeling for Human-Centered Autonomous Vehicle Control

以人为本自动驾驶车辆控制的神经认知奖励建模

AutoB2G: A Large Language Model-Driven Agentic Framework For Automated Building-Grid Co-Simulation

AutoB2G:一种大型语言模型驱动的智能体框架,用于自动化建筑-网格共仿

Designing Fatigue-Aware VR Interfaces via Biomechanical Models

通过生物力学模型设计疲劳感知虚拟现实界面

Hierarchical Control Framework Integrating LLMs with RL for Decarbonized HVAC Operation

层级控制框架:将LLM与强化学习整合以实现脱碳暖通空调运行

Dynamic Tokenization via Reinforcement Patching: End-to-end Training and Zero-shot Transfer

通过强化补丁实现动态令牌化:端到端训练与零射击传输

Rethinking Recommendation Paradigms: From Pipelines to Agentic Recommender Systems

重新思考推荐范式:从管道到代理推荐系统

Beyond Where to Look: Trajectory-Guided Reinforcement Learning for Multimodal RLVR

超越关注点:多模态RLVR的轨迹引导强化学习

Knowledge Distillation for Efficient Transformer-Based Reinforcement Learning in Hardware-Constrained Energy Management Systems

硬件受限能源管理系统中基于变压器的高效强化学习知识蒸馏

Topology-Aware Graph Reinforcement Learning for Energy Storage Systems Optimal Dispatch in Distribution Networks

用于储能系统配电网络中最优调度的拓扑感知图强化学习

Dynamic Token Compression for Efficient Video Understanding through Reinforcement Learning

动态令牌压缩,通过强化学习实现高效的视频理解

120 Minutes and a Laptop: Minimalist Image-goal Navigation via Unsupervised Exploration and Offline RL

120分钟和一台笔记本电脑:通过无监督探索和离线强化学习实现极简图像目标导航

Automatic feature identification in least-squares policy iteration using the Koopman operator framework

使用库普曼算子框架进行最小二乘策略迭代中的自动特征识别

CR-Eyes: A Computational Rational Model of Visual Sampling Behavior in Atari Games

CR-Eyes:Atari游戏中视觉采样行为的计算理性模型

Think over Trajectories: Leveraging Video Generation to Reconstruct GPS Trajectories from Cellular Signaling

思考轨迹:利用视频生成重建蜂窝信号的GPS轨迹

VLA-OPD: Bridging Offline SFT and Online RL for Vision-Language-Action Models via On-Policy Distillation

VLA-OPD:通过策略内蒸馏桥接离线SFT与在线强化学习,实现视觉-语言-动作模型

Keyword: diffusion policy

DiffusionAnything: End-to-End In-context Diffusion Learning for Unified Navigation and Pre-Grasp Motion

DiffusionAnything:端到端上下文扩散学习,实现统一导航和预抓运动