Pubs | Jiwen's Website

Feasible Constraint Policy Optimization for Safe Reinforcement Learning

We introduce Feasible Constraint Policy Optimization (FCPO), which seamlessly combines penalty and trust region methods to address policy feasibility while ensuring stability and performance.

PLM: Efficient Peripheral Language Models Hardware-Co-Designed for Ubiquitous Computing

We introduce the PLM, a Peripheral Language Model, developed through a co-design process that jointly optimizes model architecture and edge system constraints.