Restricted Research - Award List, Note/Discussion Page

Fiscal Year: 2023

2404  The University of Texas at San Antonio  (144292)

Principal Investigator: Cao,Yongcan

Total Amount of Contract, Award, or Gift (Annual before 2011): $ 130,000

Exceeds $250,000 (Is it flagged?): No

Start and End Dates: 7/15/22 - 7/14/26

Restricted Research: YES

Academic Discipline: none

Department, Center, School, or Institute: CTR EXCEL ENGR RESEARCH & EDUC

Title of Contract, Award, or Gift: A Systematic Study of Learning From Failure

Name of Granting or Contracting Agency/Entity: Office of Naval Research
CFDA Link: DOD
12.300

Program Title: none
CFDA Linked: Basic and Applied Scientific Research

Note:

SAMs 1.1.1 Developing robots that can learn from (limited) past experiences to gain high-level human-like decision intelligence is essential towards their applications in complex environments, especially for many modern DoD systems whose desired decision cycles are inevitably short. One important limitation of the existing robotic systems is the lack of “learning from failure” capability, which is one unique feature in humans’ high-level decision intelligence. One key challenge is the design of basic principles and algorithms for robots to learn from limited (yet valuable) failure experiences that are mostly ignored in the current robotic decision making strategies. Another key challenge is to address the data efficiency and robustness issue such that non-expert users can guide robotic decision learning based on their (perhaps) limited domain knowledge. The goals of the project are to overcome the two challenges via developing a new “learning from failure” approach that focuses on providing theoretical and algorithmic solutions of reward and policy learning when the value of failure is explicitly harnessed, and then developing new data-efficient and noise-resilient reward and policy learning approaches. Specifically, the project will focus on four essential thrusts: (1) Multi-Class Reward Learning: learn reward functions from multiple classes of experiences, including failure, success, and others; (2) Failure-Guided Policy Learning: create new reinforcement learning algorithms that can learn control policies from failure; (3) Data Efficiency and Robustness/Sensitivity: study the value of data and data quantity/quality on the control policy learning from failure; and, (4) Testing and Evaluation: conduct case studies to verify and evaluate the proposed methods and algorithms in both simulated and real-world environments. The success of the project is expected to fulfill the needs at the U.S. Navy by providing human-like intelligent and robust robotic systems that can learn from both success and failure. The novelty of this project is the synthesis of approaches from artificial intelligence, computational, learning, decision/control sciences to create a new “learning from failure approach” that offers diversity, efficiency, robustness, and responsiveness.

Discussion: No discussion notes

 

Close Window

Close Menu