Intelligent inverse treatment planning via deep reinforcement learning, a proof-of-principle study in high dose-rate brachytherapy for cervical cancer.

The objective function and constraints consist of multiple terms designed for different clinical and practical considerations. Weighting factors of these terms are needed to define the optimization problem. While a treatment planning system can solve the optimization problem with given weights, adjusting the weights for high-quality plans is typically performed by human planners. Such weight-tuning task is labor intensive, time consuming, and it critically affects the final plan quality. An automatic weight-tuning approach is strongly desired. The weight-tuning procedure to improve plan quality is essentially a decision-making problem. Motivated by the tremendous success in deep learning for decision making with human-level intelligence, we propose a novel framework to adjust the weights in a human-like manner. Using treatment planning in high-dose-rate brachytherapy for cervical cancer as an example, we develop a weight-tuning policy network (WTPN) that observes dose-volume histograms and outputs an action to adjust weights, similar to the behaviors of human planners. We train the WTPN via end-to-end deep reinforcement learning. Experience replay is performed with the epsilon greedy algorithm. After training is completed, we apply the trained WTPN to guide treatment planning of five testing patient cases. The trained WTPN successfully learns the treatment planning goals to guide the weight-tuning process. On average, quality score of plans generated under the WTPN's guidance...
Source: Physics in Medicine and Biology - Category: Physics Authors: Tags: Phys Med Biol Source Type: research