Publications
2024
- Learning to optimize contextually constrained problems for real-time decision-generationAaron Babier, Timothy C Y Chan, Adam Diamant, and Rafid MahmoodManagement Science, 2024
The topic of learning to solve optimization problems has received interest from both the operations research and machine learning communities. In this work, we combine techniques from both fields to address the problem of learning to generate decisions to instances of continuous optimization problems where the feasible set varies with contextual features. We propose a novel framework for training a generative model to estimate optimal decisions by combining interior point methods and adversarial learning, which we further embed within an data generation algorithm. Decisions generated by our model satisfy in-sample and out-of-sample optimality guarantees. Finally, we investigate case studies in portfolio optimization and personalized treatment design, demonstrating that our approach yields advantages over predict-then-optimize and supervised deep learning techniques, respectively.
- Knowledge-based planning for Gamma KnifeBinghao Zhang, Aaron Babier, Mark Ruschin, and Timothy C. Y. ChanMedical Physics, 2024
Purpose: To develop a novel knowledge-based planning (KBP) pipeline, using inverse optimization with 3D dose predictions for Gamma Knife (GK). Methods: Data were obtained for 349 patients from Sunnybrook Health Sciences Centre. A 3D dose prediction model was trained using 322 patients, based on a previously published deep learning methodology, and dose predictions were generated for the remaining 27 out-of-sample patients. A generalized inverse optimization (IO) model was developed to learn objective function weights from dose predictions. These weights were then used in an inverse planning model to generate deliverable treatment plans. A dose mimicking (DM) model was also implemented for comparison. The quality of the resulting plans was compared to their clinical counterparts using standard GK quality metrics. The performance of the models was also characterized with respect to the dose predictions. Results: Across all quality metrics, plans generated using the IO pipeline performed at least as well as or better than the respective clinical plans. The average conformity and gradient indices of IO plans was 0.737 ± 0.158 and 3.356 ± 1.030 respectively, compared to 0.713 ± 0.124 and 3.452 ± 1.123 for the clinical plans. IO plans also performed better than DM plans for five of the six quality metrics. Plans generated using IO also have average treatment times comparable to that of clinical plans. With regards to the dose predictions, predictions with higher conformity tend to result in higher quality KBP plans. Conclusions: Plans resulting from an IO KBP pipeline are, on average, of equal or superior quality compared to those obtained through manual planning. The results demonstrate the potential for the use of KBP to generate GK treatment with minimal human intervention.
2023
- 3D dose prediction for Gamma Knife radiosurgery using deep learning and data modificationBinghao Zhang, Aaron Babier, Timothy C.Y. Chan, and Mark RuschinPhysica Medica, Feb 2023
Purpose: To develop a machine learning-based, 3D dose prediction methodology for Gamma Knife (GK) radiosurgery. The methodology accounts for cases involving targets of any number, size, and shape. Methods: Data from 322 GK treatment plans was modified by isolating and cropping the contoured MRI and clinical dose distributions based on tumor location, then scaling the resulting tumor spaces to a standard size. An accompanying 3D tensor was created for each instance to account for tumor size. The modified dataset for 272 patients was used to train both a generative adversarial network (GAN-GK) and a 3D U-Net model (U-Net-GK). Unmodified data was used to train equivalent baseline models. All models were used to predict the dose distribution of 50 out-of-sample patients. Prediction accuracy was evaluated using gamma, with criteria of 4%/2mm, 3%/3mm, 3%/1mm and 1%/1mm. Prediction quality was assessed using coverage, selectivity, and conformity indices. Results: The predictions resulting from GAN-GK and U-Net-GK were similar to their clinical counterparts, with average gamma (4%/2mm) passing rates of 84.9 and 83.1, respectively. In contrast, the gamma passing rate of baseline models were significantly worse than their respective GK-specific models (p < 0.001) at all criterion levels. The quality of GK-specific predictions was also similar to that of clinical plans. Conclusion: Deep learning models can use GK-specific data modification to predict 3D dose distributions for GKRS plans with a large range in size, shape, or number of targets. Standard deep learning models applied to unmodified GK data generated poorer predictions.
2022
- OpenKBP-Opt: an international and reproducible evaluation of 76 knowledge-based planning pipelinesAaron Babier, Rafid Mahmood, Binghao Zhang, Victor G.L. Alves, Ana Maria Barragán-Montero, and 54 more authorsPhysics in Medicine and Biology, Sep 2022
Objective.To establish an open framework for developing plan optimization models for knowledge-based planning (KBP).Approach.Our framework includes radiotherapy treatment data (i.e. reference plans) for 100 patients with head-and-neck cancer who were treated with intensity-modulated radiotherapy. That data also includes high-quality dose predictions from 19 KBP models that were developed by different research groups using out-of-sample data during the OpenKBP Grand Challenge. The dose predictions were input to four fluence-based dose mimicking models to form 76 unique KBP pipelines that generated 7600 plans (76 pipelines × 100 patients). The predictions and KBP-generated plans were compared to the reference plans via: the dose score, which is the average mean absolute voxel-by-voxel difference in dose; the deviation in dose-volume histogram (DVH) points; and the frequency of clinical planning criteria satisfaction. We also performed a theoretical investigation to justify our dose mimicking models.Main results.The range in rank order correlation of the dose score between predictions and their KBP pipelines was 0.50-0.62, which indicates that the quality of the predictions was generally positively correlated with the quality of the plans. Additionally, compared to the input predictions, the KBP-generated plans performed significantly better (P< 0.05; one-sided Wilcoxon test) on 18 of 23 DVH points. Similarly, each optimization model generated plans that satisfied a higher percentage of criteria than the reference plans, which satisfied 3.5% more criteria than the set of all dose predictions. Lastly, our theoretical investigation demonstrated that the dose mimicking models generated plans that are also optimal for an inverse planning model.Significance.This was the largest international effort to date for evaluating the combination of KBP prediction and optimization models. We found that the best performing models significantly outperformed the reference dose and dose predictions. In the interest of reproducibility, our data and code is freely available.
- Advising student-driven analytics projects: a summary of experiences and lessons learnedAaron Babier, Craig Fernandes, and Ian Y. ZhuINFORMS Transactions on Education, Aug 2022
In this paper, we describe a course project in which teams of undergraduate students propose and execute an end-to-end analytics project to solve a real-world problem. The project challenges students to implement machine learning, optimization, simulation, or a combination of these three techniques on real-world data that they collect. A designated project advisor helps each team refine its project and assesses the quality of the resulting work. In our analysis of 58 past projects, we show that students developed solutions for a wide range of topics by employing various methodologies. However, most teams encountered similar challenges that project advisors helped them overcome with tailored feedback. Based on feedback from 106 previous students, the project experience was largely positive and helped them prepare for their future careers. We believe that this type of hands-on project is conducive to the development of important data analytics skills.
2021
- OpenKBP: The open-access knowledge-based planning grand challenge and datasetAaron Babier, Binghao Zhang, Rafid Mahmood, Kevin L. Moore, Thomas G. Purdie, and 2 more authorsMedical Physics, Sep 2021
Purpose: To advance fair and consistent comparisons of dose prediction methods for knowledge-based planning (KBP) in radiation therapy research. Methods: We hosted OpenKBP, a 2020 AAPM Grand Challenge, and challenged participants to develop the best method for predicting the dose of contoured computed tomography (CT) images. The models were evaluated according to two separate scores: (a) dose score, which evaluates the full three-dimensional (3D) dose distributions, and (b) dose-volume histogram (DVH) score, which evaluates a set DVH metrics. We used these scores to quantify the quality of the models based on their out-of-sample predictions. To develop and test their models, participants were given the data of 340 patients who were treated for head-and-neck cancer with radiation therapy. The data were partitioned into training ( n=200 ), validation ( n=40 ), and testing ( n=100 ) datasets. All participants performed training and validation with the corresponding datasets during the first (validation) phase of the Challenge. In the second (testing) phase, the participants used their model on the testing data to quantify the out-of-sample performance, which was hidden from participants and used to determine the final competition ranking. Participants also responded to a survey to summarize their models. Results: The Challenge attracted 195 participants from 28 countries, and 73 of those participants formed 44 teams in the validation phase, which received a total of 1750 submissions. The testing phase garnered submissions from 28 of those teams, which represents 28 unique prediction methods. On average, over the course of the validation phase, participants improved the dose and DVH scores of their models by a factor of 2.7 and 5.7, respectively. In the testing phase one model achieved the best dose score (2.429) and DVH score (1.478), which were both significantly better than the dose score (2.564) and the DVH score (1.529) that was achieved by the runner-up models. Lastly, many of the top performing teams reported that they used generalizable techniques (e.g., ensembles) to achieve higher performance than their competition. Conclusion: OpenKBP is the first competition for knowledge-based planning research. The Challenge helped launch the first platform that enables researchers to compare KBP prediction methods fairly and consistently using a large open-source dataset and standardized metrics. OpenKBP has also democratized KBP research by making it accessible to everyone, which should help accelerate the progress of KBP research. The OpenKBP datasets are available publicly to help benchmark future KBP research.
- An ensemble learning framework for model fitting and evaluation in inverse linear optimizationAaron Babier, Timothy C.Y. Chan, Taewoo Lee, Rafid Mahmood, and Daria TerekhovINFORMS Journal on Optimization, Feb 2021
We develop a generalized inverse optimization framework for fitting the cost vector of a single linear optimization problem given multiple observed decisions. This setting is motivated by ensemble learning, where building consensus from base learners can yield better predictions. We unify several models in the inverse optimization literature under a single framework and derive assumption-free and exact solution methods for each one. We extend a goodness-of-fit metric previously introduced for the problem with a single observed decision to this new setting and demonstrate several important properties. Finally, we demonstrate our framework in a novel inverse optimization-driven procedure for automated radiation therapy treatment planning. Here, the inverse optimization model leverages an ensemble of dose predictions from different machine learning models to construct a consensus treatment plan that outperforms baseline methods. The consensus plan yields better trade-offs between the competing clinical criteria used for plan evaluation.
2020
- AutoAudio: deep learning for automatic audiogram interpretationMatthew G. Crowson, Jong Wook Lee, Amr Hamour, Rafid Mahmood, Aaron Babier, and 3 more authorsJournal of Medical Systems, Aug 2020
Hearing loss is the leading human sensory system loss, and one of the leading causes for years lived with disability with significant effects on quality of life, social isolation, and overall health. Coupled with a forecast of increased hearing loss burden worldwide, national and international health organizations have urgently recommended that access to hearing evaluation be expanded to meet demand. The objective of this study was to develop ’AutoAudio’ - a novel deep learning proof-of-concept model that accurately and quickly interprets diagnostic audiograms. Adult audiogram reports representing normal, conductive, mixed and sensorineural morphologies were used to train different neural network architectures. Image augmentation techniques were used to increase the training image set size. Classification accuracy on a separate test set was used to assess model performance. The architecture with the highest out-of-training set accuracy was ResNet-101 at 97.5%. Neural network training time varied between 2 to 7 h depending on the depth of the neural network architecture. Each neural network architecture produced misclassifications that arose from failures of the model to correctly label the audiogram with the appropriate hearing loss type. The most commonly misclassified hearing loss type were mixed losses. Re-engineering the process of hearing testing with a machine learning innovation may help enhance access to the growing worldwide population that is expected to require audiologist services. Our results suggest that deep learning may be a transformative technology that enables automatic and accurate audiogram interpretation.
- A contemporary review of machine learning in otolaryngology-head and neck surgeryMatthew G. Crowson, Jonathan Ranisau, Antoine Eskander, Aaron Babier, Bin Xu, and 3 more authorsThe Laryngoscope, Jan 2020
One of the key challenges with big data is leveraging the complex network of information to yield useful clinical insights. The confluence of massive amounts of health data and a desire to make inferences and insights on these data has produced a substantial amount of interest in machine-learning analytic methods. There has been a drastic increase in the otolaryngology literature volume describing novel applications of machine learning within the past 5 years. In this timely contemporary review, we provide an overview of popular machine-learning techniques, and review recent machine-learning applications in otolaryngology-head and neck surgery including neurotology, head and neck oncology, laryngology, and rhinology. Investigators have realized significant success in validated models with model sensitivities and specificities approaching 100%. Challenges remain in the implementation of machine-learning algorithms. This may be in part the unfamiliarity of these techniques to clinician leaders on the front lines of patient care. Spreading awareness and confidence in machine learning will follow with further validation and proof-of-value analyses that demonstrate model performance superiority over established methods. We are poised to see a greater influx of machine-learning applications to clinical problems in otolaryngology-head and neck surgery, and it is prudent for providers to understand the potential benefits and limitations of these technologies. Laryngoscope, 130:45-51, 2020.
- Knowledge-based automated planning with three-dimensional generative adversarial networksAaron Babier, Rafid Mahmood, Andrea L. McNiven, Adam Diamant, and Timothy C.Y. ChanMed Phys, Feb 2020
Purpose: To develop a knowledge-based automated planning pipeline that generates treatment plans without feature engineering, using deep neural network architectures for predicting three-dimensional (3D) dose. Methods: Our knowledge-based automated planning (KBAP) pipeline consisted of a knowledge-based planning (KBP) method that predicts dose for a contoured computed tomography (CT) image followed by two optimization models that learn objective function weights and generate fluence-based plans, respectively. We developed a novel generative adversarial network (GAN)-based KBP approach, a 3D GAN model, which predicts dose for the full 3D CT image at once and accounts for correlations between adjacent CT slices. Baseline comparisons were made against two state-of-the-art deep learning-based KBP methods from the literature. We also developed an additional benchmark, a two-dimensional (2D) GAN model which predicts dose to each axial slice independently. For all models, we investigated the impact of multiplicatively scaling the predictions before optimization, such that the predicted dose distributions achieved all target clinical criteria. Each KBP model was trained on 130 previously delivered oropharyngeal treatment plans. Performance was tested on 87 out-of-sample previously delivered treatment plans. All KBAP plans were evaluated using clinical planning criteria and compared to their corresponding clinical plans. KBP prediction quality was assessed using dose-volume histogram (DVH) differences from the corresponding clinical plans. Results: The best performing KBAP plans were generated using predictions from the 3D GAN model that were multiplicatively scaled. These plans satisfied 77% of all clinical criteria, compared to the clinical plans, which satisfied 67% of all criteria. In general, multiplicatively scaling predictions prior to optimization increased the fraction of clinical criteria satisfaction by 11% relative to the plans generated with nonscaled predictions. Additionally, these KBAP plans satisfied the same criteria as the clinical plans 84% and 8% more frequently as compared to the two benchmark methods, respectively. Conclusion: We developed the first knowledge-based automated planning framework using a 3D generative adversarial network for prediction. Our results, based on 217 oropharyngeal cancer treatment plans, demonstrated superior performance in satisfying clinical criteria and generated more realistic plans as compared to the previous state-of-the-art approaches.
- The importance of evaluating the complete automated knowledge-based planning pipelineAaron Babier, Rafid Mahmood, Andrea L. McNiven, Adam Diamant, and Timothy C.Y. ChanPhysica Medica, Apr 2020
We determine how prediction methods combine with optimization methods in two-stage knowledge-based planning (KBP) pipelines to produce radiation therapy treatment plans. We trained two dose prediction methods, a generative adversarial network (GAN) and a random forest (RF) with the same 130 treatment plans. The models were applied to 87 out-of-sample patients to create two sets of predicted dose distributions that were used as input to two optimization models. The first optimization model, inverse planning (IP), estimates weights for dose-objectives from a predicted dose distribution and generates new plans using conventional inverse planning. The second optimization model, dose mimicking (DM), minimizes the sum of one-sided quadratic penalties between the predictions and the generated plans using several dose-objectives. Altogether, four KBP pipelines (GAN-IP, GAN-DM, RF-IP, and RF-DM) were constructed and benchmarked against the corresponding clinical plans using clinical criteria; the error of both prediction methods was also evaluated. The best performing plans were GAN-IP plans, which satisfied the same criteria as their corresponding clinical plans (78%) more often than any other KBP pipeline. However, GAN did not necessarily provide the best prediction for the second-stage optimization models. Specifically, both the RF-IP and RF-DM plans satisfied the same criteria as the clinical plans 25% and 15% more often than GAN-DM plans (the worst performing plans), respectively. GAN predictions also had a higher mean absolute error (3.9 Gy) than those from RF (3.6 Gy). We find that state-of-the-art prediction methods when paired with different optimization algorithms, produce treatment plans with considerable variation in quality.
2018
- Automated treatment planning in radiation therapy using generative adversarial networksRafid Mahmood, Aaron Babier, Andrea McNiven, Adam Diamant, and Timothy C.Y. ChanIn Proceedings of the 3rd Machine Learning for Healthcare Conference, Aug 2018
Knowledge-based planning (KBP) is an automated approach to radiation therapy treatment planning that involves predicting desirable treatment plans before they are then corrected to deliverable ones. We propose a generative adversarial network (GAN) approach for predicting desirable 3D dose distributions that eschews the previous paradigms of site-specific feature engineering and predicting low-dimensional representations of the plan. Experiments on a dataset of oropharyngeal cancer patients show that our approach significantly outperforms previous methods on several clinical satisfaction criteria and similarity metrics.
- Knowledge-based automated planning for oropharyngeal cancerAaron Babier, Justin J. Boutilier, Andrea L. McNiven, and Timothy C.Y. ChanMedical physics, Jul 2018
Purpose: The purpose of this study was to automatically generate radiation therapy plans for oropharynx patients by combining knowledge-based planning (KBP) predictions with an inverse optimization (IO) pipeline. Methods: We developed two KBP approaches, the bagging query (BQ) method and the generalized principal component analysis-based (gPCA) method, to predict achievable dose-volume histograms (DVHs). These approaches generalize existing methods by predicting physically feasible organ-at-risk (OAR) and target DVHs in sites with multiple targets. Using leave-one-out cross validation, we applied both models to a large dataset of 217 oropharynx patients. The predicted DVHs were input into an IO pipeline that generated treatment plans (BQ and gPCA plans) via an intermediate step that estimated objective function weights for an inverse planning model. The KBP predictions were compared to the clinical DVHs for benchmarking. To assess the complete pipeline, we compared the BQ and gPCA plans to both the predictions and clinical plans. To isolate the effect of the KBP predictions, we put clinical DVHs through the IO pipeline to produce clinical inverse optimized (CIO) plans. This approach also allowed us to estimate the complexity of the clinical plans. The BQ and gPCA plans were benchmarked against the CIO plans using DVH differences and clinical planning criteria. Iso-complexity plans (relative to CIO) were also generated and evaluated. Results: The BQ method tended to predict that less dose is delivered than what was observed in the clinical plans while the gPCA predictions were more similar to clinical DVHs. Both populations of KBP predictions were reproduced with inverse plans to within a median DVH difference of 3 Gy. Clinical planning criteria for OARs were satisfied most frequently by the BQ plans (74.4%), by 6.3% points more than the clinical plans. Meanwhile, target criteria were satisfied most frequently by the gPCA plans (90.2%), and by 21.2% points more than clinical plans. However, once the complexity of the plans was constrained to that of the CIO plans, the performance of the BQ plans degraded significantly. In contrast, the gPCA plans still satisfied more clinical criteria than both the clinical and CIO plans, with the most notable improvement being in target criteria. Conclusion: Our automated pipeline can successfully use DVH predictions to generate high-quality plans without human intervention. Between the two KBP methods, gPCA plans tend to achieve comparable performance as clinical plans, even when controlling for plan complexity, whereas BQ plans tended to underperform.
- Inverse optimization of objective function weights for treatment planning using clinical dose-volume histogramsAaron Babier, Justin J. Boutilier, Michael B. Sharpe, Andrea L. McNiven, and Timothy C.Y. ChanPhysics in Medicine and Biology, May 2018
We developed and evaluated a novel inverse optimization (IO) model to estimate objective function weights from clinical dose-volume histograms (DVHs). These weights were used to solve a treatment planning problem to generate ’inverse plans’ that had similar DVHs to the original clinical DVHs. Our methodology was applied to 217 clinical head and neck cancer treatment plans that were previously delivered at Princess Margaret Cancer Centre in Canada. Inverse plan DVHs were compared to the clinical DVHs using objective function values, dose-volume differences, and frequency of clinical planning criteria satisfaction. Median differences between the clinical and inverse DVHs were within 1.1 Gy. For most structures, the difference in clinical planning criteria satisfaction between the clinical and inverse plans was at most 1.4%. For structures where the two plans differed by more than 1.4% in planning criteria satisfaction, the difference in average criterion violation was less than 0.5 Gy. Overall, the inverse plans were very similar to the clinical plans. Compared with a previous inverse optimization method from the literature, our new inverse plans typically satisfied the same or more clinical criteria, and had consistently lower fluence heterogeneity. Overall, this paper demonstrates that DVHs, which are essentially summary statistics, provide sufficient information to estimate objective function weights that result in high quality treatment plans. However, as with any summary statistic that compresses three-dimensional dose information, care must be taken to avoid generating plans with undesirable features such as hotspots; our computational results suggest that such undesirable spatial features were uncommon. Our IO-based approach can be integrated into the current clinical planning paradigm to better initialize the planning process and improve planning efficiency. It could also be embedded in a knowledge-based planning or adaptive radiation therapy framework to automatically generate a new plan given a predicted or updated target DVH, respectively.