Three dimensional (3D) semantic segmentation is a powerful tool for the radiomic characterisation of lesions in cross-sectional imaging; however, its use is limited because of the extensive manual effort required to accurately delineate regions of interest. Artificial intelligence methods based on ‘deep learning’ have had enormous success at automating semantic segmentation tasks, but they rely on large datasets with high-quality manual segmentations from which to learn.1 The scarcity of publicly available CT and MRI segmentation datasets have slowed the development of automatic segmentation systems for 3D medical imaging data. The 2019 Kidney and Kidney Tumor Segmentation challenge2 (KiTS19) was an international competition held in conjunction with the International Conference on Medical Image Computing and Computer Assisted Interventions (MICCAI) that sought to stimulate progress on this automatic segmentation frontier.
This competition was based around a first-of-its-kind dataset of 300 patients with preoperative CT imaging and corresponding high-quality 3D segmentation labels for the depicted kidneys and tumours. Of these cases, 210 were publicly released at the start of the challenge for teams to develop their systems. The imaging alone for the remaining 90 cases was released prior to a 2-week submission period, during which teams used their systems to automatically segment the kidney and tumour regions. Those automatic segmentations were then submitted to an online platform, which measured each team’s performance in terms of their Sørensen–Dice coefficient with the manually produced segmentations and ranked them on a public leader board. A total of 106 teams from five continents competed for a $5,000 USD cash prize sponsored by Intuitive Surgical Inc. Teams were required to also submit a manuscript detailing the methods they used to build and train their systems. These, in conjunction with the objective and standardised comparison of their performance, allow for rapid progress in artificial intelligence research and democratise the systems it produces.
The winning method,3 led by researchers at the German Cancer Research Center, Heidelberg, Germany, achieved a Sørensen–Dice coefficient of 0.974 for the kidney regions and 0.851 for the tumours, approaching the human inter-annotator performance on kidneys (0.983), but falling short on tumours (0.923). This challenge has now entered an ‘open leader board’ phase, in which researchers can continue to develop new systems and submit them to the online platform to be ranked on the leader board.
The results of the KiTS19 challenge show deep learning methods are fully capable of reliable segmentation of kidneys and kidney tumours. Fully segmented kidneys and tumours allow for automated extraction of all types of nephrometry scores,4 as well as tumour textural features and an opportunity to accelerate the discovery of new predictive features important for personalised medicine and accurate prediction of patient-relevant outcomes. The KiTS19 challenge attracted the highest number of submissions at MICCAI 2019 and serves as an important and challenging benchmark in automatic 3D segmentation.