Hello everyone, I am a Computer science college student and new to radiomics and medical science.
I want to build a radiomics feature-based classifier. I have to take CT scans Dicom images.
Firstly, I found a dataset NSCLC-Radiomics NSCLC-Radiomics - The Cancer Imaging Archive (TCIA) Public Access - Cancer Imaging Archive Wiki, where they labeled different regions of the body like left and right lung, esophagus, spine, and the abnormal tissue that is a tumor itself. From all these segments I choose the tumor segmentation and calculated the features. But these features are for Cancer patients only. For the non-cancer, I was unable to understand the ROI. Obviously, the non-cancer CT scan can’t have an ROI.
Is it possible to classify cancer and non-cancer patient with radiomics? If yes, then what about the ROI for the non-cancer patient? And the relevant dataset to do so.
After long research, I came across the dataset Data from The Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI): A completed reference database of lung nodules on CT scans (LIDC-IDRI)
Data from The Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI): A completed reference database of lung nodules on CT scans (LIDC-IDRI) - The Cancer Imaging Archive (TCIA) Public Access - Cancer Imaging Archive Wiki, where they mentioned that they have classified the nodules(abnormal tissue) based on their size. I read in their article The Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI): A Completed Reference Database of Lung Nodules on CT Scans - PMC
The Database contains 7371 lesions marked “nodule” by at least one radiologist. 2669 of these lesions were marked “nodule≥3 mm” by at least one radiologist, of which 928 (34.7%) received such marks from all four radiologists. These 2669 lesions include nodule outlines and subjective nodule characteristic ratings.
and they scaled the nodule size from 1-5, (1,2 for small-size nodules, I will call them non-cancer, and 4,5 for large-size nodules, which I’ll treat as Cancer Data).
I hoped that finally I found the right dataset but the dataset is so confusing.
I used their python package pylidc for preprocessing. After compiling it, I got multiple NumPy arrays which I don’t know how to use in pyradiomics.
They only accept that image with its mask. It is so confusing for me.
I don’t know whether to find a dataset to do so. I have spent almost a month reading about it. I have tried many datasets but found nothing relevant to my project. I really need help regarding this.