5 – The Data

So let us talk about deep learning for skin cancer classification. My students, Brett and Andre, and our collaborators in the School of Medicine, collected roughly 130,000 images of skin conditions from various data sources, including the Web. Those include the images they are going to use for the competition. And these images came with disease labels. All these images were biopsied. So someone had actually cut out the condition and done a correct diagnostic. So we knew for all of these images what the ground truth classification is. Unfortunately, it was not just cancer or non-cancer. It came with 2,000 different diseases, from inflammatory diseases, to rashes, to lesions, to all kinds of stuff. So we built a classification tree, manually, of different types of diseases of the skin with 2,000 nodes at the end. Here is a little small version of it. Skin disease is shown in blue. We have different types of diseases; benign diseases, non-neoplastic diseases, and malignant diseases. And even among the malignant diseases, there are many, many different classes of carcinomas and melanomas. And melanomas are this little black dot over here, and that’s the one we’re really after because it is the more lethal of all cancers .

%d 블로거가 이것을 좋아합니다: