General Purpose X-ray

46 minutes to read

DICOM format

35 (+10) minutes to read

Modalities

With X-rays different kinds of images can be created, like tomosynthesis, or CT - both will be touched upon later in this course. But the most widespread images are so-called plain X-ray images that are produced by conventional X-ray machines. If plain X-ray images are stored in the DICOM format, they can have one of the three modality tags:

  • CR - Computed Radiography

    What is CR technology?

    CR file structure

  • DX - Digital Radiography (DX stands for Digital X-ray)

    DX file structure

  • RG - RadioGraphic imaging (conventional film/screen) - if you are to work with these files, they are already digitized but still should be marked as RG.

    The RG DICOM file structure may differ from the CR and DX ones. But most tags, and thus preprocessing, should be generally the same. We don’t mention this modality in the next section because of that.

Typical image preprocessing

Firstly, we should understand what the DICOM LUT (LookUp Table) is:

The general formula for the DICOM X-ray preprocessing is the following:

Note

nice_looking_image = inversion[optional]( VOI_LUT( Modality_LUT(pixel_data) ) )

Each function in the formula is described separately below.

Modality LUT is used to obtain physically meaningful values (e.g., HU, OD).

VOI (Value Of Interest) LUT is used to obtain pictures where a certain type of tissue looks nice (well-contrasted).

Warning

In one DICOM file, it can be more than one VOI LUT!

Photometric Interpretation tag points to how to understand the value of the pixels.

In the case of X-rays, there are two possible values of the Photometric Interpretation tag: MONOCHROME1 (dense tissues are dark) and MONOCHROME2 (dense tissues are bright). The last one seems to be used more commonly.

Usually, you will need to pass the images into the model with one pre-selected Photometric Interpretation. Invert the image explicitly when needed. For example:

if PhotometricInterpretation == 'MONOCHROME1':
  img = img.max() - img

Pydicom tutorial on how to work with DICOM pixel data.

Warning

The tutorial above shows only what you can do with pixel data. It does not describe the image preparation sequence.

[optional] The complete description of how to preprocess the DICOM pixel data to the properly looking image can be found in the Pixel Transformation Sequence section of the official DICOM standard documentation.

Image formats

14 minutes to read

Fairly often medical datasets are stored using widespread image file formats like png, jpg/jpeg, tiff, etc. Though images are much easier to read with Python, usually it’s harder to combine such datasets with the other ones. First of all, those image formats can introduce their own artifacts (like jpeg compression). Second, the authors of a dataset can use any preprocessing they like (maybe skipping or modifying some steps described in the previous paragraph, or introducing new ones). As a result, image statistics can vary significantly for images from different sources. It’s great when authors of the dataset describe the preprocessing they applied in detail, but unfortunately, it’s not always the case.

Typical image preprocessing serves multiple purposes:

  • To make image statistics better (to ease training)

  • Increase visual contrast (disputable, because NNs can learn contrast-enhancing filters by themselves)

  • Equalize input image statistics from different datasets (the most important one)

This is usually done with either histogram equalization or CLAHE (example notebook).

How to split the data

1 minute to read

As usual, we use standard train/validation/test split. A good practice is to have another out-of-domain test set (e.g., from a different source/device/dataset) to measure the final performance of the model.

The details of how to split your data depend on the task. But usually, images are split based on the patient IDs. This is commonly required because if, e.g., you are trying to classify pathologies, the appearance of the X-ray images of the same person with the same pathology in the different data splits would introduce a data leak. Which basically means that the network will memorize that this patient always has this pathology.

In the case of DICOM datasets, you can use the PatientID tag, if it’s not empty. Otherwise (or in the case of an image-format dataset), the only information you have is what the dataset authors provided.

Exercise

3(+2) days of work

Task: Train the classification model on the PNG dataset, make it work on the DICOM dataset.

  • Try to use different preprocessing techniques, enable/disable some of the preprocessing steps.

  • What steps are crucial and what is not helpful at all? Why?

  • [optional] Try to solve the inverse task: train on a DCM dataset and make it work on a PNG dataset.