developers, dicom

Hacking classical DICOM : A Hello World primer

This weekend, I am at PennApps, a fantastic hackathon event in Philadelphia. It got me thinking about sharing some simple steps to hit the ground running, with classic DICOM (notwithstanding all of the amazing advances of #DICOMweb).

DISCLAIMER: This information is provided as-is, the content might be incorrect or out of date, and could harm kittens or even worse if used improperly.

I will also continue to update this post as necessary.

Step 0 : Understanding enough about DICOM to start

Skip this section and come back to it as you follow the steps below. Below you’ll find some basic explainers of DICOM concepts. These are over-simplified to the point of technical inaccuracy – but I’ve added some DICOM-geek-spoilers to help technically clarify statements. If you’re just starting out, ignore those notes in red.

  • A medical image file can be referred to as a DICOM file.
    • Technically, the fact that we have actual files is just a matter of convenience / an implementation detail – DICOM is more about specifying the file structure and the transportation layer, and leave “how you save the files to your disk” up to you. 
  • A DICOM file is very much similar to a JPG or PNG, that you take from your camera. Unlike most image files, double-clicking them won’t open up those images – there isn’t a viewer available by default on computers today.
  • A DICOM file has headers and pixel data.
    • Headers are made up of a series of “parameters” (or tags, or attributes), a “parameter type” (like string, or integer), and a value / set of values. See here for a general list of tags (easy to use browser find to look at tag names).
  • A piece of medical equipment that captures medical images is sometimes called a “modality”. In most modalities, multiple images are taken as part of a medical exam. Here are a few examples:
    • X-rays are taken when you have fractured a bone. There’s usually a few different views taken.
    • Ultrasounds are taken when looking at softer tissue. Sometimes they are a series of snapshots taken over the course of the exam; other times they are a series of images in “real-time” (like watching the heart).
    • CTs and MRIs are used to see inside the body – and they are stored as “slices”. Think of your body being cut horizontally, millimetre by millimetre – each one of these is an image. There could be 64 slices, or there could be 10,000 slices or more.
  • Individual images that were captured as a group are collected into a “series”, for example all the CT slices that make up a brain volume. A number of series can then grouped into a study, which is what a doctor orders, for example a PET series and a CT series in a combined PET/CT scan of a tumor.

Step 1 : Get some images

There are a lot of places to grab images from. Google terms like “DICOM Sample Images” to find some. Here’s a couple of places to get you started:

These sites typically package DICOM images into a ZIP file for download, containing a set of folders (which represent each series), and each of those folders containing a set of DCM files. If the ZIP is a set of studies, there might be an additional layer (directory of studies, each containing directories of series, each containing directories of DCM files).

So, let’s assume you have some DICOM. Now, we need to have a look inside. Unzip them somewhere and remember where you saved them.

Step 2: Have a look at the images

For this step, you need an image viewer. There are a number of open-source viewers out there – here’s a couple :

Once you have installed viewer software, you can open up the DCM files you have saved from the previous step. Typically, viewers allow you to open up a “directory”, which will discover all related images in a series or study.

Once you’ve opened up a study, have a look around! Medical imaging is a fascinating and beautiful thing to behold.

Okay – so, now you have the data – but how do you unlock that information?

Step 3: Getting at the header and pixel data

Now we get to the programming aspect. For this, you need a library. There are a number of libraries available, for all sorts of languages. Server-side, I’ve been brought up on Java, so I use a library called DCM4CHE (start with the BIN download). It also has command line tools, so it is worthwhile even if ultimately you don’t want to use Java.

With DCM4CHE, there are a number of tools with command line apps pre-compiled with it. Two very quick wins with these tools will allow you to get header data (in XML or JSON), and getting the image housed within it.

Getting the header data

Use the command dcm2xml. You can pass in a DICOM file and it will spit out XML of all of the DICOM tags. It works very simply:

dcm2xml <path-to-dicom-file>/<file-name>

And this will write to the console the XML file. There is a lot of very useful information that can be gleaned by exploring the header data, such as learning about the patient and about the study being performed.

Although not documented on the site, I have seen a JSON rendition of this tool as well.

Getting the image (pixel data)

Use the command dcm2jpg, you can pass in a DICOM file and it will spit out a JPG for all your viewing needs. It also works very simply:

dcm2jpg <path-to-dicom-file>/<dicom-file-name> <path-to-jpg-file>/<dicom-jpg-name>

And this will drop the file as an image. Now, depending on your version of Java, you may need a JPG library that can actually encode the out as JPG.

Other DCM4CHE notes

  • When you actually use the image library inside of Java, you can use buffers and streams to more efficiently work with the pixel data (and header data, for what it’s worth).
  • There are many other tools to explore from here, including creating DICOM files, manipulating them, and transmitting them.

Other library options

There are other libraries like DCM4CHE depending on your needs and preferences. DVTk is another example.

Step 4: Do Awesome

Now that you have the basic building blocks of a DICOM image file, you can now begin to create imaging magic. It only gets better from here. Some resources to keep you moving forward:


Quick and dirty JavaScript DICOMweb UID generator

I thought I’d share this tidbit code, to create UIDs (Study Instance UID, SOP Insuance UID, etc) using Javascript, primarily for use with standards like DICOMweb STOW-RS and UPS-RS.

WARNING: This is not a proper implementation, and could result in UID collisions in your repositories. Do not use in production – you should use proper UID management procedures defined in  DICOM PS3.5 B.1.

var dicomUID = '2.25.xxxxxxxxxxxx4xxxyxxxxxxxxxxxxxxx'.replace(/[xy]/g, function(c) {
   var r = Math.random()*16|0, v = c == 'x' ? r : (r&0x3|0x8);
   return v.toString(10);

I put a functional example on jsFiddle.

See DICOM PS3.5 B.2 for the specification reference. The JavaScript inspiration came from an answer on StackOverflow.