Create standard terminology for new segmentation project

The key is to use a project-independent, universal, archival quality segmentation file format. For example, .seg.nrrd file with terminology or DICOM Segmentation Object can both fulfill this role.

You can perform a very simple fully-automatic normalization step (e.g., that is implemented in slicerio Python package) to convert the universal segmentation files to project-specific nrrd files (where you use the same label values for the same structure in all labelmap images). This allows compilation of data from many collections, it does not matter what label values are used in each collection, or if different internal names are used, or some collections have extra segments, etc. These automatically-derived normalized segmentation files are considered temporary files, only used for training, and should not be shared or archived (to avoid redundant storage, backup, complex administration of licenses of combined collections, etc).

I’ve provided a bit more details here.

Exactly. We are starting a new project that will involve segmenting a lot of different organisms using a consistent terminology. But we have to build our own terminology from scratch. So any pointers on how to go about it correctly is much appreciated.

Will it be possible to embed custom terminology in seg.nrrd? How is this different than current practice.

From here

I did not understand this sentence:

The segmentation (.seg.nrrd) files may have the segments in different order, therefore different label values may be used for the same segment in each file.

Is this meant to be a warning about the ordering?

Also, for the labels.csv that needs to be created for slicerio conversion, it is not clear to me whether the header file is fixed or we create our own custom header? Most of these do not fit our terminology

LabelValue,Name,SegmentedPropertyCategoryCodeSequence.CodingSchemeDesignator,SegmentedPropertyCategoryCodeSequence.CodeValue,SegmentedPropertyCategoryCodeSequence.CodeMeaning,SegmentedPropertyTypeCodeSequence.CodingSchemeDesignator,SegmentedPropertyTypeCodeSequence.CodeValue,SegmentedPropertyTypeCodeSequence.CodeMeaning,SegmentedPropertyTypeModifierCodeSequence.CodingSchemeDesignator,SegmentedPropertyTypeModifierCodeSequence.CodeValue,SegmentedPropertyTypeModifierCodeSequence.CodeMeaning,AnatomicRegionSequence.CodingSchemeDesignator,AnatomicRegionSequence.CodeValue,AnatomicRegionSequence.CodeMeaning,AnatomicRegionModifierSequence.CodingSchemeDesignator,AnatomicRegionModifierSequence.CodeValue,AnatomicRegionModifierSequence.CodeMeaning

The codes are embedded in the seg.nrrd file. The terminology must be an external file (you don’t want to redefine the entire terminology in each segmentation file).

The current practice unfortunately is that people leave the terminology code at the general “tissue” default, and use the segment name to identify segments.

There are 2 main hierarchy levels: category and type (with an optional modifier). In addition to this, you can specify anatomical region (with an optional modifier). This is universal terminology, so it should be applicable to anything in biomedical computing. We’ll simplify the column names to the ones described here, because I agree that the current column names are too long and confusing.

You don’t have to use SNOMED CT, the scheme is compatible with any terminology, such as TA2, FMA, etc.

Is this meant to be a warning about the ordering?

It is just an example of why you may end up having segment ending up with different label values. Maybe not the best example. I need to clarify it more.

We are using UBERON which provides the largest consolidated terminology for vertebrates. Here are some terms we are likely to include (all of them are bones). How would you go about building a terminology from these?

UBERON:0011639 (ebi.ac.uk)

UBERON:0004743 (ebi.ac.uk)

UBERON:0001688 (ebi.ac.uk)

UBERON:0008194 (ebi.ac.uk)

UBERON is a great choice. It has a wide scope and it is freely usable, including the ontology.

The coding scheme designator is UBERON, code value is for example 0011639, and code meaning is frontoparietal bone.

You can put all the codes you want to use in a .term.json file, drag-and-drop it to the Slicer window, and they will be automatically be imported and will be available in the terminology popup.

Since UBERON is open, we could try to format entire terminology (all the anatomical structures) as a .term.json file and see how well Slicer’s terminology selector can cope with it.

Uberon has thousands of terms. If we fully incorporate it, I am worried that it might be too big to manage and navigate (having to search for terms for a segment wouldn’t be conducive to using terminology).

I have an abbreviated version of some common skeletal terms here

Can you take a look and comment on it? It seems to work with terminology (i.e, valid JSON file), but I am not sure if we correctly organized it. (BTW, assigned UBERON numbers are fake. I didn’t try to query and pull from it).

I did a quick test, downloaded uberon-full.json and wrote this little script to convert all anatomical structures into a Slicer terminology json file.

uberonFile = "path/to/uberon-full.json"
terminologyFile = "path/to/uberon.term.json"

import json
import random

with open(uberonFile, 'r', encoding='utf-8') as file_object:
    uberon = json.load(file_object)

types = []
for node in uberon["graphs"][0]["nodes"]:
    if not node["id"].startswith("http://purl.obolibrary.org/obo/UBERON_"):
        continue
    if not node.get("lbl"):
        # deprecated
        continue
    types.append({
        "CodingSchemeDesignator": "UBERON",
        "CodeValue": node["id"].split("_")[-1],
        "CodeMeaning": node["lbl"],
        "recommendedDisplayRGBValue": [random.randint(0, 255), random.randint(0, 255), random.randint(0, 255)]
        })

terminology = {
  "SegmentationCategoryTypeContextName": "Segmentation category and type - Uberon anatomical structures",
  "@schema": "https://raw.githubusercontent.com/qiicr/dcmqi/master/doc/segment-context-schema.json#",
  "SegmentationCodes": {
    "Category": [
      {
        "CodingSchemeDesignator": "SCT", "CodeValue": "123037004", "CodeMeaning": "Anatomical Structure",
        "showAnatomy": False,
        "Type": types
      }
    ]
  }
}

with open(terminologyFile, "w") as f:
    json.dump(terminology, f, indent=4)

The result is a small file containing 15574 structures. You can download it from here. You can drag-and-drop it into Slicer and you have all UBERON terms readily selectable - no chance for typos, no need to manually copy codes, etc.

Slicer terminology selector was a bit slow (filtering by name took a few seconds), but after some optimization it is now very fluid. I’ve submit a pull request with the changes, it should be available in the Slicer Preview Release within a few days.

There are a few limitations of this approach:

  • The terminology browser in Slicer currently does not use an ontology (shows just a flat list, not possible to jump to parent or get a list of children), so it may be difficult to find the right term. However, you can use the online UBERON browser to explore the various ontologies and find the right term, which you can then very easily find by name in Slicer.
  • I’m not sure if you need modifiers. If yes, then you need to find an automated way to figure out what modifiers are relevant for what structures. Maybe something simple like adding left and right modifier to every type that does not have left or right in the name already could cover what is needed.
  • Color is now generated randomly. It could make sense to get generic color from the ontologies (e.g., bones are white/yellow, muscles red, arteries deep red, veins blue, etc. maybe with some randomization)

To make the selection easier, it may make sense to also create smaller terminology files for specific projects that contains a subset of this full list. This subset file would have a unique SegmentationCategoryTypeContextName, for example SlicerMorph Dentition. I would keep the Anatomical Structure category for DICOM compatibility. showAnatomy has to be set to false for anatomical structures (it only makes sense to specify anatomical region selector if the type is not an anatomical structure, e.g., if the type is a needle then you can specify the anatomical region it is in). I don’t think you need to use modifiers (the UBERON codes seem to usually include it). I would drop all the attributes that you don’t use (context group name, cid, etc.).

It would look something like this:

{
    "SegmentationCategoryTypeContextName": "SlicerMorph Dentition",
    "@schema": "https://raw.githubusercontent.com/qiicr/dcmqi/master/doc/schemas/segment-context-schema.json#",
    "SegmentationCodes": {
        "Category": [
            {
                "CodingSchemeDesignator": "SCT",
                "CodeValue": "123037004",
                "CodeMeaning": "Anatomical Structure",
                "showAnatomy": false,
                "Type": [
                    {
                        "recommendedDisplayRGBValue": [ 0, 179, 92 ],
                        "CodeValue": "100022",
                        "CodingSchemeDesignator": "UBERON",
                        "CodeMeaning": "Maxillary Molar 1",
                        "3dSlicerLabel": "maxillary molar 1",
                        "Modifier": [
                            {
                                "recommendedDisplayRGBValue": [0, 179, 92 ],
                                "CodingSchemeDesignator": "SCT",
                                "CodeValue": "24028007",
                                "CodeMeaning": "Right",
                                "3dSlicerLabel": "right maxillary molar 1",
                            },
...
                      ]
                    }
                ]
            }
        ]
    }
}
...

It would be great if we could pull in community contributed standards like the one Jiami has done:
http://www.graysvertebrateanatomy.com/work/colorsofskullanatomy/

But these should be selectable - sometimes you’ll want all the bones to be the same and sometimes you’ll want to subdivide them.

1 Like

Dear Andras, thank you so much for doing this. I downloaded the file you generated.

This is important to optimize, because if search takes longer than manually renaming a segment, it is unlikely that people will bother to use the terminologies.

At this point, everything seems to be listed under the “anatomical structure”. We should divide this into other contexts, such as skeletal system, muscles, organs etc. To make navigating the list easier and faster.

The only modifier we need is the L/R, and for some of the skeletal terms Uberon seem to provide left and right as separate structures. So at this point I can’t think of other usage, but I will check with my colleagues.

We have about 20 visually distinct colors. I think the idea is to use these colors in regions that are closeby or in articulation (e.g., cranial bones), and then recycle them for different anatomical regions. Assigning same color to patella and scapula wouldn’t hurt the visual representation, as they are not near by. Most of our segmentations are bones, so having a fixed color for bone types will not really work for us.

For human anatomy you can look at

[

TA2 Viewer
ta2viewer.openanatomy.org

apple-touch-icon.png

](TA2 Viewer)

Best
Ron

This email is intended for non-work related messages