21 OctSfN 2015 Poster

Acceptable values of similarity coefficients in neuroanatomical labeling in MRI

Andrew Worth1, Jason Tourville2

1 Neuromorphometrics, Inc., Somerville, MA
2 Dept. of Speech, Language, and Hearing Sciences, Boston University, Boston, MA

Abstract (download poster, 4.3MB)

The delineation of neuroanatomy in magnetic resonance brain scans (known as segmentation, labeling or tracing) is commonly validated by comparison with a manually created gold standard.  This is done using spatial overlap statistics such as the Dice index and Jaccard coefficient.  We examine published similarity values for particular anatomical regions to determine what is generally acceptable and compare this with results obtained by manually labeling repeat scans of 20 subjects.  Each subject was scanned twice separated by some time and both scans were labeled independently.

MRI brain scans were obtained from the “reliability” set in the Open Access Series of Imaging Studies (OASIS).  A single highly trained technician used custom software “NVM” to create closed borders around regions with isointensity contours and editing.  Labeling was performed using protocols that precisely define the landmarks, borders and methods for delineating anatomical regions.  “SegMentor” scripts helped assure accuracy and documented adherence to the labeling protocols by imbedding them into the software.  72 regions comprehensively covered the brain including: cerebral and cerebellar gray and white matter, ventricles, brain stem, accumbens, amygdala, caudate, hippocampus, pallidum, putamen, and thalamus according to the “General Segmentation” protocol defined by the MGH Center for Morphometric Analysis, and the cortex was parcellated into 51 units based on 36 sulci according to the BrainColor protocol.  An anatomist with years of labeling experience checked all results and the technician made corrections as necessary.

We demonstrate that manual results can have the best possible overlap metrics, but only after spending a sufficient (and many would say excessive) amount of effort.  The similarity coefficients we present thus represent a kind of upper bound on what is currently attainable.  Similarity values occur over a range because specific anatomical regions have differing amounts of anatomical variation and are affected differently by scanning artifacts: some regions are harder to label than others because the boundaries are less apparent.

Manual methods have the advantage that anything that can be seen by the human visual system can be labeled.  But manual labeling is tedious, requires a lot of expertise, and can have errors due to fatigue and variation in the application of the labeling protocol.  Automated methods require less human time/cost but fail on unfamiliar anatomy and need to be checked and corrected.  We conclude with an argument as to why an optimal system involves interactive automation: a combination of algorithms along with manual checks and corrections.

Nov. 19, 2015 Update: labels were edited again and all overlap numbers are given here.

Comments are closed.