How to make proper landmark folder

hajime_osaki · October 17, 2020, 6:33pm

Hello.
I have put lamdmarks in some models created by segmentation editor. I’m trying to parse them on GPA , but they aren’t recognized when I select the landmark folder I choose. I think it’s probably a storage format issue …please let me know how to do it.Any helps on this is greatly appreciated.

muratmaga · October 17, 2020, 6:39pm

Landmark needs to be saved in fcsv format. İf you are using a recent version, the default is json. Make sure you change that to fcsv when you are saving or use export as from the data module.

hajime_osaki · October 18, 2020, 2:31am

Thank you for your teaching! I have changed it to FCSV and saved it, but, the message “warning: fcsv file format only stores control point coordinates and a limited set of display properties” is displayed. And then, when I select the landmark folder in GPA, the fcsv in the folder is not recognized …

muratmaga · October 18, 2020, 3:40am

You can ignore that warning message. I am not entirely sure what you mean by

What is not recognized? Are you getting an error message (Ctrl + 0). You need to have multiple fcsv files (belonging to multiple specimens) in that folder for GPA to work. You may want to review the instructions for GPA here

hajime_osaki · October 21, 2020, 9:52am

Thanks a lot. I don’t know how I can say…,
this is Python Interactor’s message when I try to execute GPA

muratmaga · October 21, 2020, 3:04pm

Again this is only part of the log. In this window, if you click CTRL+A it will select all the text, and then you can do CTRL+C and CTRL+V to copy and paste. By the way, are you using non-English characters in your filename?

@smrolfe can this be a non-ascii filename issue?

lassoan · October 21, 2020, 8:42pm

What do you see in the first few lines of the application log (menu: Help / Report a bug)? In particular, what is the content of the line starting with “Operating system”?

This is important because you need to use a recent Windows10 version if you want to use unicode strings in text files or you want to use directory and file names that contain non-ASCII characters. If you cannot update to latest Windows10 then move your data into files and folders that do not have any special characters in their name.

smrolfe · October 22, 2020, 6:51pm

I think it may be due to a non-ascii character in the fcsv file. @hajime_osaki if you can share a sample landmark file in addition to the error log that would be helpful.

lassoan · October 23, 2020, 2:15am

@smrolfe you are right, the error occurs in for row in datafile, which means that the file contains string that does not use UTF-8 encoding (e.g., may use some Japanese code page) or the application does not support UTF-8 code page because the application is not recent enough.

This is yet another example of why we need to get rid of this old .fcsv format. This error could not have happened with the new .json file format:

Text encoding is specified in json standard (it is required to be UTF-8), therefore it is always known how to decode the file content.
The 10-line handcrafted Python code that parses the csv file is relatively complex and very fragile. In contrast, parsing the json file is a single command and you can get any content from the file incredibly easily.

Read and parse markup json file:

import json
with open('path/to/my.mrk.json') as f:
    json_data=json.loads(f.read())

The parser contains the comment “Imports the landmarks from a .fcsv file. Does not import sample if a landmark is -1000”. I think it is obvious that just bad this is. We should do better. And fortunately, we can: markup json file includes positionStatus property for each control point, which is accessible simpy as controlPoint["positionStatus"].

Print position and status of each control point:

for controlPoint in json_data["markups"][0]["controlPoints"]:
  print("position: {0}, defined: {1}".format(controlPoint["position"], controlPoint["positionStatus"] == "defined"))

@muratmaga @smrolfe Please move away from fragile and inflexible legacy formats, especially because we now support modern and sustainable file format.

muratmaga · October 23, 2020, 3:43am

We will move to json, but not immediately.

For one, we (as in my lab) have thousands of datasets that already collected, and archived and converting them takes time. I remember when Slicer 4 released in 2012 and made the decision that each fiducial file is going to be saved as a single acsv file, it took years to stabilize around fcsv, and we had to do the similar conversion.

As for designating -1000 as a missing landmark, that’s a legacy of Slicer not being able to generate a blank fiducial. That has been a request of mine way before we started the SlicerMorph project, we still can’t do it. Since we have to maintain landmark order and numbers across dataset, we had to invent a way to encode this, and we have to make that known. Switching to JSON will not fix this problem.

Landmarks are a very important part of morphometric analysis. FCSV is not perfect, but it is easy and labs (not just mine) that use Slicer for data collection has been using it for a while, and forcing them to use JSON will make their historical data being inaccessible to GPA module.

lassoan · October 23, 2020, 4:19am

Many decisions in Slicer development turn out well, while some others don’t. We are not always smart enough and/or not always lucky - it is hard to predict what technologies become mainstream in 5-10 years.

We aim to preserve Slicer’s backward compatibility with hardware, software, and data formats for at least 5 years (so you don’t have to upgrade, if you don’t want to), which is consistent with industry standard for long-term support timeline (typically 3-5 years).

We have implemented all the infrastructure for this, and the only thing missing is exposing it on the GUI - about a few days of work. I thought your team was working on it and assumed that it was delayed because it was not that important. If you need help with this then let me know.

If we could demonstrate clear advantages of using a richer and more standard file format then I think users would transition voluntarily. Probably some more features (landmark placement templates with undefined point positions, etc.), better documentation, training, and examples (e.g., to show how easy it is to create data tables from json files) would help.

What is important is to provide as good support for json as for fcsv, in all modules. There should not be modules that can only accept fcsv format and not json.

I would also add standard csv export (single-line header, columns in first row) option for better compatibility with table-based analysis. This should also help in reducing usage of the proprietary fcsv format.

muratmaga · October 23, 2020, 4:37am

I am not saying json would be a poor decision. I think its benefits, particularly for measurement type markups will overweight its complications in the long run. It will happen, but will take time as there will
be inertia among groups, due to similar concerns to mine. When the switch to json first tabled on discourse earlier this year, I did probe the morphometrics community through the listserv, and there was a push back from other groups that use Slicer for types of research similar to ours. That’s mostly due to lack familiarity with the format, and JSON being quite verbose. Saving 3 control points in fcsv is 6 lines, same thing in json is 74. People who are used to looking flat files for a long time, that nested structure do appear confusing at first, and it seem like unnecessarily complex format to extract 9 coordinates.

That is a far too short time for types of research (natural history) that our community does

lassoan · October 23, 2020, 1:31pm

I think the main issue is that it is new and people just don’t know how much simpler and more efficient it actually is to work with json files than with csv. The main idea is that you can store all data in json (any number of markups, any metadata) and generate a table view from any parts of it, using a single line of code.

Yes, json is verbose, but it is not that bad. Many of the elements are optional (see schema). A minimal file consists of 1-2 lines of header, one row per fiducial, and one closing line, something like this:

{"@schema": "https://raw.githubusercontent.com/slicer/slicer/master/Modules/Loadable/Markups/Resources/Schema/markups-schema-v1.0.0.json#",
"markups": [{"type": "Fiducial", "coordinateSystem": "LPS", "controlPoints": [
    { "label": "F-1", "position": [-53.388409961685827, -73.33572796934868, 0.0] },
    { "label": "F-2", "position": [49.8682950191571, -88.58955938697324, 0.0] },
    { "label": "F-3", "position": [-25.22749042145594, 59.255268199233729, 0.0] }
]}]}

Manual editing is not much worse either, because all modern text editors help json editing with features syntax highlighting, automatic formatting, validation, etc.

Running analysis on a json file should be as simple as on a csv file, because you can read it into a dataframe using a single line of code. For example, getting a table of control point labels and positions : using pandas (pip_install('pandas'); import pandas as pd):

controlPointsTable = pd.DataFrame.from_dict(pd.read_json(input_json_filename)['markups'][0]['controlPoints'])

Result:

>>> controlPointsTable
  label                                        position
0   F-1  [-53.388409961685824, -73.33572796934868, 0.0]
1   F-2     [49.8682950191571, -88.58955938697324, 0.0]
2   F-3   [-25.22749042145594, 59.255268199233726, 0.0]

Splitting the position vector column and into separate xyz columns and write it to csv takes 3 lines:

controlPointsTable[['x','y','z']] = pd.DataFrame(controlPointsTable['position'].to_list())
del controlPointsTable['position']
controlPointsTable.to_csv(output_csv_filename)

Resulting csv file:

,label,x,y,z
0,F-1,-53.388409961685824,-73.33572796934868,0.0
1,F-2,49.8682950191571,-88.58955938697324,0.0
2,F-3,-25.22749042145594,59.255268199233726,0.0

hherhold · October 23, 2020, 1:45pm

For what it’s worth, I had planned on taking half a day to convert my code that uses fcsv to using json. It took me about 15 minutes.

hherhold · October 23, 2020, 1:47pm

Oh, and that includes keeping in the old code - basically I check for a json file, if it exists, I use that, otherwise it parses the old fcsv file. (This is a script that runs through a hundred scans or so to tally what fiducials/markups have been placed on what scan, and where.)

muratmaga · October 23, 2020, 3:43pm

We maintain SlicerMorph primarily for 3D landmark digitization and for basic shape analysis and visualization. Most of our users end up using R shape analysis packages for more complex analysis they want to do. It looks like, it is going to be fairly straightforward to read coordinates directly from Json as a data matrix like we currently do with fcsv. We will show it in our next user check-in (this week). But again, it will take time.

Does this mean, you guys are planning to switch to saving all markups generated in a scene to a single json file? Currently every node is saved as a distinct file. I would prefer keeping the current behavior.

Also I still do not see a units tag in the json.

muratmaga · October 23, 2020, 3:44pm

This is not a big deal for your own data, it becomes complicated when you have collaborators because everyone has to update their workflow at the same time.

hherhold · October 23, 2020, 3:54pm

Yeah, I’m pretty spoiled with a user base of… one. I didn’t mean to sound insensitive to issues of keeping many others happy.

lassoan · October 23, 2020, 3:57pm

We need to keep one-to-one mapping between MRML node and data file for scene saving, so we don’t plan to change the current save/load behavior. However, we can export an entire folder of markups into a single file, and import an entire group of markups from a single file (don’t try it, not implemented yet, but can be implemented anytime, with a few-hour effort).

This has not been implemented yet. I’ve added a ticket to make sure we don’t forget about it: Save length unit in markups files · Issue #5261 · Slicer/Slicer · GitHub

Again, the beauty of json is that we can make such changes in a clean and backward compatible way: we add a new “length unit” property, describe its type, valid values, etc. and the default value, so that we know how to interpret all those files which were created before this field was introduced. Such mechanisms can help with keeping all older mkp.json files valid and well defined, even as the format evolves. We can of course still make bad decisions and need to live with the consequences, but at least we have a number of mechanisms to properly handle some unavoidable mistakes.

smrolfe · October 23, 2020, 6:52pm

Thanks @lassoan. We still plan to work on this although the project got moved back due to other deadlines.

Topic		Replies	Views
MarkupsLine .fcsv loads as MarkupsFiducials Support markups	44	3198	September 11, 2020
Export formats for markups Support markups-fiducials	19	4109	March 15, 2022
Read performance between json and fcsv Support segmentation , markups	6	660	April 3, 2021
Importing Landmarks from excel file into 3D Slicer Support markups-fiducials , file-import	9	2376	June 7, 2021
Read In JSON Landmark files Support markups-fiducials	10	1161	July 18, 2020

How to make proper landmark folder

Related topics