py_SubjectHierarchyGenericSelfTest started failing due to a failure of loading a scene. Digging into it, this error message (Error parsing XML in stream at line 38, column 151, byte index 10406: not well-formed (invalid token)) points to this line:
Specifically the quotation mark before “CodeMeaning”. As this attribute is added when loading a volume from DICOM, I suspect that now all saved scenes are invalid that contain a DICOM volume. It would be good either to revert this commit https://github.com/Slicer/Slicer/commit/9df49830897e2d0d6e458e7390eec7807a817b41
while fixing this, or fixing this quickly.
I’ll take care of this by this information into a member variable. We could encode the attribute to be able to store special characters, such as ", but we’ll use the unit for many things, so it’s better not to store it in just a generic metadata.
Scene corruption is a serious error that our development process should have captured. The pull request was open for a few days but just by looking at the code nobody noticed the error. SH test did capture the error (see failed test this morning, it even showed the relevant error message on the dashboard “Error parsing XML in stream at line 38, column 151, byte index 10404: not well-formed (invalid token)”) and the dashboard is normally green (we don’t have any failing test), so it should not be too difficult to spot new errors. @jcfr do you know why the tests were not executed automatically by the pull request (https://github.com/Slicer/Slicer/pull/750)? @Fedorov do you remember if you’ve run the automatic tests manually and if you saw the new test failure?
Do you think it would be worth sanitizing an attribute when setting (and restoring it when getting)? Users might add attributes with quotation marks and they might wonder what broke. Not sure if simply escaping the quotation mark would solve this, but if it does, then it’s a very simple solution.
Yes, currently special characters in any node properties (name, description, attributes, …) can lead to invalid scenes - see https://issues.slicer.org/view.php?id=3406. There is a function in MRML node already for encoding/decoding, we could probably use that in read/write XML of MRML node to fix this.