I know there have previous discussions about using the coverage python module with scripted modules (How to generate a test coverage report for scripted module?), but I was wondering if there was any additional thought towards this?
I’m not aware of Slicer having determined what coverage all the tests/selfTests/etc actually cover. I’m sure other Slicer developers would agree that having quantification of coverage would be appreciated to see how well Slicer testing is actually doing and to find places in the code that are rarely executed.
@lassoan, @pieper, @jcfr Has the Slicer community ever gauged how well the core tests cover the code in a quantifiable metric?
It appears that @Alex_Vergara began adding code coverage to SlicerOpenDose. @Alex_Vergara, how well did that work out for you in your Slicer extension? Did you run into issues with appropriately getting coverage metrics?
Based on recent conversations related to the Slicer Extensions Index and how to manage the growing list of extensions that are submitted and how to better provide timely reviews, code coverage of extensions could be used as one of the elements for a Extension score card similar to ITK’s remote module compliance level. In the ever growing list of Slicer extensions, we could begin to rank extensions based on their compliance where extensions with no tests would be given a low score, an extension with a single test given a better score, and an extension with tests that are proven with high code coverage to be given an even higher score. Currently the Slicer Extensions Index has a self-score which is then rarely updated.
@lassoan, @pieper, @jcfr Do you all have any knowledge of past determination of how well Slicer testing covered the code? Or know if other groups have successfully utilized more quantifiable metrics regarding how well their testing covered their extensions?
I can’t think of anything other than what’s in the earlier posts. It would definitely be nice to have.
how well Slicer testing covered the code?
There are few scenarios to consider:
- independent python package (e.g utility code)
- python scripted module
- C++ modules
Coverage for (1) and (3) can be done respectively leveraging coverage and gcov.
Both may also be reported in CDash (see here)
Challenge will be to instrument infrastructure and then merge coverage results involving both python and C++ code (e.g scripted modules, SubjectHierachy/DICOM/SegmentEditor plugins, …)
It would be interesting to get some coverage statistics.
That said, at the application level I would prefer if user requirements (what are important for users to work correctly) would drive our testing efforts and not coverage percentages.
At library level we may not have specific user requirements, so coverage testing may be much more useful there. We could start with trying to add coverage testing to vtkAddon and MRML Core.
Hi, coverage in my extension coverage reports only works if I perform the tests manually within Slicer. If I do the tests inside a docker container without GUI, the reports are not produced.