Here is the change:
It appears it doesn’t breaks anything in the build or while using Slicer.
It was implemented according to this (and it is said to have a faster convergence):
@pieper I would appreciate your opinion. VerySleepy said that Slicer used a lot of time on this function.
Nice work @mau_igna_06 That’s just the way it is meant to be done
@mau_igna_06 Could you submit a merge request against the upstream VTK project and reference the link here ?
While working on the merge request, considering including details like these ones in the commit description:
- performance improvement benchmarks
- tradeoff & limitation of the new approach (if any)
I did a pseudo benchmark with Slicer (when I have more free time I’ll create a standalone VTK app to test this) by building it with the “original” and “modified” version of vtkMath::JacobiN.
The pseudo benchmark test was set the scene with 500 points in one MarkupsFiducial and to turn on the auto rotate 3D view for 100 seconds, and by analyzing the call stack with VerySleepy.
These are the results:
As you can see the exclusive time of execution reduced from 0.21seconds to 0.14seconds (and it cannot be seen in the pictures but the execution count of these functions was approx 750 and 850 correspondingly).
These agrees with what is said on the book I linked in my first post:
GaussSeidel 5 iterations have the same accuracy than Jacobi 7 iterations.
Also, from this source:"Yang, King H. (2018). Basic Finite Element Method as Applied to Injury Biomechanics || Stepping Through Finite Element Analysis. , (), 281–308. doi:10.1016/B978-0-12-809831-8.00007-6 ": you can see the tables of the solution of a 3x3 linear system with Jacobi and GaussSeidel, my yellow highlight shows when solution reaches 1% accuracy:
The book says the only reason to give Jacobi preference from GaussSeidel is that Jacobi can be parallelized, so we could have it mind for some GPU implementation on the future. Because we are executing in CPU we should use GaussSeidel.
There is no final word yet because I have to create a pure VTK example to test this out, but I’ll do it, probably, this next weekend
Thank you for looking into this!
We do performance profiling from time to time and whenever any specific performance problem comes up, but I don’t recall any performance bottlenecks related to
vtkMath::JacobiN. Do you have a use case where this method plays a significant role?
Note that VerySleepy is a very simple, convenient tool that can give hints about performance bottlenecks, but it is not very specific or reliable (it may give misleading results, so never rely on it as a single source). I would recommend to use Visual Studio’s built in performance profiler in addition to VerySleepy.
I’ve also found profilers are great for pinpointing very signficiant bottlenecks, but once you fixed those then they start to give misleading results. Always check elapsed time for a complete end-user function to see if any performane optimization leads to actually faster performance. For some quantitive testing, you can use the FPS marker in 3D views, Performance Tests module, or Node Modified Statistics module in DeveloperTools extension.
Make sure to do performance profiling on an application built in RelWithDebInfo mode. Debug mode builds have very different performance bottlenecks; and Release mode without debug information does not give you specific information.
Even if the proposed change does not to lead to any user-perceivable performance improvements (performance would only improve if this method was used in a hot loop), it may still make sense to proceed with it, as it does not seem to have any disadvantages (it is not more complicated or less robust or less accurate as the current implementation).
I finally found a book that talks about calculating eigenvalues and eigenvectors with iterative methods
According to it:
one Gauss-Seidel step is worth two Jacobi steps
But now I have to go deep into implementation, it isn’t so simple as it appeared in the beginning but I’ll work more on it on the weekend
Before digging into this, I would recommend to verify that this will lead to perceivable speed improvement for users. For example, see if anything gets visibly faster if you manually limit the maximum number of iterations. If the method is only responsible for 0.1% of the computation time then even a 10x speed improvement might not worth a lot of development time investment. The same effort might be better spent on more impactful changes.