Issue1124

Title Improve SoPlex performance
Priority wish Status resolved
Superseder Nosy List florian, jendrik, malte, silvan
Assigned To florian Keywords
Optional summary

Created on 2023-10-10.18:04:36 by florian, last changed by florian.

Messages
msg12029 (view) Author: florian Date: 2026-02-23.12:01:27
I added a message to issue1199 to keep an eye out for the performance of SoPlex. I'd say if it improves by at least 13 tasks in the operator-counting configuration, we can consider this issue resolved (even though the improvement could have come from other changes between the versions).

I suggest we close this issue and reopen it if the performance in issue1199 is still bad and someone wants to follow up on it.
msg12027 (view) Author: malte Date: 2026-02-23.11:55:38
Now that I reread all messages of this issue more thoroughly, this makes sense. 

However, the question is whether we want to actively pursue this. It's been a bit over two years, which perhaps isn't that long compared to other issues, but at the same time for a "wish" issue, I think we should either commit to doing it or not keep it open.

Do you plan to work on this, or find someone to work on this, in the next months?
msg12026 (view) Author: florian Date: 2026-02-23.11:51:07
We created this issue after getting rid of OSI, because performance dropped and we wanted to see why. So I wouldn't say it is not relevant because we no longer use OSI. However, the picture could have changed completely since the current SoPlex version is now 8.0.1. issue1199 is also testing old and new SoPlex versions so we will probably see in those results whether the update to newer SoPlex versions brought back the 13 tasks we lost when dropping OSI.
msg11932 (view) Author: malte Date: 2025-12-25.14:51:09
@Florian: now that we don't use OSI any more, I assume this one is resolved. Is this right?
msg11462 (view) Author: florian Date: 2023-10-17.12:12:36
I ran experiments with diverse potential heuristics, an operator-counting
heuristic, and the landmark CP heuristic. For these heuristics, I'm comparing 5 versions:
* the old code running OSI and SoPlex 3.1.1,
* the current code running SoPlex 6.0.3*,
* a version running SoPlex 6.0.4* with the fix that avoids storing/clearing temporary information,
* the version above that additionally stores the basis after every solve and restores it afterwards, and
* one version that stores the basis and additionally switches off the scaler.

For the last two versions, the assumption was that whenever the dimensions of the LP match the previous solve, it is OK to recover the previous basis. Unfortunately we got segfaults and changed expansion numbers from the landmark configuration in those cases, so maybe this is not the case. Switching off the scaler might still be positive but we cannot evaluate it with this experiment if the basis restoration is broken.

For diverse potential heuristics, we already got a big performance improvement from switching off OSI (+24 tasks), the remaining changes only show minor changes in this config. Coverage and time score actually go down by ~2 with the fix in SoPlex. This doesn't make sense for the fix alone because it only avoids unnecessary effort but there are other changes going from 6.0.3 to 6.0.4. This config in general is somewhat hard to predict because it only solves LPs at the beginning and there is a non-deterministic choice of which optimal LP solution is used as a potential function. The number of expansions and the search behavior are thus different, making it more difficult to compare versions. Overall, I'd say with this config, both of the non-OSI versions are fine.
Report: https://ai.dmi.unibas.ch/_experiments/ai/downward/issue1124/data/issue1124-v2-eval/report-diverse-potentials.html

For the landmark heuristic, I don't have results for the OSI version because of the name change of the heuristic. Looking at the SoPlex fix that avoids storing unnecessary data shows only a marginal improvement (coverage goes up by 2 but there are some domains with one additional task solved, some with one fewer. The time score is roughly the same. The memory score improves but this configuration doesn't normally run out of memory). Overall, this looks like noise to me. 
Report: https://ai.dmi.unibas.ch/_experiments/ai/downward/issue1124/data/issue1124-v2-eval/report-lm-cp.html

The operator-counting heuristic was the starting point for this issue, since we lost 13 tasks when switching off OSI. While memory performance increased a lot, this didn't change much as the config also rarely runs out of memory. The lost tasks all timed out. With the SoPlex fix, things speed up somewhat and we recover 3 of the lost tasks. Storing the basis seemed to be OK here (same number of expansions everywhere) and increased the overall time score by almost 7 (without an effect on coverage unfortunately). Additionally switching off the scaler also has a positive effect on runtime (+2 total time score, +2 coverage). Interestingly, the time score of the final config (665.33) is higher than that of the OSI config (661.36) but the number of timeouts is higher. My guess is that the impact of restoring the basis is very dependent on the domain. For example, it helps a lot in Miconic, where it increases the time score by 4 without solving additional tasks.
Report: https://ai.dmi.unibas.ch/_experiments/ai/downward/issue1124/data/issue1124-v2-eval/report-seq-lmcut.html
Plot showing the effect on runtime in Miconic:
https://ai.dmi.unibas.ch/_visualizer/?xattribute=total_time&yattribute=total_time&properties_file=https%3A%2F%2Fai.dmi.unibas.ch%2F_experiments%2Fai%2Fdownward%2Fissue1124%2Fdata%2Fissue1124-v2-eval%2Fproperties.tar.xz&groupby=domain&xsize=1000&ysize=1000&entries_list=soplex604x-basis-noscale-seq-lmcut+osi-seq-lmcut+seq-lmcut&y_range=%5B0.4580125258299728%2C+11.0%5D&x_range=%5B0.009000000000000001%2C+1100.0%5D&relative=true
msg11448 (view) Author: florian Date: 2023-10-10.18:04:36
When profiling SoPlex in issue1076, we noticed that its performance degraded compared to the version using OSI. Together with the OSI developers, I traced this down to a couple of issues:

* The new code spends a lot of time in the "scaler" (a way to multiply all coefficients by a constant to get better numerical stability). This can be switched off and improved performance in the instance I used for profiling.

* SoPlex cleared some temporary data in cases where this was not necessary. This is fixed by https://github.com/scipopt/soplex/pull/20 The pull request is merged but not released yet.

* OSI (or the old SoPlex version) seems to warm-start in cases where the new SoPlex cannot do this. In particular, this happened in an operator-counting configuration with landmark constraints. From one solve to the next bounds were changed (not an issue for warm starts) but we also remove all landmark constraints and add new ones. In the example I looked at, these landmark constraints didn't change in a lot of cases (so we added the same constraints we removed). The new SoPlex invalidates the current solution and starts from scratch, while the old one could solve the LP with 0 simplex iterations (i.e., the old basis immediately led to an optimal solution). A quick fix for this is to actively store the basis after every solve and restore it before the next solve if the dimensions of the problem still match. This also improved performance in the instance where I profiled this.

I plan to run some experiments to see if the performance increase holds up.
History
Date User Action Args
2026-02-23 12:01:27floriansetmessages: + msg12029
status: chatting -> resolved
2026-02-23 11:55:39maltesetmessages: + msg12027
2026-02-23 11:51:07floriansetmessages: + msg12026
2025-12-25 14:51:09maltesetmessages: + msg11932
2023-10-17 12:12:36floriansetmessages: + msg11462
2023-10-17 12:12:16floriansetmessages: - msg11461
2023-10-17 12:12:00floriansetstatus: unread -> chatting
messages: + msg11461
2023-10-10 18:04:36floriancreate