msg9744 (view) |
Author: jendrik |
Date: 2020-10-23.14:07:30 |
|
I added a changelog entry for this issue on the main branch. Feel free to edit/expand.
|
msg9743 (view) |
Author: florian |
Date: 2020-10-22.15:29:35 |
|
Warnings like these depend on the compiler version, so it is normal that you don't get some warnings locally that we see on the buildbot (that is why we test the build with different compilers).
Looks like the buildbot is happy now, so I'll close the issue. Thanks again.
|
msg9742 (view) |
Author: rik |
Date: 2020-10-22.14:32:36 |
|
I already noticed the same problem and alleviated the problem and pushed a
minute ago. I tried compiling the code locally after fixing the style issues,
but didn't receive so much as a warning on my machine.
|
msg9741 (view) |
Author: florian |
Date: 2020-10-22.14:25:04 |
|
There is another error about an unused field "infinity". I think this is about the field in the class LinearProgram that we planned to add to avoid passing it in parallel all the time. We didn't do that in the end because it would have been a larger refactoring. Please just remove the field.
http://buildbot.fast-downward.org/#/builders/10/builds/598
|
msg9740 (view) |
Author: florian |
Date: 2020-10-20.21:44:43 |
|
one more problem just popped up: looks like the style test do not pass. Rik, can you run them locally with tox and fix the issue?
http://buildbot.fast-downward.org/#/builders/4/builds/288
|
msg9739 (view) |
Author: malte |
Date: 2020-10-20.21:43:40 |
|
It looks like the buildbot found a style violation. I did not look at the details.
|
msg9738 (view) |
Author: florian |
Date: 2020-10-20.21:42:56 |
|
Rik merged the change. There is one last small issue: the commit message of the merge was autogenerated by GitHub and broke our "[branchname] commit message" convention. I tried to fix it and force-pushed the change. If anyone pulled between 17:50 and now, please update your local copy. Rik, could you check the history in the main repository? If everything looks good to you, please make another update here to close the issue (set the field Status above to "resolved" and submit). Thanks to everyone involved in this. It'll be nice to have these debugging features available in the future.
|
msg9737 (view) |
Author: malte |
Date: 2020-10-20.17:18:51 |
|
Sounds good to me.
|
msg9736 (view) |
Author: rik |
Date: 2020-10-20.17:17:35 |
|
We have completed what this time really should be the last round of code
review and I implemented some minor stylistic changes suggested by Florian and
Malte. The code should now be ready to merge.
|
msg9735 (view) |
Author: florian |
Date: 2020-10-20.11:31:53 |
|
We discussed this live and decided not to care about the change in memory usage. Malte had some additional comments on the pull request (I added them in GitHub). After they are done and reviewed by Jendrik or me, we can merge.
|
msg9734 (view) |
Author: jendrik |
Date: 2020-10-20.10:46:43 |
|
I agree.
|
msg9733 (view) |
Author: rik |
Date: 2020-10-20.10:37:29 |
|
I have also tried to replicate memory issues by running the same
configuration locally. The memory usage was not consistent with the memory
usage on the grid. This goes for the new as well as the base version.
However, the memory usage on the grid as well as locally does seem to be
deterministic. I used version 0.107.9 of Osi and 3.1.1 of Soplex on both
machines.
Florian and I are not quite sure what to make of this, but feel like the
difference is minor enough and does not noticeably skew in one direction
that this could be merged regardless. Jendrik, would you agree to this?
|
msg9725 (view) |
Author: rik |
Date: 2020-09-29.11:43:12 |
|
I have rerun the experiments after incorporating all of the changes and the
results look much the same. Evaluations and expansions are the same for the
base and v3 version. Coverage once again went up slightly. Memory usage and
search time show the same patterns as in the previous experiment.
As announced, I ran a copy of the diverse potentials heuristic with soplex
and compared it with itself. The search time shows the same phenomenon where
most fall within 5% but a few instances are either almost twice as fast or
twice as slow, so this seems to have some cause that does not relate to this
issue. The memory usage is very consistent between the two copies however.
It seems like the variability is caused by the changes in this issue. The
versions with cplex as an lp solver does not show this variability in memory
usage, so I imagine it's something subtle that influences the memory usage
of soplex.
The full results can be found in issue960-v3-experiments.zip
|
msg9724 (view) |
Author: florian |
Date: 2020-09-22.18:40:02 |
|
I did another iteration through the changes and noticed a few more small things. I also saw a few comments by Jendrik that were still open. I think in the "conversation" tab they are hidden inside a "more comments" block. In the "Files changed" tab everything shows up as expected.
Once you resolved all open conversations, feel free to just run the experiments. No need to wait for an OK from us.
|
msg9723 (view) |
Author: rik |
Date: 2020-09-22.17:05:28 |
|
Thanks for the review Jendrik. I have addressed everything you brought up and
run all style checks. If you and Florian sign off on this version, I can rerun
the experiments to make sure this has not affected performance.
I will also run the soplex variant of the potential heuristic twice to see if
the variation in memory usage we see there is really just noise.
|
msg9719 (view) |
Author: jendrik |
Date: 2020-09-22.10:19:30 |
|
Ah, I forgot about that incompatibility. Tieing the names to the log level sounds like a good idea.
|
msg9718 (view) |
Author: florian |
Date: 2020-09-22.00:20:42 |
|
Thanks Jendrik. I don't think having names in debug mode would work because LP builds are incompatible with the debug build since the time we added the _GLIBCXX_DEBUG flag (the LP libraries are compiled without this flag which makes them incompatible on a binary level). We could tie it to the log-level in specific LP instances and have names only in non-silent log levels. But this shouldn't affect the current code.
|
msg9717 (view) |
Author: jendrik |
Date: 2020-09-21.23:54:42 |
|
I reviewed the code and left some comments.
On a more general note: it seems the current design assumes that users name LP variables and constraints only when they're debugging something and remove the names afterwards. I think we should encourage users to leave the names in permanently, but only in debug builds. On never knows when it's time for debugging again and this also helps documenting the LP. I don't think any code changes are necessary for this, but we could rephrase the comment that explicitly warns against using names in "production code". I think we should also add names (in debug mode) for at least one of the files that uses the LP code.
|
msg9716 (view) |
Author: florian |
Date: 2020-09-21.17:32:17 |
|
The memory usage looks fine for all configurations except for the SoPlex variant of potential heuristics. There, the results are a bit surprising: there are some tasks (mostly freecell and spider) where the memory usage increases a lot (e.g., 148MB -> 194MB) although the number of expansions is the same. At the same time, there are tasks where the usage decreases and there does not seem to be a trend in one direction or the other. I would have expected that the potential heuristic found in the two runs might be different but since the evaluations are the same, this cannot be the cause.
The timing looks very similar for all configurations: most tasks behave as expected but then some tasks vary a lot stronger than what I would expect: their factors are in the range of 0.5 - 2.0 instead of the range of 0.95 - 1.05. Again, there are cases with positive and negative impact and no clear trend. I cannot explain the higher variance. Did someone observe this in other experiments?
Augusto, do you know what's up with the translator segfaults in organic synthesis? Anyway, since the patch does not touch the translator, it seems to be unrelated.
Despite the strange variations, I think we could merge the patch if we don't spot anything in the review as the results do not seem systematically worse. I already reviewed the code. Jendrik, could you do a second iteration?
|
msg9715 (view) |
Author: rik |
Date: 2020-09-21.16:33:09 |
|
The pull request has now moved to github. Here is the new link:
https://github.com/aibasel/downward/pull/12
I have run experiments to compare the performance of the new implementation
of the LP wrapper class to the old one when
the debugging functionality is not used. I tested it with three different
heuristics which utilize lp solvers both with
cplex and soplex as an lp solver. There seems to be no meaningful impact on
performance. Coverage actually went up
slightly (this is definitely a coincidence). The number of evaluations and
expansions is identical for all domains and
configurations, so we can be reasonably confident that this change did not
affect the calculated heuristic values.
There seems to be no significant impact on search time. For any given
configuration, the score_total_time is at most 0.1%
lower and for some configurations the summed score was slightly higher. The
relative scatter plots show that the vast
majority of problems differed by less than 15% in terms of total time. The
plots indicate that this is down to noise.
There is a strange phenomenon which causes a significant minority of
problems to either be about 1.8x faster or slower
across all configurations, domains, and problem sizes.
There is a small but measurable impact on memory usage. Every configurations
sees an increase in memory use on the order
of 0.1%. The scatter plots show that there are very few outliers with. Only
the diverse potentials heuristic with soplex
as an lp solver contains runs where the memory usage differed by more than
5% and all of those problems were rather
small. This configurations also saw some runs where the new version used
more than 10% less memory.
The experiments contain 20 unexplained errors (out of over 21,000 runs), but
none of these relate to this issue in any
way. 12 of these errors were due to the translator segfaulting while
translating p11, p12 or p18 of the organic-
synthesis-opt18-strips domain and the other 8 were down to the planner
producing too much output when solving p12 and p14
of the parcprinter-08-strips domain when using the lmcount heuristic.
The full tables and charts can be found in issue960-experiments.zip.
|
msg9211 (view) |
Author: rik |
Date: 2020-03-12.17:14:51 |
|
I have added a few more commits to the pull request to address the issues
brought up in the way Florian and I agreed upon.
Here is the link once again: https://bitbucket.org/rikdegraaff/fast-downward-
issues/pull-requests/1/issue960/diff
|
msg9204 (view) |
Author: florian |
Date: 2020-02-27.16:25:56 |
|
Since we offering more information on the status of the LP, we should also add methods to test for infeasible and unbounded LPs. There is also a try-catch missing around create_lp_solver that we should add now. Additionally, adding methods to dump a constraint, a variable, a set of constraints, a set of variables, or a full LP, would be helpful. And finally, there is a CPLEX error that can sneak in as an unexplained error when CPLEX runs out of memory in the wrong place, we can fix that, too.
I'll attach patches for these things but please don't take the code for dumping the constraints verbatim. It should be a function that also takes a double ("infinity") and only print bounds if they are not infinite.
|
msg9200 (view) |
Author: florian |
Date: 2020-02-14.17:09:00 |
|
I think adapting the operator-counting framework and the other places where we currently use LPs should be straight-forward.
Actually, we might want to take this idea a step further: instead of two classes LPVariables and LPConstraints, we could have a single class LinearProgram that contains all required data. We could generate proxies for variables and constraints from it that in turn have the necessary getters and setters. We discussed something in this direction a while ago where we actually wanted to make these objects smarter, so you could for example change a bound on a constraint object and then re-solve the LP and the classes would keep track of the required changes internally. I think all of this would be too much to do in this issue (if we want to do it, we should create a new issue for it) but adding a LinearProgram with proxies for variables and constraints would be a step in this direction.
Rik, to see an example of what I mean, you can have a look at the classes TaskProxy, VariablesProxy, and VariableProxy. We can also discuss this live if you prefer.
|
msg9199 (view) |
Author: rik |
Date: 2020-02-14.14:09:57 |
|
Adding the LPVariables and LPConstraints classes seems like the best option to me. It does lose some flexibility over simply having a vector of something. Considering that I don't think what I'm working on in the context of time-unrolling (which generates and modifies an LP in multiple phases) would require too much of a rewrite to be compatible with this system, I suppose the benefits outweigh the costs. Do you think adapting the operator counting framework would present a significant challenge, or would that be very straightforward?
For posterity's sake, the solution I suggested was adding a separate, overloaded load_problem() function which accepts vector<NamedLPVariable>. The upside is that this has no impact on existing code and is incredibly easy to implement. The downside is that switching from debugging to not using debugging would be cumbersome because you need to adapt any function which touches the variabkes and constraints.
|
msg9191 (view) |
Author: florian |
Date: 2020-02-08.13:21:18 |
|
Ahh, of course another option would be to hide all this code with a preprocessor switch that is off by default and you'd just enable it for debugging. But so far, we mostly avoided this. Something like this:
class LPVariable {
// ...
#ifdef LP_DEBUG
string name;
#endif
}
|
msg9190 (view) |
Author: florian |
Date: 2020-02-08.13:17:19 |
|
I thought the size of empty string would be smaller but it looks like it would roughly double the size of LPVariable and LPConstraint in gcc > 5. (Interesting article, by the way: https://shaharmike.com/cpp/std-string/). We could reduce the overhead to a single pointer by using a unique_ptr<string> but it would still be an overhead.
We could also go for functions in the LPSolver class that forward the setRowName and setColName functions. The disadvantage I see there is that these functions only make sense after loading the LP into the solver and our code often splits the generation of constraints from the loading. So the code that generates the constraints could not easily set the names for the variables and names it creates.
Maybe another option would be to generate parallel structures LPVariableNames and LPConstraintNames similar to the way we generate names for enum options in the option parser. As for the option before, this would mean that existing code is not easy to adapt because we would have to modify the types of data passed between the class that generates LP variables and constraints and the class that uses them.
The main use case I have in mind is when we already have code like this:
vector<LPVariable> variables = generate_variables(task);
vector<LPConstraint> constraints = constraint_generator.generate(task)
lp_solver.load(variables, constraints);
and now want to add names inside the methods generate_variables and constraint_generator.generate for debugging. Without adding a field to LPVariable and LPConstraint, we'd have to modify the interface if we want to debug this code. How about adding a new class like this:
class LPVariables {
vector<LPVariable> variables;
vector<string> names;
public:
add_variable(double lower_bound, double upper_bound, double objective, string name = "");
}
and replacing the existing cases of vector<LPVariable> with it (and analogously for constraints of course)?
|
msg9189 (view) |
Author: malte |
Date: 2020-02-07.17:49:46 |
|
Makes sense. Feel free to involve me in the discussion if you cannot find something that makes you sufficiently happy, but only then. (Much on the plate at the moment.)
|
msg9188 (view) |
Author: rik |
Date: 2020-02-07.17:47:57 |
|
I initially implemented a version which had no impact when it is not used, but Florian found it too cumbersome; perhaps I could try and find a different approach which offers the best of both worlds with him?
|
msg9187 (view) |
Author: malte |
Date: 2020-02-07.17:43:46 |
|
I am not sure we want to pay for storing a string with every constraint whether or not we use the feature. Usually we try to make debug features more lightweight.
|
msg9186 (view) |
Author: rik |
Date: 2020-02-07.17:36:23 |
|
I have created a pull request for this issue: https://bitbucket.org/rikdegraaff/fast-downward-issues/pull-requests/1/issue960/diff
|
msg9178 (view) |
Author: rik |
Date: 2020-01-24.16:59:07 |
|
We want to add the ability to assign custom variable and constraint names and write out the LP to analyse errors.
|
|
Date |
User |
Action |
Args |
2020-10-23 14:07:30 | jendrik | set | messages:
+ msg9744 |
2020-10-22 15:29:35 | florian | set | status: in-progress -> resolved messages:
+ msg9743 |
2020-10-22 14:32:36 | rik | set | messages:
+ msg9742 |
2020-10-22 14:25:04 | florian | set | messages:
+ msg9741 |
2020-10-20 21:44:44 | florian | set | messages:
+ msg9740 |
2020-10-20 21:43:40 | malte | set | messages:
+ msg9739 |
2020-10-20 21:42:56 | florian | set | messages:
+ msg9738 |
2020-10-20 17:18:51 | malte | set | messages:
+ msg9737 |
2020-10-20 17:17:35 | rik | set | messages:
+ msg9736 |
2020-10-20 11:31:53 | florian | set | messages:
+ msg9735 |
2020-10-20 10:46:43 | jendrik | set | messages:
+ msg9734 |
2020-10-20 10:37:30 | rik | set | messages:
+ msg9733 |
2020-09-29 11:43:14 | rik | set | files:
+ issue960-v3-experiments.zip messages:
+ msg9725 |
2020-09-22 18:40:02 | florian | set | messages:
+ msg9724 |
2020-09-22 17:05:28 | rik | set | messages:
+ msg9723 |
2020-09-22 10:19:30 | jendrik | set | messages:
+ msg9719 |
2020-09-22 00:20:42 | florian | set | messages:
+ msg9718 |
2020-09-21 23:54:42 | jendrik | set | messages:
+ msg9717 |
2020-09-21 17:32:17 | florian | set | messages:
+ msg9716 |
2020-09-21 16:33:09 | rik | set | messages:
+ msg9715 summary: The pull request has now moved to github. Here is the new link:
https://github.com/aibasel/downward/pull/12
I have run experiments to compare the performance of the new implementation of the LP wrapper class to the old one when
the debugging functionality is not used. I tested it with three different heuristics which utilize lp solvers both with
cplex and soplex as an lp solver. There seems to be no meaningful impact on performance. Coverage actually went up
slightly (this is definitely a coincidence). The number of evaluations and expansions is identical for all domains and
configurations, so we can be reasonably confident that this change did not affect the calculated heuristic values.
There seems to be no significant impact on search time. For any given configuration, the score_total_time is at most 0.1%
lower and for some configurations the summed score was slightly higher. The relative scatter plots show that the vast
majority of problems differed by less than 15% in terms of total time. The plots indicate that this is down to noise.
There is a strange phenomenon which causes a significant minority of problems to either be about 1.8x faster or slower
across all configurations, domains, and problem sizes.
There is a small but measurable impact on memory usage. Every configurations sees an increase in memory use on the order
of 0.1%. The scatter plots show that there are very few outliers with. Only the diverse potentials heuristic with soplex
as an lp solver contains runs where the memory usage differed by more than 5% and all of those problems were rather
small. This configurations also saw some runs where the new version used more than 10% less memory.
The experiments contain 20 unexplained errors (out of over 21,000 runs), but none of these relate to this issue in any
way. 12 of these errors were due to the translator segfaulting while translating p11, p12 or p18 of the organic-
synthesis-opt18-strips domain and the other 8 were down to the planner producing too much output when solving p12 and p14
of the parcprinter-08-strips domain when using the lmcount heuristic.
The full tables and charts can be found in issue960-experiments.zip. -> |
2020-09-21 16:32:42 | rik | set | files:
+ issue960-experiments.zip summary: The pull request has now moved to github. Here is the new link:
https://github.com/aibasel/downward/pull/12
I have run experiments to compare the performance of the new implementation of the LP wrapper class to the old one when
the debugging functionality is not used. I tested it with three different heuristics which utilize lp solvers both with
cplex and soplex as an lp solver. There seems to be no meaningful impact on performance. Coverage actually went up
slightly (this is definitely a coincidence). The number of evaluations and expansions is identical for all domains and
configurations, so we can be reasonably confident that this change did not affect the calculated heuristic values.
There seems to be no significant impact on search time. For any given configuration, the score_total_time is at most 0.1%
lower and for some configurations the summed score was slightly higher. The relative scatter plots show that the vast
majority of problems differed by less than 15% in terms of total time. The plots indicate that this is down to noise.
There is a strange phenomenon which causes a significant minority of problems to either be about 1.8x faster or slower
across all configurations, domains, and problem sizes.
There is a small but measurable impact on memory usage. Every configurations sees an increase in memory use on the order
of 0.1%. The scatter plots show that there are very few outliers with. Only the diverse potentials heuristic with soplex
as an lp solver contains runs where the memory usage differed by more than 5% and all of those problems were rather
small. This configurations also saw some runs where the new version used more than 10% less memory.
The experiments contain 20 unexplained errors (out of over 21,000 runs), but none of these relate to this issue in any
way. 12 of these errors were due to the translator segfaulting while translating p11, p12 or p18 of the organic-
synthesis-opt18-strips domain and the other 8 were down to the planner producing too much output when solving p12 and p14
of the parcprinter-08-strips domain when using the lmcount heuristic.
The full tables and charts can be found in issue960-experiments.zip. |
2020-04-15 17:17:23 | augusto | set | nosy:
+ augusto |
2020-03-12 17:14:51 | rik | set | messages:
+ msg9211 |
2020-02-27 16:26:07 | florian | set | files:
+ lp_internals.cc.diff |
2020-02-27 16:25:57 | florian | set | files:
+ lp_solver.cc.diff messages:
+ msg9204 |
2020-02-14 17:09:00 | florian | set | messages:
+ msg9200 |
2020-02-14 14:09:57 | rik | set | messages:
+ msg9199 |
2020-02-08 13:21:18 | florian | set | messages:
+ msg9191 |
2020-02-08 13:17:20 | florian | set | messages:
+ msg9190 |
2020-02-07 17:49:46 | malte | set | messages:
+ msg9189 |
2020-02-07 17:47:57 | rik | set | messages:
+ msg9188 |
2020-02-07 17:43:47 | malte | set | messages:
+ msg9187 |
2020-02-07 17:36:23 | rik | set | messages:
+ msg9186 |
2020-01-27 13:38:07 | rik | set | status: unread -> in-progress |
2020-01-24 16:59:07 | rik | create | |