msg4057 (view) |
Author: malte |
Date: 2015-03-15.14:03:36 |
|
Hi Silvan, see issue508, especially msg4054.
|
msg4056 (view) |
Author: silvan |
Date: 2015-03-15.14:02:22 |
|
Hey guys, I am unable to compile the master version after the integration of
this issue.
In file included from /usr/include/c++/4.8/memory:81:0,
from lp_solver.h:7,
from lp_solver.cc:1:
/usr/include/c++/4.8/bits/unique_ptr.h: In instantiation of ‘void
std::default_delete<_Tp>::operator()(_Tp*) const [with _Tp = OsiSolverInterface]’:
/usr/include/c++/4.8/bits/unique_ptr.h:184:16: required from
‘std::unique_ptr<_Tp, _Dp>::~unique_ptr() [with _Tp = OsiSolverInterface; _Dp =
std::default_delete<OsiSolverInterface>]’
lp_solver.h:97:5: required from here
/usr/include/c++/4.8/bits/unique_ptr.h:65:22: error: invalid application of
‘sizeof’ to incomplete type ‘OsiSolverInterface’
static_assert(sizeof(_Tp)>0,
^
make: *** [.obj/lp_solver.release.o] Error 1
|
msg4044 (view) |
Author: florian |
Date: 2015-03-09.17:52:12 |
|
Merged and looking forward to the integration of more LP heuristics :-)
|
msg4041 (view) |
Author: malte |
Date: 2015-03-08.13:08:01 |
|
If you like the results, please merge!
|
msg4040 (view) |
Author: florian |
Date: 2015-03-07.11:30:18 |
|
I compared the current tip of the default branch with the issue branch. Results
still look good:
http://ai.cs.unibas.ch/_tmp_files/pommeren/issue443-v4-base-issue443-v4-compare.html
Should I merge?
|
msg4036 (view) |
Author: malte |
Date: 2015-03-06.16:33:17 |
|
Yes, experiments before merging would be good.
|
msg4033 (view) |
Author: florian |
Date: 2015-03-06.15:32:05 |
|
Thank you for the review. I updated the code in the bitbucket pull request with
your suggestions. Should I run another experiment now?
|
msg3989 (view) |
Author: florian |
Date: 2014-12-29.20:29:50 |
|
I updated the pull request with the new code version. Could you do a review as a
next step? Of course this is not urgent, I probably will not work on this until
February.
https://bitbucket.org/flogo/downward-issues-issue443/pull-request/1/issue443/diff
The results for this revision look similar to the last experimental version (+45
coverage and almost always faster):
http://ai.cs.unibas.ch/_tmp_files/pommeren/issue443-base-issue443-v3-compare.html
I also opened issue506 for the changes we did not want to do right now.
|
msg3987 (view) |
Author: malte |
Date: 2014-12-25.12:00:01 |
|
Looks very good indeed! :-) Do the crashed nodes work again, and if not, have
they been reported? (If you report them, please say that they're not urgent.)
> I agree, this sounds good but can wait until later. Should I open a separate
> issue for this change?
Yes, sounds good.
> For now, I suggest that I reimplement experimental-v4 in the main issue branch
> (without all the hacks and debug code from the experimental branch) and run
> experiments for that version. I prefer experimental-v4 over experimental-v5
> because it uses a clearer interface. The advantage that experimental-v5 has
> (10-20%) from not copying the data is probably something that we can achieve in
> the new issue when we store all data in the LP class as you suggested.
Sounds good.
|
msg3986 (view) |
Author: florian |
Date: 2014-12-24.13:25:34 |
|
I added a second vector<LPConstraint> to LM-count and in each round copy all
non-empty LPConstraints to this vector and use the result to generate the LP
(experimental-v4). I also tried changing the interface to accept a
vector<LPConstraint *> to avoid the copy (experimental-v5). The results of both
version look very good:
== Comparing base to experimental-v4 ==
http://ai.cs.unibas.ch/_tmp_files/pommeren/issue443-base-issue443-experimental-v4-compare.html
http://ai.cs.unibas.ch/_tmp_files/pommeren/issue443-issue443-experimental-v4-v5-relative-scatter-time-v4.png
== Comparing base to experimental-v5 ==
http://ai.cs.unibas.ch/_tmp_files/pommeren/issue443-base-issue443-experimental-v5-compare.html
http://ai.cs.unibas.ch/_tmp_files/pommeren/issue443-issue443-experimental-v4-v5-relative-scatter-time-v5.png
http://ai.cs.unibas.ch/_tmp_files/pommeren/issue443-issue443-experimental-v4-v5-relative-scatter-time-v5-no-psr.png
== Comparing experimental-v4 to experimental-v5 ==
http://ai.cs.unibas.ch/_tmp_files/pommeren/issue443-experimental-v4-issue443-experimental-v5-compare.html
http://ai.cs.unibas.ch/_tmp_files/pommeren/issue443-issue443-experimental-v4-v5-relative-scatter-time-v4-v5.png
http://ai.cs.unibas.ch/_tmp_files/pommeren/issue443-issue443-experimental-v4-v5-relative-scatter-time-v4-v5-no-psr.png
Notes:
Some runs crashed on the grid because 2 nodes stopped working during the
experiment. I'm ignoring this here because we will have to run another
experiment for the final version anyway and I think the trend is clear enough
without the crashed tasks.
The scatter plots for experimental-v5 have two strongly negative points from the
domain psr-small at the very left of the plot that throw of the scale a bit. I
added alternative plots that leave out this domain.
> I still like the idea where the actual data for the constraints lives right in
> the LP class, because I think it allows some more global optimizations and
> should also make it comparatively easy to add/remove/modify individual
> constraints in any order. I think this should be a separate evolution step,
though.
I agree, this sounds good but can wait until later. Should I open a separate
issue for this change?
For now, I suggest that I reimplement experimental-v4 in the main issue branch
(without all the hacks and debug code from the experimental branch) and run
experiments for that version. I prefer experimental-v4 over experimental-v5
because it uses a clearer interface. The advantage that experimental-v5 has
(10-20%) from not copying the data is probably something that we can achieve in
the new issue when we store all data in the LP class as you suggested.
|
msg3985 (view) |
Author: malte |
Date: 2014-12-23.14:48:27 |
|
I still like the idea where the actual data for the constraints lives right in
the LP class, because I think it allows some more global optimizations and
should also make it comparatively easy to add/remove/modify individual
constraints in any order. I think this should be a separate evolution step, though.
For now, I suggest you go with whatever appears easiest and matches the previous
performance (or at least gets close to it). Let's try to improve it in a
separate step.
|
msg3984 (view) |
Author: florian |
Date: 2014-12-23.14:38:47 |
|
This was subtle. The matrix can be different in one case: if constraints *at the
end* of the matrix are empty or the *last few* LP variables are not used, then
the old code does not create rows/columns form them. I don't think we can do
this in general, because we allow to modify an existing problem. If a variable
is not used, but a constraint that is added later uses it, we should make it
part of the LP from the beginning.
I think a solution for LM-count is to remove empty constraints in the LM-count
code and then just pass non-empty constraints to the LP. I'm not quite sure how
to make this efficient. We could change the interface to allow passing a
vector<LPConstraint *> instead of a vector<LPConstraint> but this seems a bit
awkward if we have some constraints and just want to pass them to the LP as they
are.
|
msg3983 (view) |
Author: malte |
Date: 2014-12-23.12:02:09 |
|
Odd. You could dump the matrices and bounds (or a hash value of the contents?)
to see if they are really identical?
|
msg3982 (view) |
Author: florian |
Date: 2014-12-23.11:56:52 |
|
I ran another profile for a version that normalizes the matrix in both the
default and experimental branch before loading it into Cplex. After
normalization, the matrix is column-ordered, there are no gaps, and each column
is sorted. The only way that I see how the call to Cplex can behave differently
is if the matrix has different coefficients but this should lead to different
heuristic values at least in some cases. Still, the profile of the experimental
version has more samples in almost all methods. This is most significant in
"CPXlpopt" (91050 vs. 52218) which is called by initial_solve(), but copying the
LP ("CPXcopylp") also takes longer (3793 vs. 2728).
|
msg3981 (view) |
Author: florian |
Date: 2014-12-22.22:19:03 |
|
Profiles for the opt11 task are somewhat confusing. The time spent in Cplex
methods is significantly different between the two version. However, the number
of evaluations, expansions, etc. is the same, so I think it still calculates the
same heuristic. The new version should call the same Cplex methods with the same
parameters. The difference should only affect our wrapper and the OSI-layer.
|
msg3980 (view) |
Author: florian |
Date: 2014-12-22.15:50:27 |
|
The results of experimental-v3 (intitial_solve, vectors, reuse temporary
objects) look better than before but are still worse than the baseline on average.
http://ai.cs.unibas.ch/_tmp_files/pommeren/issue443-base-issue443-experimental-v3-compare.html
http://ai.cs.unibas.ch/_tmp_files/pommeren/issue443-issue443-experimental-v3-relative-scatter-time.png
We previously profiled sokoban task opt08/p04, which now runs faster than the
baseline.
I'll redo the profile on one of the sokoban tasks that still have worse
performance (e.g. opt11/p09) and try to find what's going on.
|
msg3975 (view) |
Author: florian |
Date: 2014-12-19.21:34:14 |
|
Re-using LPConstraints and the vectors for bounds and objective shows a clear
improvement in the profile.
base experimental-v3 diff
cost_sharing_h_value 73711 74196 -485 (- 0.7%)
assign wrapper - 14512
assign 7606 7485 +121 (+ 1.6%)
matrix 5382 4193 +1189 (+22.1%)
self(wrapper) 0 2834 -2834
initialSolve 58139 58352 -213 (- 0.4%)
self(cost_sharing) 2584 1332 +1252 (+48.4%)
The new version (experimental-v3) only loses time compared to the base version
while copying the data into the temporary objects. The time mentioned under
"self(wrapper)" is the time spent in the method LpSolver::assign_problem() which
prepares the data for OSI. I'll run a larger set of experiments on this version.
|
msg3972 (view) |
Author: malte |
Date: 2014-12-19.16:45:23 |
|
> I think using initialSolve() after assigning a new problem is a natural thing
> to do
That's true, and it's good to stick to that while we're trying to find out why
performance dropped. In the big picture, though, it may be cleaner and faster to
reuse the LP in the lmcount heuristic, so that may be something worth looking at
later.
|
msg3971 (view) |
Author: florian |
Date: 2014-12-19.15:28:25 |
|
I had a look at the absolute numbers in the profiles. The following table gives
an overview. The numbers are in "million profile samples" and each percentage is
the difference relative to the measurement in "base".
base experimental diff
cost_sharing_h_value 73711 84454 -10743 (-14.5%)
assign wrapper - 15113
assign 7606 7685 -79 (- 1.0%)
matrix 5382 4179 +1203 (+22.4%)
self(wrapper) 0 3249 -3249
initialSolve 58139 58365 -226 (- 0.4%)
create temporaries - 5102 -5102 LPConstraint::insert()
remove temporaries - 2019 -2019 ~LPConstraint()
self(cost_sharing) 2584 3855 -1271 (-49.2%)
Almost no change in assign() and initialSolve() make sense, because they are
called with the same parameters.
Creating the matrix is faster in the experimental version because the parameter
"colordered" has a bad setting in the default branch. The remaining difference
seems to come from temporary objects in the assign wrapper and in the cost
partitioning (LPConstraints).
I'll try to reuse LPConstraint objects to avoid recreating them in every step.
|
msg3970 (view) |
Author: florian |
Date: 2014-12-19.13:52:47 |
|
> All our new versions still lose some performance in the generation of the matrix
> for the constraint solver, but we recover that (at least for score_total_time,
> not for coverage) by using "resolve" instead of "initialSolve". Does all this
> sound correct?
Using resolve() is very beneficial for some tasks and much slower for others,
but on average it seems to help, yes.
> This means that perhaps for a fair comparison to the baseline it might be worth
> using a configuration with "initialSolve" and trying to get that close in
> performance to the baseline, so that we really see our overhead in matrix
> generation. Or maybe we could look at a variant of the baseline that uses
> resolve to compare to that. What do you think?
I think using initialSolve() after assigning a new problem is a natural thing to
do, so I prefer to look for an efficient version of that. One way to get closer
to the baseline would be to generate matrix entries column-by-column instead of
row-by-row. I suggest that I try this with an interface where the user can just
add one column at a time to see if this is faster. If it is, we can think of a
way to get this performance with a nicer interface.
|
msg3969 (view) |
Author: malte |
Date: 2014-12-19.00:07:29 |
|
OK, so I think we can roughly summarize this as follows. We are mostly varying
two dimensions: how to solve ("resolve" vs. "initial") and how to generate the
matrix ("base matrix" vs. "COIN vectors" vs. "vectors").
Comparing our five versions and focusing on score_total_time, I think we have,
from best to worst:
experimental-v2 ("vectors/resolve"): best
baseline ("base/initial"): second-best
v1 ("COIN vectors/resolve"), experimental-v1 ("vectors/initial"): middle
category; close to each other
v2 ("COIN vectors/initial"): clearly worst
This suggests a rather clear picture with the following total orderings:
"base" better than "vectors" better than "COIN vectors"
"resolve" better than "initial"
All our new versions still lose some performance in the generation of the matrix
for the constraint solver, but we recover that (at least for score_total_time,
not for coverage) by using "resolve" instead of "initialSolve". Does all this
sound correct?
This means that perhaps for a fair comparison to the baseline it might be worth
using a configuration with "initialSolve" and trying to get that close in
performance to the baseline, so that we really see our overhead in matrix
generation. Or maybe we could look at a variant of the baseline that uses
resolve to compare to that. What do you think?
|
msg3968 (view) |
Author: florian |
Date: 2014-12-18.20:27:37 |
|
Results for v2 are generally worse than before:
http://ai.cs.unibas.ch/_tmp_files/pommeren/issue443-base-issue443-v2-compare.html
http://ai.cs.unibas.ch/_tmp_files/pommeren/issue443-issue443-v2-relative-scatter-time.png
|
msg3967 (view) |
Author: florian |
Date: 2014-12-18.11:31:29 |
|
Experimental version v2 is like the experimental version v1 but always uses
resolve() instead of initial_solve(). The results compared to the base version
are better on average but the variance is much higher. This explains why the
regular version v1 had a larger variance (it also used resolve() instead of
initial_solve()).
http://ai.cs.unibas.ch/_tmp_files/pommeren/issue443-base-issue443-experimental-v2-compare.html
http://ai.cs.unibas.ch/_tmp_files/pommeren/issue443-issue443-experimental-v2-relative-scatter-time.png
Since the number of configurations is getting confusingly large, here is a summary:
v1:
Uses CoinShallowPackedVectors to create the matrix and resolve() to solve the LP.
experimental-v1:
(Re-)uses vectors (elements, indices, starts, lengths) to create the matrix
and initial_solve() to solve the LP.
experimental-v2:
(Re-)uses vectors (elements, indices, starts, lengths) to create the matrix
and resolve() to solve the LP.
v2:
Uses CoinShallowPackedVectors to create the matrix and intial_solve() to solve
the LP.
Experiments for v2 are running.
|
msg3960 (view) |
Author: florian |
Date: 2014-12-17.18:31:59 |
|
I added the profiles of the base and experimental version. I'll now set up
experiments for the experimental version without initial_solve() and a version
based on issue443-v1 that uses initial_solve().
|
msg3956 (view) |
Author: florian |
Date: 2014-12-17.12:15:04 |
|
We evaluated a first implementation, which unfortunately performed worse than
the original:
http://ai.cs.unibas.ch/_tmp_files/pommeren/issue443-base-issue443-v1-compare.html
http://ai.cs.unibas.ch/_tmp_files/pommeren/issue443-issue443-v1-relative-scatter-time.png
We then experimented with different calls to the LP solver and different ways to
prepare the data before passing it to coin. Always calling initial_solve() for
new problems and only creating one solver interface recovered some but not all
of the lost performance.
The last version we looked at, reuses vectors to copy the data of constraints
into the CoinPackedMatrix used by the solver. Compared to the last report, there
are fewer tasks that are much slower (> 150% total time) but also most of the
tasks are now somewhat slower (~110%-120%).
http://ai.cs.unibas.ch/_tmp_files/pommeren/issue443-base-issue443-experimental-v1-compare.html
http://ai.cs.unibas.ch/_tmp_files/pommeren/issue443-issue443-experimental-v1-relative-scatter-time.png
The next step is to create full profiles of the different versions and compare them.
|
msg3220 (view) |
Author: florian |
Date: 2014-07-23.18:19:17 |
|
We also want to extend the wrapper classes for the COIN interface to exit the
process with EXIT_CRITICAL_ERROR whenever a CoinError is thrown. This way the
classes using the interface avoid an identical try/catch block around every call.
|
msg3219 (view) |
Author: florian |
Date: 2014-07-23.16:18:47 |
|
In the code for LP heuristics we encapsulate some classes of COIN so the rest of
the code does not have to use COIN classes and compiles even if COIN is not
present. When we merge the LP code, we should use the wrappers in the landmark
code too. The goal of this is to have all code that uses "#ifdef USE_LP" in one
place.
|
|
Date |
User |
Action |
Args |
2015-03-15 14:03:36 | malte | set | messages:
+ msg4057 |
2015-03-15 14:02:22 | silvan | set | nosy:
+ silvan messages:
+ msg4056 |
2015-03-09 17:52:12 | florian | set | status: chatting -> resolved messages:
+ msg4044 |
2015-03-08 13:08:01 | malte | set | messages:
+ msg4041 |
2015-03-07 11:30:18 | florian | set | messages:
+ msg4040 |
2015-03-06 16:33:17 | malte | set | messages:
+ msg4036 |
2015-03-06 15:32:05 | florian | set | messages:
+ msg4033 |
2014-12-29 20:29:50 | florian | set | messages:
+ msg3989 |
2014-12-25 12:00:01 | malte | set | messages:
+ msg3987 |
2014-12-24 13:25:34 | florian | set | messages:
+ msg3986 |
2014-12-23 14:48:27 | malte | set | messages:
+ msg3985 |
2014-12-23 14:38:48 | florian | set | messages:
+ msg3984 |
2014-12-23 12:02:09 | malte | set | messages:
+ msg3983 |
2014-12-23 11:56:53 | florian | set | messages:
+ msg3982 |
2014-12-22 22:19:03 | florian | set | messages:
+ msg3981 |
2014-12-22 22:09:46 | florian | set | files:
+ callgrind.out.p09.experimental-v3 |
2014-12-22 22:09:18 | florian | set | files:
+ callgrind.out.p09.default |
2014-12-22 15:50:27 | florian | set | messages:
+ msg3980 |
2014-12-19 21:34:14 | florian | set | files:
+ callgrind.out.experimental-v3 messages:
+ msg3975 |
2014-12-19 16:45:23 | malte | set | messages:
+ msg3972 |
2014-12-19 15:28:25 | florian | set | messages:
+ msg3971 |
2014-12-19 13:52:47 | florian | set | messages:
+ msg3970 |
2014-12-19 00:07:29 | malte | set | messages:
+ msg3969 |
2014-12-18 20:27:37 | florian | set | messages:
+ msg3968 |
2014-12-18 11:31:29 | florian | set | messages:
+ msg3967 |
2014-12-17 18:31:59 | florian | set | messages:
+ msg3960 |
2014-12-17 18:29:03 | florian | set | files:
+ callgrind.out.issue433-experimental |
2014-12-17 18:28:51 | florian | set | files:
+ callgrind.out.default |
2014-12-17 12:15:04 | florian | set | messages:
+ msg3956 |
2014-12-12 09:55:51 | jendrik | set | nosy:
+ jendrik |
2014-12-11 18:18:06 | florian | set | title: Reduce number of #ifdefs in LP code -> Introduce interface class between LP user code and LP solvers |
2014-07-23 18:19:17 | florian | set | messages:
+ msg3220 |
2014-07-23 16:18:47 | florian | create | |