msg2917 (view) |
Author: malte |
Date: 2014-01-28.16:14:32 |
|
It seems that everybody seems to be at least OK with closing this and Silvan and
I are strongly in favour of closing it. :-) So I'm closing it; if I'm
interpreting the old messages incorrectly, just reopen.
|
msg2916 (view) |
Author: silvan |
Date: 2014-01-28.14:03:24 |
|
At least, we did not find any obvious problem in the code and the recent
performance fix for iPDB could be somewhat related to the pegsol performance,
although it is unclear if the two cases are really related.
|
msg2915 (view) |
Author: gabi |
Date: 2014-01-28.14:02:20 |
|
It seems that there is no problem in the code and we can simply close this
issue. Right?
|
msg2914 (view) |
Author: silvan |
Date: 2014-01-28.13:59:22 |
|
I agree, the most likely explanation seems to be that the reported values in the
SoCS paper are incorrect.
Gabi, do you have any opinion on this? I would like to get rid of this open
issue at some time... ;)
|
msg2893 (view) |
Author: malte |
Date: 2014-01-06.21:19:30 |
|
I see -- but this doesn't really solve the mystery why the numbers in the SoCS
paper are quite good (18 solved tasks), but when we try to reproduce them with
what we think should be the correct version, we only solve 3. So I would treat
these numbers as suspect w.r.t. the SoCS paper version.
|
msg2892 (view) |
Author: florian |
Date: 2014-01-06.21:17:16 |
|
Yes, I was looking at the 2008 benchmark. As for the detailed info, I believe
this is from a per-problem report, I looked at with Silvan some time ago. It
contains some information about number of iterations, initial h value and (max?)
improvement but not the full logs.
|
msg2891 (view) |
Author: malte |
Date: 2014-01-06.21:08:49 |
|
I think you're looking at different benchmark sets. The SOCS paper uses the 2011
benchmark suite, which has 20 peg solitaire instances. The paper shows a
coverage of 18, and issue402 shows a coverage of 19. That's close enough
considering that we made other improvements since then and are likely running on
a faster machine.
Regarding msg2656, I'm not really sure where the detailed information there
comes from, since the main problem here is that we don't have the logs for the
actual experiments run back then and can't reproduce them.
|
msg2890 (view) |
Author: florian |
Date: 2014-01-06.21:00:24 |
|
I re-read the messages here and to me it looks like issue402 was not responsible
for the original change in coverage. Gabi mentioned (msg2656) that the reported
coverage was 18 and with issue402 it is 29. This could be due to the other
changes we made, but still seems like a large difference. Also, we can guess
from the reports that the SoCS version stopped hill-climbing earlier (also
msg2656) and issue402 should not influence this.
I still am with Silvan on this one: without knowing the exact code and having a
way to reproduce the results it will be hard to find out anything new, so I
would be ok with closing this issue.
|
msg2889 (view) |
Author: malte |
Date: 2014-01-06.20:42:43 |
|
Florian and Gabi, what do you think?
BTW, regarding the earlier comment on hg meld: I always run this as "hg meld -r
rev1:rev2", i.e., with a colon instead of two -r arguments, when I want to
compare two specific revisions. In my past experience, this always seemed to
work. The four main forms of hg diff/meld I use regularly are:
$ hg diff
Compare working directory to parent.
$ hg diff -r 10
Compare working directory to revision 10.
$ hg diff -c 10
Compare revision 10 to its parent.
$ hg diff -r 10:20
Compare revision 10 to revision 20.
|
msg2888 (view) |
Author: silvan |
Date: 2014-01-06.20:36:25 |
|
I'd be happy to be lazy and close this issue :)
But maybe others have different opinions?
|
msg2885 (view) |
Author: malte |
Date: 2014-01-06.18:39:53 |
|
What is your preference?
|
msg2880 (view) |
Author: silvan |
Date: 2014-01-06.11:51:12 |
|
To answer your older questions: We do not have any logs from the runs, if not in
some kind of archived form on a DVD in Freiburg (I remember you asked us to pack
some experiment data of the experiments for the paper to give it to Uli for
archiving it).
I still have two old repositories:
- one from the teamprojekt, where the last revision is 232de6d0ff7c (which is
after you merged in our pdb code).
- one for the socs paper where we back-integrated our two variants of pdb (base
and efficient) and where we have many experiment scripts we used for the socs
experiments. I therefore believe we also used the downward version from this
repository for the experiments. The last "merge from master" revision is
1246ebf3408f.
Anyway, when I compare those two revisions, all changes I find that are related
to pdbs are changes for more statistics and options parsing related things. I
think it is safe to assume these revisions to behave the same/unless some
changes in the landmarks code, the mas code or the lmcut heuristic could have
some impact on the pdbs).
I then compared the newer of those versions (1246ebf3408f) against the version
used in the experiments below (64c3312cf51f), which dates September 17 2013. The
changes there include the addition of the dominance pruner and some other
smallish changes which I cannot believe to be a reason for the observed behavior
(but of course many changes outside the pdbs, e.g. in the landmarks and the
state representation).
(Btw., if you try to have a look at the diffs yourself: for me, hg meld -r <rev>
-r <rev> did *not* work, it always compared against the revision the repository
was currently updated to. So you would possibly need two clones, update to the
respective revisions and diff manually.)
So, to conclude and to answer your latest question: I am not sure at all what
caused the observed behavior and I am not sure if the issue is resolved by the
increased coverage from issue402. But I would be happy to "accept" it as
resolved because I cannot think of any other, better reasons where the coverage
in pegsol could have been lost.
|
msg2867 (view) |
Author: malte |
Date: 2013-12-30.19:45:15 |
|
It looks like issue402 might be related to this.
Do you think we can close this as resolved by issue402?
|
msg2676 (view) |
Author: malte |
Date: 2013-09-26.19:11:33 |
|
OK, I guess both need a bit of implementation. Maybe the 7 is less strange than
the 3 because we made a number of iPDB-related changes in the last months, but
it would still be good to find out at which point we jumped from 0 to 7.
Regarding the 3, I guess this means we need to find out what exactly we ran for
the SoCS experiment, and maybe also where. Is there a good way to find this out?
For example, do we still have the log files from these runs?
|
msg2675 (view) |
Author: silvan |
Date: 2013-09-26.19:09:02 |
|
A bit late, there you have the results:
http://ai.cs.unibas.ch/_tmp_files/sieverss/ipdb-old-new-revisions-d.html
http://ai.cs.unibas.ch/_tmp_files/sieverss/ipdb-old-new-revisions-p.html
Interestingly, the results do not reflect the ones from the papers (at least for
the used default ipdb config). The left column shows the old socs-version code
and achieves a coverage of 3 (!= 18) and the right column with the newst
downward version achieves a coverage of 7 (!= 0). This is still strange...
|
msg2669 (view) |
Author: silvan |
Date: 2013-09-20.15:25:29 |
|
Experiment is running. For now I only took the default ipdb-config; if you
wanted more divers configurations (e.g. like in the socs paper), let me know.
|
msg2666 (view) |
Author: malte |
Date: 2013-09-19.17:50:37 |
|
Great! Can we get started by making an experiment that compares the old and new
code on peg solitaire?
|
msg2663 (view) |
Author: silvan |
Date: 2013-09-19.10:35:01 |
|
We do have the old code and it is in the repository. The last revision I ever
pushed to our Teamprojekt-repository is this one: 232de6d0ff7c
I am very sure that the experiments for the SoCS-paper have been run on this
revision. Furthermore, I could reproduce the behavior on the first
pegsol-instance which is solved in about 60s with the old code.
As there were some issues with running the old revision, I pushed a fixed
version to a ai-repos-repository, I've granted Malte access. If anyone else is
interested, let me know.
|
msg2657 (view) |
Author: malte |
Date: 2013-09-16.18:15:10 |
|
Is there a way for us to reproduce these results, i.e., is the old SoCS code in
the repository? If yes, what revision?
|
msg2656 (view) |
Author: gabi |
Date: 2013-09-16.18:06:29 |
|
In their PDB paper, Sievers, Ortlieb and Helmert (SoCS 2012) report 18 solved
pegsol instances with iPDB. In their IJCAI 13 paper, Pommerening, Röger and
Helmert report a coverage of 0, albeit there should be no difference.
Florian and Silvan already had a deeper look at the first instance:
It seems that in the first four hill-climbing iterations, similar patterns are
found: we do not have detailed logs for the SoCS results. According to the iPDB
output we find patterns of the same size but we cannot know whether they are
actually the same. However, with the old results these 4 iterations took 86
seconds in contrast to more than 330 seconds with the new results.
Afterwards, the hill-climbing search in the SoCS results stops with an h-value
of 1, but at the newer results the hill-climbing search continues because it
finds new patterns with an improvement of 32. It runs the hill-climing until it
times out and finds larger and larger patterns, increasing the h-value to 2 (h*
is 3).
It is unclear why we observe this different behaviour.
|
|
Date |
User |
Action |
Args |
2014-01-28 16:14:32 | malte | set | status: chatting -> resolved messages:
+ msg2917 |
2014-01-28 14:03:24 | silvan | set | messages:
+ msg2916 |
2014-01-28 14:02:20 | gabi | set | messages:
+ msg2915 |
2014-01-28 13:59:22 | silvan | set | messages:
+ msg2914 |
2014-01-06 21:19:30 | malte | set | messages:
+ msg2893 |
2014-01-06 21:17:16 | florian | set | messages:
+ msg2892 |
2014-01-06 21:08:49 | malte | set | messages:
+ msg2891 |
2014-01-06 21:00:24 | florian | set | messages:
+ msg2890 |
2014-01-06 20:42:43 | malte | set | messages:
+ msg2889 |
2014-01-06 20:36:25 | silvan | set | messages:
+ msg2888 |
2014-01-06 18:39:53 | malte | set | messages:
+ msg2885 |
2014-01-06 11:51:12 | silvan | set | messages:
+ msg2880 |
2013-12-30 19:45:15 | malte | set | messages:
+ msg2867 |
2013-09-26 19:11:33 | malte | set | messages:
+ msg2676 |
2013-09-26 19:09:02 | silvan | set | messages:
+ msg2675 |
2013-09-20 15:25:29 | silvan | set | messages:
+ msg2669 |
2013-09-19 17:50:37 | malte | set | messages:
+ msg2666 |
2013-09-19 10:35:01 | silvan | set | messages:
+ msg2663 |
2013-09-16 18:15:10 | malte | set | status: unread -> chatting messages:
+ msg2657 |
2013-09-16 18:06:29 | gabi | create | |