Issue662

Title Add option to the build system for dynamic linking
Priority wish Status resolved
Superseder Nosy List florian, gnad, haz, jendrik, malte, silvan
Assigned To florian Keywords
Optional summary

Created on 2016-07-26.14:54:27 by silvan, last changed by florian.

Messages
msg6152 (view) Author: florian Date: 2017-02-24.14:59:03
Looks like I forgot to close the issue when I merged.
msg6120 (view) Author: florian Date: 2017-01-13.17:34:00
Thanks. I merged and the buildbot ran fine (except of course for the mac which
was broken before that).
msg6119 (view) Author: malte Date: 2017-01-13.16:33:36
All 13 configurations in build.py build correctly on kibo (gcc Ubuntu
6.2.0-5ubuntu12), kruemel (same gcc version) and dakar (gcc Ubuntu
5.4.0-6ubuntu1~16.04.4).
msg6118 (view) Author: florian Date: 2017-01-13.16:14:20
Ahh, yes, I mixed up "remove" and "add". Your interpretation is correct and the
pull request is up to date.
msg6117 (view) Author: malte Date: 2017-01-13.16:10:16
> Using clang on Linux caused a linker error in static builds because of the flag
> -rdynamic which we removed in the patch (now added again, but only for this
> case).

I don't understand. I don't see anything on bitbucket adding -rdynamic in any
configuration (and I also don't see why it should be needed -- it seems to be a
somewhat esoteric in-process introspection option).

From the comment on bitbucket, it seems that the issue was that we didn't
*remove* an implicitly added -rdynamic for a static build. Am I misunderstanding
things, or is the pull request on bitbucket out of date?

> Should I merge, or are there additional tests I should run first?

In half an hour or so, the compiles I mentioned in my previous message should
have finished, so perhaps it makes sense to wait for that.
msg6116 (view) Author: florian Date: 2017-01-13.15:39:30
I pushed two more changes after testing on the three buildbots.

Windows builds were still working (the ones that worked before at least).
MacOS also sets the CMake variable UNIX to true, so our case distinction was
wrong (fixed now).
Using clang on Linux caused a linker error in static builds because of the flag
-rdynamic which we removed in the patch (now added again, but only for this case).
g++-4.8 and 5.3 on the buildbot worked fine.

Should I merge, or are there additional tests I should run first?
msg6115 (view) Author: malte Date: 2017-01-13.14:43:09
Sounds good to me. The LP issues for the static build are sufficient reason keep
the static build as the default.

I'll try compiling on my notebook, home desktop and work desktop and will let
you know if there are any issues.
msg6114 (view) Author: florian Date: 2017-01-13.14:24:04
The experiment finished and looks good to merge. There was no significant change
in the static build and the dynamic build is only ~2% worse than the static one.

I'm still in favor of keeping the static build as the default for two reasons:
most importantly, the LP stuff doesn't work in dynamic builds and would need
additional work. Also, the last time we measured, we got a difference of 10%
on a larger set of configurations and with a higher time limit. While I think
this experiment is sufficient to add an additional build option, I think we
should do more experiments if we want to make this our only build option.


Static builds in base and v1:
http://ai.cs.unibas.ch/_tmp_files/pommeren/issue662-v1-total_time-astar-lmcut-dynamic-issue662-base-astar-lmcut-static-issue662-v1-astar-lmcut-static.png

Static vs. dynamic in v1:
http://ai.cs.unibas.ch/_tmp_files/pommeren/issue662-v1-total_time-astar-lmcut-dynamic-issue662-v1-astar-lmcut-static-issue662-v1-astar-lmcut-dynamic.png

Full report:
http://ai.cs.unibas.ch/_tmp_files/pommeren/issue662-v1.html
msg6105 (view) Author: florian Date: 2017-01-12.23:43:52
> do we have a cmake TODO list?
Now we do (issue706).
msg6103 (view) Author: malte Date: 2017-01-12.22:58:55
New code looks nice!

I'd move the error checking (don't set FORCE_DYNAMIC_BUILD on unsupported
platforms) to the top of the function. I find this makes it clearer when reading
the code that all supported cases are covered. I also personally wouldn't make
FORCE_DYNAMIC_BUILD illegal on the Mac just because Mac builds are dynamic
anyway. But I also don't mind leaving things as they are.

In the dynamic build, I'm not sure if the line

            set(LINK_SEARCH_END_STATIC FALSE)

is necessary because I would expect this to be the default behaviour, but I also
don't think it hurts.


>> The usual way that makefiles support
>> static and dynamic linking is to provide binaries for both settings (say
>> "downward-static" and "downward-dynamic"). Is this infeasible with cmake?
>
> I wouldn't know how to do this. Given that we plan to use dynamic builds for
> profiling, we probably won't have too much cases where we switch between static
> and dynamic builds a lot.

Halving the time taken by the buildbot would be nice. :-) No need to look into
this now, but I'd appreciate if we could put in on our TODO list (do we have a
cmake TODO list?) at least as a wish item. For building libraries rather than
executables, this has been called "moderately easy":

http://stackoverflow.com/questions/2152077/is-it-possible-to-get-cmake-to-build-both-a-static-and-shared-version-of-the-sam
msg6102 (view) Author: florian Date: 2017-01-12.22:38:42
> That error message is not great, but I assume our systems are already infected
> by the "force static" virus to some extent due to the way we build the COIN
> libraries. Does the same issue happen if we don't follow the instructions on
> http://www.fast-downward.org/LPBuildInstructions but only build a dynamic
> library for COIN? I assume this implies removing "--enable-static" and
> changing the "LDFLAGS" line in our build instructions.
I managed to get a shared library of OSI installed and the build script
discovers it. That doesn't help too much, since we don't have a shared library
for CPLEX. I would like to table this until it actually comes up (i.e., until we
need a dynamic build with CPLEX).
msg6100 (view) Author: jendrik Date: 2017-01-12.22:22:12
If the experiment shows that the runtime penalty of dynamic builds is "not that 
bad", I'd be in favor of using a simpler build script.
msg6098 (view) Author: florian Date: 2017-01-12.21:51:42
I'm home by now, and I continued on the train for a bit. I restructured the
CMake macro in a way that hopefully makes it more explicit which flag is added
for which environment. Forcing the build to static or dynamic is only supported
on Unix, on the other systems we rely on a sensible build environment.

Let me also respond to some older comments here:

> What are everyone's preferences regarding supporting both kinds of builds
> going forward?
I'm in favor of keeping the static build as long as there is a performance
difference. In our last test (issue67) the static build was ~10% faster. Back
then, we decided that this was sufficient reason to keep forcing a static build
and think this still applies.

> Does [the maia experiment] involve more than a standard
> experiments running configs with appropriate driver options? Is there a
> convenient way to tell lab that we need non-standard builds? (Haven't we done
> things like this before, like comparing debug and release builds?)
The IssueConfig class in our common_setup.py has build_options and
driver_options so we can specify a build config for building and running the
planner. This should be relatively simple.

> The usual way that makefiles support
> static and dynamic linking is to provide binaries for both settings (say
> "downward-static" and "downward-dynamic"). Is this infeasible with cmake?
I wouldn't know how to do this. Given that we plan to use dynamic builds for
profiling, we probably won't have too much cases where we switch between static
and dynamic builds a lot.

> Could perhaps just make the debug build be dynamic by default
Unrelated to this issue, but there is an undocumented feature that lets you
define your own build configs and defaults without getting merge conflicts. Just
add a file "christians_build_configs.py" next to "build_configs.py" and use the
same syntax as that file to specify configs and defaults.
msg6097 (view) Author: malte Date: 2017-01-12.20:25:13
One of the reasons why we default to static builds was that we often had to run
the planner on systems where we couldn't compile it, and only static builds
would work when using machine A's build on machine B. This applies to debug as
well as release builds -- sometimes you want to debug on machine B. :-) (For
example if things only crash there.)

But this use case has become quite obscure for us, and hence we were discussing
to discontinue forced static builds to simplify the Makefiles, just like we're
planning to discontinue 32-bit builds. (Having options is of course always nice,
but clean and simple Makefiles are also nice. :-))

downward-dev used to be an "internal" list, but these days it's open to anyone
who is interested. It's linked from the main page on http://www.fast-downward.org.
msg6096 (view) Author: haz Date: 2017-01-12.20:15:26
Thanks Florian / Malte!

Seems fine to keep both options imo, and testing things briefly works locally 
(64bit dynamic debug -- generating some nice memory plots with valgrind's massif 
finally). Is there a reason to have a static debug build? Could perhaps just make 
the debug build be dynamic by default -- I can only imagine needing it for debug 
purposes.

PS. This fabled downward-dev list a Basel-only internal thing?
msg6095 (view) Author: malte Date: 2017-01-12.19:37:48
As long as it is an unsupported/undocumented feature, I don't think we need to
test the Windows and Mac buildbots with the new build parameters. This can wait
until the follow-up issue (if any) or even until the next time we need to make
more extensive changes when we get rid of 32-bit compiles.

But I think it would be good to do at least one small maia experiment (60 second
timeout would be enough, using any one of the common configs such as A* +
LM-Cut) to compare release32 to release32dynamic to make sure there's no
unforeseen performance impact. Does this involve more than a standard
experiments running configs with appropriate driver options? Is there a
convenient way to tell lab that we need non-standard builds? (Haven't we done
things like this before, like comparing debug and release builds?)
msg6094 (view) Author: florian Date: 2017-01-12.19:31:03
I handled your comments on bitbucket. I tried release32 and release32dynamic on
my laptop with g++6.2 so far (both work). I'll see if I get them to run on the
mac and windows buildbot before I have to leave. I'll also try with dynamically
linked OSI. Someone should also try clang and g++4.8 on Linux. And maybe a few
builds on maia ... This is why I don't like working on CMake issue.
msg6093 (view) Author: malte Date: 2017-01-12.18:35:28
I'm done making comments on bitbucket. Once you've had a chance to react to
them, let me know and I'll try it out on my notebook and home desktop. No matter
how we decide to go forward with this (see my previous message), I can think we
can merge this soon and leave the next steps (if any) for a future issue. But
I'd still like to at least make the plan for the future now.
msg6092 (view) Author: malte Date: 2017-01-12.15:34:21
That error message is not great, but I assume our systems are already infected
by the "force static" virus to some extent due to the way we build the COIN
libraries. Does the same issue happen if we don't follow the instructions on
http://www.fast-downward.org/LPBuildInstructions but only build a dynamic
library for COIN? I assume this implies removing "--enable-static" and changing
the "LDFLAGS" line in our build instructions.

To be clear, I'm not against having something in the cmakefile to make the error
messages clearer, as long as the trade-off between complexity and utility is
good. One way to proceed is to incorporate your changes now and decide later
whether we want to simplify the cmakefile by removing the static/dynamic linking
options (which probably means no static linking in the most common setting).

What are everyone's preferences regarding supporting both kinds of builds going
forward? Perhaps this is something we should ask on downward-dev. If we want to
explicitly support both, then I think we must update the buildbot to provide both.

One more question: given that the dynamic/static choice only affects linking,
not compilation, it seems very wasteful to have two separate build targets,
which means compiling everything twice. The usual way that makefiles support
static and dynamic linking is to provide binaries for both settings (say
"downward-static" and "downward-dynamic"). Is this infeasible with cmake?
msg6090 (view) Author: florian Date: 2017-01-12.09:38:24
That would work most of the time as well. The problem is with systems that do
not have the correct libraries. Without additional options CMake sometimes mixes
static and dynamic libraries which can lead to cryptic linker errors (I think it
was trying to link dynamically against a static library or vice versa). With the
additional options you would get a library-not-found error.

As an example, I get the following error when I try a dynamic build without
setting up any linker options:
  /usr/bin/ld: /opt/cplex-1263-32/cplex/lib/x86_linux/static_pic
  /libcplex.a(ucnv.ao): direct GOT relocation R_386_GOT32 against 
  `UCNV_FROM_U_CALLBACK_STOP_44_cplex' without base register can not be used when
  making a shared object

With the option that only looks for *.so files, we get a note that CPLEX was not
found and a dynamic build of the planner without CPLEX.
msg6089 (view) Author: malte Date: 2017-01-12.03:23:07
Maybe a naive comment, but given that dynamic linking tends to be the default on
Linux systems and you often have to work hard to get a static build, what
happens if we don't try passing any particular options or enforcing anything in
particular at all under Linux?
msg6088 (view) Author: florian Date: 2017-01-12.01:25:54
Its not impossible to get a dynamic build out of CMake, its just hard to force
it to reliably produce one in all circumstances. Since there is so much demand
for this issue, I gave it a try. The solution in the linked pull request works
on my laptop but only without an LP solver. I'm also pretty sure it won't work
on Windows. On MacOS every build should be dynamic anyway because Apple does not
support static builds. I'm not sure about clang on Unix.

As an additional limitation of the solution in the pull request, the option has
to be passed to the first cmake call. Switching from a dynamic to a static build
(e.g., with ccmake or qtcreator) is not possible. I added build configs
"release32dynamic" and "debug32dynamic" that set the necessary options.

So all in all it is a hacky solution but it might be sufficient for our purpose
(debugging with valgrind).

https://bitbucket.org/FlorianPommerening/downward-issues/pull-requests/32/issue662/diff
msg5994 (view) Author: haz Date: 2016-12-27.19:45:58
Perhaps a more direct approach would work: if the main objective is to profile the 
memory, then there might be a linkable library that could be optionally included 
with the (debug or profile) build and disabled by default. The dynamic linking just 
helps if you don't want to include anything in the project.
msg5993 (view) Author: malte Date: 2016-12-27.19:37:31
Unfortunately Florian, our cmake expert, says it not so easy to add a cmake
level switch for static vs. dynamic linking. Perhaps we can start with something
simpler, like a patch to change to dynamic linking, and take it from there.
msg5992 (view) Author: haz Date: 2016-12-27.19:34:14
+1 to exposing it down at the build.py level. It would be quite helpful for those 
building off of FD as well.
msg5990 (view) Author: malte Date: 2016-12-24.15:06:41
See also issue703.
msg5989 (view) Author: malte Date: 2016-12-24.15:03:09
Jendrik, can you add this to the agenda for the next Fast Downward meeting that
I'll be able to attend?
msg5478 (view) Author: silvan Date: 2016-07-26.14:54:26
Adding a build option to link dynamically would help in running profilers like
valgrind.

Such an option would be somewhat different from the existing build-type options,
and hence it is not immediately clear to me where such an option would come into
play (although the manual modification to remove the -static options from
linking in CMAke works easily). Another question is whether we would want to
forward this new option also to the build.py driver script.
History
Date User Action Args
2017-02-24 14:59:03floriansetstatus: chatting -> resolved
messages: + msg6152
2017-01-13 17:34:00floriansetmessages: + msg6120
2017-01-13 16:33:36maltesetmessages: + msg6119
2017-01-13 16:14:20floriansetmessages: + msg6118
2017-01-13 16:10:16maltesetmessages: + msg6117
2017-01-13 15:39:30floriansetmessages: + msg6116
2017-01-13 14:43:09maltesetmessages: + msg6115
2017-01-13 14:24:04floriansetmessages: + msg6114
2017-01-12 23:43:52floriansetmessages: + msg6105
2017-01-12 22:58:55maltesetmessages: + msg6103
2017-01-12 22:38:42floriansetmessages: + msg6102
2017-01-12 22:22:12jendriksetmessages: + msg6100
2017-01-12 21:51:42floriansetassignedto: florian
messages: + msg6098
2017-01-12 20:25:13maltesetmessages: + msg6097
2017-01-12 20:15:27hazsetmessages: + msg6096
2017-01-12 19:37:48maltesetmessages: + msg6095
2017-01-12 19:31:03floriansetmessages: + msg6094
2017-01-12 18:35:28maltesetmessages: + msg6093
2017-01-12 15:34:21maltesetmessages: + msg6092
2017-01-12 09:38:24floriansetmessages: + msg6090
2017-01-12 03:23:07maltesetmessages: + msg6089
2017-01-12 01:25:54floriansetmessages: + msg6088
2017-01-09 11:05:32floriansetnosy: + florian
2016-12-27 19:45:58hazsetmessages: + msg5994
2016-12-27 19:37:31maltesetmessages: + msg5993
2016-12-27 19:34:14hazsetnosy: + haz
messages: + msg5992
2016-12-24 15:06:41maltesetmessages: + msg5990
2016-12-24 15:03:09maltesetstatus: unread -> chatting
messages: + msg5989
2016-08-04 09:57:48jendriksetnosy: + jendrik
2016-07-29 10:28:09gnadsetnosy: + gnad
2016-07-26 14:54:27silvancreate