Issue870

Title don't force static builds
Priority wish Status resolved
Superseder Nosy List augusto, cedric, florian, guillem, jendrik, malte, silvan
Assigned To jendrik Keywords
Optional summary

Created on 2018-11-26.14:20:17 by jendrik, last changed by jendrik.

Messages
msg8155 (view) Author: jendrik Date: 2018-12-05.16:59:52
I went ahead and merged the changes.
msg8144 (view) Author: malte Date: 2018-12-03.12:26:47
Looks good to me. Does anyone else have objections to merging this?
msg8142 (view) Author: jendrik Date: 2018-12-03.09:18:56
I have reverted the changes to our Windows build code. The build is now forced to 
be static again. Since we have never linked statically on MacOS, this means that 
the only OS affected by this issue is Linux.
msg8138 (view) Author: malte Date: 2018-12-01.10:48:34
As long as we have a build on Windows that works well, I don't think there is a
reason to do anything there.

In the long run, we might consider getting rid of OSI for many reasons, given
that we already have our own abstraction layer for the LP code anyway.

Regarding using the Windows Subsystem for Linux, our aim is for each platform
build to be as close to the common practices on that platform as possible, so I
don't think this would be a step in the right direction. Out current build is
fully native, and I would expect that this is what most Windows users prefer.
msg8137 (view) Author: augusto Date: 2018-11-29.16:55:58
Jendrik and I tried to perform the 64-bits dynamic build on Windows and we
failed. The issue is (once again) OSI. In Windows, we must use a DLL file to
load OSI dynamically (just using the "-ldl" flag during OSI compilation is
useless), but I couldn't find an OSI.dll or something similar in their website
until now. We found how to produce the DLL for the whole CoinMP project, which
also contains OSI (according to their website) but this is not really what we
want. I will look at it again later. 

It should be also possible to compile OSI with Visual Studio and generate our
own DLL, but then the question is: would other users also do that? I don't have
any experience with the generation of DLL files using other people's project, so
I don't know how hard it actually is to do it.

I mentioned with Jendrik that in the newest Windows versions (Windows 10 and
Windows Server 2019; but with the exception of Windows 10 S) it *might* be
possible to compile Fast Downward using this new "Windows Subsystem for Linux"
(https://www.microsoft.com/en-us/p/ubuntu/9nblggh4msv6?activetab=pivot%3Aoverviewtab).
However, this is still just a hypothesis. I will test it tonight using my
personal computer, because the buildbot machine has Windows 7, and then I share
the results here.
msg8129 (view) Author: malte Date: 2018-11-28.16:01:08
I see. If it doesn't cause trouble and it's otherwise easy to support both
(user-configured, not forced) static and dynamic builds, I would suggest leaving
the directories in the search path for the time being. We can always reconsider
this later.
msg8128 (view) Author: jendrik Date: 2018-11-28.15:57:23
Sorry Cedric, that was by accident :-)

Maybe this is moot after msg8126 from Florian, but what I meant to say was: as 
long as we include the static library directories in FindCplex, the warning 
from msg8125 will show up. If we remove them, the warning goes away. I'm 
indifferent about keeping or removing them.
msg8127 (view) Author: malte Date: 2018-11-28.15:37:40
Adding back Cedric to the nosy list after Jendrik removed him. (Accidentally, I
hope. :-))

Jendrik, I don't fully understand what you're saying. You say that your changes
caused the warning you quote, but at the same time you say the warning was
already present before. Am I misunderstanding?
msg8126 (view) Author: florian Date: 2018-11-28.15:29:26
I think keeping those is the right thing to do since the find script should find
all installations of CPLEX no matter how you installed it. I don't think the
linker is confused, it just uses the libraries that it gets from the user
settings which end up to be the static ones. Linking with the static CPLEX
libraries using the gold linker causes the warning independent of what CMake does.
msg8125 (view) Author: jendrik Date: 2018-11-28.15:14:58
Pull request is updated. 

In FindCplex.cmake I kept the directories containing the static *.a libraries 
for now. This seems to confuse the linker as it prints the following warning 
(twice):

/scicore/soft/apps/binutils/2.31.1-GCCcore-8.2.0/bin/ld.gold: warning: 
/infai/seipp/lib/cplex64-12.8/cplex/lib/x86-
64_linux/static_pic/libcplex.a(ucnv_bld.ao): 
section .rodata.str1.32 contains incorrectly aligned strings; the alignment of 
those strings won't be preserved

This warning was already present when we forced dynamic and static builds. The 
warnings mentioned in issue866 are gone.
msg8124 (view) Author: jendrik Date: 2018-11-28.14:23:10
I'll update the pull request accordingly.
msg8116 (view) Author: malte Date: 2018-11-28.12:29:30
I was imprecise in mentioning the different options.

What I wanted to suggest with option 3) is not to *force* a dynamic build but
rather to force no linker options at all, so your option c) indeed. Similar to
how we want to do things in the future w.r.t. -m32/-m64: don't force anything,
but use what we get out of the box.

On Linux, my understanding is that this generally gives you dynamic builds
unless you went to some lengths to set things up differently. Hence why I
referred to this as "dropping support for static builds", but what I meant was
dropping support for any forced non-default build.

I doubt this will cause more support pain than the current setup. Forcing *any*
esoteric options generally leads to pain, and static builds are currently quite
esoteric. If we switch to the default, it should just work for the majority of
users. For users with a broken setup things might lead to breakage, but I think
it's a mistake to override the user's setting unless there's a really good
reason. If they manually meddle with build settings via environment variables
and end up with broken settings, I think it's correct to let the breakage
happen, as in "errors should never pass silently" etc.

I also think it should reduce the amount of documentation we provide, not
increase it. If that's not true, changing things might be a bad idea. But we'll
have to consider that once we have a concrete pull request.

In summary, so far it seems there are no objections to trying out this change?
("No longer force a static build.") Then as a next step we should see how this
can simplify our CMake files, whether it gets rid of the buildbot linker
warnings we currently have, more generally if that builds cleanly on all our
platforms, and what it means for the installation instructions for OSI etc.
msg8112 (view) Author: florian Date: 2018-11-28.12:09:58
Also, thanks for the links to the talks. I remember that I wanted to use some
new CMake features when we started with it but couldn't because the default
version on our supported OS versions was very much out of date. Since we
recently bumped the supported version, maybe we can use more features now.
Currently, we require  version 2.8.3 (8 years old), the most recent one is 3.13.0.
msg8111 (view) Author: florian Date: 2018-11-28.12:02:26
Interesting results, I wonder at what point the performance difference
disappeared. Out of curiosity, Jendrik, could you also prepare relative scatter
plots for time and memory of these configurations?

I agree that we should continue to support dynamic builds. We already have them
for profiling purposes.

As for making dynamic builds the default and dropping support for static builds:
the way I see it, there are three options (a) static build is the default; (b)
dynamic build it the default; (c) there is no default. From the perspective of
CMake, there is not much difference between (a) and (b). I don't have a
preference for static or dynamic builds. Either way, I don't think dropping some
lines of the non-default build from build_configs.py is a large benefit. 

We would only get a simpler build script if we remove the default (c). This
would compile with the flags and libraries set up by the user. The potential
problem with that is that if the user has a broken setup (e.g., dynamic library
installed but flags set to static), the error messages are not that helpful.
Without a default, CMake looks for both *.a" and *.so files and picks the first
one it finds. The differences between a working and a broken build come down to
subtle choices in environment variables.

I'm still OK with removing the default but I can imagine that what we save in
CMake complexity, we'll pay for in email support and documentation size.
msg8109 (view) Author: jendrik Date: 2018-11-28.08:19:27
I agree with you on all three questions. It seems dropping support for static 
builds would indeed simplify the CMake scripts considerably, it would allow us to 
drop some build configs from build_configs.py and it would simplify memory 
profiling since we would already have a binary that supports it.
msg8107 (view) Author: malte Date: 2018-11-28.08:02:00
The results look good to me. I left a comment on bitbucket.

How do we proceed?

There are two main reasons why we used static builds in the past: because they
had slightly better performance, and because they allowed us to build on our own
OS (with newer compilers etc.) and then copy the compiled executable to the
computers on which we do the planner experiments. It looks like the first reason
no longer applies, and for the second one I think singularity is now a much more
robust option.

So, questions:

1) Do we want to support dynamic builds?
2) Do we want to make them the default?
3) Do we want to drop support for static builds?

My answers to these questions would be "yes", "yes" and "yes if it helps with
maintenance of the CMake files or gets rid of a source of complexity in the
build process". But I don't know our CMake files well enough to evaluate this.

BTW, I've watched a lot of CppCon videos in the last months, and several have
mentioned that CMake has become a lot better in the last years and that one
should invest time in learning modern CMake practices. Unfortunately I don't
remember what exactly it was I watched, but some of these look relevant:

    https://www.youtube.com/watch?v=bsXLMQ6WgIk
    https://www.youtube.com/watch?v=eC9-iRN2b04
msg8104 (view) Author: jendrik Date: 2018-11-27.15:43:41
Here is a non-broken link to the patch: https://bitbucket.org/jendrikseipp/downward/pull-requests/109
msg8103 (view) Author: jendrik Date: 2018-11-27.15:42:15
Augusto sent me a patch that allows compiling dynamically with LP support (https://bitbucket.org/jendrikseipp/downward/pull-
requests/109) and disables static compilation.

Here are the results for the state equation heuristic:

https://ai.dmi.unibas.ch/_tmp_files/seipp/issue870-seq-static-vs-dynamic.html

I think the results look pretty good. The dynamic version needs ~28 MB more memory. I think this could be due to the fact that we 
strip the "downward" binary and the binaries we link in dynamically are not stripped. The two binaries themselves have the 
following sizes:

35582752 27. Nov 12:13 data/revision-cache/f7b2611b6522_a2e76f4d/builds/release64/bin/downward
 3110336 27. Nov 12:10 data/revision-cache/49a0be90af54_f14d1491/builds/release64dynamic/bin/downward
msg8101 (view) Author: malte Date: 2018-11-27.01:11:52
Thanks, Jendrik! I see no serious performance problems there, but we should wait
until we've seen an LP configuration, too.
msg8098 (view) Author: jendrik Date: 2018-11-26.22:17:37
I went ahead and ran the experiment for blind() and lmcut(). Here are the 
results:

https://ai.dmi.unibas.ch/_tmp_files/seipp/issue839-opt-static-vs-dynamic.html
msg8094 (view) Author: jendrik Date: 2018-11-26.14:53:00
When trying to compile dynamically, I get "Could NOT find Cplex (missing:  CPLEX_LIBRARIES)". Florian, do you know 
what I have to change to compile dynamically with LP support?
msg8092 (view) Author: malte Date: 2018-11-26.14:26:33
Thanks! Since the static/dynamic issues mainly arise in the context of the LP
solver, it would be good to also test an LP configuration.
msg8090 (view) Author: jendrik Date: 2018-11-26.14:20:16
Quoting Malte in issue866:

"I wonder if we should give up on the idea of forcing a
static build and should instead go with the system default everywhere. Many of
the build issues and complications we have when using CPLEX, OSI etc. seem to be
related to the fact that we want to do a custom thing. If it helps avoid
warnings, simplify the setup instructions and CMake file and would allow us to
use more vanilla configurations of the libraries we use, I wouldn't mind some
mild performance loss. (We should measure first.)"

I'll run an experiment for A* with lmcut() and blind() using 64-bit builds.
History
Date User Action Args
2018-12-05 16:59:52jendriksetstatus: reviewing -> resolved
messages: + msg8155
2018-12-03 12:26:47maltesetmessages: + msg8144
2018-12-03 09:18:56jendriksetstatus: in-progress -> reviewing
messages: + msg8142
2018-12-01 10:48:34maltesetmessages: + msg8138
2018-11-29 16:55:58augustosetmessages: + msg8137
2018-11-28 16:01:08maltesetmessages: + msg8129
2018-11-28 15:57:23jendriksetmessages: + msg8128
2018-11-28 15:37:40maltesetnosy: + cedric
messages: + msg8127
2018-11-28 15:29:26floriansetmessages: + msg8126
2018-11-28 15:14:58jendriksetnosy: - cedric
messages: + msg8125
2018-11-28 14:46:19cedricsetnosy: + cedric
2018-11-28 14:23:10jendriksetstatus: reviewing -> in-progress
messages: + msg8124
2018-11-28 12:29:30maltesetmessages: + msg8116
2018-11-28 12:09:58floriansetmessages: + msg8112
2018-11-28 12:02:26floriansetmessages: + msg8111
2018-11-28 08:19:27jendriksetmessages: + msg8109
2018-11-28 08:02:00maltesetmessages: + msg8107
2018-11-27 15:59:12augustosetnosy: + augusto
2018-11-27 15:43:41jendriksetmessages: + msg8104
2018-11-27 15:42:15jendriksetstatus: in-progress -> reviewing
messages: + msg8103
2018-11-27 01:11:52maltesetmessages: + msg8101
2018-11-26 22:17:37jendriksetmessages: + msg8098
2018-11-26 20:35:09silvansetnosy: + silvan
2018-11-26 14:53:00jendriksetnosy: + florian
messages: + msg8094
2018-11-26 14:26:33maltesetmessages: + msg8092
2018-11-26 14:26:00guillemsetnosy: + guillem
2018-11-26 14:20:17jendrikcreate