Issue1163

Title Show code revision that was used to compile the code
Priority feature Status chatting
Superseder Nosy List florian, jendrik, malte, masataro
Assigned To Keywords
Optional summary
PR: https://github.com/aibasel/downward/pull/236

Created on 2025-01-14.10:32:05 by florian, last changed by malte.

Summary
PR: https://github.com/aibasel/downward/pull/236
Messages
msg11736 (view) Author: malte Date: 2025-01-15.10:38:21
If that's an easier way to implement things, using it makes sense to me.


I'm slightly worried that loading a 140M search executable from the driver on every run just to print a release number is heavyweight in terms of time. On my notebook with SSD, invoking the search code from the command line and doing nothing takes 32 ms, which is not trivial. On an environment where the search executable comes from a network drive, it will be more serious, although perhaps these file systems are good at caching.

Still, with this setup I think it does make more sense for the search component itself to print the code revision as part of its regular execution, and then we'd only run it once. Then I wouldn't do anything special in the driver.


For what it's worth, I don't think this solves the "I forgot to rebuild" use case. To consider it addressed, we have to assume that someone is not diligent enough to build, but at the same time diligent enough to check the very verbose logs and be aware that the cryptic hash ID they find there is not the one they expected. (In the sequence you describe, the user wouldn't know which hash is the expected one to start with. This can be addressed by trying to resolve tags or branches like "HEAD". But they would still need to look, and then what if instead of forgetting to build they forgot to pull?)

But I think trying to ensure that someone ran "build" is not really worth it in the first place. If we wanted to make sure the user always build, I'd rather do things like have the driver do a build dry-run to check that things are up to date. But like I said, I think that's something I don't think it's worth doing. I don't see other programs doing it.

For me, the main value of the feature is having the revision in the output and logs for later inspection. For example, we often have people writing to our Discord who work with old versions of the planner, and sometimes the issues they report are already solved in the current version. Not having to do a round of back-and-forth to try to figure out what they actually ran is quite useful. If we can see what they ran from a copy-pasted log, that's very useful.
msg11735 (view) Author: florian Date: 2025-01-15.09:43:33
Agreed on the "-dirty" suffix - let's stick with this. It also has the advantage that it sticks out more than a "+" from a revision hash.

> Whatever mechanism determines the revision number would store the revision info in a place where the driver can access it. Doesn't the current patch already have to do something like this and save info about the revision somewhere?

Currently, the patch uses a CMake feature to create a temporary header file that is included in the build and defines a macro that is then used in the code. This header is only stored in the CMake cache. I view this as storing the revision in the binary and I suggested the "--version" option for the binary as the mechanism to access this stored data.

> It's somewhat insane to use the driver of revision X to run the code of revision Y
Yes it is, but it can happen accidentally. For example, if you build, pull, and forget to rebuild.
I think one goal of this patch is to make such cases more easily detectable.
msg11732 (view) Author: masataro Date: 2025-01-14.21:49:34
And "X-dirty" is meant specifically for uncommited changes. If you commit the changes, then it will use the hash + release style
msg11731 (view) Author: masataro Date: 2025-01-14.21:47:47
(I don't know how to use this issue tracker to send a reply)

"X-dirty" follows the same convension that git and magit uses when it refers to a submodule with uncommited changes.
     
   dirty

       All changes to the submodule’s work tree will be ignored, only committed differences between the HEAD of the submodule and its recorded state in the superproject are taken into account.

https://git-scm.com/docs/gitmodules
msg11730 (view) Author: malte Date: 2025-01-14.12:51:17
Whatever mechanism determines the revision number would store the revision info in a place where the driver can access it. Doesn't the current patch already have to do something like this and save info about the revision somewhere?

If the following points don't help, perhaps you can elaborate which specific things you see as contradictory.

- We have one driver, but potentially multiple builds. In general, the revision depends on the specific build chosen. It's somewhat insane to use the driver of revision X to run the code of revision Y, and if you do you should rather know what you're doing. For the same reason, you can have multiple coexisting builds based on different revisions.

- We can decide to consider people crazy that mix revisions in this way and assume that there is always "the revision". Then build.py/the CMake process could store this revision somewhere directly in builds/ when building stuff. (If you mix revisions, you would get the one from the latest build.)

- But it is probably much cleaner and in line with how CMake should work to store the revision in a directory with each individual build. Then the driver would look up the revision for the build that was selected on the command line, the same way it decides which search code to run. (Usually this would be the default build.)

- There is also the option of the build process directly writing to a file in driver/, like driver/build_revision.py, which could then be imported. (The file would need to be in .gitignore.) We probably don't want to do this because we don't think of this as a directory that we should automatically write to, but it's similar to what our release scripts do with the version number. A mechanism for updating such a file wouldn't then need to live in the CMake files; it could also live in ./build.py. This seems messier to me than the suggestion in the previous bullet item, but is something I've seen people do and would perhaps be the least amount of work.
msg11729 (view) Author: florian Date: 2025-01-14.11:53:18
I would have gone with "X+" instead of "X-dirty" because this is more in line with how we report versions ("24.06" is the release exactly and "24.06+" is the release with additional changes). But I'm fine with "X-dirty" as well. It certainly is more explicit.

Regarding your answers to 3 and 4, if we want the driver to report the compiled revision of the binary, don't we need a way for it to query for it? To me, your answers to 3, 3a, and 4 seem incompatible.

If the driver should report the compiled revision of the binary, we probably also should print which binary we are talking about. For example, something like this:

$ ./fast-downward.py --version
24.06+ (builds/release/bin/downward build revision deadbeef)
$ ./fast-downward.py --debug --version
24.06+ (builds/debug/bin/downward build revision baadf00d-dirty)
$ ./fast-downward.py --build=my-custom-build --version
24.06 (builds/my-custom-build/bin/downward build revision cafefeed)
msg11728 (view) Author: malte Date: 2025-01-14.11:24:06
> For me the main concern is that we should only print a git revision if the code was
> actually built from this revision. For example, if I git checkout X, then manually
> edit stuff, then build, I don't want the build to claim that this is X.

I see that Masataro suggested "X-dirty" for this case. That works for me too.
msg11727 (view) Author: malte Date: 2025-01-14.11:21:35
For me the main concern is that we should only print a git revision if the code was actually built from this revision. For example, if I git checkout X, then manually edit stuff, then build, I don't want the build to claim that this is X.

Regarding your questions:

1) something like "unknown revision" will do
2) yes, and it should say 24.06+ and not 24.06 unless it is precisely the revision of the release
3) I think it's more import for --version to report this than normal operation. If we want to enable it for normal operation, I would not treat this as particularly high verbosity. But given how chatty we currently are, there is no good reason not to print it on default settings.
3a) the compiled revision; if you modify the Python code in-place and then run it directly without building, you're on your own.
4) no, no reason to duplicate this; you're supposed to go via ./fast-downward.py
msg11726 (view) Author: florian Date: 2025-01-14.10:32:05
Masataro suggested a change in the pull request linked above to compile the git revision into the search binary and print it in every run. In the PR, we had some discussions about what to print where, so I thought I create this issue to have a more permanent place for the discussion.

The situation before the patch is that the driver supports an option --version that prints the number "24.06" if we are on a release build or "24.06+" if we are on some commit after the release build. The patch has a different purpose of identifying retroactively what revision of the code was used to build it. This can help prevent several mistakes like building, then pulling and forgetting to rebuild.

The main points of discussions are:
1) Sometimes we are building from a tarball, where no revision information is available. What should we report in such cases?

2) Should we include the release number ("24.06" or "24.06+") in the reported output of the binary?

3) Should the driver also report the revision in addition to the release, especially on non-release revisions where we currently only say "24.06+"?

3a) If so, should it print the current revision, i.e., the revision of the driver code used, or the revision that was used to build the binary, or both?

4) Should we give a --version option to the binary?


After some back and forth in the pull request, I'm currently leaning towards the following choices:
1) "unknown revision"
2) yes
3) no
4) no
History
Date User Action Args
2025-01-15 10:38:21maltesetmessages: + msg11736
2025-01-15 09:43:33floriansetmessages: + msg11735
2025-01-14 21:49:34masatarosetmessages: + msg11732
2025-01-14 21:47:47masatarosetmessages: + msg11731
2025-01-14 12:51:17maltesetmessages: + msg11730
2025-01-14 11:53:18floriansetmessages: + msg11729
2025-01-14 11:24:06maltesetmessages: + msg11728
2025-01-14 11:21:35maltesetmessages: + msg11727
2025-01-14 10:32:05floriancreate