Issue1012

Title copy Buildbot nightly translator tests to GitHub Actions
Priority wish Status resolved
Superseder Nosy List clemens, florian, jendrik, malte, silvan
Assigned To jendrik Keywords
Optional summary
Part of issue1001.

Created on 2021-02-19.16:58:17 by jendrik, last changed by jendrik.

Summary
Part of issue1001.
Messages
msg10148 (view) Author: jendrik Date: 2021-02-26.15:52:58
Done. For reference: I went to the "Daily Tests" workflow page and clicked on the "..." symbol for each workflow run and then selected "delete workflow run". After all runs were gone, the whole workflow disappeared as well.
msg10145 (view) Author: silvan Date: 2021-02-26.15:28:28
> I'll close the pull request.
The workflow "daily" still shows on the aibasel repository. I think it would be good to get rid of it. Could you please look into this? Maybe you have to also remove the commits of your fork or remove/disable the workflow manually from the aibasel repo.
msg10131 (view) Author: malte Date: 2021-02-23.17:17:47
> Commit f2dc7fda5 fixed the ".hg" directory name problem. The Buildbot
> tests are now failing for other reasons.

It looks like this is because the tests no longer include a build step. This was changed during the last sprint, and I remember that someone commented that this would break certain parts of the buildbots. If it only affects tests we want to disable anyway, there is not necessarily something we need to do about it.
msg10130 (view) Author: jendrik Date: 2021-02-23.16:53:18
OK, let's drop the tests. I'll close the pull request.

Commit f2dc7fda5 fixed the ".hg" directory name problem. The Buildbot tests are now failing for other reasons.
msg10129 (view) Author: malte Date: 2021-02-23.12:49:14
> In my opinion this only makes sense if we have something that we would like to
> test that takes so long that we don't want to do it on every push (translating
> one task per domain might qualify, I don't know).

I don't think it should take long. Right now the tests fail, I think because the test code mistakes the ".hg" directory within the benchmark repository for a directory containing a planning domain.
msg10128 (view) Author: malte Date: 2021-02-23.12:41:24
I don't have a strong opinion, but like Silvan my recommendation would be to drop these tests.

For what it's worth, if we check if the translator was modified, we could run these as part of the regular tests. Translator changes are rare, I don't think it would be a problem. (If the tests run too long because some domain for some reason has a huge first task, I'd just adapt the test.) But I think the better solution would be to keep things simple and not have these tests at all.
msg10125 (view) Author: silvan Date: 2021-02-23.12:07:04
> I'd like to limit this issue to whether we want to keep the nightly translator tests (in their current status) or whether we should drop them.

I'm in favor of dropping them. If we keep them, I agree with Florian's suggestion of only running the tests if necessary. We could even go further and check if the translator was was modified.
msg10122 (view) Author: florian Date: 2021-02-22.18:26:25
I think it could make sense to just abort the tests if the revision didn't change, i.e., to run it once per "day that had pushes". In my opinion this only makes sense if we have something that we would like to test that takes so long that we don't want to do it on every push (translating one task per domain might qualify, I don't know).
msg10121 (view) Author: clemens Date: 2021-02-22.18:21:53
Sorry, didn't mean to change the title.
msg10120 (view) Author: clemens Date: 2021-02-22.18:20:30
How often is the translator modified? I assume far less than every day. Since the tasks are static, nothing about the tested objects changes, or am I missing something? I don't understand why this should be tested daily.
msg10119 (view) Author: jendrik Date: 2021-02-22.18:19:57
I'd like to limit this issue to whether we want to keep the nightly translator tests (in their current status) or whether we should drop them.
msg10118 (view) Author: florian Date: 2021-02-22.18:07:23
I think a better way to check for determinism would be to compare the result to the result obtained on the previous day or a reference result.

Is this issue just for the translator tests or should we also discuss what other tests make sense to run once a day?
msg10115 (view) Author: jendrik Date: 2021-02-19.17:41:46
Yes, the translator tests check that the output is deterministic. Of course the proposed daily action would also check that all first tasks from the downward-benchmarks repo can actually be translated. Three times :-)

I think checking for determinism has value, but I'm fine with dropping the daily test.

You can find the code for the new workflow here: https://github.com/aibasel/downward/pull/31
msg10113 (view) Author: silvan Date: 2021-02-19.17:26:46
To me, this test doesn't sound useful. What do we test with this? That the translator is deterministic?
msg10112 (view) Author: malte Date: 2021-02-19.17:04:52
Is this a useful test? We shouldn't run it just because we always used to run it.
msg10110 (view) Author: jendrik Date: 2021-02-19.16:58:17
The Buildbot currently runs two nightly/weekly tests: regression tests and "medium" translator tests. Both tests are the same for the nightly and weekly variant. We agreed that the regression tests in their current form don't have much value and that we want a better solution for tracking performance regressions in the long term. The "medium" translator tests run the first task of each domain three times and check that the output is the same.

I suggest to run the "medium" translator tests daily on GitHub.
History
Date User Action Args
2021-02-26 15:52:59jendriksetmessages: + msg10148
2021-02-26 15:28:28silvansetmessages: + msg10145
2021-02-23 17:17:47maltesetmessages: + msg10131
2021-02-23 16:53:19jendriksetstatus: chatting -> resolved
messages: + msg10130
2021-02-23 12:49:14maltesetmessages: + msg10129
2021-02-23 12:41:25maltesetmessages: + msg10128
2021-02-23 12:07:04silvansetmessages: + msg10125
2021-02-22 18:26:25floriansetmessages: + msg10122
2021-02-22 18:22:28clemenssettitle: run nightly translator tests on GitHub -> copy Buildbot nightly translator tests to GitHub Actions -> copy Buildbot nightly translator tests to GitHub Actions
2021-02-22 18:21:53clemenssetmessages: + msg10121
title: run nightly translator tests on GitHub -> run nightly translator tests on GitHub -> copy Buildbot nightly translator tests to GitHub Actions
2021-02-22 18:20:30clemenssetnosy: + clemens
messages: + msg10120
title: copy Buildbot nightly translator tests to GitHub Actions -> run nightly translator tests on GitHub
2021-02-22 18:19:57jendriksetmessages: + msg10119
title: run nightly translator tests on GitHub -> copy Buildbot nightly translator tests to GitHub Actions
2021-02-22 18:07:24floriansetnosy: + florian
messages: + msg10118
2021-02-19 17:42:01jendriksetassignedto: jendrik
2021-02-19 17:41:46jendriksetmessages: + msg10115
2021-02-19 17:26:46silvansetmessages: + msg10113
2021-02-19 17:26:04silvansetnosy: + silvan
2021-02-19 17:04:52maltesetstatus: unread -> chatting
messages: + msg10112
2021-02-19 16:58:17jendrikcreate