Issue1116

Title Remove Experiments from the Repository
Priority wish Status resolved
Superseder Nosy List florian, jendrik, malte, silvan, simon
Assigned To Keywords
Optional summary
I the beginning of 2023 we decided to stop committing the experiments but store 
them in 'ai-files' and make 
them accessible by the https://ai.dmi.unibas.ch/ website. 

Archiving new experiments this way but the old ones in the Repo is inconsistent. 
I see no benefit in keeping 
them in the repo. 

The question is: How should we archive the old experiments?

I think storing them anywhere else but the 'ai-files' would be weird and 
inconsistent.

At the moment the 'ai-files' is structured like this (sketch):

ai-files
│
... # not accessable from the website
│
└── experiments 
    ├── README
    ├── name_1
    ...
    ├── name_n
    └── ai
        └── downward
            ├── issue000
            │   ├── data
            │   └── scripts
            ... 
            ├── issue123 
            │   ├── data
            │   └── scripts
            │       ├── archive.py
            │       ├── common_setup.py
            │       ├── requirements.in
            │       ├── requirements.txt
            │       ├── custom_parser.py
            │       └── v1.py
            ...
            └── issue1234
                ├── data
                └── scripts

and the downward repo is structured like this (sketch):

downward
│
...
│
└── experiments 
    ├── README
    ├── issue456 
    │   ├── archive.py
    │   ├── common_setup.py
    │   ├── requirements.in
    │   ├── requirements.txt
    │   ├── custom_parser.py
    │   └── v1.py
    ...
    └── issue789

The ones in the repo do not contain the experiment results. 


One option would be to simply copy the issueXYZ folders from 
downward/experiments into the ai-
files/experiments/ai/downward folder. This would be the simplest solution I 
think.

A slightly different solution would be to add a folder into ai-
files/experiments/ai/downward that stores the 
resolved issues to declutter the ai-files/experiments/ai/downward folder.

Are there any hurdles to the 2 solutions I don't see?
Are there more different solutions to consider?

Created on 2023-09-21.14:07:34 by simon, last changed by simon.

Summary
I the beginning of 2023 we decided to stop committing the experiments but store 
them in 'ai-files' and make 
them accessible by the https://ai.dmi.unibas.ch/ website. 

Archiving new experiments this way but the old ones in the Repo is inconsistent. 
I see no benefit in keeping 
them in the repo. 

The question is: How should we archive the old experiments?

I think storing them anywhere else but the 'ai-files' would be weird and 
inconsistent.

At the moment the 'ai-files' is structured like this (sketch):

ai-files
│
... # not accessable from the website
│
└── experiments 
    ├── README
    ├── name_1
    ...
    ├── name_n
    └── ai
        └── downward
            ├── issue000
            │   ├── data
            │   └── scripts
            ... 
            ├── issue123 
            │   ├── data
            │   └── scripts
            │       ├── archive.py
            │       ├── common_setup.py
            │       ├── requirements.in
            │       ├── requirements.txt
            │       ├── custom_parser.py
            │       └── v1.py
            ...
            └── issue1234
                ├── data
                └── scripts

and the downward repo is structured like this (sketch):

downward
│
...
│
└── experiments 
    ├── README
    ├── issue456 
    │   ├── archive.py
    │   ├── common_setup.py
    │   ├── requirements.in
    │   ├── requirements.txt
    │   ├── custom_parser.py
    │   └── v1.py
    ...
    └── issue789

The ones in the repo do not contain the experiment results. 


One option would be to simply copy the issueXYZ folders from 
downward/experiments into the ai-
files/experiments/ai/downward folder. This would be the simplest solution I 
think.

A slightly different solution would be to add a folder into ai-
files/experiments/ai/downward that stores the 
resolved issues to declutter the ai-files/experiments/ai/downward folder.

Are there any hurdles to the 2 solutions I don't see?
Are there more different solutions to consider?
Messages
msg11401 (view) Author: simon Date: 2023-09-28.15:07:39
In the fast-downward meeting we agreed to basically do nothing with the old 
content of the repo folder downward/experiments.

New experiments are stored on ai-files to be accessible from the website with a 
stable link and the old ones are still in the history of the repo in the case 
someone wants to look at them again.
msg11395 (view) Author: malte Date: 2023-09-22.12:01:00
Agreed. The pull request looks good. I also looked for further references to the old experiments directory in the code or wiki that would need updating and found none. Before we merge this, we might want to give people some advance warning on the internal Discord channel. Or just let them know -- they can always find the experiments in older revisions.

Regarding commit message style, the general pattern we try to follow is to say what is done in the change, possibly followed by further explanations/motivations if necessary. So rather than saying "Experiments do not belong in the main branch." we'd write "Remove experiments directory." Some people use past tense ("Removed") some or all of the time, but I think Jendrik has been lobbying for present tense. I'd also remove the reference to the main branch because there is nothing specific regarding the main branch here. We only have the main branch and the release branches, and we won't put the experiments in future release branches either.
msg11392 (view) Author: simon Date: 2023-09-22.09:56:39
I agree that this is a good topic for a fast-downward meeting.

Anyways, unless we decide that the repo is actually the best place for archiving 
the experiments all solutions would produce the same change in the repo. That is 
simply removing the 'experiments' folder and removing mentions of the 
'experiments' folder in the .gitignore and .gitattributes file.
https://github.com/aibasel/downward/pull/182
msg11390 (view) Author: malte Date: 2023-09-21.14:48:55
Thanks for triggering this!

This may be a topic that we should best discuss synchronously (with everyone interested) to speed things up. I think discussion via the tracker will be inefficient because we first need to agree on what we want to achieve.

I don't think I understand all important details of the proposal and how it relates to what Florian implemented for archiving experiments. For example, I don't understand where the code lives in these examples, and which data is stored under data. Do we include logs? Raw data? HTML reports? We currently store ~8.5 MB under "experiments" in the repository. With full experiment data, this would be in the TB range, which raises different questions regarding our storage and backup solutions.

More generally, I don't see much value in storing old experiments scripts (with no code and no data) on the web server. (This is also not consistent, which I want to mention because you emphasize consistency.) The main reason why we have experiments on the web server is to link to results, in particular interpreted results (HTML, PNG). I don't think we ever wanted to link to an experiment script, so I'm not sure which need we want to address by uploading old experiments without data to the web server. A repository is a much better solution for pure code (offline use, browsability, searchability, revision control). To be clear, this doesn't mean that I advocate having experiments stored in a repository.

In a discussion, I think it would be good to put the use cases in the center (who wants to access which data and how do we serve these needs etc.). Ideally we should also keep in mind use cases outside of Fast Downward (e.g. for Prost and Powerlifted).
msg11389 (view) Author: simon Date: 2023-09-21.14:30:26
To avoid https://ai.dmi.unibas.ch/_experiments/ai/downward/ to get large we could 
group the issues into chunks of hundreds. 
However this would still break the links from the year 2023 so far.

Therefore I think this would rather hurt than help.
msg11388 (view) Author: florian Date: 2023-09-21.14:22:54
Having different directories for active and resolved issues defeats one purpose of the online archive: that links stay stable.

For example, before we had the archive, we often uploaded plots and tables to https://ai.dmi.unibas.ch/_tmpfiles and posted a link to the tracker, which was awkward, because now all of these "temporary" files cannot be deleted without breaking links in the tracker. I see one advantage in the archive (https://ai.dmi.unibas.ch/_experiments/) in the fact that files are meant to permanently stay there.

So I would opt for the first option, where we have the same structure for new and old experiments (I would also add the "scripts" subdirectory to really make the structure equivalent). The downside is that the overview on  https://ai.dmi.unibas.ch/_experiments/ai/downward/ will get large but this going to happen eventually anyway, and is should be easy to search for an issue number on that page.
History
Date User Action Args
2023-09-28 15:07:39simonsetstatus: chatting -> resolved
messages: + msg11401
2023-09-27 11:35:02silvansetnosy: + silvan
2023-09-22 12:01:01maltesetmessages: + msg11395
2023-09-22 09:56:39simonsetmessages: + msg11392
2023-09-21 14:48:55maltesetmessages: + msg11390
2023-09-21 14:30:26simonsetmessages: + msg11389
summary: I the beginning of 2023 we decided to stop committing the experiments but store them in 'ai-files' and make them accessible by the https://ai.dmi.unibas.ch/ website. Archiving new experiments this way but the old ones in the Repo is inconsistent. I see no benefit in keeping them in the repo. The question is: How should we archive the old experiments? I think storing them anywhere else but the 'ai-files' would be weird and inconsistent. At the moment the 'ai-files' is structured like this (sketch): ai-files │ ... # not accessable from the website │ └── experiments ├── README ├── name_1 ... ├── name_n └── ai └── downward ├── issue000 │ ├── data │ └── scripts ... ├── issue123 │ ├── data │ └── scripts │ ├── archive.py │ ├── common_setup.py │ ├── requirements.in │ ├── requirements.txt │ ├── custom_parser.py │ └── v1.py ... └── issue1234 ├── data └── scripts and the downward repo is structured like this (sketch): downward │ ... │ └── experiments ├── README ├── issue456 │ ├── archive.py │ ├── common_setup.py │ ├── requirements.in │ ├── requirements.txt │ ├── custom_parser.py │ └── v1.py ... └── issue789 The ones in the repo do not contain the experiment results. One option would be to simply copy the issueXYZ folders from downward/experiments into the ai- files/experiments/ai/downward folder. This would be the simplest solution I think. A slightly different solution would be to add a folder into ai-files/experiments/ai/downward that stores the resolved issues to declutter the ai-files/experiments/ai/downward folder. Are there any hurdles to the 2 solutions I don't see? Are there more different solutions to consider? -> I the beginning of 2023 we decided to stop committing the experiments but store them in 'ai-files' and make them accessible by the https://ai.dmi.unibas.ch/ website. Archiving new experiments this way but the old ones in the Repo is inconsistent. I see no benefit in keeping them in the repo. The question is: How should we archive the old experiments? I think storing them anywhere else but the 'ai-files' would be weird and inconsistent. At the moment the 'ai-files' is structured like this (sketch): ai-files │ ... # not accessable from the website │ └── experiments ├── README ├── name_1 ... ├── name_n └── ai └── downward ├── issue000 │ ├── data │ └── scripts ... ├── issue123 │ ├── data │ └── scripts │ ├── archive.py │ ├── common_setup.py │ ├── requirements.in │ ├── requirements.txt │ ├── custom_parser.py │ └── v1.py ... └── issue1234 ├── data └── scripts and the downward repo is structured like this (sketch): downward │ ... │ └── experiments ├── README ├── issue456 │ ├── archive.py │ ├── common_setup.py │ ├── requirements.in │ ├── requirements.txt │ ├── custom_parser.py │ └── v1.py ... └── issue789 The ones in the repo do not contain the experiment results. One option would be to simply copy the issueXYZ folders from downward/experiments into the ai- files/experiments/ai/downward folder. This would be the simplest solution I think. A slightly different solution would be to add a folder into ai- files/experiments/ai/downward that stores the resolved issues to declutter the ai-files/experiments/ai/downward folder. Are there any hurdles to the 2 solutions I don't see? Are there more different solutions to consider?
2023-09-21 14:22:54floriansetmessages: + msg11388
2023-09-21 14:07:34simoncreate