Issue950

Title Migration Script hg to git
Priority feature Status reviewing
Superseder Nosy List cedric, florian, jendrik, malte, patfer, silvan
Assigned To patfer Keywords
Optional summary
TODO:
* ~~Don't use the name "master" for the main branch~~
~~(https://www.bbc.com/news/technology-53050955).~~
* ~~Delete closed and merged branches (https://github.com/frej/fast-
export/issues/84).~~
* Add documentation to fast-downward.org
* Set repository to public
* Write an announcement email with instructions

Created on 2019-12-19.16:40:31 by patfer, last changed by malte.

Summary
TODO:
* Don't use the name "master" for the main branch
(https://www.bbc.com/news/technology-53050955).
* Delete closed and merged branches (https://github.com/frej/fast-
export/issues/84).
* Add documentation to fast-downward.org
* Set repository to public
* Write an announcement email with instructions
Files
File name Uploaded Type Edit Remove
empty-commits.txt.xz jendrik, 2020-07-03.12:34:37 application/x-xz
Messages
msg9529 (view) Author: malte Date: 2020-07-07.16:50:32
If it helps, I'm available for discussion after the PhD defense (which is currently in the discussion stage).
msg9526 (view) Author: jendrik Date: 2020-07-07.16:30:19
Maybe it's best to hold this discussion live when Malte is available so we can directly make a decision.
msg9525 (view) Author: florian Date: 2020-07-07.16:27:32
I'm not sure if I understand the situation where -d would produce an error. From the stackoverflow page Jendrik linked, it sounds like this happens if you are on a branch that has no path in the tree to some commit on the branch you want to delete. Does this mean this happens for us only if the branch we delete is closed and merged into a branch other than main that itself is open and unmerged?
msg9524 (view) Author: patfer Date: 2020-07-07.16:22:01
From our discussions, I thought, we want to delete only the closed and merge 
branches. Those should be safe with '-d'.

Did it happen on your test repository that '-d' tried to delete something and 
failed or that after conversion there was a branch left which you wanted 
deleted (in the later case, I would be fine by telling the user, if you have 
this, you can delete it manually)?

I am a bit hesitant to switch to -D, because I am afraid that some users have 
unmerged closed branches that they do not want to loose without being asked.
msg9523 (view) Author: jendrik Date: 2020-07-07.16:07:29
I tested the scripts again and made several small changes. The most critical one is 

-            call(["git", "branch", "-d", branch])
+            # Use -D to avoid "branch not fully merged" warning, which
+            # occurs when trying to delete a branch that is closed and merged
+            # into a non-"default" branch, even if that branch will not be closed
+            # (https://stackoverflow.com/questions/7548926).
+            call(["git", "branch", "-D", branch])

I'm unsure whether -D is the correct solution. Using -d has the advantage that we abort with an error if we're potentially making commits unreachable. But -d doesn't allow us to remove all closed-and-merged branches. Does someone have a better solution for this?
msg9508 (view) Author: jendrik Date: 2020-07-06.22:29:18
You're right. I didn't think it through.
msg9506 (view) Author: malte Date: 2020-07-06.21:57:49
I agree with Florian. The .hgignore file doesn't really hurt.
msg9505 (view) Author: florian Date: 2020-07-06.20:00:36
I would not treat removing .hgignore as part of this issue. The problem with this is that if the script does a commit to remove the file, this could lead to inconsistent repositories (if the commit is not at the same place in the history, for example if an older repository is converted). If we create a normal issue and commit for it outside of the conversion script, then getting from a .hgignore to a .gitignore is just a matter of merging with the main branch.

If you thought about removing it during the rewrite-history step, that would be an option but I don't see it as necessary to erase the file from history. I think a normal commit on the main branch that removes .hgignore and adds .gitignore would be fine. If we want to avoid commits on main, we could also create an issue for this.
msg9499 (view) Author: jendrik Date: 2020-07-06.19:25:40
Add TODO: remove .hgignore file.
msg9440 (view) Author: malte Date: 2020-07-03.14:07:44
> I fiddled around with filtering empty commits and noticed that the repository
> sizes differ greatly depending on whether the repo is freshly cloned or not.
> Long story short: I think we can only compare repository sizes of freshly
> clones repos.

I think it would make sense to end the script with a cloning step then, so this doesn't trip other people up.
msg9439 (view) Author: jendrik Date: 2020-07-03.13:54:19
The conversion script now removes empty commits.
msg9437 (view) Author: jendrik Date: 2020-07-03.13:35:21
I fiddled around with filtering empty commits and noticed that the repository sizes differ greatly depending on whether the repo is freshly cloned or not. Long story short: I think we can only compare repository sizes of freshly clones repos.
msg9436 (view) Author: jendrik Date: 2020-07-03.12:53:04
> Question: does the migration script prepend [issueXYZ] to all commit messages?

Yes, it does.
msg9435 (view) Author: malte Date: 2020-07-03.12:49:01
> Do we get Malte's final approval?

We have to finalize issue652 first, or else we cannot really look at the resulting repository properly.

I'll try to review the issue652 result repository ASAP, and it would be great if others could, too.
msg9434 (view) Author: malte Date: 2020-07-03.12:43:39
> I'm appending the result. If nobody objects, I'll let the conversion script
> remove all empty commits.

They are all tags and branch status changes, which makes sense because Mercurial doesn't allow commits that do nothing; they have to change a file or some metadata like the branch info. (A commit might of course be empty because our conversion ignores all files that it changes, but it seems that such commits were already previously dropped by the conversion script.)

Question: does the migration script prepend [issueXYZ] to all commit messages? If not, after dropping these commits, we lose any trace of which branch/issue a commit belonged to. Now, we can use the "start branch XYZ/stop branch XYZ" commits for this.

(I can also check this myself, but I want to look at the hg cleanup issue first.)
msg9433 (view) Author: jendrik Date: 2020-07-03.12:34:37
I used the following command to obtain a list of empty commits:

for sha in $(git rev-list --min-parents=1 --max-parents=1 --all); do if [ $(git rev-parse ${sha}^{tree}) == $(git rev-parse ${sha}^1^{tree} ) ]; then git log --format="%B" -n 1 ${sha}; fi; done | sed '/^[[:space:]]*$/d'| tee empty-commits.txt

I'm appending the result. If nobody objects, I'll let the conversion script remove all empty commits.
msg9431 (view) Author: patfer Date: 2020-07-03.11:44:10
Thank you Florian for the review. All comments have been incorporated. The 
current state can be seen at:
https://github.com/aibasel/convert-downward/pull/3

Do we get Malte's final approval?
msg9390 (view) Author: florian Date: 2020-07-01.20:00:12
I'm done with my review. There was nothing major, so Patrick, once you incorporated the changes, I'd say this is ready for Malte's final approval.
msg9374 (view) Author: florian Date: 2020-07-01.12:20:35
Copied over some notes from the task board:

We decided to keep the repo at aibasel/convert-downward instead of including it in the main repository
msg9362 (view) Author: patfer Date: 2020-06-30.17:13:46
Yes, we renamed master to main and we delete those branches that are closed AND 
have been merged (without history loss)
msg9360 (view) Author: florian Date: 2020-06-30.16:31:36
The TODOs in the summary are done, right?
msg9359 (view) Author: patfer Date: 2020-06-30.16:19:39
The migration script is ready for the next round of reviewing.
It seems not to be possible to make a code review for a whole repository, thus, 
I do not show you a nice PR. Here is the code
https://github.com/aibasel/convert-downward/
msg9355 (view) Author: jendrik Date: 2020-06-29.12:46:41
TODO:
* Don't use the name "master" for the main branch (https://www.bbc.com/news/technology-53050955).
msg9122 (view) Author: patfer Date: 2019-12-19.16:40:31
After sprint summary:

We want a tool which automatically converts our Fast Downward repositories from
mercurial to git.

I am working on this. A first (slightly outdated) version exists and can be seen at
https://github.com/aibasel/convert-downward/pull/2

TODO:
- finish conversion for .hgignore to .gitignore (by simple rules and a user
given list of mappings, by user I mean me who creates a default list for our
rules which cannot be converted automatically)
- all branches are open again. Decide what we want to do about them. One suggestion:
#Archive (maybe our new close) branch
git tag archive/<branchname> <branchname>
git branch -d <branchname>
git checkout master

# Reopen branch
git checkout -b new_branch_name archive/<branchname>
History
Date User Action Args
2020-07-07 16:50:32maltesetmessages: + msg9529
2020-07-07 16:30:19jendriksetmessages: + msg9526
2020-07-07 16:27:32floriansetmessages: + msg9525
2020-07-07 16:22:01patfersetmessages: + msg9524
2020-07-07 16:07:29jendriksetmessages: + msg9523
2020-07-06 22:29:18jendriksetmessages: + msg9508
summary: TODO: * ~~Don't use the name "master" for the main branch~~ ~~(https://www.bbc.com/news/technology-53050955).~~ * ~~Delete closed and merged branches (https://github.com/frej/fast- export/issues/84).~~ * Add documentation to fast-downward.org * Set repository to public * Write an announcement email with instructions * Remove .hgignore file -> TODO: * ~~Don't use the name "master" for the main branch~~ ~~(https://www.bbc.com/news/technology-53050955).~~ * ~~Delete closed and merged branches (https://github.com/frej/fast- export/issues/84).~~ * Add documentation to fast-downward.org * Set repository to public * Write an announcement email with instructions
2020-07-06 21:57:49maltesetmessages: + msg9506
2020-07-06 20:00:36floriansetmessages: + msg9505
2020-07-06 19:25:40jendriksetmessages: + msg9499
summary: TODO: * ~~Don't use the name "master" for the main branch~~ ~~(https://www.bbc.com/news/technology-53050955).~~ * ~~Delete closed and merged branches (https://github.com/frej/fast- export/issues/84).~~ * Add documentation to fast-downward.org * Set repository to public * Write an announcement email with instructions -> TODO: * ~~Don't use the name "master" for the main branch~~ ~~(https://www.bbc.com/news/technology-53050955).~~ * ~~Delete closed and merged branches (https://github.com/frej/fast- export/issues/84).~~ * Add documentation to fast-downward.org * Set repository to public * Write an announcement email with instructions * Remove .hgignore file
2020-07-03 14:07:44maltesetmessages: + msg9440
2020-07-03 13:54:19jendriksetmessages: + msg9439
2020-07-03 13:35:22jendriksetmessages: + msg9437
2020-07-03 12:53:04jendriksetmessages: + msg9436
2020-07-03 12:49:01maltesetmessages: + msg9435
2020-07-03 12:43:39maltesetmessages: + msg9434
2020-07-03 12:34:37jendriksetfiles: + empty-commits.txt.xz
messages: + msg9433
2020-07-03 11:44:10patfersetmessages: + msg9431
2020-07-01 20:00:12floriansetstatus: in-progress -> reviewing
messages: + msg9390
2020-07-01 12:20:35floriansetmessages: + msg9374
summary: TODO: * ~~Don't use the name "master" for the main branch~~ ~~(https://www.bbc.com/news/technology-53050955).~~ * ~~Delete closed and merged branches (https://github.com/frej/fast- export/issues/84).~~ -> TODO: * ~~Don't use the name "master" for the main branch~~ ~~(https://www.bbc.com/news/technology-53050955).~~ * ~~Delete closed and merged branches (https://github.com/frej/fast- export/issues/84).~~ * Add documentation to fast-downward.org * Set repository to public * Write an announcement email with instructions
2020-06-30 17:13:46patfersetmessages: + msg9362
2020-06-30 16:31:36floriansetmessages: + msg9360
summary: TODO: * Don't use the name "master" for the main branch (https://www.bbc.com/news/technology-53050955). * Delete closed and merged branches (https://github.com/frej/fast- export/issues/84). -> TODO: * ~~Don't use the name "master" for the main branch~~ ~~(https://www.bbc.com/news/technology-53050955).~~ * ~~Delete closed and merged branches (https://github.com/frej/fast- export/issues/84).~~
2020-06-30 16:19:39patfersetnosy: + florian
messages: + msg9359
summary: TODO: * Don't use the name "master" for the main branch (https://www.bbc.com/news/technology-53050955). * Delete closed and merged branches (https://github.com/frej/fast-export/issues/84). -> TODO: * Don't use the name "master" for the main branch (https://www.bbc.com/news/technology-53050955). * Delete closed and merged branches (https://github.com/frej/fast- export/issues/84).
2020-06-30 00:33:49jendriksetpriority: urgent -> feature
2020-06-29 14:48:41jendriksetpriority: feature -> urgent
summary: TODO: * Don't use the name "master" for the main branch (https://www.bbc.com/news/technology-53050955). * Delete closed and merged branches (https://github.com/frej/fast-export/issues/84).
2020-06-29 13:03:33maltesetpriority: urgent -> feature
2020-06-29 12:46:41jendriksetmessages: + msg9355
2019-12-19 16:40:31patfercreate