GSoC Week 5-6

3 minute read

Summary of the two weeks

At this stage, the main focus is on making git-sparse-checkout work better with git-mv. After getting the “from out-of-cone to in-cone” series merged into master, I’m working to make the complementary “from in-cone to out-of-cone” possible. It’s now a [PATCH v1]. Read the previous blogs on GSoC to get a better context.

In the meantime, I’m also experimenting with git-rm to integrate with sparse-index. git-rm turns out to be a slightly less complex command in terms of its interaction with sparse-checkout, so working with it at the same time gives me some insight into sparse-index, without worrying too much about its compatibility with sparse-checkout.

What I was doing in week 5

Week 5 is mainly about designing the “from in-cone to out-of-cone” series. As stated in previous blogs, this series is a sister of the previous “from out-of-cone to in-cone” series. This series is trying to make the opposite operation possible for mv, namely move <source>, which is in-cone, to <destination>, which is out-of-cone.

The main steps of this operation are:

  • Determine if the path being moved is clean or dirty (whether it has unstaged change).

  • If the path is clean (without unstaged changes), then git-mv should move the <source> to the <destination>, both in the working tree and the index, then remove the resulted path from the working tree, and turn on its CE_SKIP_WORKTREE bit.
    • Reasoning: the resulted <destination> path is now out-of-cone, then it should be automatically sparsified, namely being removed from the working tree and marked as CE_SKIP_WORKTREE in the index. Because the path is clean, doing so will not lose any information.
  • If the path is dirty (has unstaged changes), then git-mv should move the <source> to the <destination>, both in the working tree and the index, but should not remove the resulted path from the working tree and should not turn on its CE_SKIP_WORKTREE bit. Instead, advise the user to git add --sparse this path and run git sparse-checkout reapply to re-sparsify that path.
    • Reasoning: though the resulted <destination> path is now out-of-cone, we cannot just sparsify it as we do for the clean path. That’s simply because removing an unstaged path results in a loss of information. Instead, we should go ahead to tell the user to stage the changes first (to prevent loss of information), then utilize the git sparse-checkout reapply command to make the path sparsified according to the sparse-checkout definition (cone).

What I was doing in week 6

After I sent my week 5 patch to the mailing list, I received suggestions from my mentors to take a look at the git-rm integration with sparse-index, given that git-rm is already working relatively well with sparse-checkout, so there may not be much preliminary work to do as I’m doing with git-mv.

I took the suggestions. Since I’ve been working with git-mv for a long time (around 3 months including researching), it’s a nice time to breathe some fresh air and work on something new to avoid burnout.

Integration with sparse-index needs first to add tests to, which is a shell script that tests Git commands’ compatibility with sparse-checkout or sparse-index. The tests are primarily focusing on comparing the running results from different Git commands in environments with different sparsities. For example, git-rm will be tested by running it in a full-checkout tree (without any sparse configuration), a sparse-checkout tree, and a sparse-index tree (e.g. git sparse-checkout set <dir> --sparse-index). Then the results from the three differently set up trees will be compared against each other to see if the Git command is working correctly when sparse-checkout or sparse-index is on (the definition for correctness is based on the Git command’s expected behavior, so it varies).

What I was doing this week was investigating the t1092 script and git-rm source code. Doing this is a prerequisite to understand exactly what to test and how to test them. Only after establishing proper tests for a command, then we know what is working and what is buggy when the command is using sparse setups.

What’s next

What I’ll be doing next week is writing out the compatibility tests for git-rm, and see if there is anything to do based on the test results. If compatibility works out, then I should go ahead and check the sparse-index stays sparse when git-rm is working inside of the sparse- checkout definition. A guide to integrating with sparse-index is written by my mentors here.