GSoC Week 3-4
For some reason, I totally forgot the blog for week 3, so here is a bundle of the progress from week 3 and week 4.
Summary of the two weeks
At this stage, the main focus is on making git-sparse-checkout
work better
with git-mv
. With the ongoing “from out-of-cone to in-cone” clearing
up, I’m also working to make the complementary “from in-cone to out-of-cone”
possible. Read the previous blogs on GSoC to get a better context.
In the meantime, some experiments towards integration with sparse-index
have also started, which are based on the latest work “from in-cone to
out-of-cone” boiling in my branch.
This week I’m working to ship a PATCH v5 (please reference all the code here) to address the issues raised in PATCH v4.
The good news is, that PATCH v5 is being queued into the ‘next’ branch, which means it could potentially be merged into ‘master’. It marks that this stage of work is almost done.
What’s going on in PATCH v5
-
Fix style-nits.
-
Add t1092 tests (2/8) for “mv: add check_dir_in_index() and solve general dir check issue” (8/8).
There is really not much to say about v5: it addressed some questions/ideas raised in v4, and that’s it.
What I was doing in week 3
I spent the whole week experimenting with moving from in-cone to out-of-cone. It is a complementary part to the ongoing out-of-cone to in-cone series.
In this form of move, the <destination>
is a SKIP_WORKTREE_DIR
(as a enum
flag in builtin/mv.c
), which means it is a directory exists
only in the index, but missing in the working tree. Such a directory is a
result of all its files being sparsified, so it is removed from the working
tree (it is also known as a “sparse-directory entry” when sparse-index
is
on).
It is worth noticing that both “cone mode” and out-of-cone specify that
the <destination>
can only be such a directory described above. For the
reason behind this conclusion and more information about “cone mode” (which
is an essential concept), please see git-sparse-checkout (1), section
“INTERNALS — CONE PATTERN SET”.
To make this form of move possible, we should do a few steps:
-
When
<destination>
does not present in the working tree, utilize thecheck_dir_in_index
function to see if<destination>
is in the index as aSKIP_WORKTREE_DIR
, or if it is a “sparse-directory entry”. If yes, then proceed to the next step, otherwise, stop. -
Check if the
cache_entry
(inmv
, the move is usually done by two steps: first rename(2) the file on disk, then rename the correspondingcache_entry
) being moved is dirty (Changes not staged for commit). If not dirty: we turn on theCE_SKIP_WORKTREE
bit for the movedcache_entry
, then we simply delete the corresponding file from the disk. The reason behind this is that moving<source>
from in-cone to out-of-cone, the expected behavior is to “sparsify” the file: turn on itsCE_SKIP_WORKTREE
bit and the corresponding file should be gone from the disk. If is dirty: we create the leading directories so that the result can be moved (e.g. ingit mv folder2/file folder1/deeper/
,folder1/deeper/
is aSKIP_WORKTREE_DIR
, so we do something likemkdir -p folder1/deeper
to make sure thatrename("folder2/file", "folder1/deeper/file")
can work). In this case, we don’t want to remove the resulted file (in this case “folder1/deeper/file”), and we don’t turn on its correspondingcache_entry
’sCE_SKIP_WORKTREE
bit. The reason is that the change in this file has not been staged yet, we should leave it on disk for safety, so the user can decide what to do. -
In the dirty case, we also warn the user about what paths are dirty and thus not moved.
What I was doing in week 4
So week 3 was for experimenting with moving from in-cone to out-of-cone.
In week 4, I was building the integration with sparse-index
on top of
week 3’s result. It’s time to get my feet wet trying to work with
sparse-index
, after realizing a relatively ideal interaction between mv
and sparse-checkout
.
One of the head-on obstacles I met is that mv
does not know about
sparse-index
at all. For example, let’s say git mv folder1 deep
,
wherein folder1
is a sparse-directory entry, deep
is a normal directory,
and the expected result is deep/folder1
. We know that mv
needs to search
the index, and find every cache_entry
that starts with folder1/
, and
move these cache_entry
one by one. However, with sparse-index on,
the folder1
is stored in the index as folder1/
, and all its files are
pruned away from the index, which means we can’t really locate any files
under it, so the whole “moving a directory” logic is broken.
There are several possible solutions here:
- Expand the
folder1/
sparse-directory entry first, so all the files under it are back, then we should be able to utilize the originalmv
logic. - Treat
sparse-index
as a special case and make some new logic for it, which could require more effort than the previous solution.
What’s next
Similar logic conflicts are potentially many, and I’m still working to address them to make sure things work with sparse-index.