Question

Updating submodules with submodule --remote will pull in the HEAD of the submodules rather than the hash recorded within the wrapping git repo. But it seems that the wrapping git repo will keep managing the hash of them in itself, needlessly introducing noise to its own history.

E.g. after an submodule update --remote there will be a change introduced in the wrapping project, something like:

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

    modified:   <module-name> (new commits)

Is it possibly not to include any hashes or information about the hash of submodules, in a git repo containing submodules, such that submodule update will not introduce the need for new commits and will not be reflected in the project's history?

Motivating Scenario:

This would solve a workflow that can be described as "always use the latest of all submodules", which currently requires special administration after each submodule update (committing or somehow removing the above change record from the history.. which makes workflow very confusing when you just always want to use the latest).

Answer 1

Long answer

Is it possibly not to include any hashes or information about the hash of submodules

No, that is the all point of a submodule: to record in the parent repo a fixed SHA1 (that is the gitlink, special entry in the index).

As a convenience, you can update the content of a submodule in order to match a remote branch of the submodule upstream repo.
(git submodule update --remote)

But once that update is done, the result is no different than any other manual modification inside a submodule: its SHA1 change, and needs to be recorded in the parent repo.

The goal is for the parent repo to be cloned later with the exact same content (including its submodules content), hence the submodules are checked out at their respective SHA1 recorded by the parent repo.

Commit 9937e65 (Git 2.0, Jan. 2014) mentions:

Make it clear that there is no implicit floating going on; --remote lets you explicitly integrate the upstream branch in your current HEAD (just like running 'git pull' in the submodule).
The only distinction with the current 'git pull' is the config location and setting used for the upstream branch, which is hopefully clear now.

commit 23d25e4 details:

This commit does not change the logic for updates after the initial clone, which will continue to create detached HEADs for checkout-mode updates, and integrate remote work with the local HEAD (detached or not) in other modes.

The motivation for the change is that developers doing local work inside the submodule are likely to select a non-checkout-mode for updates so their local work is integrated with upstream work.
Developers who are not doing local submodule work stick with checkout-mode updates so any apparently local work is blown away during updates.

For example, if upstream rolls back the remote branch or gitlinked commit to an earlier version, the checkout-mode developer wants their old submodule checkout to be rolled back as well, instead of getting a no-op merge/rebase with the rolled-back reference.

TL;DR

Probably nothing in the git universe can make a repo automatically point at the latest of a submodule rather than at a specific commit hash of it. While submodule update --remote can be used as a convention to ignore that tracking, history describing the updates will always be recorded in the parent repo.

Updating submodules to their latest without accumulating update history

Motivating Scenario:

1 个答案:

Long answer

TL;DR