* subdirectory-filter does not delete files before the directory came into existence?
@ 2010-12-14 22:21 Jan Wielemaker
2010-12-14 23:03 ` Thomas Rast
0 siblings, 1 reply; 8+ messages in thread
From: Jan Wielemaker @ 2010-12-14 22:21 UTC (permalink / raw
To: git
Hi,
There is a lot of information about extracting a directory from a git
project. One thing I failed to find though is the following:
I try to extract a directory. The result is fine, but there is a lot
of history in the result from *before* the directory was added to the
project. Why? How can I get rid of this?
If you want to see yourself, I did:
git clone git://www.swi-prolog.org/home/pl/git/pl-devel.git
git clone pl-devel odbc
cd odbc
git filter-branch --subdirectory-filter packages/odbc --prune-empty
--tag-name-filter cat -- --all
Now use e.g. qgit to look at the history. As from 03/07/2002, when
the packages/odbc directory was created, all looks just fine. Before
though ...
Thanks for any hints
--- Jan
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: subdirectory-filter does not delete files before the directory came into existence?
2010-12-14 22:21 subdirectory-filter does not delete files before the directory came into existence? Jan Wielemaker
@ 2010-12-14 23:03 ` Thomas Rast
2010-12-15 9:50 ` Jan Wielemaker
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Thomas Rast @ 2010-12-14 23:03 UTC (permalink / raw
To: Jan Wielemaker; +Cc: git
Jan Wielemaker wrote:
> I try to extract a directory. The result is fine, but there is a lot
> of history in the result from *before* the directory was added to the
> project. Why? How can I get rid of this?
[...]
> Now use e.g. qgit to look at the history. As from 03/07/2002, when
> the packages/odbc directory was created, all looks just fine. Before
> though ...
That history is not connected to the filtered one. git-filter-branch
alerts you to it with messages like
WARNING: Ref 'refs/tags/V5.0.4' is unchanged
WARNING: Ref 'refs/tags/V5.0.5' is unchanged
WARNING: Ref 'refs/tags/V5.0.6' is unchanged
WARNING: Ref 'refs/tags/V5.0.7' is unchanged
I haven't made up my mind if this is a bug report or a feature
request, but in any case you can delete all of them and the problem
goes away.
--
Thomas Rast
trast@{inf,student}.ethz.ch
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: subdirectory-filter does not delete files before the directory came into existence?
2010-12-14 23:03 ` Thomas Rast
@ 2010-12-15 9:50 ` Jan Wielemaker
2010-12-15 10:40 ` Jan Wielemaker
2010-12-15 12:22 ` Jan Wielemaker
2 siblings, 0 replies; 8+ messages in thread
From: Jan Wielemaker @ 2010-12-15 9:50 UTC (permalink / raw
To: Thomas Rast; +Cc: git
Dear Thomas,
On Wed, 2010-12-15 at 00:03 +0100, Thomas Rast wrote:
> Jan Wielemaker wrote:
> > I try to extract a directory. The result is fine, but there is a lot
> > of history in the result from *before* the directory was added to the
> > project. Why? How can I get rid of this?
> [...]
> > Now use e.g. qgit to look at the history. As from 03/07/2002, when
> > the packages/odbc directory was created, all looks just fine. Before
> > though ...
>
> That history is not connected to the filtered one. git-filter-branch
> alerts you to it with messages like
>
> WARNING: Ref 'refs/tags/V5.0.4' is unchanged
> WARNING: Ref 'refs/tags/V5.0.5' is unchanged
> WARNING: Ref 'refs/tags/V5.0.6' is unchanged
> WARNING: Ref 'refs/tags/V5.0.7' is unchanged
Thanks for the insight. Catching these errors and running git tag -d on
them gets me a nice and clean history. Only ... It starts in
12/08/2008 instead of 03/07/2002. This is (almost) compatible with the
filtering feedback that says it rewrote 174 commits. The filtered and
cleaned history contains 171.
This is a bit odd. If I open qgit on the original (before filtering)
and show the history of odbc.c, it looks like a nice and continuous
one going back to 2002. Also
git log --oneline packages/odbc/odbc.c
shows a history that starts with "First public version of ODBC
interface"
Of course, this is a project with a long history that was converted
from CVS, but the history looks unbroken, so why does filtering a
directory breaks it?
> I haven't made up my mind if this is a bug report or a feature
> request, but in any case you can delete all of them and the problem
> goes away.
Isn't it true that you will have info from before introducing a
directory whenever there are tags that are older than the directory?
If that is the case, it looks wrong to me. I want to filter the
directory, so the repository from before the existence of the
directory is not interesting. Of course, things change if the
directory was created by renaming files that where already in
the repository. I don't know what one should `expect' in that
case. Here, the directory was added from new files, so it is
quite clear what one should expect.
Regards --- Jan
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: subdirectory-filter does not delete files before the directory came into existence?
2010-12-14 23:03 ` Thomas Rast
2010-12-15 9:50 ` Jan Wielemaker
@ 2010-12-15 10:40 ` Jan Wielemaker
2010-12-15 12:22 ` Jan Wielemaker
2 siblings, 0 replies; 8+ messages in thread
From: Jan Wielemaker @ 2010-12-15 10:40 UTC (permalink / raw
To: Thomas Rast; +Cc: git
In addition to my previous reply: Looking at the result of the
initial filter, if remove all unchanged refs I loose the history
before 2008. Qgit however shows a broken history at the start
of the directory in 2002. If I keep deleting the tag that is
the head of older stuff I end up with what I hoped in the first
place. This is of course a bit tedious :-(
You can view the result at
git://www.swi-prolog.org/home/pl/git/packages/odbc.git
I'll split some more packages. Curious to what is going to happen ...
Regards --- Jan
On Wed, 2010-12-15 at 00:03 +0100, Thomas Rast wrote:
> Jan Wielemaker wrote:
> > I try to extract a directory. The result is fine, but there is a lot
> > of history in the result from *before* the directory was added to the
> > project. Why? How can I get rid of this?
> [...]
> > Now use e.g. qgit to look at the history. As from 03/07/2002, when
> > the packages/odbc directory was created, all looks just fine. Before
> > though ...
>
> That history is not connected to the filtered one. git-filter-branch
> alerts you to it with messages like
>
> WARNING: Ref 'refs/tags/V5.0.4' is unchanged
> WARNING: Ref 'refs/tags/V5.0.5' is unchanged
> WARNING: Ref 'refs/tags/V5.0.6' is unchanged
> WARNING: Ref 'refs/tags/V5.0.7' is unchanged
>
> I haven't made up my mind if this is a bug report or a feature
> request, but in any case you can delete all of them and the problem
> goes away.
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: subdirectory-filter does not delete files before the directory came into existence?
2010-12-14 23:03 ` Thomas Rast
2010-12-15 9:50 ` Jan Wielemaker
2010-12-15 10:40 ` Jan Wielemaker
@ 2010-12-15 12:22 ` Jan Wielemaker
2010-12-19 2:23 ` Thomas Rast
2 siblings, 1 reply; 8+ messages in thread
From: Jan Wielemaker @ 2010-12-15 12:22 UTC (permalink / raw
To: Thomas Rast; +Cc: git
The reported problems also apply to the next module. What appears to
work is this:
* Walk through the history, finding the commit where the directory
is created.
* use git tag -l --contains <commit that created dir> to get the
tags we want to keep.
* get all tags, use comm and delete the tags not in the `contained'
set above.
Not very friendly and I'm (with Thomas) about the status of these
findings. I like to thank Thomas for giving me the right clue.
Regards --- Jan
On Wed, 2010-12-15 at 00:03 +0100, Thomas Rast wrote:
> Jan Wielemaker wrote:
> > I try to extract a directory. The result is fine, but there is a lot
> > of history in the result from *before* the directory was added to the
> > project. Why? How can I get rid of this?
> [...]
> > Now use e.g. qgit to look at the history. As from 03/07/2002, when
> > the packages/odbc directory was created, all looks just fine. Before
> > though ...
>
> That history is not connected to the filtered one. git-filter-branch
> alerts you to it with messages like
>
> WARNING: Ref 'refs/tags/V5.0.4' is unchanged
> WARNING: Ref 'refs/tags/V5.0.5' is unchanged
> WARNING: Ref 'refs/tags/V5.0.6' is unchanged
> WARNING: Ref 'refs/tags/V5.0.7' is unchanged
>
> I haven't made up my mind if this is a bug report or a feature
> request, but in any case you can delete all of them and the problem
> goes away.
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: subdirectory-filter does not delete files before the directory came into existence?
2010-12-15 12:22 ` Jan Wielemaker
@ 2010-12-19 2:23 ` Thomas Rast
2010-12-19 9:34 ` Jan Wielemaker
0 siblings, 1 reply; 8+ messages in thread
From: Thomas Rast @ 2010-12-19 2:23 UTC (permalink / raw
To: Jan Wielemaker; +Cc: git
Jan Wielemaker wrote:
> The reported problems also apply to the next module. What appears to
> work is this:
>
> * Walk through the history, finding the commit where the directory
> is created.
> * use git tag -l --contains <commit that created dir> to get the
> tags we want to keep.
> * get all tags, use comm and delete the tags not in the `contained'
> set above.
>
> Not very friendly and I'm (with Thomas) about the status of these
> findings. I like to thank Thomas for giving me the right clue.
Now I finally remember where I knew this problem from:
http://article.gmane.org/gmane.comp.version-control.git/91708
(My memory really sucks.)
--
Thomas Rast
trast@{inf,student}.ethz.ch
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: subdirectory-filter does not delete files before the directory came into existence?
2010-12-19 2:23 ` Thomas Rast
@ 2010-12-19 9:34 ` Jan Wielemaker
2010-12-19 22:51 ` Thomas Rast
0 siblings, 1 reply; 8+ messages in thread
From: Jan Wielemaker @ 2010-12-19 9:34 UTC (permalink / raw
To: Thomas Rast; +Cc: git
On Sun, 2010-12-19 at 03:23 +0100, Thomas Rast wrote:
> Jan Wielemaker wrote:
> > The reported problems also apply to the next module. What appears to
> > work is this:
> >
> > * Walk through the history, finding the commit where the directory
> > is created.
> > * use git tag -l --contains <commit that created dir> to get the
> > tags we want to keep.
> > * get all tags, use comm and delete the tags not in the `contained'
> > set above.
> >
> > Not very friendly and I'm (with Thomas) about the status of these
> > findings. I like to thank Thomas for giving me the right clue.
>
> Now I finally remember where I knew this problem from:
>
> http://article.gmane.org/gmane.comp.version-control.git/91708
>
> (My memory really sucks.)
Funny. That was me having problems with filtering out directories
as well :-) I thought your patch was added using the --prune-empty
flag. I guess you can comment on that. I can confirm that I've got
nice and clean filtering using
* git filter-branch --subdirectory-filter <dir> --prune-empty
--tag-name-filter cat -- --all
followed by the steps above. I use qgit with the tree-view enabled
to find the place where the hierarchy changes from the complete one
to the only-this-dir one. You can do a binary search for that and
you spot the exact commit easily by the gap in the history-line. Then
I run this little bit of code:
#!/bin/bash
contains="$1"
git tag | sort > tags.all
git tag -l --contains $contains | sort > tags.keep
for t in `comm -23 tags.all tags.keep`; do
git tag -d $t
done
Not ideal, but doable.
Cheers --- Jan
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: subdirectory-filter does not delete files before the directory came into existence?
2010-12-19 9:34 ` Jan Wielemaker
@ 2010-12-19 22:51 ` Thomas Rast
0 siblings, 0 replies; 8+ messages in thread
From: Thomas Rast @ 2010-12-19 22:51 UTC (permalink / raw
To: Jan Wielemaker; +Cc: git
Jan Wielemaker wrote:
> On Sun, 2010-12-19 at 03:23 +0100, Thomas Rast wrote:
> > Jan Wielemaker wrote:
> > > * get all tags, use comm and delete the tags not in the `contained'
> > > set above.
[...]
> > http://article.gmane.org/gmane.comp.version-control.git/91708
[...]
> Funny. That was me having problems with filtering out directories
> as well :-) I thought your patch was added using the --prune-empty
> flag. I guess you can comment on that. I can confirm that I've got
> nice and clean filtering using
No, those two are rather different. --prune-empty drops commits that
became "no-ops" in the sense that their tree is the same as their
(only) parent's. In the case of --subdirectory-filter, --prune-empty
is most likely[*] redundant since the former already enables history
simplification limited to that directory.
As you can see from "TOY PATCH", my patch wasn't really meant for
application anyway. I'm now wondering what the ramifications would
be. filter-branch only attempts to change refs that you told it to
(listed positively on the command line), so maybe deleting anything
that was not rewritten is a sensible option (not default, mind you).
[*] Read: I think it is redundant, I'm just too lazy to double-check.
--
Thomas Rast
trast@{inf,student}.ethz.ch
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2010-12-19 22:52 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-12-14 22:21 subdirectory-filter does not delete files before the directory came into existence? Jan Wielemaker
2010-12-14 23:03 ` Thomas Rast
2010-12-15 9:50 ` Jan Wielemaker
2010-12-15 10:40 ` Jan Wielemaker
2010-12-15 12:22 ` Jan Wielemaker
2010-12-19 2:23 ` Thomas Rast
2010-12-19 9:34 ` Jan Wielemaker
2010-12-19 22:51 ` Thomas Rast
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.