[PATCH 3/3] git-svn: Reduce temp file usage when dealing with non-links

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Marcus Griep
Date: Monday, August 11, 2008 - 8:53 am

Currently, in sub 'close_file', git-svn creates a temporary file and
copies the contents of the blob to be written into it. This is useful
for symlinks because svn stores symlinks in the form:

link $FILE_PATH

Git creates a blob only out of '$FILE_PATH' and uses file mode to
indicate that the blob should be interpreted as a symlink.

As git-hash-object is invoked with --stdin-paths, a duplicate of the
link from svn must be created that leaves off the first five bytes,
i.e. 'link '. However, this is wholly unnecessary for normal blobs,
though, as we already have a temp file with their contents. Copying
the entire file gains nothing, and effectively requires a file to be
written twice before making it into the object db.

This patch corrects that issue, holding onto the substr-like
duplication for symlinks, but skipping it altogether for normal blobs
by reusing the existing temp file.

Signed-off-by: Marcus Griep <marcus@griep.us>
---
 git-svn.perl |   43 ++++++++++++++++++++-----------------------
 1 files changed, 20 insertions(+), 23 deletions(-)

diff --git a/git-svn.perl b/git-svn.perl
index 0937918..f53afaa 100755
--- a/git-svn.perl
+++ b/git-svn.perl
@@ -3281,36 +3281,33 @@ sub close_file {
 				    "expected: $exp\n    got: $got\n";
 			}
 		}
-		sysseek($fh, 0, 0) or croak $!;
 		if ($fb->{mode_b} == 120000) {
-			eval {
-				sysread($fh, my $buf, 5) == 5 or croak $!;
-				$buf eq 'link ' or die "$path has mode 120000",
-						       " but is not a link";
-			};
-			if ($@) {
-				warn "$@\n";
-				sysseek($fh, 0, 0) or croak $!;
-			}
-		}
-
-		my $tmp_fh = Git::temp_acquire('svn_hash');
-		my $result;
-		while ($result = sysread($fh, my $string, 1024)) {
-			my $wrote = syswrite($tmp_fh, $string, $result);
-			defined($wrote) && $wrote == $result
-				or croak("write $tmp_fh->filename: $!\n");
-		}
-		defined $result or croak $!;
+			sysseek($fh, 0, 0) or croak $!;
+			sysread($fh, my $buf, 5) == 5 or croak $!;
 
+			unless ($buf eq 'link ') {
+				warn "$path has mode 120000",
+						" but is not a link\n";
+			} else {
+				my $tmp_fh = Git::temp_acquire('svn_hash');
+				my $result;
+				while ($result = sysread($fh, my $string, 1024)) {
+					my $wrote = syswrite($tmp_fh, $string, $result);
+					defined($wrote) && $wrote == $result
+						or croak("write $tmp_fh->filename: $!\n");
+				}
+				defined $result or croak $!;
 
-		Git::temp_release($fh, 1);
+				($fh, $tmp_fh) = ($tmp_fh, $fh);
+				Git::temp_release($tmp_fh, 1);
+			}
+		}
 
-		$hash = $::_repository->hash_and_insert_object($tmp_fh->filename);
+		$hash = $::_repository->hash_and_insert_object($fh->filename);
 		$hash =~ /^[a-f\d]{40}$/ or die "not a sha1: $hash\n";
 
 		Git::temp_release($fb->{base}, 1);
-		Git::temp_release($tmp_fh, 1);
+		Git::temp_release($fh, 1);
 	} else {
 		$hash = $fb->{blob} or die "no blob information\n";
 	}
-- 
1.6.0.rc2.6.g8eda3

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
[PATCH 0/3] git-svn and temporary file improvements, Marcus Griep, (Mon Aug 11, 8:53 am)
[PATCH 3/3] git-svn: Reduce temp file usage when dealing w ..., Marcus Griep, (Mon Aug 11, 8:53 am)
[PATCH] Git.pm: require Perl 5.6.1, Lea Wiemann, (Wed Aug 13, 3:30 pm)