使用早期(?)结束引物解析智能http git-upload-request响应

时间:2016-11-08 20:07:17

标签: git http

我试图通过智能http传输协议下载/读取一个简单的git存储库,但生成的文件在引用任何对象之前包含一个结束序列。我开始使用以下网址在Github(Bitbucket制作类似文件)下载随机存储库的副本: https://github.com/ user / repo。git / info / refs?服务= GIT-上传包。这导致了一个文件,例如(此文件是从Github Git Repository获取并缩短的):

001e# service=git-upload-pack
000000fabe5a750939c212bc0781ffa04fabcfd2b2bd744e HEAD multi_ack thin-pack side-band side-band-64k ofs-delta shallow no-progress include-tag multi_ack_detailed no-done symref=HEAD:refs/heads/master agent=git/2:2.6.5~peff-attributes-nofollow-1622-gbbc42c6
003eac84098b7e32406a982ac01cc76a663d5605224b refs/heads/maint
003fbe5a750939c212bc0781ffa04fabcfd2b2bd744e refs/heads/master
003db27dc33dac678b815097aa6e3a4b5db354285f57 refs/heads/next
003b0962616cb70317a1ca3e4b03a22b51a0095e2326 refs/heads/pu
003d0b3e657f530ecba0206e6c07437d492592e43210 refs/heads/todo
003ff0d0fd3a5985d5e588da1e1d11c85fba0ae132f8 refs/pull/10/head
00403fed6331a38d9bb19f3ab72c91d651388026e98c refs/pull/10/merge
...
004549fa3dc76179e04b0833542fa52d0f287a4955ac refs/tags/v2.9.0-rc2^{}
003e47e8b7c56a5504d463ee624f6ffeeef1b6d007c1 refs/tags/v2.9.1
00415c9159de87e41cf14ec5f2132afb5a06f35c26b3 refs/tags/v2.9.1^{}
003ee6eeb1a62fdd0ac7b66951c45803e35f17b2b980 refs/tags/v2.9.2
0041e634160bf457f8b3a91125307681c9493f11afb2 refs/tags/v2.9.2^{}
003ef883596e997fe5bcbc5e89bee01b869721326109 refs/tags/v2.9.3
0041e0c1ceafc5bece92d35773a75fff59497e1d9bd5 refs/tags/v2.9.3^{}
0000

我使用以下来源获取有关解析的信息:

Protocol Doc

Git Book

JGit Reference Implementation

我在上面的消息来源中没有找到任何参考资料,这会让我相信序列" 0000"要记录,但git客户端可以克隆。

在对git原始源代码的简短检查中," pkt-line。(c | h)"没有产生任何新发现。以下(Java-)程序说明了问题,因为它将打印" true"在secound println语句中。所以很好地解析了0000,将其视为结束,在下一个语句中将评估00fa并打印以下行。由于我的观察,我只能假设我缺少一些细节,流行的git客户端/服务器有一个实现缺陷或协议文档不清楚。任何帮助表示赞赏!

PS:我知道" 0000"可以表示刷新,但未在此服务请求中指定。

import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.nio.ByteBuffer;
import java.nio.charset.CharacterCodingException;
import java.nio.charset.Charset;
import java.nio.charset.CharsetDecoder;
import java.nio.charset.CodingErrorAction;
import java.nio.charset.StandardCharsets;
import java.nio.file.Path;
import java.nio.file.Paths;

/**
 * Read Git style pkt-line formatting from an input stream.
 * <p>
 * This class is not thread safe and may issue multiple reads to the underlying
 * stream for each method call made.
 * <p>
 * This class performs no buffering on its own. This makes it suitable to
 * interleave reads performed by this class with reads performed directly
 * against the underlying InputStream.
 */
public class PacketLineIn {
    /**
     * Magic return from {@link #readString()} when a flush packet is found.
     */
    public static final String END = new StringBuilder(0).toString();   /* must not string pool */

    private final InputStream in;

    private final byte[] lineBuffer;

    /**
     * Create a new packet line reader.
     *
     * @param i the input stream to consume.
     */
    public PacketLineIn(final InputStream i) {
        in = i;
        lineBuffer = new byte[1000];
    }

    /**
     * Read a single UTF-8 encoded string packet from the input stream.
     * <p>
     * If the string ends with an LF, it will be removed before returning the
     * value to the caller. If this automatic trimming behavior is not desired,
     * use {@link #readStringRaw()} instead.
     *
     * @return the string. {@link #END} if the string was the magic flush
     * packet.
     * @throws IOException the stream cannot be read.
     */
    public String readString() throws IOException {
        int len = readLength();
        if (len == 0) {
            return END;
        }

        len -= 4; // length header (4 bytes)
        if (len == 0) {
            return ""; //$NON-NLS-1$
        }
        byte[] raw;
        if (len <= lineBuffer.length) {
            raw = lineBuffer;
        } else {
            raw = new byte[len];
        }

        readFully(in, raw, 0, len);
        if (raw[len - 1] == '\n') {
            len--;
        }
        return decodeNoFallback(StandardCharsets.UTF_8, raw, 0, len);
    }

    /**
     * Read a single UTF-8 encoded string packet from the input stream.
     * <p>
     * Unlike {@link #readString()} a trailing LF will be retained.
     *
     * @return the string. {@link #END} if the string was the magic flush
     * packet.
     * @throws IOException the stream cannot be read.
     */
    public String readStringRaw() throws IOException {
        int len = readLength();
        if (len == 0) {
            return END;
        }

        len -= 4; // length header (4 bytes)

        byte[] raw;
        if (len <= lineBuffer.length) {
            raw = lineBuffer;
        } else {
            raw = new byte[len];
        }

        readFully(in, raw, 0, len);
        return decodeNoFallback(StandardCharsets.UTF_8, raw, 0, len);
    }

    int readLength() throws IOException {
        readFully(in, lineBuffer, 0, 4);
        try {
            final int len = parseInt16(lineBuffer, 0);
            if (len != 0 && len < 4) {
                throw new ArrayIndexOutOfBoundsException();
            }
            return len;
        } catch (ArrayIndexOutOfBoundsException err) {
            throw new IOException("FUCK U JGIT");
        }
    }

    private static void readFully(InputStream in, byte[] buffer, int off, int length) throws IOException {
        if (in.read(buffer, off, length) != length) {
            throw new IllegalArgumentException("Not enough spaaaaaaaace!");
        }
    }

    private static int parseInt16(final byte[] args, int start) {
        final byte[] data = {
            args[start], args[start + 1], args[start + 2], args[start + 3]
        };
        return Integer.parseInt(new String(data), 16);
    }

    public static String decodeNoFallback(final Charset cs,
        final byte[] buffer, final int start, final int end)
        throws CharacterCodingException {
        ByteBuffer b = ByteBuffer.wrap(buffer, start, end - start);
        b.mark();

        // Try our built-in favorite. The assumption here is that
        // decoding will fail if the data is not actually encoded
        // using that encoder.
        try {
            return decode(b, StandardCharsets.UTF_8);
        } catch (CharacterCodingException e) {
            b.reset();
        }

        if (!cs.equals(StandardCharsets.UTF_8)) {
            // Try the suggested encoding, it might be right since it was
            // provided by the caller.
            try {
                return decode(b, cs);
            } catch (CharacterCodingException e) {
                b.reset();
            }
        }

        // Try the default character set. A small group of people
        // might actually use the same (or very similar) locale.
        Charset defcs = Charset.defaultCharset();
        if (!defcs.equals(cs) && !defcs.equals(StandardCharsets.UTF_8)) {
            try {
                return decode(b, defcs);
            } catch (CharacterCodingException e) {
                b.reset();
            }
        }

        throw new CharacterCodingException();
    }

    private static String decode(final ByteBuffer b, final Charset charset)
        throws CharacterCodingException {
        final CharsetDecoder d = charset.newDecoder();
        d.onMalformedInput(CodingErrorAction.REPORT);
        d.onUnmappableCharacter(CodingErrorAction.REPORT);
        return d.decode(b).toString();
    }

    public static void main(String[] args) throws IOException {
        final Path input = Paths.get("refs");
        final PacketLineIn line = new PacketLineIn(new FileInputStream(input.toFile()));
        System.out.println(line.readString());
        System.out.println(line.readString() == PacketLineIn.END);
        System.out.println(line.readString());
    }
}

1 个答案:

答案 0 :(得分:1)

这是flush-pkt,并在Documentation Common to Pack and Http Protocols中定义。

  

长度字段为0(&#34; 0000&#34;)的pkt-line,称为flush-pkt,是一种特殊情况,必须与空pkt-line(&#34; 0004&#34;。)

它在Packfile transfer protocols中使用。它在HTTP transfer protocols中也提到了,但不是名字。

smart_reply     =  PKT-LINE("# service=$servicename" LF)
                   ref_list
                   "0000"

compute_request   =  want_list
                     have_list
                     request_end
request_end       =  "0000" / "done"