我正在尝试找到用Java解析RFC-822文档的最简单方法。假设我有一个消息队列,其中存储了HTTP消息。请求和响应。因此,它们不会以“正常”的方式通过与 - 例如 - 端口80进行套接字连接并从那里发送/检索消息来检索。
在下面的代码中,我故意将“邮件”标题与HTTP消息混合在一起。这意味着两者并没有太大不同。但这不是重点。这是代码:
package httpexample;
import java.io.ByteArrayInputStream;
import java.io.IOException;
import org.apache.http.Header;
import org.apache.http.HttpException;
import org.apache.http.HttpRequest;
import org.apache.http.impl.io.DefaultHttpRequestParser;
import org.apache.http.impl.io.HttpTransportMetricsImpl;
import org.apache.http.impl.io.SessionInputBufferImpl;
import org.apache.http.io.HttpMessageParser;
import org.apache.http.message.BasicHttpEntityEnclosingRequest;
public class HttpExample {
// RFC 822
public static void main(String[] args) throws IOException, HttpException {
String str = "POST http://localhost:8080/foobar/1234567 HTTP/1.1\n" +
"Message-ID: <19815303.1075861029555.JavaMail.ss@kk>\n" +
"Date: Wed, 6 Mar 2010 12:32:20 -0800 (PST)\n" +
"From: someone@someotherplace.com\n" +
"To: someone@someplace.com\n" +
"Subject: some subject\n" +
"Mime-Version: 1.0\n" +
"Content-Type: text/plain; charset=us-ascii\n" +
"Content-Transfer-Encoding: 7bit\n" +
"X-From: one, some <some.one@someotherplace.com>\n" +
"X-To: one\n" +
"X-cc: \n" +
"X-bcc: \n" +
"X-Origin: Bob-R\n" +
"X-FileName: rbob (Non-Privileged).pst\n" +
"\n" +
"some message\n";
ByteArrayInputStream fakeStream = new ByteArrayInputStream(
str.getBytes());
HttpTransportMetricsImpl metrics = new HttpTransportMetricsImpl();
SessionInputBufferImpl inbuffer = new SessionInputBufferImpl(metrics, 1024);
inbuffer.bind(fakeStream);
HttpMessageParser<HttpRequest> requestParser =
new DefaultHttpRequestParser(inbuffer);
BasicHttpEntityEnclosingRequest request = (BasicHttpEntityEnclosingRequest)requestParser.parse();
for (Header hdr : request.getAllHeaders()) {
System.out.println(String.format("%-30s = %s", hdr.getName(), hdr.getValue()));
}
System.out.println(String.format("Request Line: %s", request.getRequestLine()));
System.out.println(String.format("Body\n------------------\n%s",
request.getEntity()));
}
}
输出如下:
Message-ID = <19815303.1075861029555.JavaMail.ss@kk>
Date = Wed, 6 Mar 2010 12:32:20 -0800 (PST)
From = someone@someotherplace.com
To = someone@someplace.com
Subject = some subject
Mime-Version = 1.0
Content-Type = text/plain; charset=us-ascii
Content-Transfer-Encoding = 7bit
X-From = one, some <some.one@someotherplace.com>
X-To = one
X-cc =
X-bcc =
X-Origin = Bob-R
X-FileName = rbob (Non-Privileged).pst
Request Line: POST http://localhost:8080/foobar/1234567 HTTP/1.1
Body
------------------
null
我无法弄清楚的是如何访问消息的正文。
我希望它有内容some message\n
我在BasicHttpEntityEnclosingRequest
中找不到能给我这个价值的任何方法。在早期版本中,我使用了
HttpRequest request = requestParser.parse();
而不是
BasicHttpEntityEnclosingRequest request =
(BasicHttpEntityEnclosingRequest) requestParser.parse();
我将其更改为BasicHttpEntityEnclosingRequest
,因为它具有getEntity
方法。但是返回null
。
所以我有点失落。
我在哪里找到尸体?
答案 0 :(得分:1)
我添加了 Content-Length 标头,否则解析器会忽略POST主体。我修改了你的代码,现在它解析了正常的身体:
package org.apache.http.examples;
import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import java.net.Socket;
import org.apache.http.Header;
import org.apache.http.HttpException;
import org.apache.http.message.BasicHttpEntityEnclosingRequest;
import org.apache.http.util.EntityUtils;
public class HttpExample {
// RFC 822
public static void main(String[] args) throws IOException, HttpException {
String str = "POST http://localhost:8080/foobar/1234567 HTTP/1.1\n" +
"Message-ID: <19815303.1075861029555.JavaMail.ss@kk>\n" +
"Date: Wed, 6 Mar 2010 12:32:20 -0800 (PST)\n" +
"From: someone@someotherplace.com\n" +
"To: someone@someplace.com\n" +
"Subject: some subject\n" +
"Mime-Version: 1.0\n" +
"Content-Type: text/plain; charset=us-ascii\n" +
"Content-Transfer-Encoding: 7bit\n" +
"X-From: one, some <some.one@someotherplace.com>\n" +
"X-To: one\n" +
"X-cc: \n" +
"X-bcc: \n" +
"X-Origin: Bob-R\n" +
"X-FileName: rbob (Non-Privileged).pst\n" +
"Content-Length: 13\n" +
"\n" +
"some message\n";
ByteArrayInputStream fakeStream = new ByteArrayInputStream(
str.getBytes());
BHttpConnectionBaseImpl b = new BHttpConnectionBaseImpl(fakeStream);
BasicHttpEntityEnclosingRequest request1 = (BasicHttpEntityEnclosingRequest) b.receiveRequestHeader();
b.receiveRequestEntity(request1);
for (Header hdr : request1.getAllHeaders()) {
System.out.println(String.format("%-30s = %s", hdr.getName(), hdr.getValue()));
}
System.out.println(String.format("Request Line: %s", request1.getRequestLine()));
System.out.println(String.format("Body\n------------------\n%s",
EntityUtils.toString( request1.getEntity() ) ));
}
}
class BHttpConnectionBaseImpl extends org.apache.http.impl.DefaultBHttpServerConnection{
private InputStream inputStream;
public BHttpConnectionBaseImpl(final InputStream inputStream) {
super(4048);
this.inputStream = inputStream;
try {
super.bind(new Socket());
} catch (IOException e) {
e.printStackTrace();
}
}
@Override
protected InputStream getSocketInputStream(final Socket socket) throws IOException {
return inputStream;
}
@Override
protected OutputStream getSocketOutputStream(final Socket socket) throws IOException {
return new ByteArrayOutputStream();
}
}
POST主体的解析发生在org.apache.http.impl.BHttpConnectionBase.prepareInput(HttpMessage)
,无论其唯一的构造函数是受保护,并且需要大量参数。子org.apache.http.impl.DefaultBHttpServerConnection
具有方便的公共构造函数,并在receiveRequestHeader()
中进行标头解析。我正在重载的方法需要绕过一些错误检查,例如如果Socket == null
并且能够从fakeStream
可能有效的另一种方法是覆盖Socket
,尤其是getInputStream()
和getOutputStream()
。然后创建DefaultBHttpServerConnection
的实例并调用其bind
方法。其余的应该是一样的。
答案 1 :(得分:0)
我认为问题可能是你的邮件标题中不清楚正文的长度是多少,所以接收者只是忽略它。 HTTP specification定义了有关如何传达此信息的几个选项,似乎没有一个适用于此处:
Content-Transfer-Encoding
必须为Transfer-Encoding
7bit
不属于the standard options。str.getBytes()
时,它会为您提供在us-ascii
中声明的Content-Type
的UTF-16字节。所以,我会稍微改变你的要求:
Content-Type: text/plain; charset=UTF-16
Content-Transfer-Encoding
Content-Lenght: 28
(28为"some message\n".getBytes().length()
)。答案 2 :(得分:0)
查看DefaultHttpRequestParser的源代码,它似乎只解析请求行和标题,它不会尝试解析正文。
This thread is discussing the same topic. There are few solution proposals as well.
答案 3 :(得分:0)
通过覆盖LineParser来自定义解析:
inbuffer = new SessionInputBufferImpl(new HttpTransportMetricsImpl(), reqDataLength);
inbuffer.bind(input);
HttpMessageParser<org.apache.http.HttpRequest> requestParser = new DefaultHttpRequestParser(
inbuffer,
new LineParser(),
new DefaultHttpRequestFactory(),
MessageConstraints.DEFAULT
);
获取实体主体如下:
HttpEntityEnclosingRequest ereq = (HttpEntityEnclosingRequest) req;
ContentLengthStrategy contentLengthStrategy =
StrictContentLengthStrategy.INSTANCE;
long len = contentLengthStrategy.determineLength(req);
InputStream contentStream = null;
if (len == ContentLengthStrategy.CHUNKED) {
contentStream = new ChunkedInputStream(buf);
} else if (len == ContentLengthStrategy.IDENTITY) {
contentStream = new IdentityInputStream(buf);
} else {
contentStream = new ContentLengthInputStream(buf, len);
}
BasicHttpEntity ent = new BasicHttpEntity();
ent.setContent(contentStream);
ereq.setEntity(ent);
return ereq;