从multipart / form-data POST读取文件输入

时间:2011-09-18 07:21:19

标签: c# wcf http-post html-form

我正在通过HTML表单将文件发布到WCF REST服务,enctype设置为multipart/form-data,并且只有一个组件:<input type="file" name="data">。服务器读取的结果流包含以下内容:

------WebKitFormBoundary
Content-Disposition: form-data; name="data"; filename="DSCF0001.JPG"
Content-Type: image/jpeg

<file bytes>
------WebKitFormBoundary--

问题是我不确定如何从流中提取文件字节。我需要这样做才能将文件写入磁盘。

9 个答案:

答案 0 :(得分:41)

很抱歉很晚才加入派对,但有一种方法可以使用 Microsoft公共API 来实现这一目标。

以下是您的需求:

  1. System.Net.Http.dll
    • 包含在.NET 4.5
    • 对于.NET 4,请通过NuGet
    • 获取
  2. System.Net.Http.Formatting.dll
  3. 注意 Nuget软件包附带了更多程序集,但在撰写本文时,您只需要上述内容。

    一旦引用了程序集,代码就会如下所示(为方便起见使用.NET 4.5):

    public static async Task ParseFiles(
        Stream data, string contentType, Action<string, Stream> fileProcessor)
    {
        var streamContent = new StreamContent(data);
        streamContent.Headers.ContentType = MediaTypeHeaderValue.Parse(contentType);
    
        var provider = await streamContent.ReadAsMultipartAsync();
    
        foreach (var httpContent in provider.Contents)
        {
            var fileName = httpContent.Headers.ContentDisposition.FileName;
            if (string.IsNullOrWhiteSpace(fileName))
            {
                continue;
            }
    
            using (Stream fileContents = await httpContent.ReadAsStreamAsync())
            {
                fileProcessor(fileName, fileContents);
            }
        }
    }
    

    至于用法,请说您有以下WCF REST方法:

    [OperationContract]
    [WebInvoke(Method = WebRequestMethods.Http.Post, UriTemplate = "/Upload")]
    void Upload(Stream data);
    

    你可以像这样实现它

    public void Upload(Stream data)
    {
        MultipartParser.ParseFiles(
               data, 
               WebOperationContext.Current.IncomingRequest.ContentType, 
               MyProcessMethod);
    }
    

答案 1 :(得分:31)

您可以查看following blog post,其中说明了一种可用于使用Multipart Parser在服务器上解析multipart/form-data的技术:

public void Upload(Stream stream)
{
    MultipartParser parser = new MultipartParser(stream);
    if (parser.Success)
    {
        // Save the file
        SaveFile(parser.Filename, parser.ContentType, parser.FileContents);
    }
}

另一种可能性是启用aspnet compatibility并使用HttpContext.Current.Request,但这不是一种非常WCF的方式。

答案 2 :(得分:25)

我遇到了一些基于字符串解析的解析器问题,特别是对于大文件我发现它会耗尽内存而无法解析二进制数据。

为了解决这些问题,我已经开放了我自己尝试使用C#multipart / form-data解析器here

特点:

  • 处理非常大的文件。 (数据在阅读时流入并流出)
  • 可以处理多个文件上传,并自动检测某个部分是否为文件。
  • 将文件作为流而不是byte []返回(适用于大文件)。
  • 该库的完整文档,包括MSDN样式生成的网站。
  • 完整单元测试。

限制:

  • 不处理非多部分数据。
  • 代码比Lorenzo的
  • 更复杂

只需使用MultipartFormDataParser类:

Stream data = GetTheStream();

// Boundary is auto-detected but can also be specified.
var parser = new MultipartFormDataParser(data, Encoding.UTF8);

// The stream is parsed, if it failed it will throw an exception. Now we can use
// your data!

// The key of these maps corresponds to the name field in your
// form
string username = parser.Parameters["username"].Data;
string password = parser.Parameters["password"].Data

// Single file access:
var file = parser.Files.First();
string filename = file.FileName;
Stream data = file.Data;

// Multi-file access
foreach(var f in parser.Files)
{
    // Do stuff with each file.
}

在WCF服务的上下文中,您可以像这样使用它:

public ResponseClass MyMethod(Stream multipartData)
{
    // First we need to get the boundary from the header, this is sent
    // with the HTTP request. We can do that in WCF using the WebOperationConext:
    var type = WebOperationContext.Current.IncomingRequest.Headers["Content-Type"];

    // Now we want to strip the boundary out of the Content-Type, currently the string
    // looks like: "multipart/form-data; boundary=---------------------124123qase124"
    var boundary = type.Substring(type.IndexOf('=')+1);

    // Now that we've got the boundary we can parse our multipart and use it as normal
    var parser = new MultipartFormDataParser(data, boundary, Encoding.UTF8);

    ...
}

或者喜欢这个(稍慢但代码更友好):

public ResponseClass MyMethod(Stream multipartData)
{
    var parser = new MultipartFormDataParser(data, Encoding.UTF8);
}

文档也可用,当您克隆存储库时,只需导航到HttpMultipartParserDocumentation/Help/index.html

答案 3 :(得分:16)

我开源了一个C#Http表单解析器here

这比在CodePlex上提到的另一个稍微灵活一些,因为您可以将它用于Multipart和非Multipart form-data,并且它还为您提供格式为{{1}的其他表单参数对象。

可以使用如下:

<强>非多

Dictionary

<强>多

public void Login(Stream stream)
{
    string username = null;
    string password = null;

    HttpContentParser parser = new HttpContentParser(stream);
    if (parser.Success)
    {
        username = HttpUtility.UrlDecode(parser.Parameters["username"]);
        password = HttpUtility.UrlDecode(parser.Parameters["password"]);
    }
}

答案 4 :(得分:2)

另一种方法是使用.Net解析器进行HttpRequest。要做到这一点,你需要为WorkerRequest使用一些反射和简单的类。

首先创建派生自HttpWorkerRequest的类(为简单起见,您可以使用SimpleWorkerRequest):

public class MyWorkerRequest : SimpleWorkerRequest
{
    private readonly string _size;
    private readonly Stream _data;
    private string _contentType;

    public MyWorkerRequest(Stream data, string size, string contentType)
        : base("/app", @"c:\", "aa", "", null)
    {
        _size = size ?? data.Length.ToString(CultureInfo.InvariantCulture);
        _data = data;
        _contentType = contentType;
    }

    public override string GetKnownRequestHeader(int index)
    {
        switch (index)
        {
            case (int)HttpRequestHeader.ContentLength:
                return _size;
            case (int)HttpRequestHeader.ContentType:
                return _contentType;
        }
        return base.GetKnownRequestHeader(index);
    }

    public override int ReadEntityBody(byte[] buffer, int offset, int size)
    {
        return _data.Read(buffer, offset, size);
    }

    public override int ReadEntityBody(byte[] buffer, int size)
    {
        return ReadEntityBody(buffer, 0, size);
    }
}

然后,只要你有消息流创建和此类的实例。我在WCF服务中这样做:

[WebInvoke(Method = "POST",
               ResponseFormat = WebMessageFormat.Json,
               BodyStyle = WebMessageBodyStyle.Bare)]
    public string Upload(Stream data)
    {
        HttpWorkerRequest workerRequest =
            new MyWorkerRequest(data,
                                WebOperationContext.Current.IncomingRequest.ContentLength.
                                    ToString(CultureInfo.InvariantCulture),
                                WebOperationContext.Current.IncomingRequest.ContentType
                );

然后使用激活器和非公共构造函数

创建HttpRequest
var r = (HttpRequest)Activator.CreateInstance(
            typeof(HttpRequest),
            BindingFlags.Instance | BindingFlags.NonPublic,
            null,
            new object[]
                {
                    workerRequest,
                    new HttpContext(workerRequest)
                },
            null);

var runtimeField = typeof (HttpRuntime).GetField("_theRuntime", BindingFlags.Static | BindingFlags.NonPublic);
if (runtimeField == null)
{
    return;
}

var runtime = (HttpRuntime) runtimeField.GetValue(null);
if (runtime == null)
{
    return;
}

var codeGenDirField = typeof(HttpRuntime).GetField("_codegenDir", BindingFlags.Instance | BindingFlags.NonPublic);
if (codeGenDirField == null)
{
    return;
}

codeGenDirField.SetValue(runtime, @"C:\MultipartTemp");

r.Files之后,您将拥有来自您的信息流的文件。

答案 5 :(得分:1)

解决此问题的人将其发布为LGPL并且您不允许修改它。当我看到它时,我甚至没有点击它。 这是我的版本。这需要进行测试。可能存在错误。请发布任何更新。没有保修。您可以随意修改,自己调用,在一张纸上打印并用于狗窝废料,......不要在意。

using System.Collections.Generic;
using System.Collections.Specialized;
using System.IO;
using System.Net;
using System.Text;
using System.Web;

namespace DigitalBoundaryGroup
{
    class HttpNameValueCollection
    {
        public class File
        {
            private string _fileName;
            public string FileName { get { return _fileName ?? (_fileName = ""); } set { _fileName = value; } }

            private string _fileData;
            public string FileData { get { return _fileData ?? (_fileName = ""); } set { _fileData = value; } }

            private string _contentType;
            public string ContentType { get { return _contentType ?? (_contentType = ""); } set { _contentType = value; } }
        }

        private NameValueCollection _post;
        private Dictionary<string, File> _files;
        private readonly HttpListenerContext _ctx;

        public NameValueCollection Post { get { return _post ?? (_post = new NameValueCollection()); } set { _post = value; } }
        public NameValueCollection Get { get { return _ctx.Request.QueryString; } }
        public Dictionary<string, File> Files { get { return _files ?? (_files = new Dictionary<string, File>()); } set { _files = value; } }

        private void PopulatePostMultiPart(string post_string)
        {
            var boundary_index = _ctx.Request.ContentType.IndexOf("boundary=") + 9;
            var boundary = _ctx.Request.ContentType.Substring(boundary_index, _ctx.Request.ContentType.Length - boundary_index);

            var upper_bound = post_string.Length - 4;

            if (post_string.Substring(2, boundary.Length) != boundary)
                throw (new InvalidDataException());

            var current_string = new StringBuilder();

            for (var x = 4 + boundary.Length; x < upper_bound; ++x)
            {
                if (post_string.Substring(x, boundary.Length) == boundary)
                {
                    x += boundary.Length + 1;

                    var post_variable_string = current_string.Remove(current_string.Length - 4, 4).ToString();

                    var end_of_header = post_variable_string.IndexOf("\r\n\r\n");

                    if (end_of_header == -1) throw (new InvalidDataException());

                    var filename_index = post_variable_string.IndexOf("filename=\"", 0, end_of_header);
                    var filename_starts = filename_index + 10;
                    var content_type_starts = post_variable_string.IndexOf("Content-Type: ", 0, end_of_header) + 14;
                    var name_starts = post_variable_string.IndexOf("name=\"") + 6;
                    var data_starts = end_of_header + 4;

                    if (filename_index != -1)
                    {
                        var filename = post_variable_string.Substring(filename_starts, post_variable_string.IndexOf("\"", filename_starts) - filename_starts);
                        var content_type = post_variable_string.Substring(content_type_starts, post_variable_string.IndexOf("\r\n", content_type_starts) - content_type_starts);
                        var file_data = post_variable_string.Substring(data_starts, post_variable_string.Length - data_starts);
                        var name = post_variable_string.Substring(name_starts, post_variable_string.IndexOf("\"", name_starts) - name_starts);
                        Files.Add(name, new File() { FileName = filename, ContentType = content_type, FileData = file_data });
                    }
                    else
                    {
                        var name = post_variable_string.Substring(name_starts, post_variable_string.IndexOf("\"", name_starts) - name_starts);
                        var value = post_variable_string.Substring(data_starts, post_variable_string.Length - data_starts);
                        Post.Add(name, value);
                    }

                    current_string.Clear();
                    continue;
                }

                current_string.Append(post_string[x]);
            }
        }

        private void PopulatePost()
        {
            if (_ctx.Request.HttpMethod != "POST" || _ctx.Request.ContentType == null) return;

            var post_string = new StreamReader(_ctx.Request.InputStream, _ctx.Request.ContentEncoding).ReadToEnd();

            if (_ctx.Request.ContentType.StartsWith("multipart/form-data"))
                PopulatePostMultiPart(post_string);
            else
                Post = HttpUtility.ParseQueryString(post_string);

        }

        public HttpNameValueCollection(ref HttpListenerContext ctx)
        {
            _ctx = ctx;
            PopulatePost();
        }


    }
}

答案 6 :(得分:1)

我已经为ASP.NET 4实现了MultipartReader NuGet包,用于读取多部分表单数据。它基于Multipart Form Data Parser,但它支持多个文件。

答案 7 :(得分:1)

一些正则表达式怎么样?

我为文本写了一个文件,但我相信这对你有用

(如果您的文本文件包含与下面“匹配”完全匹配的行 - 只需调整您的正则表达式)

    private static List<string> fileUploadRequestParser(Stream stream)
    {
        //-----------------------------111111111111111
        //Content-Disposition: form-data; name="file"; filename="data.txt"
        //Content-Type: text/plain
        //...
        //...
        //-----------------------------111111111111111
        //Content-Disposition: form-data; name="submit"
        //Submit
        //-----------------------------111111111111111--

        List<String> lstLines = new List<string>();
        TextReader textReader = new StreamReader(stream);
        string sLine = textReader.ReadLine();
        Regex regex = new Regex("(^-+)|(^content-)|(^$)|(^submit)", RegexOptions.IgnoreCase | RegexOptions.Compiled | RegexOptions.Singleline);

        while (sLine != null)
        {
            if (!regex.Match(sLine).Success)
            {
                lstLines.Add(sLine);
            }
            sLine = textReader.ReadLine();
        }

        return lstLines;
    }

答案 8 :(得分:0)

我已经使用大文件(几个GB)上传WCF,其中内存中的存储数据不是一个选项。我的解决方案是将消息流存储到临时文件,并使用seek来查找二进制数据的开始和结束。