如何分割从URL下载的csv文件?我正在尝试保留拆分的标题。
例如
A,B,C,D,E
1,2,3,4,5
12,11,8,7,6
23,23,34,1,0
23,23,32,1,0
转换为
A,B,C,D,E
1,2,3,4,5
12,11,8,7,6
A,B,C,D,E
23,23,34,1,0
23,23,32,1,0
我下面的代码检索URL文件:
MemoryStream file = GetStreamFromUrl(invoiceAPI);
private static MemoryStream GetStreamFromUrl(string url)
{
MemoryStream stream = new MemoryStream();
WebClient wc = new WebClient();
using (MemoryStream streamDownload = new MemoryStream(wc.DownloadData(url)))
{
stream = streamDownload;
}
return stream;
}
我如何能够拆分csv文件并保留标题,并且文件具有动态长度,例如,我可以将其拆分为仅10行,因为我将上传它用于另一组。你能告诉我如何解释吗。
答案 0 :(得分:1)
用户字符串。分割,将第一行作为其标题,然后分割其余各行。 https://docs.microsoft.com/en-us/dotnet/api/system.string.split?view=netframework-4.8
答案 1 :(得分:0)
我想出了两个版本。
var dataLinesPerFile = 2;
var contentAsLines = content.Split(new[] { '\r', '\n' }, StringSplitOptions.RemoveEmptyEntries);
var header = contentAsLines[0];
var dataLines = contentAsLines.Skip(1);
// I've used foreach so that the algorithm could be used if reading line by line rather then the whole file
List<string> lines = new List<string>();
var fileId = 0;
foreach (var line in dataLines)
{
lines.Add(line);
if (lines.Count() % dataLinesPerFile == 0)
{
WriteChunk(fileId++, header, lines);
lines = new List<string>(); // or lines.Clear();
}
}
if (lines.Any()) WriteChunk(fileId++, header, lines);
(...)
private static void WriteChunk(int id, string header, IEnumerable<string> lines)
{
Console.WriteLine("");
Console.WriteLine($"File_A{id}:");
Console.WriteLine(header);
Console.WriteLine(string.Join(Environment.NewLine, lines)); // File.WriteAllLines
}
var fileId = 0;
var lineCount = 0;
foreach (var line in dataLines)
{
if (lineCount % dataLinesPerFile == 0)
{
//Close the file, create the new file and write the header
Console.WriteLine("");
Console.WriteLine($"File_B{fileId++}");
Console.WriteLine(header);
}
Console.WriteLine(line);
lineCount++;
}
// Close the current file
我添加了第五行以证明代码不会丢失“杂散”行。
var content = @"A,B,C,D,E
1,2,3,4,5
12,11,8,7,6
23,23,34,1,0
23,23,32,1,0
5,5,5,5,5";
// .NETCoreApp,Version=v3.0
File_A0:
A,B,C,D,E
1,2,3,4,5
12,11,8,7,6
File_A1:
A,B,C,D,E
23,23,34,1,0
23,23,32,1,0
File_A2:
A,B,C,D,E
5,5,5,5,5
------------------
File_B0
A,B,C,D,E
1,2,3,4,5
12,11,8,7,6
File_B1
A,B,C,D,E
23,23,34,1,0
23,23,32,1,0
File_B2
A,B,C,D,E
5,5,5,5,5
答案 2 :(得分:0)
这是使用CsvHelper
NuGet程序包的实现。
首先创建一个Row
类以映射您的CSV列:
public class Row {
public int A { get; set; }
public int B { get; set; }
public int C { get; set; }
public int D { get; set; }
public int E { get; set; }
public override string ToString()
{
return $"A={A},B={B},C={C},D,={D},E={E}";
}
}
然后,您可以创建一个方法,该方法采用您要读取的CSV文件的源路径,并输出应该将新CSV文件保存到的位置的输出路径。您还需要指定要分块到每个文件中的行数。在这种情况下是两个。该方法肯定可以改进并可以处理错误检查,但是它显示了总体思路。
private static void SplitCsv(string source, string dest, int numRows)
{
// Open CSV file for reading
using (var fileReader = File.OpenText(source))
{
using (var csv = new CsvReader(fileReader))
{
// Collect all rows
var rows = csv
.GetRecords<Row>()
.ToList();
// Iterate rows in chunks
for (var row = 0; row < rows.Count() / numRows; row++)
{
// Extract chunks using LINQ
var fileRows = rows
.Skip(row * numRows)
.Take(numRows);
// Create output path
var outputPath = Path.Combine(dest, $"file{row}");
// Write chunk to file
using (var writer = new StreamWriter(outputPath,
false,
System.Text.Encoding.UTF8))
{
using (var csvFile = new CsvWriter(writer))
{
csvFile.WriteRecords(fileRows);
}
}
}
}
}
}
下面将生成以下文件:
file0.txt
A,B,C,D,E
1,2,3,4,5
12,11,8,7,6
file1.txt
A,B,C,D,E
23,23,34,1,0
23,23,32,1,0
答案 3 :(得分:0)
我可以建议编辑一下吗?
static void SplitCsv(string source, string dest, int numRows, string currency, ref List<string> outputPaths)
{
// Apro il file CSV per la lettura
using (TextReader fileReader = System.IO.File.OpenText(source))
{
using (CsvReader csv = new CsvReader(fileReader, CultureInfo.InvariantCulture))
{
csv.Configuration.HasHeaderRecord = false;
// Raccolgo tutte le righe
List<Row> rows = csv.GetRecords<Row>().ToList();
// Itero le righe in blocchi
for (int row = 0; row < rows.Count() / numRows; row++)
{
// Estraggo i blocchi usando LINQ
var fileRows = rows
.Skip(row * numRows)
.Take(numRows);
// Creo un percorso di output
string outputPath = Path.Combine(dest, currency + "_" + DateTime.UtcNow.Year + "_" + DateTime.UtcNow.Month + "_" + DateTime.UtcNow.Day + $"_CashBacks{row}.csv");
// Scrivo i blocchi su file
using (TextWriter writer = new StreamWriter(outputPath, false, Encoding.UTF8))
{
using (CsvWriter csvFile = new CsvWriter(writer, CultureInfo.InvariantCulture))
{
csvFile.Configuration.HasHeaderRecord = false;
csvFile.WriteRecords(fileRows);
}
}
outputPaths.Add(outputPath);
}
}
}
}