Question

我有一个包含数十万行，单列，没有空格，没有引号，没有逗号的CSV文件。

line1
line2
line3
line4

我需要它分成仍然是1列，但每行最多50行，用逗号分隔。

所以：

line1,line2,line3,line4 all the way to line50
line51,line52,line53, all the way to line100
line101,line102,line103 all the way to line150

直到完成CSV。

我有FFE，CSVTOOLS，我正在运行linux，所以我更喜欢linux方法。这绝对是我的头脑，所以请帮助，谢谢。

Answer 1

我假设您可以运行Perl个脚本。我无法保证速度，但考虑到您提供的详细信息，它将完成工作。

#!/usr/bin/perl

use strict;
use warnings;

my $file = $ARGV[0];

open( my $fh, "<", $file ) or die $!;

my $cnt = 0;
while (<$fh>) {
    ++$cnt;
    if ( $cnt < 50 ) {
        $_ =~ tr/\n/,/;
        print $_;
    }
    else {
        print "$_";
        $cnt = 0;
    }
}

close($fh);

如果您希望将其打印到标准输出，或者只是将它在shell中重定向到文件，则可以将其作为perl convert.pl file运行。

Answer 2

所以你想从文件中读取50行，然后用逗号连接它们，对吧？这就是我提出的（使用Python）：

import sys;

fd = open("foo.txt");
for i in range(3):
    for j in range(50):
        line = fd.readline().rstrip()
        if (j != 0):
            sys.stdout.write(",")
        sys.stdout.write(line)
    sys.stdout.write("\n")
fd.close()

将3更改为50行块，将"foo.txt"更改为实际文件名。这写入stdout;如果这是一个问题，你可以打开另一个文件进行写作。

Answer 3

在bash中：

#!/bin/bash

out_file=output.csv
line_width=50

count=0

while read line
do
  echo -n $line >> $out_file
  count=$(($count+1))

  if [ $count -lt $line_width ]
  then
    echo -n "," >> $out_file
  else
    echo "" >> $out_file
    count=0
  fi
done

# strip trailing commas
sed 's/,$//g' < $out_file > "$out_file.tmp" && mv "$out_file.tmp" $out_file

假设您在wrap.sh中有此脚本，请通过命令行执行：

$ ./wrap.sh < file.txt

输出将在output.csv。

将逐行CSV转换为逗号CSV的最简单方法

3 个答案: