我有一个包含三个以逗号分隔的字段的CSV文件,如下所示:
THIS_IS_A_RECORD,email1domain.com;,email@domain.com;
,,email@domain.com;
,,email@domain.com;
,,email@domain.com;
,,email@domain.com;
,,email@domain.com;
,,email@domain.com;
ANOTHER_RECORD,email1domain.com;,email@domain.com;
,,email@domain.com;
,,email@domain.com;
,,email@domain.com;
,,email@domain.com;
,,email@domain.com;
,,email@domain.com;
,,email@domain.com;
,,email@domain.com;
,,email@domain.com;
,,email@domain.com;
,,email@domain.com;
,,email@domain.com;
我想合并行,所以我的输出如下所示:
THIS_IS_A_FIELD,email1domain.com;,email@domain.com;email@domain.com;email@domain.com;email@domain.com;email@domain.com;email@domain.com;email@domain.com;
ANOTHER_FIELD,email1domain.com;,email@domain.com;email@domain.com;email@domain.com;email@domain.com;email@domain.com;email@domain.com;email@domain.com;email@domain.com;email@domain.com;
,,email@domain.com;
行中的第三个字段应附加到最后一个完整记录的末尾。我的目标是将输出导入MySQL数据库。
答案 0 :(得分:0)
根据您的数据集,以下内容将满足您的需求:
perl -pe 'chomp; print "\n" if /^[^,]/ && $. > 1; s/,//g if /^,/' inFile > outFile
希望这有帮助!
答案 1 :(得分:0)
awk -F, '
length($1) {if (line) print line; line=""}
{line = line $0}
END {if (line) print line}
' file
答案 2 :(得分:0)
您可能喜欢这个解决方案。它不假设哪些列(在第一列之后)包含电子邮件地址。
use strict;
use warnings;
my %data;
my @labels;
while (<>) {
chomp;
my ($label, @emails) = split /,/;
@emails = grep $_, @emails;
push @labels, $label if $label;
push @{ $data{ $labels[-1] } }, @emails if @labels;
}
print join(',', $_, @{ $data{$_} }), "\n" for @labels;
<强>输出强>
THIS_IS_A_RECORD,email1domain.com;,email@domain.com;,email@domain.com;,email@domain.com;,email@domain.com;,email@domain.com;,email@domain.com;,email@domain.com;
ANOTHER_RECORD,email1domain.com;,email@domain.com;,email@domain.com;,email@domain.com;,email@domain.com;,email@domain.com;,email@domain.com;,email@domain.com;,email@domain.com;,email@domain.com;,email@domain.com;,email@domain.com;,email@domain.com;,email@domain.com;