我有一些纯文本表,我需要以csv格式输出 如果我做tr并替换字符,当我有2行时,我在字段上会遇到一些问题。
cat file.txt | tr -s '|' ' ' | tr -s '_' ' '
原始表格:
____________________________________________________________________________
| Name | AB | DATA | SOME | IF | DATE |
|___________________________|_________|__________|_______|________|__(UTC)__|
| Marra Carolina Odoriz | | | | |2019-07- |
| Dolman |36737202 |098787267 | 45 | - |09T10:35:|
|____________________________|_________|__________|_______|________|_50.289Z_|
| | | | | |2019-07- |
| - |53959997 |098543650 | 30 | - |09T12:02:|
|____________________________|_________|__________|_______|________|_36.746Z_|
| | | | | |2019-07- |
| Vic Velazquez |33577915 |096638025 | - | 6000 |09T12:40:|
|____________________________|_________|__________|_______|________|_17.754Z_|
| Gabriela Letacia Cararallo | | | | |2019-07- |
| Vacchetzi |43132876 |091322398 | 30 | - |09T12:40:|
|____________________________|_________|__________|_______|________|_40.887Z_|
我需要csv的输出 对于此普通表示例:
NAME;AB;DATA;SOME;IF;DATE (UTC)
Marra Carolina Odoriz Dolman;36737202;098787267;45;-;2019-07-09T10:35:50.289Z
-;53959997;098543650;30;-;2019-07-09T12:02:36.746Z
Vic Velazquez;33577915;096638025;-;6000;2019-07-09T12:40:17.754Z
Gabriela Letacia Cararallo Vacchetzi;43132876;091322398;30;-;2019-0709T12:40:40.887Z
如果我有没有“ table ascii”设计的原始多行输入文件,可以将此局部解决方案应用于案件吗? 我曾申请:
while(<>)
{
@vals = split /\ /; # split fields into the val array (now I take the blank space)
$size = @vals;
for( $i = 0 ; $i < $size ; $i++ )
{
#clean up the values: remove underscores and extra spaces
#remove semicolons
$vals[$i] =~ s/_/ /g;
$vals[$i] =~ s/;/ /g;
$vals[$i] =~ s/^ *//;
$vals[$i] =~ s/ *$//;
# append the value to the data record for this field
$data[$i] .= $vals[$i];
# special handling for first field: use spaces when joining
$data[$i] .= " " if ($i==0);
}
if(/\R/) # Taking four underscores to indicate the end of the record
# now taking the return of carriage of new line how end of the record
{
# clean up the first record; trim spaces
$data[0] =~ s/^ *//;
$data[0] =~ s/ *$//;
$data[3] =~ s/\..*//;
# join the records with semicolons
$line = join (";", @data);
# collapse multiple spaces
$line =~ s/ +/ /g;
# print this line and start over
print "$line\n" unless ($line eq '');
@data = ();
}
}
使用此解决方案的结果是:
NAME; FULL ;;;;;;;; AB ;;;;;;; DATA ;; SOME ;; DATE;(UTC) Marra; Carolina; Odoriz ;;;;; 36737202; 098787267; 45;-; 2019-07-09T10:35:50.289Z
杜尔曼 ;;;
答案 0 :(得分:0)
尝试使用sed,这是一个非常相似的示例replace pipes with commas 您的代码看起来更像这样,因为您只有一个管道。
sed 's/|/,/g' input.csv >output.csv
此外,我建议您检查文件中是否已包含逗号,因为这会给您带来麻烦。如果文件中没有用定界符括起来的字符串,则可能用〜标签来分隔文件。
答案 1 :(得分:0)
多行处理在外壳中很困难,但在perl中则很容易。
blocktab2csv.pl:
while(<>)
{
chomp; # remove newline
s/^\|//; # remove pipe at the start of the line
@vals = split /\|/; # split fields into the val array
$size = @vals;
for( $i = 0 ; $i < $size ; $i++ )
{
#clean up the values: remove underscores and extra spaces
$vals[$i] =~ s/_//g;
$vals[$i] =~ s/^ *//;
$vals[$i] =~ s/ *$//;
# append the value to the data record for this field
$data[$i] .= $vals[$i];
# special handling for first field: use spaces when joining
$data[$i] .= " " if ($i==0);
}
if(/____/) # Taking four underscores to indicate the end of the record
{
# clean up the first record; trim spaces
$data[0] =~ s/^ *//;
$data[0] =~ s/ *$//;
# join the records with semicolons
$line = join (";", @data);
# collapse multiple spaces
$line =~ s/ +/ /g;
# print this line and start over
print "$line\n" unless ($line eq '');
@data = ();
}
}
然后
$ perl blocktab2csv.pl intable.txt > output.csv
output.csv:
Name;AB;DATA;SOME;IF;DATE(UTC)
Marra Carolina Odoriz Dolman;36737202;098787267;45;-;2019-07-09T10:35:50.289Z
-;53959997;098543650;30;-;2019-07-09T12:02:36.746Z
Vic Velazquez;33577915;096638025;-;6000;2019-07-09T12:40:17.754Z
Gabriela Letacia Cararallo Vacchetzi;43132876;091322398;30;-;2019-07-09T12:40:40.887Z
这假定您的字段中没有分号。不过,修改起来很容易以处理它们。