我制作了一个shell脚本,该脚本应该用某些字段名称提取数据并将它们放在CSV文件中。
示例输入文件可能包含以下行:
user_name: null@gmail.com
EMAIL: null@gmail.com
FIRST_NAME: jonathan
LAST_NAME: doestein
CREATION_DATE: 2013-08-01 01:08:52
REGISTRATION_STATUS: Y
VENDOR: vendorname
这将重复自己' n'次。
这是我到目前为止写的脚本的摘录:
#!/bin/sh
echo "Please enter input file name."
read input_variable
echo "You entered: $input_variable"
echo "Please enter a name of the new output file."
read output_file
touch $output_file
echo "The output file name is going to be $output_file"
echo "Extracting files..." ;
awk '$1 ~ /^(user_name:|EMAIL:|FIRST_NAME:|LAST_NAME:|CREATION_DATE:|REGISTRATION_STATUS:)$/{printf "%s,",$2} $1 ~ /REGISTRATION_STATUS:/{print $2}' $input_variable >> $output_file.ib ;
但是,虽然数据打印到我的输出文件,该文件必须是.csv扩展名才能查看GUI,但当我在OpenOffice Calc等GUI中打开文件时,同一行中连接的行数很多,而其他线似乎开始像他们应该的新线。
例如,一行可能如下所示:
noway@gmail.com,noreally51,noway,username,username...x40 or so
usnername,username,username ....这意味着它只是在一行中列出了大约40-50个用户名,然后最后转到下一行并打印信息。
我想将列名添加到输出文件中:
VENDOR,user_name,FIRST_NAME,LAST_NAME,CREATION_DATE,REGISTRATION_STATUS
我无法弄清楚如何做到这一点。
感谢您的时间和所有支持!
我编辑了我的脚本如下:
#!/bin/sh
echo "Please enter input file name."
read input_variable
echo "You entered: $input_variable"
echo "Please enter a name of the new output file."
touch output_file
read $output_file
echo "The output file name is going to be $output_file"
echo "Processing data extraction..." ;
awk -F": " n=25 -v 'NR<=n {h[NR-1]=$1} {a[NR%n-1]=$2} $1~/VENDOR/ && !hp{for(k=0;k<n;k++) printf "%s ", h[k] $input_variable && print "";hp=1} $1~/VENDOR/{for(k=0;k<n;k++) printf "%s ", a[k] && print ""}' data | column -t $input_variable ;
echo "Done."
这至少会将数据打印到$ output_file。但是,$ output_file中的数据如下所示:
??ࡱ?;?? ????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????Root Entry????????????????????????????????????????????????????????????????
@karakfa
这是我所拥有的脚本的内容。我注意到你答案中的脚本第一行更改了。所以,我将我的脚本修改为以下内容:
#!/bin/sh
echo "Please enter input file name."
read input_variable
echo "You entered: $input_variable"
echo "Please enter a name of the new output file."
touch output_file
read $output_file
echo "The output file name is going to be ${output_file}"
echo "Processing data extraction..." ;
cat $input_variable | awk -F": " -v OFS="," -v n=25
'NR<=n{sub(/^ */,"",$1);h[NR-1]=$1}
{a[(NR-1)%n]=$2}
$1~/VENDOR/ && !hp{line=h[0];
for(k=1;k<n;k++) line=line OFS h[k];
print line;hp=1
}
$1~/VENDOR/{line=a[0];
for(k=1;k<n;k++) line=line OFS a[k];
print line}' $input_variable ;
echo "Done."
输出结果为:
Please enter input file name.
inputfile.txt
You entered: allgmail.com_accounts.txt
Please enter a name of the new output file.
outputfile.csv
The output file name is going to be
Processing data extraction...
awk: no program given
./scriptname: line 23: NR<=n{sub(/^ */,"",$1);h[NR-1]=$1}
{a[(NR-1)%n]=$2}
$1~/VENDOR/ && !hp{line=h[0];
for(k=1;k<n;k++) line=line OFS h[k];
print line;hp=1
}
$1~/VENDOR/{line=a[0];
for(k=1;k<n;k++) line=line OFS a[k];
print line}: No such file or directory
Done.
我没有找到任何关于&aw; awk的文章:没有给出的程序&#39;错误。你知道我做错了什么吗?
我注意到它所说的第23行&#39;所以第23行如下:
print line}' $input_variable ;
然后,我注意到它在最后一行也说了以下内容:
print line}: No such file or directory
无论有没有&#39; cat $ input_variable |&#39;在awk之前。通常,awk在我的操作系统上运行正常。它是Mac 10.11.1(15B42)。 #!/ bin / sh不正确吗?
我期待你的想法。谢谢!
答案 0 :(得分:2)
如果您的所有字段始终存在,则可以尝试以下awk
脚本。字段数被设置为变量(在这种情况下为7)和&#34; VENDOR&#34;用作记录指示符的最后一个字段。
更新:没有注意到csv输出
$ awk -F": " -v OFS="," -v n=7
'NR<=n{sub(/^ */,"",$1);h[NR-1]=$1}
{a[(NR-1)%n]=$2}
$1~/VENDOR/ && !hp{line=h[0];
for(k=1;k<n;k++) line=line OFS h[k];
print line;hp=1
}
$1~/VENDOR/{line=a[0];
for(k=1;k<n;k++) line=line OFS a[k];
print line}' inputfilename
user_name,EMAIL,FIRST_NAME,LAST_NAME,CREATION_DATE,REGISTRATION_STATUS,VENDOR
null@gmail.com,null@gmail.com,jonathan,doestein,2013-08-01 01:08:52,Y,vendorname
在前n行中构建标题,完成打印标题一次,并在看到最后一个字段时记录每个记录。
要将最后一个字段移到第一个,您可以将代码更改为
line=h[n-1];
for(k=1;k<n-1;k++) line=line OFS h[k];
两次出现(将数组名称从&#34; h&#34;更改为&#34; a&#34;在第二个实例中)。
答案 1 :(得分:1)
为什么不在awk之前使用echo?
echo ENDOR,user_name,FIRST_NAME,LAST_NAME,CREATION_DATE,REGISTRATION_STATUS > file