我有一些文件,包括以下格式的日终库存数据:
文件名:NYSE_20120116.txt
<ticker>,<date>,<open>,<high>,<low>,<close>,<vol>
A,20120116,36.15,36.36,35.59,36.19,3327400
AA,20120116,10.73,10.78,10.53,10.64,20457600
如何为每个符号创建文件? 例如,对于公司A
文件名:A.txt
<ticker>,<date>,<open>,<high>,<low>,<close>,<vol>
A,20120116,36.15,36.36,35.59,36.19,3327400
A,20120117,39.76,40.39,39.7,39.99,4157900
答案 0 :(得分:2)
您想要在记录级别拆分第一个文件,然后根据第一个字段的值将每一行路由到另一个文件?
# To skip first line, see later
cat endday.txt | while read line; do
# Careful with backslashes here - they're not quote signs
# If supported, use:
# symbol=$( echo "$line" | cut -f1 -d, )
symbol=`echo "$line" | cut -f1 -d,`
# If file is not there, create it with a header
# if [ ! -r $symbol.txt ]; then
# head -n 1 endday.txt > $symbol.txt
# fi
echo "$line" >> $symbol.txt
done
效率不高:Perl或Python会更好。
如果目录中有多个文件(请注意,您必须自己删除它们,否则它们会一次又一次地处理......),您可以这样做:
for file in *.txt; do
echo "Now processing $file..."
# A quick and dirty way of ignoring line number 1 --- start at line 2.
tail -n +2 $file | while read line; do
# Careful with backslashes here - they're not quote signs
# If supported, use:
# symbol=$( echo "$line" | cut -f1 -d, )
symbol=`echo "$line" | cut -f1 -d,`
# If file is not there, create it with a header
# if [ ! -r $symbol.txt ]; then
# head -n 1 $file > $symbol.csv
# fi
# Output file is named .CSV so as not to create new .txt files
# which this script might find
echo "$line" >> $symbol.csv
done
# Change the name from .txt to .txt.ok, so it won't be found again
mv $file $file.ok
# or better move it elsewhere to avoid clogging this directory
# mv $file /var/data/files/already-processed
done