修改: 大家好,谢谢你的回复。 我的问题不是如何解决我在这里提供的sample.csv,情况是我有100多个类似的文件,我希望我能快速有效地解决它们,我通过python解决了这个问题,但我更喜欢sed,因为我知道sed可以直接修改文件。我不想数百次运行类似的命令......
我每天生成大约4个月生成的文件,每个文件包含9列,现在我想从所有这些文件中删除最后两列。
我打算使用sed
删除带有-i
的最后两列,我的目的是我可以直接修改所有文件,而不需要写入新文件。不幸的是,我找不到办法,然后我编写了我的python脚本来完成所有的工作。这是我的代码:
def remove_last_two_columns(input_dir, output_dir, file_name):
writer = open(output_dir + file_name, "w")
with open(input_dir + file_name, "r") as inputs:
for line in inputs:
parts = line.strip().split(",")
outline = ""
for index, part in enumerate(parts):
if index < 7:
outline += "," + part
writer.write(outline[1:] + "\n")
writer.close()
remove_last_two_columns("/home/haifzhan/input/", "/home/haifzhan/output/", "sample.csv")
输入:
C1,C2,2014-06-30 13:11:46,2014-07-01 00:19:12,43,N,N,N,N
C1,C2,2014-06-30 13:37:40,N,N,N,N,2014-07-01 00:37:22,N
C1,C2,2014-06-30 15:35:40,2014-07-01 00:23:14,36,N,N,N,N
C1,C2,2014-06-30 16:54:07,2014-07-01 00:08:38,35,N,N,N,N
C1,C2,2014-06-30 17:13:33,N,N,N,N,2014-07-01 00:25:55,N
C1,C2,2014-06-30 17:23:05,N,N,2014-07-01 00:26:03,13,N,N
C1,C2,2014-06-30 17:49:59,2014-07-01 02:46:20,11,N,N,N,N
C1,C2,2014-06-30 18:16:51,2014-07-01 06:15:25,20,N,N,N,N
C1,C2,2014-06-30 18:18:07,N,N,2014-07-01 00:02:22,24,N,N
C1,C2,2014-06-30 18:41:27,N,N,N,N,2014-07-01 00:52:22,N
my output:
C1,C2,2014-06-30 13:11:46,2014-07-01 00:19:12,43,N,N
C1,C2,2014-06-30 13:37:40,N,N,N,N
C1,C2,2014-06-30 15:35:40,2014-07-01 00:23:14,36,N,N
C1,C2,2014-06-30 16:54:07,2014-07-01 00:08:38,35,N,N
C1,C2,2014-06-30 17:13:33,N,N,N,N
C1,C2,2014-06-30 17:23:05,N,N,2014-07-01 00:26:03,13
C1,C2,2014-06-30 17:49:59,2014-07-01 02:46:20,11,N,N
C1,C2,2014-06-30 18:16:51,2014-07-01 06:15:25,20,N,N
C1,C2,2014-06-30 18:18:07,N,N,2014-07-01 00:02:22,24
C1,C2,2014-06-30 18:41:27,N,N,N,N
任何人都可以提供sed / awk方式来实现这一目标吗?我想在将来的工作中使用sed / awk。提前谢谢。
答案 0 :(得分:3)
Awk解决方案
awk 'BEGIN{FS=OFS=","}NF=(NF-2)' file
答案 1 :(得分:2)
cut绝对是实现这一目标的最简单工具:
cat input | cut -d, -f8,9 --complement
请注意,切割的osx版本已过时,因此最好获取最新版本:
brew install coreutils
答案 2 :(得分:2)
此语句删除最后两列,其中sample.csv
是输入文件的名称。
sed s/,[^,]*,[^,]*$//g sample.csv
我的结果是:
C1,C2,2014-06-30 13:11:46,2014-07-01 00:19:12,43,N,N
C1,C2,2014-06-30 13:37:40,N,N,N,N
C1,C2,2014-06-30 15:35:40,2014-07-01 00:23:14,36,N,N
C1,C2,2014-06-30 16:54:07,2014-07-01 00:08:38,35,N,N
C1,C2,2014-06-30 17:13:33,N,N,N,N
C1,C2,2014-06-30 17:23:05,N,N,2014-07-01 00:26:03,13
C1,C2,2014-06-30 17:49:59,2014-07-01 02:46:20,11,N,N
C1,C2,2014-06-30 18:16:51,2014-07-01 06:15:25,20,N,N
C1,C2,2014-06-30 18:18:07,N,N,2014-07-01 00:02:22,24
C1,C2,2014-06-30 18:41:27,N,N,N,N
在您的示例中,您删除了最后3列,您可以通过将原始语句修改为以下内容来执行此操作:
sed s/,[^,]*,[^,]*,[^,]*$//g sample.csv