如何解析csv数据,以将某些行显示为列

时间:2019-07-31 19:59:30

标签: python csv

我想将行值显示为特定列中值的列

原始数据

For eg

_time                         action    file
2019-07-24T02:01:02.930-0400    get     abc
2019-07-24T00:30:10.927-0400    put     abc
2019-07-24T05:01:02.930-0400    get     def
2019-07-24T04:30:10.927-0400    put     def

以此类推

我希望输出为

File  put                            get
abc  2019-07-24T00:30:10.927-0400    2019-07-24T02:01:02.930-0400
def  2019-07-24T04:30:10.927-0400    2019-07-24T05:01:02.930-0400

我当时以为可以在for循环中进行?所以

with open('raw.csv','r') as csv_file:
        csv_reader = csv.reader(csv_file, delimiter=',')
        for line in csv_reader:
                if line[0] != "":
                        file = line[0]
                           if line[1] == "get" and file in {file}
                                   gettime = line[1]
                           if line[1] == "put" and file in {file}
                                   puttime = line[1]
                print file,puttime,gettime

这不起作用

2 个答案:

答案 0 :(得分:1)

使用pandas.pivot_table例程:

In [9]: df.pivot_table(index='file', columns='action', values='_time', aggfunc='first')                           
Out[9]: 
action                           get                           put
file                                                              
abc     2019-07-24T02:01:02.930-0400  2019-07-24T00:30:10.927-0400
def     2019-07-24T05:01:02.930-0400  2019-07-24T04:30:10.927-0400

答案 1 :(得分:0)

如果您没有熊猫,则可以使用简单的循环在纯python中完成:

files = {}

with open('/tmp/input.csv') as input:
  reader = csv.reader(input, delimiter=',')
  for (time, action, file) in reader:
    files.setdefault(file, {})
    files[file][action] = time

with open('/tmp/output.csv', 'w') as output:
  writer = csv.writer(output, delimiter=',')
  for (key, val) in files.items():
    writer.writerow((key, val['get'], val['put']))