我有一个包含40行气象站数据的.csv文件,类似于:
Date Station PET Max Temp Min Temp
2/11/2016 Conroe 0.09 70 33
2/11/2016 Huntsville 0.11 69 33
2/11/2016 Overton 0.14 67 34
2/11/2016 Allen 0.11 71 32
2/11/2016 Dallas AgriLife Center 0.17 71 37
2/11/2016 Forney 0.13 70 35
我正在尝试使用pandas从此文件中提取每个站的数据,并将其写入每个站的不同.csv文件。
我尝试过使用此代码:
import pandas as pd
df = pd.read_csv('C:\\Desktop\\report.csv')
for Station in df:
df[Station].to_csv('C:\\data\\'+ Station +'.csv')
但是这段代码是按照这样的每一列提取数据,image of files created
请帮帮我... 有没有一种方法可以逐行迭代并提取数据,例如循环遍历每一行,并为每个工作站创建一个CSV文件。
答案 0 :(得分:1)
df =pd.DataFrame({'Date': {0: '2/11/2016', 1: '2/11/2016', 2: '2/11/2016', 3: '2/11/2016', 4: '2/11/2016', 5: '2/11/2016'}, 'PET': {0: 0.089999999999999997, 1: 0.11, 2: 0.14000000000000001, 3: 0.11, 4: 0.17000000000000001, 5: 0.13}, 'Max Temp': {0: 70, 1: 69, 2: 67, 3: 71, 4: 71, 5: 70}, 'Station': {0: 'Conroe', 1: 'Huntsville', 2: 'Overton', 3: 'Allen', 4: 'Dallas Agri Life Center', 5: 'Forney'}, 'Min Temp': {0: 33, 1: 33, 2: 34, 3: 32, 4: 37, 5: 35}})
df.groupby('Station').apply(lambda x : pd.DataFrame.to_csv(x, x['Station'].values[0] + '.csv'))
答案 1 :(得分:1)
df[Station]
只需选择列即可。你想做什么以下:
在伪代码中:
for each station in stations:
select the row and put it a separate data_frame
when done write each data frame to a file.
这也不是很难在熊猫中实现的。方法如下:
for name in df.Station:
....: print df[df.Station == name]
....:
Date Station PET Max Temp Min Temp
0 2/11/2016 Conroe 0.09 70 33
Date Station PET Max Temp Min Temp
1 2/11/2016 Huntsville 0.11 69 33
Date Station PET Max Temp Min Temp
2 2/11/2016 Overton 0.14 67 34
Date Station PET Max Temp Min Temp
3 2/11/2016 Allen 0.11 71 32
Date Station PET Max Temp Min Temp
4 2/11/2016 Dallas AgriLife Center 0.17 71 37
Date Station PET Max Temp Min Temp
5 2/11/2016 Forney 0.13 70 35
这只是一个打印,但你可以用写入新的csv替换打印:
In [54]: for name in df.Station:
....: df[df.Station == name].to_csv(name+'.csv')
....:
In [55]: ls
Allen.csv Conroe.csv Dallas AgriLife Center.csv foo.csv Forney.csv Huntsville.csv Overton.csv stations.csv
现在每个文件都包含您想要的数据。