我希望从下表中创建字典
"\w \\\$ "
ID ArCityArCountry DptCityDptCountry DateDpt DateAr
1922 ParisFrance NewYorkUnitedState 2008-03-10 2001-02-02
1002 LosAngelesUnitedState California UnitedState 2008-03-10 2008-12-01
1901 ParisFrance LagosNigeria 2001-03-05 2001-02-02
1922 ParisFrance NewYorkUnitedState 2011-02-03 2008-12-01
1002 ParisFrance CaliforniaUnitedState 2003-03-04 2002-03-04
1099 ParisFrance BeijingChina 2011-02-03 2009-02-04
1901 LosAngelesUnitedState ParisFrance 2001-03-05 2001-02-02
预期产出
import pandas as pd
import datetime
from pandas_datareader import data, wb
import csv
#import numpy as np
out= open("testfile.csv", "rb")
data = csv.reader(out)
data = [[row[0],row[1] + row[2],row[3] + row[4], row[5],row[6]] for row in data]
out.close()
print data
out=open("data.csv", "wb")
output = csv.writer(out)
for row in data:
output.writerow(row)
out.close()
df = pd.read_csv('data.csv')
for DateDpt, DateAr in df.iteritems():
df.DateDpt = pd.to_datetime(df.DateDpt, format='%Y-%m-%d')
df.DateAr = pd.to_datetime(df.DateAr, format='%Y-%m-%d')
print df
dept_cities = df.groupby('ArCityArCountry')
for city, departures in dept_cities:
print(city)
print([list(r) for r in departures.loc[:, ['AuthorID', 'DptCityDptCountry', 'DateDpt', 'DateAr']].to_records()])
注意:我想按ArCityArCountry和DptCityDptCountry分组
您会注意到我没有包含DateDpt;我想在指定的时间段之间选择所有在DateAr和DateDpt之间以及实际在ParisFrance或CaliforniaUnitedStates之间的ID。
例如在1999-10-02 A先生在巴黎一直到2013-12-12,B先生在2010-11-04在巴黎离开2012-09-09,这意味着MrA和B先生在巴黎因为MrB对巴黎的访问顺便说一下,
MrA在那里ParisFrance = { DateAr, ID, ArCityArCountry, DptCityDptCountry}