我正在尝试使用(pandas)在Python中逐行使用脚本。我想要它,如果列A ='Something'和B列> 25然后写入C列=“类别”。
我有月份列和日期列。所以,例如:
当月= 8月和日> = 25然后周= 8月25日
我尝试了几件事,但都没有效果......
首先我尝试了:
import os ### OS library is imported.
import pandas as pd ### Pandas library is imporated as 'pd'.
counter = 1 ### Counter starts at the first iteration.
while os.path.exists("CSV-Iteration-"'{0}'"/".format(counter)): ### Runs the loop until all iteration's folders have been processed.
a = pd.read_csv("output-"'{0}'".csv".format(counter)) ### Sets 'a' dataframe as holding data from a CSV file.
a['Week'] = ""
a[(a['Month'] is 'June') & (a['Day'] < 25)]['Week'] = 'June 18'
a[(a['Month'] is 'June') & (a['Day'] >= 25)]['Week'] = 'June 25'
a[(a['Month'] is 'July') & (a['Day'] < 2)]['Week'] = 'June 25'
a[(a['Month'] is 'July') & (a['Day'] >= 2) & (a['Day'] < 9)]['Week'] = 'July 2'
a[(a['Month'] is 'July') & (a['Day'] >= 9) & (a['Day'] < 16)]['Week'] = 'July 9'
a[(a['Month'] is 'July') & (a['Day'] >= 16) & (a['Day'] < 23)]['Week'] = 'July 16'
a[(a['Month'] is 'July') & (a['Day'] >= 23) & (a['Day'] < 30)]['Week'] = 'July 23'
a[(a['Month'] is 'July') & (a['Day'] >= 31) & (a['Day'] < 16)]['Week'] = 'July 30'
a[(a['Month'] is 'August') & (a['Day'] < 6)]['Week'] = 'July 30'
a[(a['Month'] is 'August') & (a['Day'] >= 6) & (a['Day'] < 13)]['Week'] = 'August 6'
a[(a['Month'] is 'August') & (a['Day'] >= 13) & (a['Day'] < 20)]['Week'] = 'August 13'
a[(a['Month'] is 'August') & (a['Day'] >= 20) & (a['Day'] < 27)]['Week'] = 'August 20'
a[(a['Month'] is 'August') & (a['Day'] >= 27)]['Week'] = 'August 27'
a[(a['Month'] is 'September') & (a['Day'] < 3)]['Week'] = 'August 27'
a[(a['Month'] is 'September') & (a['Day'] >= 3) & (a['Day'] < 10)]['Week'] = 'September 3'
a[(a['Month'] is 'September') & (a['Day'] >= 10) & (a['Day'] < 17)]['Week'] = 'September 10'
a[(a['Month'] is 'September') & (a['Day'] >= 17) & (a['Day'] < 24)]['Week'] = 'September 17'
a[(a['Month'] is 'September') & (a['Day'] >= 24)] = 'September 24'
a[(a['Month'] is 'October') & (a['Day'] >= 1) & (a['Day'] < 8)]['Week'] = 'October 1'
a[(a['Month'] is 'October') & (a['Day'] >= 8) & (a['Day'] < 15)]['Week'] = 'October 8'
a[(a['Month'] is 'October') & (a['Day'] >= 15) & (a['Day'] < 22)]['Week'] = 'October 15'
a[(a['Month'] is 'October') & (a['Day'] >= 22) & (a['Day'] < 29)]['Week'] = 'October 22'
a[(a['Month'] is 'October') & (a['Day'] >= 29)]['Week'] = 'October 29'
a[(a['Month'] is 'November') & (a['Day'] < 5)]['Week'] = 'October 29'
a[(a['Month'] is 'November') & (a['Day'] >= 5) & (a['Day'] < 12)]['Week'] = 'November 5'
a[(a['Month'] is 'November') & (a['Day'] >= 12) & (a['Day'] < 19)]['Week'] = 'November 12'
a[(a['Month'] is 'November') & (a['Day'] >= 19) & (a['Day'] < 26)]['Week'] = 'November 19'
a[(a['Month'] is 'November') & (a['Day'] >= 26)]['Week'] = 'November 26'
a[(a['Month'] is 'December') & (a['Day'] < 3)]['Week'] = 'November 26'
a[(a['Month'] is 'December') & (a['Day'] >= 3) & (a['Day'] < 10)]['Week'] = 'December 3'
a[(a['Month'] is 'December') & (a['Day'] >= 10) & (a['Day'] < 17)]['Week'] = 'December 10'
a[(a['Month'] is 'December') & (a['Day'] >= 17) & (a['Day'] < 24)]['Week'] = 'December 17'
a[(a['Month'] is 'December') & (a['Day'] >= 24) & (a['Day'] < 31)]['Week'] = 'December 24'
a[(a['Month'] is 'December') & (a['Day'] >= 31)]['Week'] = 'December 31'
a[(a['Month'] is 'January') & (a['Day'] < 7)]['Week'] = 'December 31'
a[(a['Month'] is 'January') & (a['Day'] >= 7) & (a['Day'] < 14)]['Week'] = 'January 7'
a[(a['Month'] is 'January') & (a['Day'] >= 14) & (a['Day'] < 21)]['Week'] = 'January 14'
a[(a['Month'] is 'January') & (a['Day'] >= 21) & (a['Day'] < 28)]['Week'] = 'January 21'
a[(a['Month'] is 'January') & (a['Day'] >= 28)]['Week'] = 'January 28'
a.to_csv("TESToutput-"'{0}'".csv".format(counter), index=False) ### 'a' dataframe becomes 'TESToutput-#.csv' and does not print fields for indexing (index=False).
counter += 1 ### Adds 1 to the counter.
print 'Date Corrections - All Done!'
然后我尝试了:
import os ### OS library is imported.
import pandas as pd ### Pandas library is imporated as 'pd'.
counter = 1 ### Counter starts at the first iteration.
while os.path.exists("CSV-Iteration-"'{0}'"/".format(counter)): ### Runs the loop until all iteration's folders have been processed.
a = pd.read_csv("output-"'{0}'".csv".format(counter)) ### Sets 'a' dataframe as holding data from a CSV file.
a['Week'] = ""
def this_week (row):
if row[(a['Month'] is 'June') + (a['Day'] < 25)]:
return 'June 18'
if row[(a['Month'] is 'June') + (a['Day'] >= 25)]:
return 'June 25'
if row[(a['Month'] is 'July') + (a['Day'] < 2)]:
return 'June 25'
if row[(a['Month'] is 'July') + (a['Day'] >= 2) + (a['Day'] < 9)]:
return 'July 2'
if row[(a['Month'] is 'July') + (a['Day'] >= 9) + (a['Day'] < 16)]:
return 'July 9'
if row[(a['Month'] is 'July') + (a['Day'] >= 16) + (a['Day'] < 23)]:
return 'July 16'
if row[(a['Month'] is 'July') + (a['Day'] >= 23) + (a['Day'] < 30)]:
return 'July 23'
if row[(a['Month'] is 'July') + (a['Day'] >= 31) + (a['Day'] < 16)]:
return 'July 30'
if row[(a['Month'] is 'August') + (a['Day'] < 6)]:
return 'July 30'
if row[(a['Month'] is 'August') + (a['Day'] >= 6) + (a['Day'] < 13)]:
return 'August 6'
if row[(a['Month'] is 'August') + (a['Day'] >= 13) + (a['Day'] < 20)]:
return 'August 13'
if row[(a['Month'] is 'August') + (a['Day'] >= 20) + (a['Day'] < 27)]:
return 'August 20'
if row[(a['Month'] is 'August') + (a['Day'] >= 27)]:
return 'August 27'
if row[(a['Month'] is 'September') + (a['Day'] < 3)]:
return 'August 27'
if row[(a['Month'] is 'September') + (a['Day'] >= 3) + (a['Day'] < 10)]:
return 'September 3'
if row[(a['Month'] is 'September') + (a['Day'] >= 10) + (a['Day'] < 17)]:
return 'September 10'
if row[(a['Month'] is 'September') + (a['Day'] >= 17) + (a['Day'] < 24)]:
return 'September 17'
if row[(a['Month'] is 'September') + (a['Day'] >= 24)]:
return 'September 24'
if row[(a['Month'] is 'October') + (a['Day'] >= 1) + (a['Day'] < 8)]:
return 'October 1'
if row[(a['Month'] is 'October') + (a['Day'] >= 8) + (a['Day'] < 15)]:
return 'October 8'
if row[(a['Month'] is 'October') + (a['Day'] >= 15) + (a['Day'] < 22)]:
return 'October 15'
if row[(a['Month'] is 'October') + (a['Day'] >= 22) + (a['Day'] < 29)]:
return 'October 22'
if row[(a['Month'] is 'October') + (a['Day'] >= 29)]:
return 'October 29'
if row[(a['Month'] is 'November') + (a['Day'] < 5)]:
return 'October 29'
if row[(a['Month'] is 'November') + (a['Day'] >= 5) + (a['Day'] < 12)]:
return 'November 5'
if row[(a['Month'] is 'November') + (a['Day'] >= 12) + (a['Day'] < 19)]:
return 'November 12'
if row[(a['Month'] is 'November') + (a['Day'] >= 19) + (a['Day'] < 26)]:
return 'November 19'
if row[(a['Month'] is 'November') + (a['Day'] >= 26)]:
return 'November 26'
if row[(a['Month'] is 'December') + (a['Day'] < 3)]:
return 'November 26'
if row[(a['Month'] is 'December') + (a['Day'] >= 3) + (a['Day'] < 10)]:
return 'December 3'
if row[(a['Month'] is 'December') + (a['Day'] >= 10) + (a['Day'] < 17)]:
return 'December 10'
if row[(a['Month'] is 'December') + (a['Day'] >= 17) + (a['Day'] < 24)]:
return 'December 17'
if row[(a['Month'] is 'December') + (a['Day'] >= 24) + (a['Day'] < 31)]:
return 'December 24'
if row[(a['Month'] is 'December') + (a['Day'] >= 31)]:
return 'December 31'
if row[(a['Month'] is 'January') + (a['Day'] < 7)]:
return 'December 31'
if row[(a['Month'] is 'January') + (a['Day'] >= 7) + (a['Day'] < 14)]:
return 'January 7'
if row[(a['Month'] is 'January') + (a['Day'] >= 14) + (a['Day'] < 21)]:
return 'January 14'
if row[(a['Month'] is 'January') + (a['Day'] >= 21) + (a['Day'] < 28)]:
return 'January 21'
if row[(a['Month'] is 'January') + (a['Day'] >= 28)]:
return 'January 28'
a['Week'] = a.apply (lambda row: this_week (row), axis=1)
a.to_csv("TESToutput-"'{0}'".csv".format(counter), index=False) ### 'a' dataframe becomes 'TESToutput-#.csv' and does not print fields for indexing (index=False).
counter += 1 ### Adds 1 to the counter.
print 'Date Corrections - All Done!'
第二个给了我这个错误:“IndexingError :('Unalignable boolean Series key provided',u'occurred at index 0')”
我是Python的新手,所以我根据我在论坛中阅读的内容将这些内容放在一起。如果有更简单的方法,或者如果有更正或补充使这两个脚本中的一个有效,请告诉我。
谢谢!
=============================================== =====
这是数据文件的样子(减去多余的列)。
Month Day C.Sym F.Sym D.Sym
September 3 1 1
September 27 1
October 14 1
October 15 1
October 17 1
October 21 1
October 29 1
November 30 1
December 16 1 1
December 17 1
December 27 1
January 6 1
January 8 1
January 20 1
我想添加一个列,用于检查Month和Day列以分配“Week”IE。下面:
Month Day C.Sym F.Sym D.Sym Week
September 3 1 1 Sept 3
September 27 1 Sept 24
October 14 1 Oct 8
October 15 1 Oct 15
October 17 1 Oct 15
October 21 1 Oct 15
October 29 1 Oct 29
November 30 1 Oct 29
December 16 1 1 Dec 10
December 17 1 Dec 10
December 27 1 Dec 24
January 6 1 Dec 31
January 8 1 Jan 7
January 20 1 Jan 14
我现在想要合并的一个elif的例子:
elif a[(a['Month'] is 'January') & (a['Day'] >= 14) & (a['Day'] < 21)]:
['Week'] = 'January 14'
我希望这更具体,更有帮助...
答案 0 :(得分:0)
>>> df = pd.DataFrame({'column_A': ['something', 'day', 'something'], 'column_B' : [30, 40, 10]})
>>> df
column_A column_B
0 something 30
1 day 40
2 something 10
>>> df = df.assign(column_C=((df.column_A == 'something') & (df.column_B > 25)))
>>> df.column_C.replace(True, 'something', inplace=True)
>>> df
column_A column_B column_C
0 something 30 something
1 day 40 False
2 something 10 False