我有一个如下数据框。基于几个条件,我需要检索该列。
Wifi_User1 Wifi_User2 Wifi_User3 Thermostat Act_User1 Act_User2 Act_User3
-58 -48 -60 18 0 1 0
-60 -56 -75 18 0 1 1
-45 -60 -45 18 0 1 1
-67 -45 -60 18 1 0 1
-40 -65 -65 18 1 0 1
-55 -78 -74 18 1 0 0
-55 -45 -65 18 1 0 0
-67 -45 -44 18 0 0 0
-65 -68 -70 18 0 0 0
-70 -70 -65 24 0 0 0
-72 -56 -45 24 0 1 0
-75 -45 -60 24 0 1 0
-77 -48 -65 24 0 0 0
条件如下:
if (Wifi_User1==Wifi_User2) or (Wifi_User2==Wifi_User3)
or (Wifi_User3==Wifi_User1) or (Wifi_User1==Wifi_User2==Wifi_User3)
and when the thermostat value is changing
then
scan Act_User1, Act_User2, Act_User3 columns for the first instance of 1
before the thermostat value changes.
If its Act_user1, return 1
else if its Act_User2 return 2
else return 3
例如,在上面的数据集中,第10行Wifi_user1 == Wifi_User2
,恒温器值从18变为24.
对于这种情况,我将扫描Act_User1,Act_User2,Act_User3。并且看到,Act_User1的第一个实例1发生,因此我需要在该特定行的新列中返回值1.
请帮助我解决这个问题,因为我是Python新手并且正在探索python
答案 0 :(得分:1)
要回答问题的第一部分,请按以下步骤转录if语句:
wifi_user_equality = (df.Wifi_User1 == df.Wifi_User2) | \
(df.Wifi_User2 == df.Wifi_User3) | \
(df.Wifi_User3 == df.Wifi_User1)
thermostat_change = df.Thermostat != df.Thermostat.shift(1)
然后返回所有包含true的行:
df[wifi_user_equality & thermostat_change]
Wifi_User1 Wifi_User2 Wifi_User3 Thermostat Act_User1 Act_User2 Act_User3
9 -70 -70 -65 24 0 0.0 0.0
或者,如果您只想要这些索引:
df.index[(wifi_user_equality & thermostat_change)]
对于你问题的第二部分,它比较棘手,但这是一个解决方案:
# We add the first index element too
zero = df.index == df.index[0]
# Get the list of index where the condition is satisfied, in reverse order
idx = list(df.index[(wifi_user_equality & thermostat_change) | zero][::-1])
for i, index in enumerate(idx):
if index > 0:
# I use a try/except block in case it cannot find an occurrence of 1
# (all previous act users are 0).
# Might not be needed in your specific application
try:
x= df.loc[idx[i+1]:(index-1), ['Act_User1','Act_User2','Act_User3']]
col_of_first_1 = np.where(x==1)[1][-1] + 1
except:
col_of_first_1 = 'Not Found'
# Assign to a new column
df.loc[index, 'Last_Act_User'] = col_of_first_1
我修改了您的数据,以便拥有更复杂的案例:
Wifi_User1 Wifi_User2 Wifi_User3 Thermostat Act_User1 Act_User2 Act_User3
-70 -70 -65 24 0 0 0
-77 -48 -65 24 0 0 0
-58 -48 -48 18 0 1 0
-60 -56 -75 18 0 1 1
-45 -60 -45 18 0 1 1
-67 -45 -60 18 1 0 1
-40 -65 -65 18 1 0 1
-55 -78 -74 18 1 0 0
-55 -45 -65 18 1 0 0
-67 -45 -44 18 0 0 0
-65 -68 -70 18 0 0 0
-70 -70 -65 24 0 0 0
-72 -56 -45 24 0 1 0
-75 -45 -60 24 0 1 0
-77 -48 -65 24 0 0 0
会给df
:
Wifi_User1 Wifi_User2 Wifi_User3 Thermostat Act_User1 Act_User2 \
0 -70 -70 -65 24 0 0
1 -77 -48 -65 24 0 0
2 -58 -48 -48 18 0 1
3 -60 -56 -75 18 0 1
4 -45 -60 -45 18 0 1
5 -67 -45 -60 18 1 0
6 -40 -65 -65 18 1 0
7 -55 -78 -74 18 1 0
8 -55 -45 -65 18 1 0
9 -67 -45 -44 18 0 0
10 -65 -68 -70 18 0 0
11 -70 -70 -65 24 0 0
12 -72 -56 -45 24 0 1
13 -75 -45 -60 24 0 1
14 -77 -48 -65 24 0 0
Act_User3 Last_Act_User
0 0 NaN
1 0 NaN
2 0 Not Found
3 1 NaN
4 1 NaN
5 1 NaN
6 1 NaN
7 0 NaN
8 0 NaN
9 0 NaN
10 0 NaN
11 0 1
12 0 NaN
13 0 NaN
14 0 NaN