我正在尝试使用熊猫编写嵌套的 @Test
public void testExecuteSSHCommand() throws JSchException, IOException {
Channel channel = mock(Channel.class);
ChannelExec channelExec = mock(ChannelExec.class);
String command = "dummyCommand";
String result = "the correct result";
InputStream inputStream = new ByteArrayInputStream(result.getBytes(StandardCharsets.UTF_8));;
when(channel.getInputStream()).thenReturn(inputStream);
when(session.openChannel("exec")).thenReturn(channel);
//when(channel.setCommand(command)).get();
logger.info("Returning {}", sshClient.executeSSHCommand(session, command));
assertEquals(result, sshClient.executeSSHCommand(session, command));
}
语句,但是在熊猫中使用if语句不是很好。请找到正在处理的示例CSV数据以及我到目前为止编写的示例代码段。
if/else
:
df
当前的if / else语句逻辑:
t1
8
1134
0
119
122
446
21
0
138
0
此代码段引发import pandas as pd
df = pd.read_csv('file.csv', sep=';')
def get_cost(df):
t_zone = 720
max_rate = 5.5
rate = 0.0208
duration = df['t1']
if duration < t_zone:
if(duration * rate) >= max_rate:
return max_rate
else:
return(duration * rate)
else:
if duration >= 720:
x = int(duration/720)
y = ((duration%720) * rate)
if y >= max_rate:
return((x * max_rate) + max_rate)
else:
return((x * max_rate) + y)
cost = get_cost(df)
错误。如果有人有更好的解决方案,或者可以帮助翻译该if / else语句,那将是更神奇的方式!
答案 0 :(得分:3)
除非绝对必要,否则在熊猫中使用循环和if
语句效率不高。这是一个完全矢量化的100%熊猫解决方案:
import numpy as np # Needs numpy, too
x = df['t1'] // 720 * max_rate # Note the use of //!
y = df['t1'] % 720 * rate
df['cost'] = np.where(df['t1'] < t_zone,
np.minimum(df['t1'] * rate, max_rate),
np.minimum(y, max_rate) + x)
答案 1 :(得分:2)
尝试此解决方案。
import pandas as pd
df = pd.read_csv('file.csv')
def get_cost(x):
t_zone = 720
max_rate = 5.5
rate = 0.0208
duration = x['t1']
if duration < t_zone:
if(duration * rate) >= max_rate:
return max_rate
else:
return(duration * rate)
else:
if duration >= 720:
x = int(duration/720)
y = ((duration%720) * rate)
if y >= max_rate:
return((x * max_rate) + max_rate)
else:
return((x * max_rate) + y)
df['cost'] = df.apply(get_cost, axis=1)
您也可以将结果分配给同一列。在这种情况下,我已分配给一个名为“ cost”的自定义列。
输出:
t1 cost
0 8 0.1664
1 1134 11.0000
2 0 0.0000
3 119 2.4752
4 122 2.5376
5 446 5.5000
6 21 0.4368
7 0 0.0000
8 138 2.8704
9 0 0.0000
答案 2 :(得分:1)
您应该在持续时间内进行迭代,而不是直接将其与数字进行比较。你可以这样做。
import pandas as pd
df = pd.read_csv('file.csv', sep=';')
def get_cost(df):
t_zone = 720
max_rate = 5.5
rate = 0.0208
duration = df['t1']
ratecol = []
for i in duration:
if i < t_zone:
if(i * rate) >= max_rate:
ratecol.append(max_rate)
else:
ratecol.append(i * rate)
else:
if i >= 720:
x = int(i/720)
y = ((i%720) * rate)
if y >= max_rate:
ratecol.append((x * max_rate) + max_rate)
else:
ratecol.append((x * max_rate) + y)
return ratecol
df['cost'] = get_cost(df)
此代码产生的结果与之前发布的结果完全相同。