假设我有下表:
@JmsListener(
destination = "${default-queue-name-to-listen}",
subscription = "${default-queue-name-to-listen}"
)
public void receiveMessage(Message<T> message) throws JMSException {}
此表表示所需的输出。初始输入只是前两列。问题是:我如何优雅地到达那里?不应处理没有内容的行。
我尝试了以下内容:
random_string|end_location|substring
-------------|------------|---------
HappyBirthday| 4 |Happ
GoodBye | 5 |GoodB
NaN | NaN |NaN
Haensel | 2 |Ha
... | ... |...
这种方法的问题是所有字符串都会被切割成相同的长度而不是预期的结果。
我试过了:
df['random_string'].str[0:4] or [0:5]
这有效,但我感觉相当不优雅和低效。 (如何)(C /)可以以更优雅的方式执行 - 也许是矢量化的方式。也许适用的东西可以起作用吗?
答案 0 :(得分:2)
试试这个:
In [24]: df
Out[24]:
random_string end_location
0 HappyBirthday 4.0
1 GoodBye 5.0
2 NaN NaN
3 Haensel 2.0
In [25]: mask = (df['random_string'].str.len() >= 0) & (df['end_location'] >= 0)
In [26]: df[mask]
Out[26]:
random_string end_location
0 HappyBirthday 4.0
1 GoodBye 5.0
3 Haensel 2.0
In [27]: df.loc[mask, 'substring'] = [t[0][:int(t[1])] for t in df[mask].values.tolist()]
In [28]: df
Out[28]:
random_string end_location substring
0 HappyBirthday 4.0 Happ
1 GoodBye 5.0 GoodB
2 NaN NaN NaN
3 Haensel 2.0 Ha
计时用于更大的(40K行)DF
In [179]: df = pd.concat([df] * 10**4, ignore_index=True)
In [40]: %%timeit
...: mask = (df['random_string'].str.len() >= 0) & (df['end_location'] >= 0)
...: [t[0][:int(t[1])] for t in df[mask].values.tolist()]
...:
10 loops, best of 3: 77.3 ms per loop
In [41]: df.shape
Out[41]: (40000, 2)
答案 1 :(得分:2)
.my-element {
color: rgba(0, 170, 255, 0.5);
}
使用子集
df
random_string end_location
0 HappyBirthday 4.0
1 GoodBye 5.0
2 NaN NaN
3 Haensel 2.0
时间
d1 = df.dropna()
rs = d1.random_string.values.tolist()
el = d1.end_location.values.astype(int).tolist() # Thx @MaxU for `astype(int)`
df.loc[d1.index, 'substring'] = [s[:n] for s, n in zip(rs, el)]
random_string end_location substring
0 HappyBirthday 4.0 Happ
1 GoodBye 5.0 GoodB
2 NaN NaN NaN
3 Haensel 2.0 Ha