我有一个pandas数据框,其中一列是文本描述字符串。我需要创建一个新列,以确定列表中的一个字符串是否在文本描述中。
df = pd.DataFrame({'Description': ['2 Bedroom/1.5 Bathroom end unit Townhouse.
Available now!', 'Very spacious studio apartment available', ' Two bedroom, 1
bathroom condominium, superbly located in downtown']})
list_ = ['unit', 'apartment']
然后结果应该是
Description in list
0 2 Bedroom/1.5 Bathroom end unit Townhouse. Av... True
1 Very spacious studio apartment available True
2 Two bedroom, 1 bathroom condominium, superbly... False
我可以这样做
for i in df.index.values:
df.loc[i,'in list'] = any(w in df.loc[i,'Description'] for w in list_)
但是使用大量数据集需要的时间比我想要的长。
答案 0 :(得分:2)
使用str.contains
list_ = ['unit', 'apartment']
df.Description.str.contains('|'.join(list_))
Out[724]:
0 True
1 True
2 False
Name: Description, dtype: bool
答案 1 :(得分:1)
使用v = df.Description.values.astype('U')[:, None]
df['in list'] = (np.char.find(v, list_) > 0).any(1)
df
Description in list
0 2 Bedroom/1.5 Bathroom end unit Townhouse. Av... True
1 Very spacious studio apartment available True
2 Two bedroom, 1 bathroom condominium, superbly... False
-
<script>
$( document ).ready(function() {
$.ajax({
url: 'http://www.my-url/wp-json/wp/v2/pages/34/',
error: function() {
$('#info').html('<p>An error has occurred</p>');
},
dataType: 'json',
async: false,
type: 'GET',
success: function(data) {
var theContent = data;
document.getElementById("remote-content").innerHTML = theContent.content.rendered;
}
});
});
</script>
<span id="remote-content"></span>