我的数据
data = [{"content": "1", "title": "chestnut", "info": "", "time": 1578877014},
{"content": "2", "title": "chestnut", "info": "", "time": 1579877014},
{"content": "3", "title": "ches", "info": "", "time": 1582877014},
{"content": "aa", "title": "ap", "info": "", "time": 1582876014},
{"content": "15", "title": "apple", "info": "", "time": 1581877014},
{"content": "16", "title": "banana", "info": "", "time": 1561877014},
]
Mycode
index=[i['content'] for i in data]
s=pd.Series(data,index)
print((s[s.str.get('title').contains('ches',regex=True)]))
发生错误
AttributeError: 'Series' object has no attribute 'contains'
我想达到这种效果,我该如何使用包含 contais文件: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.str.contains.html#pandas.Series.str.contains。
我希望数据为
[
{"content": "1", "title": "chestnut", "info": "", "time": 1578877014},
{"content": "2", "title": "chestnut", "info": "", "time": 1579877014},
{"content": "3", "title": "ches", "info": "", "time": 1582877014},
]
答案 0 :(得分:2)
最好具有与数据兼容的结构。使用数据框。
DataFrame提供了更好的列和行操作。您的数据是二维的,即它具有项目,然后每个项目都具有带有值的属性。因此适合于像DataFrame这样的2D结构,而不适合于像Series这样的1D结构。
>>> df = pd.DataFrame(data)
>>> df
content title info time
0 1 chestnut 1578877014
1 2 chestnut 1579877014
2 3 ches 1582877014
3 aa ap 1582876014
4 15 apple 1581877014
5 16 banana 1561877014
>>> df[df.title.str.contains('ches')]
content title info time
0 1 chestnut 1578877014
1 2 chestnut 1579877014
2 3 ches 1582877014
对于系列(不推荐)
s[s.apply(lambda x: x.get('title')).str.contains('ches')]