Question

我的数据

data = [{"content": "1", "title": "chestnut", "info": "", "time": 1578877014},
     {"content": "2", "title": "chestnut", "info": "", "time": 1579877014},
     {"content": "3", "title": "ches", "info": "", "time": 1582877014},
     {"content": "aa", "title": "ap", "info": "", "time": 1582876014},
     {"content": "15", "title": "apple", "info": "", "time": 1581877014},
     {"content": "16", "title": "banana", "info": "", "time": 1561877014},
     ]

Mycode

index=[i['content'] for i in data]

s=pd.Series(data,index)
print((s[s.str.get('title').contains('ches',regex=True)]))

发生错误

AttributeError: 'Series' object has no attribute 'contains'

我想达到这种效果，我该如何使用包含 contais文件： https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.str.contains.html#pandas.Series.str.contains。

我希望数据为

[
{"content": "1", "title": "chestnut", "info": "", "time": 1578877014},
{"content": "2", "title": "chestnut", "info": "", "time": 1579877014},
{"content": "3", "title": "ches", "info": "", "time": 1582877014},
]

Answer 1

最好具有与数据兼容的结构。使用数据框。

DataFrame提供了更好的列和行操作。您的数据是二维的，即它具有项目，然后每个项目都具有带有值的属性。因此适合于像DataFrame这样的2D结构，而不适合于像Series这样的1D结构。

>>> df = pd.DataFrame(data)
>>> df
  content     title info        time
0       1  chestnut       1578877014
1       2  chestnut       1579877014
2       3      ches       1582877014
3      aa        ap       1582876014
4      15     apple       1581877014
5      16    banana       1561877014

>>> df[df.title.str.contains('ches')]
  content     title info        time
0       1  chestnut       1578877014
1       2  chestnut       1579877014
2       3      ches       1582877014

对于系列（不推荐）

s[s.apply(lambda x: x.get('title')).str.contains('ches')]

熊猫的系列包含AttributeError：“系列”对象没有属性“包含”

1 个答案: