我有一个看起来像这样的对象列表:
Bidder - Timestamp: 11, User ID: 8, Action: BID, Loan ID: 430, Rate: 0.20 -- missed
Bidder - Timestamp: 13, User ID: 13, Action: BID, Loan ID: 430, Rate: 0.15
Bidder - Timestamp: 17, User ID: 8, Action: BID, Loan ID: 430, Rate: 0.10 -- miss
Bidder - Timestamp: 18, User ID: 1, Action: BID, Loan ID: 431, Rate: 0.15
Bidder - Timestamp: 19, User ID: 3, Action: BID, Loan ID: 431, Rate: 0.14
Bidder - Timestamp: 21, User ID: 3, Action: BID, Loan ID: 431, Rate: 0.14
我正在尝试查找所有匹配的User ID
,但是这样做很麻烦,我的代码遇到的问题是我最终错过了上面显示的一个匹配的ID
。 / p>
这是我当前拥有的代码,该代码获取第一个元素并将其与下一个元素进行比较:
self._bidders = [] # List of bidders
for idx, firstElement in enumerate(self._bidders):
FirstElement = firstElement
NextElement = self._bidders[(idx + 1) % len(self._bidders)]
if FirstElement.user_id == NextElement.user_id:
#Do something
我将如何确保我得到所有匹配的User ID
而不会丢失任何内容且未使用任何导入?,任何建议/帮助将不胜感激。
答案 0 :(得分:1)
这是一种将对象列表转换为数据框的方法,您可以从中轻松找到匹配项:
# Create a list of lists
data = ['Bidder - Timestamp: 11, User ID: 8, Action: BID, Loan ID: 430, Rate: 0.20',
'Bidder - Timestamp: 13, User ID: 13, Action: BID, Loan ID: 430, Rate: 0.15',
'Bidder - Timestamp: 17, User ID: 8, Action: BID, Loan ID: 430, Rate: 0.10',
'Bidder - Timestamp: 18, User ID: 1, Action: BID, Loan ID: 431, Rate: 0.15',
'Bidder - Timestamp: 19, User ID: 3, Action: BID, Loan ID: 431, Rate: 0.14',
'Bidder - Timestamp: 21, User ID: 3, Action: BID, Loan ID: 431, Rate: 0.14']
df = pd.DataFrame([d.split(',') for d in data])
# df = pd.DataFrame([list(d).split(',') for d in data]) # Use this for your list of objects
df2 = pd.DataFrame()
for i in range(len(df.columns)):
name = df.iloc[:,i].str.split(':', expand=True)[0][0].strip()
values = df.iloc[:,i].str.split(':', expand=True)[1].str.strip()
df2[name] = values
print(df2)
Bidder - Timestamp User ID Action Loan ID Rate
0 11 8 BID 430 0.20
1 13 13 BID 430 0.15
2 17 8 BID 430 0.10
3 18 1 BID 431 0.15
4 19 3 BID 431 0.14
5 21 3 BID 431 0.14
# Find matches
df2[df2['User ID'] == '8']
Bidder - Timestamp User ID Action Loan ID Rate
0 11 8 BID 430 0.20
2 17 8 BID 430 0.10
答案 1 :(得分:1)
尝试一下,一个for循环:
vect = TfidfVectorizer()
X = vect.fit_transform(text_list)
word_list = vect.get_feature_names()
df1 = pd.DataFrame(X.toarray())
df1.to_excel("temp1.xlsx")
df2 = pd.DataFrame(X.toarray(), columns = word_list)
df2.to_excel("temp2.xlsx")
或者是单线列表理解:
data = ['Bidder - Timestamp: 11, User ID: 8, Action: BID, Loan ID: 430, Rate: 0.20',
'Bidder - Timestamp: 13, User ID: 13, Action: BID, Loan ID: 430, Rate: 0.15',
'Bidder - Timestamp: 17, User ID: 8, Action: BID, Loan ID: 430, Rate: 0.10',
'Bidder - Timestamp: 18, User ID: 1, Action: BID, Loan ID: 431, Rate: 0.15',
'Bidder - Timestamp: 19, User ID: 3, Action: BID, Loan ID: 431, Rate: 0.14',
'Bidder - Timestamp: 21, User ID: 3, Action: BID, Loan ID: 431, Rate: 0.14']
l = [[i.split(':')[0].strip() for i in data[0].split(',')]]
for i in data:
l.append([x.split(':')[1].strip() for x in i.split(',')])
答案 2 :(得分:1)
您可以使用dictionary首先存储ID在列表中出现的次数。然后,您可以根据ID是否重复来过滤原始列表:
bidders = [
Bidder(11, 8, 'BID', 430, 0.20),
Bidder(13, 13, 'BID', 430, 0.15),
Bidder(17, 8, 'BID', 430, 0.10),
Bidder(18, 1, 'BID', 431, 0.15),
Bidder(19, 3, 'BID', 431, 0.14),
Bidder(21, 3, 'BID', 431, 0.14)
]
id_counts = {}
for b in bidders:
if b.user_id in id_counts:
id_counts[b.user_id] += 1
else:
id_counts[b.user_id] = 1
result = [b for b in bidders if id_counts[b.user_id] > 1]
print(result)
输出
[Timestamp: 11, User Id: 8, Action: BID, Loan ID: 430, Rate: 0.2,
Timestamp: 17, User Id: 8, Action: BID, Loan ID: 430, Rate: 0.1,
Timestamp: 19, User Id: 3, Action: BID, Loan ID: 431, Rate: 0.14,
Timestamp: 21, User Id: 3, Action: BID, Loan ID: 431, Rate: 0.14]