让我们假设我有一个像这样的Pandas Dataframe:
C1 C2 C3
0 1 B v
1 5 D i
2 1 B iii
3 3 C iv
C1,C2,C3的所有可能值都是
C1 = [1,2,3,4,5]
C2 = ['A','B','C','D','E']
C3 = ['i','ii','iii','iv','v']
问题是打印新行以排除现有数据帧中尚未存在的C1,C2,C3的所有可能组合。
有没有比使用C1,C2,C3的所有值并将每个组合与现有列表进行比较的3个嵌套循环更好的方法?
答案 0 :(得分:1)
你可以尝试这样的事情,
C1 = [1, 2, 3, 4, 5]
C2 = ['A', 'B', 'C', 'D', 'E']
C3 = ['i', 'ii', 'iii', 'iv', 'v']
existing = ((1, 'B', 'v'), (5, 'D', 'i'), (1, 'B', 'iii'), (3, 'C', 'iv'))
import itertools
result = [i for i in itertools.product(C1, C2, C3) if i not in existing]
答案 1 :(得分:0)
series_1 = pd.Series(C1)
new_data_frame['C1'] = series_1.values
series_2 = pd.Series(C2)
new_data_frame['C2'] = series_2.values
series_3 = pd.Series(C3)
new_data_frame['C3'] = series_3.values
for index, row in new_data_frame.iterrows():
if existing_dataframe[existing_dataframe == row].shape[0] == 0:
existing_dataframe.appen(row)