Question

我有两个要比较的文件，我有一个for循环来比较它们，但是我不确定如何继续进行操作，因此我对符合条件的第一个文件的每一行都有数据if / else语句的内容。

date_location = 3
numeric_location = 4


with open('file1.csv', 'r') as f1:
    next(f1)
    with open('file2.csv', 'r') as f2:
        next(f2)
        for i in (f1):
                f1_date = (i.split(',')[date_location])
                f1_number = (i.split(',')[numeric_location])
                for j in (f2):
                        f2_date = (j.split(',')[date_location])
                        f2_number = (j.split(',')[numeric_location])
                        if f1_date == f2_date:
                            if f2_number > f1_number:
#                                print('WIN')
                                continue
                            elif f2_number <= f1_number:
#                                print('lose')
                f2.seek(0, 0)

这是我当前拥有的代码。我想要的是显示if循环到file1.csv中的结果。尽管我无法使其记录我打印到file1.csv中的内容。我有什么办法最好在大熊猫中做到这一点吗？我早些时候尝试过在熊猫中创建for循环，但是它不允许我对两个文件的数据帧都这样做。

Answer 1

您可以创建两个Pandas DataFrame并使用np.where()进行比较。

假设您有2个文件，分别为df1和df2。每个df中都有一个score列。然后，您可以通过

进行比较

result = np.where(df1.score > df2.score, "WIN", "lose")

，如果键入结果，则显示比较结果。

您可以使用以下代码进行实验：

import pandas as pd 
import numpy as np
df1 = pd.util.testing.makeMixedDataFrame()
df2 = pd.util.testing.makeMixedDataFrame()
df3 = np.where(df1.A > df2.B, 'WIN', 'lose')
df3

更新：

import pandas as pd 
import numpy as np
df1 = pd.util.testing.makeMixedDataFrame()
df2 = pd.util.testing.makeMixedDataFrame()
df3=pd.DataFrame({})
for col in df2.A:
   df3[col] = np.where(df1.A < col, 1,0)
df3

或者这个：

for i in df1.index: # go through file 1 
  r1 = df1.iloc[i] # each time choose a row
  df = df2[df2.D == r1.D] # and choose the rows to compare from file 2, if D matches  
  c = np.where(df.A <= r1.A, "Yes","No") 

  for a in c:
      print (a)

比较两个csv文件后读取和写入相同的csv文件

1 个答案: