Question

我有一系列的csv，我想读取一些唯一的值，然后为每个csv打印每个值。为了更好地解释它：我有几个带有Type和Publisher列的csv。在每个csv中，Type和Publisher列可以具有相同的值，重复多次。如果在“类型”列中有“文件”，“文件”，“记录”，“文件”，“记录”，我只想打印“文件”和“记录”。

我正在尝试：

print (~df['A'].duplicated(keep=False) | df[['B','C']].notnull().any(axis=1))
0     True
1    False
2     True
3     True
4    False
dtype: bool

这会在每次搜索下一个csv时将所有不同csv的值一起打印出来。

所需的输出为：

publisher = [] #create lists for each value we want
type = []
for rec in attachment: #attachment is a list with the url of csv
    newFile = rec.replace("\\","/")
    print("I'm searching in "+newFile)
    download = requests.get(newFile) #get the file from url
    decoded_content = download.content.decode('utf-8') #decode in utf-8

    csvFile = csv.DictReader(decoded_content.splitlines(), delimiter='\t')  
    csvFile.fieldnames = [field.strip().lower() for field in csvFile.fieldnames]
        for row in csvFile:
          publisher.append(row["publisher"])
          type.append(row["type"])
    print(";".join(set(self.type)))
    print(";".join(set(self.publisher)))

哪里出了错？

Answer 1

尝试在循环内初始化列表：

for rec in attachment: #attachment is a list with the url of csv
    publisher = []                      # <-- HERE
    type = []                           # <-- HERE
    newFile = rec.replace("\\","/")

除了列表之外，您还可以使用集合开头：

for rec in attachment: #attachment is a list with the url of csv
    publisher = set()
    type = set()
    newFile = rec.replace("\\","/")

如果使用集合，则应使用add代替append

我希望这会有所帮助。

具有多个值的csv上的Python for循环

1 个答案: