我有一系列的csv,我想读取一些唯一的值,然后为每个csv打印每个值。 为了更好地解释它:我有几个带有Type和Publisher列的csv。在每个csv中,Type和Publisher列可以具有相同的值,重复多次。如果在“类型”列中有“文件”,“文件”,“记录”,“文件”,“记录”,我只想打印“文件”和“记录”。
我正在尝试:
print (~df['A'].duplicated(keep=False) | df[['B','C']].notnull().any(axis=1))
0 True
1 False
2 True
3 True
4 False
dtype: bool
这会在每次搜索下一个csv时将所有不同csv的值一起打印出来。
所需的输出为:
publisher = [] #create lists for each value we want
type = []
for rec in attachment: #attachment is a list with the url of csv
newFile = rec.replace("\\","/")
print("I'm searching in "+newFile)
download = requests.get(newFile) #get the file from url
decoded_content = download.content.decode('utf-8') #decode in utf-8
csvFile = csv.DictReader(decoded_content.splitlines(), delimiter='\t')
csvFile.fieldnames = [field.strip().lower() for field in csvFile.fieldnames]
for row in csvFile:
publisher.append(row["publisher"])
type.append(row["type"])
print(";".join(set(self.type)))
print(";".join(set(self.publisher)))
哪里出了错?
答案 0 :(得分:1)
尝试在循环内初始化列表:
for rec in attachment: #attachment is a list with the url of csv
publisher = [] # <-- HERE
type = [] # <-- HERE
newFile = rec.replace("\\","/")
除了列表之外,您还可以使用集合开头:
for rec in attachment: #attachment is a list with the url of csv
publisher = set()
type = set()
newFile = rec.replace("\\","/")
如果使用集合,则应使用add
代替append
我希望这会有所帮助。