Question

我第一次遇到将csv加载到Python中的问题。

我正在尝试this。我的csv文件与他的相同，但更长，并且具有不同的值。

当我运行时，

import collections
path='../data/struc.csv'
answer = collections.defaultdict(list)
with open(path, 'r+') as istream:
    for line in istream:
        line = line.strip()
        try:
            k, v = line.split(',', 1)
            answer[k.strip()].append(v.strip())
        except ValueError:
            print('Ignoring: malformed line: "{}"'.format(line))

print(answer)

一切都运行良好。我得到了你所期望的。

没有复制并粘贴link中的代码，在这两种情况下都会出错。

在接受的答案中，终端吐出ValueError：需要多于1个值才能解压缩

在第二个答案中，我得到了AttributeError：＆＃39; file＆＃39;对象没有属性＆＃39; split＆＃39;。如果您将其调整为列表，它也不起作用。

我觉得问题是csv文件本身。它的负责人是

_id,parent,name,\n Section,none,America's,\n Section,none,Europe,\n Section,none,Asia,\n Section,none,Africa,\n Country,America's,United States,\n Country,America's,Argentina,\n Country,America's,Bahamas,\n Country,America's,Bolivia,\n Country,America's,Brazil,\n Country,America's,Colombia,\n Country,America's,Canada,\n Country,America's,Cayman Islands,\n Country,America's,Chile,\n Country,America's,Costa Rica,\n Country,America's,Dominican Republic,\n 我已经阅读了很多关于csv的内容，尝试了导入csv的东西，但仍然没有运气。请有人帮忙。遇到这种问题是最糟糕的。

import re
from collections import defaultdict

parents=defaultdict(list)
path='../data/struc.csv'

with open(path, 'r+') as istream:
    for i, line in enumerate(istream.split(',')):
        if i != 0 and line.strip():
            id_, parent, name = re.findall(r"[\d\w-]+", line)
            parents[parent].append((id_, name))



Traceback (most recent call last):

  File "<ipython-input-29-2b2fd98946b3>", line 1, in <module>
runfile('/home/bob/Documents/mega/tree/python/structure.py',       wdir='/home/bob/Documents/mega/tree/python')

  File "/home/bob/anaconda/lib/python2.7/site-packages/spyderlib/widgets/externalshell/sitecustomize.py", line 685, in runfile
    execfile(filename, namespace)

   File "/home/bob/anaconda/lib/python2.7/site-packages/spyderlib/widgets/externalshell/sitecustomize.py", line 78, in execfile
    builtins.execfile(filename, *where)

  File "/home/bob/Documents/mega/tree/python/structure.py", line 15, in <module>
    for i, line in enumerate(istream.split(',')):

AttributeError: 'file' object has no attribute 'split'

Answer 1

首先，Python中有一个特殊模块，用于处理不同风格的CSV的标准库。请参阅documentation。

当CSV文件有标题时，csv.DictReader可能是解析文件的更直观方式：

import collections
import csv

filepath = '../data/struc.csv'
answer = collections.defaultdict(list)

with open(filepath) as csvfile:
    reader = csv.DictReader(csvfile)
    for row in reader:
        answer[row["_id"].strip()].append(row["parent"].strip())

print(answer)

您可以通过标题中的名称来引用行中的字段。在这里，我假设你想使用_id和parent，但你明白了。

此外，可以将dialect=csv.excel_tab作为参数添加到DictReader以解析以制表符分隔的文件。

Answer 2

如果您计划对此数据进行任何分析，那么我建议您学习一下pandas库。 Pandas库负责处理所有似乎绊倒你的细节，使csv文件打开一行。

import pandas as pd
csv_file = pd.read_csv(file_path)

Python CSV问题

2 个答案: