我有CSV文件
Firstname Lastname City Province
'Guy', 'Ouell', 'Brossard','QC'
'Michelle', 'Balonne','Stittsville','ON'
'Ben', 'Sluzing','Toronto','ON'
'Theodora', 'Panapoulos','Saint-Constant','QC'
'Kathleen', 'Mercier','St Johns','NL'
...
我打开并检查一切都很好:
df = pd.read_csv('a.csv')
df.head(n=5)
当我想使用列时,我有两个不同的问题:
问题1:只有我可以访问第一列,当我想使用其他列时,我收到错误:
for mis_column, mis_row in missing_df.iterrows():
print(mis_row['Firstname'])
我得到了所有的名字,但是当我想要获得所有城市时,例如,我看到:
TypeError Traceback (most recent call last)
E:\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_value(self, series, key)
2482 try:
-> 2483 return libts.get_value_box(s, key)
2484 except IndexError:
pandas/_libs/tslib.pyx in pandas._libs.tslib.get_value_box
(pandas\_libs\tslib.c:18843)()
pandas/_libs/tslib.pyx in pandas._libs.tslib.get_value_box
(pandas\_libs\tslib.c:18477)()
TypeError: 'str' object cannot be interpreted as an integer
During handling of the above exception, another exception occurred:
KeyError Traceback (most recent call last)
<ipython-input-36-55ba81245685> in <module>()
1
2 for mis_column, mis_row in missing_df.iterrows():
----> 3 print(mis_row['City'])
4
5
E:\Anaconda3\lib\site-packages\pandas\core\series.py in __getitem__(self, key)
599 key = com._apply_if_callable(key, self)
600 try:
--> 601 result = self.index.get_value(self, key)
602
603 if not is_scalar(result):
E:\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in
get_value(self, series, key)
2489 raise InvalidIndexError(key)
2490 else:
-> 2491 raise e1
2492 except Exception: # pragma: no cover
2493 raise e1
E:\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_value(self, series, key)
2475 try:
2476 return self._engine.get_value(s, k,
-> 2477 tz=getattr(series.dtype, 'tz', None))
2478 except KeyError as e1:
2479 if len(self) > 0 and self.inferred_type in ['integer', 'boolean']:
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_value()
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_value()
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\hashtable_class_helper.pxi in
pandas._libs.hashtable.PyObjectHashTable.get_item()
pandas\_libs\hashtable_class_helper.pxi in
pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: 'City'
问题2:
for mis_column, mis_row in df.iterrows():
if mis_row['Firstname'] == 'Guy':
print('A')
不打印A
提前致谢
答案 0 :(得分:0)
将CSV标题的逗号分开。像这样,
Firstname, Lastname, City, Province
'Guy', 'Ouell', 'Brossard','QC'
'Michelle', 'Balonne','Stittsville','ON'
'Ben', 'Sluzing','Toronto','ON'
'Theodora', 'Panapoulos','Saint-Constant','QC'
'Kathleen', 'Mercier','St John's','NL'
由于您的CSV周围有空格,因此您可以通过跳过来读取数据框
df = pd.read_csv('<your_input>.csv', skipinitialspace=True)
如果你想删除单引号,那么,
df = pd.read_csv('<your_input>.csv', skipinitialspace=True, quotechar="'")
>>> df
Firstname Lastname City Province
0 Guy Ouell Brossard QC
1 Michelle Balonne Stittsville ON
2 Ben Sluzing Toronto ON
3 Theodora Panapoulos Saint-Constant QC
4 Kathleen Mercier St Johns' NL
>>> import pandas as pd
>>> df = pd.read_csv('test2.csv', skipinitialspace=True, quotechar="'")
>>> df
Firstname Lastname City Province
0 Guy Ouell Brossard QC
1 Michelle Balonne Stittsville ON
2 Ben Sluzing Toronto ON
3 Theodora Panapoulos Saint-Constant QC
4 Kathleen Mercier St Johns' NL
>>> for mis_column, mis_row in df.iterrows():
... if mis_row['Firstname'] == 'Guy':
... print('A')
...
A
>>>