Python在第一列中使用pandas搜索条目返回整行

时间:2018-04-15 14:41:00

标签: python pandas

我是python的新手,很难搞清楚熊猫。我整个晚上都试过但是无法上班。这可能是一个重复的问题,但我搜索它仍然没有解决它。

df = pd.read_csv(r'E:\Programming\Pipeline\Tests\vfxdatasheet.csv')
df2 = df.columns.get_values()
print (df2)

把我的专栏给我。到现在为止还挺好。 我想有效地在名为“Shot#”的第一列中搜索一个条目。如果找到该条目,则返回其整行的信息(作为列表或其他)

奖励点:如何返回特定行/列中找到的值

This is my data table which I export as a utf-8 encoded csv

感谢您帮助一个完整的菜鸟。 :)

编辑:

shotid = '001_0010'
ix = df['Shot#'].loc[df['Shot#'].str.contains(shotid)].index
print (ix)

导致我昨天一直遇到的一个关键错误。我使用的是WinPython,pandas包有问题吗?

编辑2:好的,我知道为什么它不起作用。创建数据框时我没有设置分隔符。愚蠢的错误!

df = pd.read_csv(r"E:\Programming\Pipeline\Tests\vfxdatasheet.csv", sep=';', encoding='utf-8')
Traceback (most recent call last):
  File "E:/Programming/Pipeline/Python/test.py", line 8, in <module>
    ix = df['Shot#'].loc[df['Shot#'].str.contains(shotid)].index
  File "C:\WinPython\python-3.5.4.amd64\lib\site-packages\pandas\core\frame.py", line 2139, in __getitem__
    return self._getitem_column(key)
  File "C:\WinPython\python-3.5.4.amd64\lib\site-packages\pandas\core\frame.py", line 2146, in _getitem_column
    return self._get_item_cache(key)
  File "C:\WinPython\python-3.5.4.amd64\lib\site-packages\pandas\core\generic.py", line 1842, in _get_item_cache
    values = self._data.get(item)
  File "C:\WinPython\python-3.5.4.amd64\lib\site-packages\pandas\core\internals.py", line 3843, in get
    loc = self.items.get_loc(item)
  File "C:\WinPython\python-3.5.4.amd64\lib\site-packages\pandas\core\indexes\base.py", line 2527, in get_loc
    return self._engine.get_loc(self._maybe_cast_indexer(key))
  File "pandas\_libs\index.pyx", line 117, in pandas._libs.index.IndexEngine.get_loc
  File "pandas\_libs\index.pyx", line 139, in pandas._libs.index.IndexEngine.get_loc
  File "pandas\_libs\hashtable_class_helper.pxi", line 1265, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas\_libs\hashtable_class_helper.pxi", line 1273, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'Shot#'

1 个答案:

答案 0 :(得分:0)

您可以尝试这种方式:

# sample data
df = pd.DataFrame({'Shot#': ['001_0010','002_0020','003_0010','003_0020','003_0030','004_0010','003_0010'],
                   'play': ['a','b','c','d','a','b','d']})

# let's say
val_to_search = '003_0010'

# get row index value where match is found
ix = df['Shot#'].loc[df['Shot#'].str.contains(val_to_search)].index

# get rows of match value as output
df.values[ix]

# output
array([['003_0010', 'c'],
       ['003_0010', 'd']], dtype=object)

如果要从specfic列返回值,可以使用多种方法:

方法1:

df.apply(lambda row: row['Shot#'] if row['Shot#'] == val_to_search else np.nan, axis=1)

方法2:

mask = df['Shot#'].str.contains(val_to_search)
df['new_col'] = df.loc[mask,'Shot#']

print(df)

    Shot#    play   new_col
0   001_0010    a   NaN
1   002_0020    b   NaN
2   003_0010    c   003_0010
3   003_0020    d   NaN
4   003_0030    a   NaN
5   004_0010    b   NaN
6   003_0010    d   003_0010