Question

我在数据库中有一个blob（CSV）。我准备了一个字符串缓冲区并创建了一个熊猫数据框。 CSV文件没有某些列的列名，并且重复某些列名称。

例如：如果需要获取B5 = search_row和E2 = search_column的交叉值。即E5 = value_to_be_fetched。

我只有文字值search_row和search_column。如何将行索引查找为B5，列索引为E2？以及获取值E5 = value_to_be_fetched。

Answer 1

如果值search_row和search_column在所有数据中都是唯一的，请使用np.where作为排名，并按DataFrame.iloc选择：

df = pd.DataFrame({'A':list('abcdef'),
                   'B':[4,5,4,5,500,4],
                   'C':[7,8,9,4,2,3],
                   'D':[1,300,5,7,1,0],
                   'E':[5,3,6,9,2,4],
                   'F':list('aaabbb')}, index = [1] * 6)
df.columns = ['A'] * 6
print (df)
   A    A  A    A  A  A
1  a    4  7    1  5  a
1  b    5  8  300  3  a
1  c    4  9    5  6  a
1  d    5  4    7  9  b
1  e  500  2    1  2  b
1  f    4  3    0  4  b

a = np.where(df == 500)[0]
b = np.where(df == 300)[1]
print (a)
[4]
print (b)
[3]

c = df.iloc[a[0],b[0]]
print (c)
1

但是如果值可以重复，则只能选择第一次出现，因为np.where返回带有length > 1的数组：

a = np.where(df == 5)[0]
b = np.where(df == 2)[1]
print (a)
[0 1 2 3]
print (b)
[2 4]

c = df.iloc[a[0],b[0]]
print (c)
7

a = np.where(df == 2)[0]
b = np.where(df == 5)[1]
print (a)
[4 4]
print (b)
[4 1 3 1]

c = df.iloc[a[0],b[0]]
print (c)
2

如何在给定dataframe中的值的情况下获取行号和列号

1 个答案: