在没有pandas包的情况下在python中创建数据框架的最简单方法?

时间:2016-09-14 18:42:39

标签: python pandas

所以我正在部署一个不能使用pandas的网络应用程序。我在aws上使用python3和弹性beanstalk,目前还没有各种依赖项。

我只需要在一个函数中使用pandas - 用法很简单:制作一些数据框,然后通过df.loc搜索它们。 - >有没有人知道具有df.loc[index, col]功能的熊猫的好方法?

2 个答案:

答案 0 :(得分:6)

你最好的选择是在dict中使用列表:

df_eq = {'col1' : [list, of, column, data],
         'col2' : [list, of, column, data],
         ...,
         'coln-1' : [list, of, column, data],
         'coln' : [list, of, column, data]}

然后您可以使用loc

之类的内容
df_eq['coln'][idx]

答案 1 :(得分:0)

我会使用numpy。此外,索引numpy比索引w / pandas

更快
Ar_data = np.array([["gyrados","raichu","mu","dragonair","vaporeon"],["water","electric","normal","dragon","water"], [0,0,0,1,2]]).T
Ar_data
# array([['gyrados', 'water', '0'],
#        ['raichu', 'electric', '0'],
#        ['mu', 'normal', '0'],
#        ['dragonair', 'dragon', '1'],
#        ['vaporeon', 'water', '2']], 
#       dtype='<U9')

# Index w/ ints `.iloc`
Ar_data[3,1]
# 'dragon'

fields = ["pokemon","status","meta"]
observations = ["p1","p2","p3","p4","p5"]

# Index w/ labels `.loc`
Ar_data[3,fields.index("pokemon")]
# 'dragonair'

Ar_data[observations.index("p4"),fields.index("pokemon")]
# 'dragonair'

# Time it
DF_data = pd.DataFrame(Ar_data, columns=fields, index=observations)
%timeit DF_data.iloc[3,1]
%timeit Ar_data[3,1]
# 10000 loops, best of 3: 129 µs per loop
# The slowest run took 21.69 times longer than the fastest. This could mean that an intermediate result is being cached.
# 1000000 loops, best of 3: 384 ns per loop