使用numpy数组替换pandas DataFrame值

时间:2018-11-19 12:57:53

标签: python pandas numpy dataframe

我尝试使用numpy数组替换熊猫DataFrame中的数据(更确切地说,我想规范化数据,然后在现有DataFrame中设置新列)。看起来像这样:

# df is a existing pandas dataframe with 10 rows and 3 columns
new_values = np.random.rand(10,3)
df = new_values # this is the step I want to solve

我当然想保留DataFrame的列和索引信息。有人知道我如何进行这项工作吗?

1 个答案:

答案 0 :(得分:3)

使用:

np.random.seed(12)

df = pd.DataFrame(np.random.rand(10,3), columns=list('ABC'))
print (df)
          A         B         C
0  0.154163  0.740050  0.263315
1  0.533739  0.014575  0.918747
2  0.900715  0.033421  0.956949
3  0.137209  0.283828  0.606083
4  0.944225  0.852736  0.002259
5  0.521226  0.552038  0.485377
6  0.768134  0.160717  0.764560
7  0.020810  0.135210  0.116273
8  0.309898  0.671453  0.471230
9  0.816168  0.289587  0.733126

new_values = np.random.rand(10,3)
print (new_values)
[[0.70262236 0.32756948 0.33464753]
 [0.97805808 0.62458211 0.95031352]
 [0.76747565 0.82500925 0.4066403 ]
 [0.45130841 0.40063163 0.99513816]
 [0.17756418 0.9625969  0.41925027]
 [0.42405245 0.46314887 0.37372315]
 [0.4655081  0.03516826 0.08427267]
 [0.7325207  0.63619999 0.02790779]
 [0.30017006 0.22085252 0.05501999]
 [0.52324607 0.41636966 0.04821875]]

df[:] = new_values
#alternative solution
#df = pd.DataFrame(new_values, index=df.index, columns=df.columns)
print (df)
          A         B         C
0  0.702622  0.327569  0.334648
1  0.978058  0.624582  0.950314
2  0.767476  0.825009  0.406640
3  0.451308  0.400632  0.995138
4  0.177564  0.962597  0.419250
5  0.424052  0.463149  0.373723
6  0.465508  0.035168  0.084273
7  0.732521  0.636200  0.027908
8  0.300170  0.220853  0.055020
9  0.523246  0.416370  0.048219