我尝试使用numpy数组替换熊猫DataFrame中的数据(更确切地说,我想规范化数据,然后在现有DataFrame中设置新列)。看起来像这样:
# df is a existing pandas dataframe with 10 rows and 3 columns
new_values = np.random.rand(10,3)
df = new_values # this is the step I want to solve
我当然想保留DataFrame的列和索引信息。有人知道我如何进行这项工作吗?
答案 0 :(得分:3)
使用:
np.random.seed(12)
df = pd.DataFrame(np.random.rand(10,3), columns=list('ABC'))
print (df)
A B C
0 0.154163 0.740050 0.263315
1 0.533739 0.014575 0.918747
2 0.900715 0.033421 0.956949
3 0.137209 0.283828 0.606083
4 0.944225 0.852736 0.002259
5 0.521226 0.552038 0.485377
6 0.768134 0.160717 0.764560
7 0.020810 0.135210 0.116273
8 0.309898 0.671453 0.471230
9 0.816168 0.289587 0.733126
new_values = np.random.rand(10,3)
print (new_values)
[[0.70262236 0.32756948 0.33464753]
[0.97805808 0.62458211 0.95031352]
[0.76747565 0.82500925 0.4066403 ]
[0.45130841 0.40063163 0.99513816]
[0.17756418 0.9625969 0.41925027]
[0.42405245 0.46314887 0.37372315]
[0.4655081 0.03516826 0.08427267]
[0.7325207 0.63619999 0.02790779]
[0.30017006 0.22085252 0.05501999]
[0.52324607 0.41636966 0.04821875]]
df[:] = new_values
#alternative solution
#df = pd.DataFrame(new_values, index=df.index, columns=df.columns)
print (df)
A B C
0 0.702622 0.327569 0.334648
1 0.978058 0.624582 0.950314
2 0.767476 0.825009 0.406640
3 0.451308 0.400632 0.995138
4 0.177564 0.962597 0.419250
5 0.424052 0.463149 0.373723
6 0.465508 0.035168 0.084273
7 0.732521 0.636200 0.027908
8 0.300170 0.220853 0.055020
9 0.523246 0.416370 0.048219