分配给DataFrame无效,但dtypes已更改。
对于数据科学而言,我想将target_frame
分配给empty_frame
,但是直到再次分配它才起作用。在分配过程中,dtypes
的{{1}}从empty_frame
更改为int32
,最后设置为float64
。
我尝试简化下面的代码,它们具有相同的问题。
int64
我希望上面代码的输出应该是:
import pandas as pd
import numpy as np
dataset = [[[i for i in range(5)], ] for i in range(5)]
dataset = pd.DataFrame(dataset, columns=['test'])
empty_numpy = np.arange(25).reshape(5, 5)
empty_numpy.fill(np.nan)
# Solution 1: change the below code into 'empty_frame = pd.DataFrame(empty_numpy)' then everything will be fine
empty_frame = pd.DataFrame(empty_numpy, columns=[str(i) for i in range(5)])
series = dataset['test']
target_frame = pd.DataFrame(list(series))
# Solution 2: run `empty_frame[:] = target_frame` twice, work fine to me.
# ==================================================================
# First try.
empty_frame[:] = target_frame
print("="*40)
print(f"Data types of empty_frame: {empty_frame.dtypes}")
print("="*40)
print("Result of first try: ")
print(empty_frame)
print("="*40)
# Second try.
empty_frame[:] = target_frame
print(f"Data types of empty_frame: {empty_frame.dtypes}")
print("="*40)
print("Result of second try: ")
print(empty_frame)
print("="*40)
# ====================================================================
但是当我第一次尝试时它不起作用。
有两个解决方案,但是我不知道为什么:
========================================
Data types of empty_frame: 0 int64
1 int64
2 int64
3 int64
4 int64
dtype: object
========================================
Result of first try:
0 1 2 3 4
0 0 1 2 3 4
1 0 1 2 3 4
2 0 1 2 3 4
3 0 1 2 3 4
4 0 1 2 3 4
========================================
时删除列的名称。我想弄清楚两件事:
empty_frame
的数据类型已更改。谢谢。
答案 0 :(得分:0)
如果我正确理解了您的问题,那么当您创建empty_numpy矩阵时,您的问题就会开始。 我最喜欢的解决方案是改用 empty_numpy = np.empty([5,5])(此处的默认dtypes为float64)。那么“第一次尝试的结果:”是正确的。这意味着:
import pandas as pd
import numpy as np
dataset = [[[i for i in range(5)],] for i in range(5)]
dataset = pd.DataFrame(dataset, columns=['test'])
empty_numpy = np.empty([5,5])
# here you may add empty_numpy.fill(np.nan) but it's not necessary,result is the same
empty_frame = pd.DataFrame(empty_numpy, columns=[str(i) for i in range(5)])
series = dataset['test']
target_frame = pd.DataFrame(list(series))
# following assignment is correct then
empty_frame[:] = target_frame
print('='*40)
print(f'Data types of empty_frame: {empty_frame.dtypes}')
print('='*40)
print("Result of first try: ")
print(empty_frame)
print("="*40)
或者只是将dtype属性添加到您的np.arrange调用中,就像这样:
empty_numpy = np.arange(25, dtype=float).reshape(5, 5)
然后它也可以工作(但是有点无聊; o)。