Question

分配给DataFrame无效，但dtypes已更改。

对于数据科学而言，我想将target_frame分配给empty_frame，但是直到再次分配它才起作用。在分配过程中，dtypes的{{1}}从empty_frame更改为int32，最后设置为float64。

我尝试简化下面的代码，它们具有相同的问题。

int64

我希望上面代码的输出应该是：

import pandas as pd
import numpy as np

dataset = [[[i for i in range(5)], ] for i in range(5)]
dataset = pd.DataFrame(dataset, columns=['test'])  

empty_numpy = np.arange(25).reshape(5, 5)
empty_numpy.fill(np.nan)

# Solution 1: change the below code into 'empty_frame = pd.DataFrame(empty_numpy)' then everything will be fine
empty_frame = pd.DataFrame(empty_numpy, columns=[str(i) for i in range(5)])

series = dataset['test']
target_frame = pd.DataFrame(list(series))

# Solution 2: run `empty_frame[:] = target_frame` twice, work fine to me.
# ==================================================================
# First try.
empty_frame[:] = target_frame
print("="*40)
print(f"Data types of empty_frame: {empty_frame.dtypes}")
print("="*40)

print("Result of first try: ")
print(empty_frame)
print("="*40)


# Second try.
empty_frame[:] = target_frame

print(f"Data types of empty_frame: {empty_frame.dtypes}")
print("="*40)

print("Result of second try: ")
print(empty_frame)
print("="*40)
# ====================================================================

但是当我第一次尝试时它不起作用。

有两个解决方案，但是我不知道为什么：

正如我在代码中所示，一次运行两次尝试分配。
在创建======================================== Data types of empty_frame: 0 int64 1 int64 2 int64 3 int64 4 int64 dtype: object ======================================== Result of first try: 0 1 2 3 4 0 0 1 2 3 4 1 0 1 2 3 4 2 0 1 2 3 4 3 0 1 2 3 4 4 0 1 2 3 4 ========================================时删除列的名称。

我想弄清楚两件事：

为什么empty_frame的数据类型已更改。
为什么我的代码中显示的解决方案可以解决此分配问题。

谢谢。

Answer 1

如果我正确理解了您的问题，那么当您创建empty_numpy矩阵时，您的问题就会开始。我最喜欢的解决方案是改用 empty_numpy = np.empty（[5,5]）（此处的默认dtypes为float64）。那么“第一次尝试的结果：”是正确的。这意味着：

import pandas as pd
import numpy as np

dataset = [[[i for i in range(5)],] for i in range(5)]
dataset = pd.DataFrame(dataset, columns=['test'])  

empty_numpy = np.empty([5,5])
# here you may add empty_numpy.fill(np.nan) but it's not necessary,result is the same

empty_frame = pd.DataFrame(empty_numpy, columns=[str(i) for i in range(5)])

series = dataset['test']
target_frame = pd.DataFrame(list(series))

# following assignment is correct then
empty_frame[:] = target_frame
print('='*40)
print(f'Data types of empty_frame: {empty_frame.dtypes}')
print('='*40)

print("Result of first try: ")
print(empty_frame)
print("="*40)

或者只是将dtype属性添加到您的np.arrange调用中，就像这样：

empty_numpy = np.arange(25, dtype=float).reshape(5, 5)

然后它也可以工作（但是有点无聊； o）。

分配给DataFrame无效，但dtypes已更改

1 个答案: