对于像我这样的熊猫新手来说,这可能是一个简单的解决方案:
我正在尝试用该标签的最新版本(在单独的DataFrame(latest_version)中找到)替换一条pandas DataFrame(df)记录。
df.ix[label] = latest_version.ix[label]
错误:
AttributeError: 'unicode' object has no attribute 'view'
df本身很大而且复杂(并且是专有的)所以如果可以的话我想避免发布它;我希望我有一些容易的东西,但是我无法弄明白。
编辑:df.info()和latest_version.info()
的输出ipdb> df.info()
<class 'pandas.core.frame.DataFrame'>
Index: 7 entries, A to G
Data columns (total 73 columns):
Column 0 7 non-null object
Column 1 7 non-null object
Column 2 7 non-null object
Column 3 7 non-null object
Column 4 7 non-null object
Column 5 7 non-null float64
Column 6 1 non-null object
Column 7 7 non-null object
Column 8 7 non-null object
Column 9 6 non-null datetime64[ns]
Column 10 0 non-null object
Column 11 0 non-null object
Column 12 5 non-null object
Column 13 0 non-null object
Column 14 0 non-null object
Column 15 6 non-null datetime64[ns]
Column 16 0 non-null object
Column 17 0 non-null object
Column 18 0 non-null object
Column 19 0 non-null object
Column 20 0 non-null object
Column 21 0 non-null object
Column 22 0 non-null object
Column 23 0 non-null object
Column 24 0 non-null object
Column 25 0 non-null object
Column 26 0 non-null object
Column 27 0 non-null object
Column 28 0 non-null object
Column 29 0 non-null object
Column 30 0 non-null object
Column 31 0 non-null object
Column 32 0 non-null object
Column 33 0 non-null object
Column 34 0 non-null object
Column 35 0 non-null object
Column 36 0 non-null object
Column 37 4 non-null object
Column 38 6 non-null object
Column 39 4 non-null object
Column 40 0 non-null object
Column 41 0 non-null object
Column 42 0 non-null object
Column 43 6 non-null object
Column 44 0 non-null object
Column 45 6 non-null object
Column 46 0 non-null object
Column 47 4 non-null object
Column 48 0 non-null object
Column 49 4 non-null object
Column 50 0 non-null object
Column 51 0 non-null object
Column 52 0 non-null object
Column 53 0 non-null object
Column 54 0 non-null object
Column 55 0 non-null object
Column 56 0 non-null object
Column 57 0 non-null object
Column 58 0 non-null object
Column 59 0 non-null object
Column 60 0 non-null object
Column 61 0 non-null object
Column 62 0 non-null object
Column 63 0 non-null object
Column 64 0 non-null object
Column 65 0 non-null object
Column 66 0 non-null object
Column 67 0 non-null object
Column 68 0 non-null object
Column 69 0 non-null object
Column 70 0 non-null object
Column 71 0 non-null object
Column 72 0 non-null object
dtypes: datetime64[ns](2), float64(1), object(70)ipdb>
ipdb> latest_version.info()
<class 'pandas.core.frame.DataFrame'>
Index: 4 entries, A to D
Data columns (total 73 columns):
Column 0 4 non-null object
Column 1 4 non-null object
Column 2 4 non-null object
Column 3 4 non-null object
Column 4 4 non-null object
Column 5 4 non-null int64
Column 6 4 non-null object
Column 7 4 non-null object
Column 8 4 non-null object
Column 9 4 non-null object
Column 10 4 non-null object
Column 11 4 non-null object
Column 12 4 non-null object
Column 13 4 non-null object
Column 14 4 non-null object
Column 15 4 non-null object
Column 16 3 non-null object
Column 17 4 non-null object
Column 18 4 non-null object
Column 19 4 non-null object
Column 20 3 non-null object
Column 21 3 non-null object
Column 22 4 non-null object
Column 23 4 non-null object
Column 24 4 non-null object
Column 25 4 non-null object
Column 26 4 non-null object
Column 27 4 non-null object
Column 28 4 non-null object
Column 29 4 non-null object
Column 30 4 non-null object
Column 31 4 non-null object
Column 32 4 non-null object
Column 33 4 non-null object
Column 34 4 non-null object
Column 35 4 non-null object
Column 36 4 non-null object
Column 37 4 non-null object
Column 38 4 non-null object
Column 39 4 non-null object
Column 40 4 non-null object
Column 41 4 non-null object
Column 42 4 non-null object
Column 43 4 non-null object
Column 44 4 non-null object
Column 45 4 non-null float64
Column 46 4 non-null object
Column 47 4 non-null object
Column 48 4 non-null object
Column 49 4 non-null object
Column 50 4 non-null object
Column 51 4 non-null object
Column 52 4 non-null object
Column 53 4 non-null object
Column 54 4 non-null object
Column 55 4 non-null object
Column 56 1 non-null object
Column 57 1 non-null object
Column 58 4 non-null object
Column 59 4 non-null object
Column 60 4 non-null object
Column 61 4 non-null object
Column 62 4 non-null object
Column 63 4 non-null object
Column 64 4 non-null object
Column 65 4 non-null object
Column 66 4 non-null object
Column 67 4 non-null object
Column 68 4 non-null object
Column 69 4 non-null object
Column 70 4 non-null object
Column 71 4 non-null object
Column 72 4 non-null object
dtypes: float64(1), int64(1), object(71)ipdb>
进一步编辑(响应Ed):以下是仅包含不同类型列的表:
ipdb> latest_version.ix[:,[5,9,15]]
line_number entry_date entry_ref_a
unique_index
NEW/AAAAAAAAAAAAAAAAAAA 0 2014-12-30 2015-01-14
NEW/AAAAAAAAAAAAAAAAAAB 1 2014-12-30
NEW/AAAAAAAAAAAAAAAAAAC 2 2014-12-30
ipdb>/df.ix[:,[5,9,15]]
line_number entry_date \
unique_index
OLD/204442 0 1419897600000000000
OLD/343278 1 1419897600000000000
OLD/359628 2 1419897600000000000
NEW/AAAAAAAAAAAAAAAAAAA 0 2014-12-30
entry_ref_a
unique_index
OLD/204442 1421193600000000000
OLD/343278 1421193600000000000
OLD/359628 1422230400000000000
NEW/AAAAAAAAAAAAAAAAAAA 2015-01-14
绝对可以肯定这里存在类型不匹配的问题......
答案 0 :(得分:1)
所以你的问题似乎是你在尝试分配的2个dfs之间的dtypes不匹配:
df dtypes: datetime64[ns](2), float64(1), object(70)
虽然
latest_version is :dtypes: float64(1), int64(1), object(71)
从输出中我们可以看到碰撞一些的列是日期时间,而它们是另一个df中相应列中的int64。
您可以通过执行以下操作将格式错误的列转换为日期时间:
df['entry_date'] = pd.to_datetime(df['entry_date')
同样适用于entry_ref_a