让我们先构建一个ctable
:
import pandas as pd
import blaze as bl
df = pd.DataFrame({'x': range(4), 'y': [2., 4., 2., 4.]})
bl.odo(df, 'test.bcolz')
现在假设我要添加一个名为' x_mod'的列。到这张桌子。我试过了
test_table = bl.Data('test.bcolz')
def f(h):
return h*3
test_table['x_mod'] = test_table['x'].apply(f, dshape='int64')
#Or, I think equivalently:
#test_table['x_mod'] = test_table['x']*3
但它给出了
TypeError: 'InteractiveSymbol' object does not support item assignment
1)如何分配' x_mod'列然后保存到磁盘?
我正在处理大型数据库:计算内存中的列应该没问题,但是我无法在内存中加载整个ctable
。
2)在相关问题上,apply
对我来说也不起作用。我做错了吗?
#This doesn't work:
bl.compute(test_table['x'].apply(f, dshape='int64'))
#This I think should be equivalent, but does work:
bl.compute(test_table['x']*3)
谢谢你的时间!
答案 0 :(得分:1)
You can use the transform method in Blaze like this:
bz.transform(df, sepal_ratio = df.sepal_length / df.sepal_width )
For other function, you need to use Blaze expression:
bz.transform(df, sepal_ratio = BLAZE_symbolic_Expression(df.Col1, df.col2) )
it will add the compute column to the dataframe. Doc is here: https://blaze.readthedocs.io/en/latest/expressions.html
For example, you can use map:
from datetime import datetime
yourexpr = df.col1.map(datetime.utcfromtimestamp)
bz.transform(df, sepal_ratio=yourexpr)