如何在不丢弃其他项目的情况下在pandas系列中添加新项目

时间:2016-01-16 04:41:49

标签: python pandas

我有一个以下的熊猫系列。

new_orders_list
Out[853]: 
Cluster 1    [525, 526, 533]
Cluster 2    [527, 528, 532]
Cluster 3    [519, 534, 535]
Cluster 4              [530]
Cluster 5         [529, 531]
Cluster 6    [520, 521, 524]

而且,在对数据帧进行一些切片后,我还有两个系列。

condition
Out[854]: 
5    525
Name: order_id, dtype: object

condition2
Out[855]: 
Clusters
Cluster 6    1
Name: quant_bought, dtype: int64

现在,我想在condition new_orders_list位置将Cluster 6系列525的值添加到(index from condition2 series)。并从525位置删除Cluster 1。所以,它看起来应该是这样的

Cluster 1    [526, 533]
Cluster 2    [527, 528, 532]
Cluster 3    [519, 534, 535]
Cluster 4              [530]
Cluster 5         [529, 531]
Cluster 6    [520, 521, 524, 525]

我正在使用Python进行跟踪。但它附加到先前存储的值。

new_orders_list.append(pd.Series(condition.values ,index = 
condition2.index))

Cluster 1    [525, 526, 533]
Cluster 2    [527, 528, 532]
Cluster 3    [519, 534, 535]
Cluster 4              [530]
Cluster 5         [529, 531]
Cluster 6    [520, 521, 524]
Cluster 6                525

1 个答案:

答案 0 :(得分:1)

您可以尝试此解决方案。

创建了新系列的删除数据,称为remseries

lists Series中的new_orders_list中的值类型是整数,其他Series的类型是strings,因此所有值都会转换为字符串。< / p>

然后按isin按子集选择行,并添加和删除值。

print new_orders_list

Clusters
Cluster 1    [525, 526, 533]
Cluster 2    [527, 528, 532]
Cluster 3    [519, 534, 535]
Cluster 4              [530]
Cluster 5         [529, 531]
Cluster 6    [520, 521, 524]
Name: no, dtype: object

print condition

5    525
Name: order_id, dtype: object

print condition2

Clusters
Cluster 6    1
Name: quant_bought, dtype: int64

#create new Series for remove
remseries = pd.Series(condition.values, index = ['Cluster 1'], name='rem')
print remseries

Cluster 1    525
Name: rem, dtype: object
    
#create dataframe from series
df = new_orders_list.reset_index()
print df

    Clusters               no
0  Cluster 1  [525, 526, 533]
1  Cluster 2  [527, 528, 532]
2  Cluster 3  [519, 534, 535]
3  Cluster 4            [530]
4  Cluster 5       [529, 531]
5  Cluster 6  [520, 521, 524]

#convert values in list from int to string
df['no'] = df['no'].apply(lambda x: [str(i) for i in x])

#add and remove items
df.loc[df['Clusters'].isin(condition2.index.tolist()), 'no'] = 
df['no'].apply(lambda x: x + condition.values.tolist())

df.loc[df['Clusters'].isin(remseries.index.tolist()), 'no']  = 
df['no'].apply(lambda x: [k for k in x if k != ''.join(remseries.values)])

#check types of values in list
print [ type(x) for x in df['no'][0]]

[<type 'str'>, <type 'str'>]

#convert values in list from string to int
df['no'] = df['no'].apply(lambda x: [int(i) for i in x])
print df

    Clusters                    no
0  Cluster 1            [526, 533]
1  Cluster 2       [527, 528, 532]
2  Cluster 3       [519, 534, 535]
3  Cluster 4                 [530]
4  Cluster 5            [529, 531]
5  Cluster 6  [520, 521, 524, 525]

#check types of values in list
print [ type(x) for x in df['no'][0]]

[<type 'int'>, <type 'int'>]