将df的多列附加到列表中

时间:2018-12-03 04:26:50

标签: python pandas dataframe matplotlib append

我正在尝试从多个import xy coordinates columns到一个列表。当从单个coordinates读取column时,我可以解决问题,但是却难以有效地从多个columns读取它。

我需要这个用于绘图

尝试:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

d = ({
    'Time' : [1,2,3,4,5,6,7,8],       
    'GrA1_X' : [10,12,17,16,16,14,12,8],                 
    'GrA1_Y' : [10,12,13,7,6,7,8,8], 
    'GrA2_X' : [5,8,13,16,19,15,13,5],                 
    'GrA2_Y' : [6,15,12,7,8,9,10,8],
    'GrB1_X' : [15,18,25,16,6,15,17,10],                 
    'GrB1_Y' : [7,12,5,9,10,12,18,9], 
    'GrB2_X' : [10,4,18,14,16,12,17,4],                 
    'GrB2_Y' : [8,12,16,8,10,14,12,15],         
     })

df = pd.DataFrame(data=d)

GrA_X = df[df.columns[1::2][:2]]
GrA_Y = df[df.columns[2::2][:2]]

GrB_X = df[df.columns[5::2][:2]]
GrB_Y = df[df.columns[6::2][:2]]

fig = plt.figure(figsize=(10,6))
ax = plt.gca()

Zs = []
for l,color in zip('AB', ('red', 'blue')):
    # plot all of the points from a single group
    ax.plot(GrA_X.iloc[0], GrA_Y.iloc[0], '.', c='red', ms=15, label=l, alpha = 0.5)
    ax.plot(GrB_X.iloc[0], GrB_Y.iloc[0], '.', c='blue', ms=15, label=l, alpha = 0.5)    

    Zrows = []
    for _,row in df.iterrows():
        x,y = row['Gr%s_X'%l], row['Gr%s_Y'%l]

我对Zrows = []通话感到困惑。具体来说,如何在此列表中附加multiple columns

2 个答案:

答案 0 :(得分:1)

鉴于我正确理解了您的问题,这可能是替代解决方案。

df = pd.DataFrame(data=d)
X = [df[c].tolist() for c in df.columns if c.find("_X") >= 0]
Y = [df[c].tolist() for c in df.columns if c.find("_Y") >= 0]

allX = [x for sublist in X for x in sublist]
allY = [y for sublist in Y for y in sublist]

dfXY = pd.DataFrame({"X": allX, "Y":allY})

现在,您在一个简单的数据框中拥有所有x和y。 干杯

答案 1 :(得分:0)

我仍然不清楚我是否正确理解了上一个循环的目的,但是我会冒险并提出解决方案。您的问题似乎与列的命名有关,除了遍历Gr{A,B}{1,2}_{X,Y}的不同变体(这里的{}表示变体)之外,我看不到其他方法。我会使用嵌套的for循环,即:

Zs = []
for l,color in zip('AB', ('red', 'blue')):
    # plot all of the points from a single group
    ax.plot(GrA_X.iloc[0], GrA_Y.iloc[0], '.', c='red', ms=15, label=l, alpha = 0.5)
    ax.plot(GrB_X.iloc[0], GrB_Y.iloc[0], '.', c='blue', ms=15, label=l, alpha = 0.5)    

    Zrows = []
    for _,row in df.iterrows():
        for i in [1,2]:
            x,y = row['Gr{}{}_X'.format(l,i)], row['Gr{}{}_Y'.format(l,i)]
            Zrows.append((x,y))


print(Zrows)

给出坐标元组的列表:

[(15, 7), (10, 8), (18, 12), (4, 12), (25, 5), (18, 16), (16, 9), (14, 8), (6, 10), (16, 10), (15, 12), (12, 14), (17, 18), (17, 12), (10, 9), (4, 15)]

请注意,我将字符串格式更改为format()语法,我认为这大大提高了可读性。请让我知道这是否接近您的需求。