我正在尝试从熊猫数据框中选择行和列的子集,最终将要绘制图形。我的数据目前是结构化的:
0 2 3 ... 177 178 Timestamp
1 ...
6:54:36 7/26/2019 -35.0 -34.75 ... 8.75 9.0 06:54:36
500 a 7/26/2019 3880.0 4068.00 ... 4562.00 4398.0 06:54:36
500 b 7/26/2019 3462.0 3458.00 ... 3604.00 3718.0 06:54:36
600 a 7/26/2019 NaN NaN ... NaN NaN 06:54:36
600 b 7/26/2019 NaN NaN ... NaN NaN 06:54:36
700 a 7/26/2019 3462.0 3684.00 ... 3821.00 3800.0 06:54:36
700 b 7/26/2019 4290.0 4414.00 ... 4303.00 4336.0 06:54:36
900 a 7/26/2019 2863.0 3059.00 ... 3075.00 3313.0 06:54:36
900 b 7/26/2019 4480.0 4632.00 ... 4873.00 4843.0 06:54:36
1000 a 7/26/2019 NaN NaN ... 4426.00 4751.0 06:54:36
1000 b 7/26/2019 NaN NaN ... 4388.00 4239.0 06:54:36
6:54:40 7/26/2019 -35.0 -34.75 ... 8.75 9.0 06:54:40
500 a 7/26/2019 3995.0 4056.00 ... 4571.00 4480.0 06:54:40
500 b 7/26/2019 3837.0 3974.00 ... 3720.00 3619.0 06:54:40
600 a 7/26/2019 NaN NaN ... NaN NaN 06:54:40
600 b 7/26/2019 NaN NaN ... NaN NaN 06:54:40
700 a 7/26/2019 3501.0 3468.00 ... 3897.00 3911.0 06:54:40
700 b 7/26/2019 4422.0 4331.00 ... 4737.00 4505.0 06:54:40
900 a 7/26/2019 2681.0 2749.00 ... 3375.00 3269.0 06:54:40
900 b 7/26/2019 4542.0 4602.00 ... 4505.00 4442.0 06:54:40
1000 a 7/26/2019 NaN NaN ... NaN NaN 06:54:40
1000 b 7/26/2019 NaN NaN ... NaN NaN 06:54:40
我想在两个单独的图(a图和b图)的2-178列中绘制a值和b值,我想在每个时间段内执行此操作。我最终希望每次都单击绘图以查看随时间的变化(如绘图GUI)。我需要根据时间和索引名称为每组时间戳提取选定的列。例如,我想要:
a500 = [3880.0 4068.00 ... 4562.00 4398.0]
a600 = [NaN NaN ... NaN NaN]
a700 = [3462.0 3684.00 ... 3821.00 3800.0]
a900 = [2863.0 3059.00 ... 3075.00 3313.0]
a1000 = [ NaN NaN ... 4426.00 4751.0]
我希望能够将按钮单击更新为:
a500 = [3995.0 4056.00 ... 4571.00 4480.0]
a600 = [NaN NaN ... NaN NaN]
a700 = [ 3501.0 3468.00 ... 3897.00 3911.0]
a900 = [2681.0 2749.00 ... 3375.00 3269.0]
a1000 = [ NaN NaN ... NaN NaN]
我不会提前知道时间戳。行的结构应在整个数据帧中保持一致(行从时间和相关值开始,然后交替排列a和b行,然后对新的时间值重复)。我希望能够保留NaN,因为这些是非零值,我不想将其绘制为零。
我尝试使用.loc
搜索以所需值开头的行(例如a500=data.loc['500 a']
),但会踢出错误消息(例如KeyError: '500 a'
)。 / p>
Tl; dr:需要帮助,以基于熊猫数据框中的列为基础来选择行的子集。
答案 0 :(得分:0)
花了很多时间,但我确实设法使.iloc
正常工作:
n=1
m=n+10
subdf=df.iloc[n:m]
newdf=subdf[subdf.columns[1:178].tolist()]
此解决方案对我有用,因为我知道此数据框具有重复的行标签和定义的列数。当我最终要迭代绘制数据帧的某些部分时,n和m值是占位符。因此,只要一个值的关联行数是恒定的(例如,每个新时间戳我有10行),该解决方案就可以工作。