Question

我正在尝试编写一个简单的程序，以CSV格式读取各种数据集（长度均相同），并自动将所有数据集绘制为Pandas Dataframe散点图。我当前的代码可以做到这一点，但是所有标记的颜色都是相同的（蓝色）。我想弄清楚如何制作颜色图，以便将来如果我有更大的数据集（比如说，有100多种不同的X-Y配对），它将在绘制每个序列时自动为其着色。最终，我希望这是从命令行运行的一种快速简便的方法。我没有运气阅读文档或进行堆栈交换，希望这不是重复的！

我尝试了这些帖子中的建议：

1）Setting different color for each series in scatter plot on matplotlib

2）https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.plot.scatter.html

3）https://matplotlib.org/users/colormaps.html

但是，第一个本质上是根据数据点在x轴上的位置进行分组，并使这些数据组具有相同的颜色（这不是我想要的颜色，每个数据系列大致都是线性增加的函数）。第二个和第三个链接似乎有效，但是我不喜欢选择颜色图（例如，“ viridis”，许多颜色太相似了，很难区分数据点）。

到目前为止，这是我代码的简化版本（删除其他自动命名轴等的行，以便于阅读）。我还删除了为指定颜色图所做的任何尝试，以获得更多的空白画布感：

''' Importing multiple scatter data and plotting '''

import pandas as pd
import matplotlib.pyplot as plt

### Data file path (please enter Dataframe however you like)
path = r'/Users/.../test_data.csv'

### Read in data CSV
data = pd.read_csv(path)

### List of headers
header_list = list(data)

### Set data type to float so modified data frame can be plotted
data = data.astype(float)

### X-axis limits
xmin = 1e-4;
xmax = 3e-3;

## Create subplots to be plotted together after loop
fig, ax = plt.subplots()

### Since there are multiple X-axes (every other column), this loop only plots every other x-y column pair

for i in range(len(header_list)):

    if i % 2 == 0:

        dfplot = data.plot.scatter(x = "{}".format(header_list[i]), y = "{}".format(header_list[i + 1]), ax=ax)

        dfplot.set_xlim(xmin,xmax) # Setting limits on X axis

plot.show()

数据集可以在下面的Google驱动器链接中找到。感谢您的帮助！

https://drive.google.com/drive/folders/1DSEs8D7lIDUW4NIPBl2qW2EZiZxslGyM?usp=sharing

如何使用熊猫在绘图循环中更改散点图标记的颜色？

0 个答案: