Question

我无法在轴上设置可读的刻度线。问题是我的数据大小不同，因此我不确定如何处理。

我的数据包括来自两台机器的约400种不同产品，每个产品有3/4个变量。我已经将其预处理为data.table，并使用collect将其转换为长格式-该部分很好。

概述：数据是离散的，x轴上的每个X_________代表一个单独的读数，其相对于机器1/2的相对值-目的是将两者进行比较。图形格式非常适合我的需要，我只想在x轴上每10个产品设置刻度，在y轴上设置合理的值。

Y_1：从150到250
Y_2：从1.5 *到2.5
Y_3：从0.8 *到2.3
Y_4：从0.4 *到1.5

*底值，四舍五入

这是到目前为止我正在使用的代码

var.Parameter <- c("Var1", "Var2", "Var3", "Var4")

MProduct$Parameter <- factor(MProduct$Parameter,
                          labels = var.Parameter)
labels_x <- MProduct$Lot[seq(0, 1626, by= 20)]
labels_y <- MProduct$Value[seq(0, 1626, by= 15)]


plot.MProduct <- ggplot(MProduct, aes(x = Lot,
                                y = Value,
                                colour = V4)) +
  facet_grid(Parameter ~.,
            scales = "free_y") + 
  scale_x_discrete(breaks=labels_x) +
  scale_y_discrete(breaks=labels_y) +
  geom_point() +
  labs(title = "Product: Select  Trends | 2018",
       x = "Time (s)",
       y = "Value") +
  theme(axis.text.x = element_text (angle = 90,
                                    hjust = 1,
                                    vjust = 0.5)) 
 # ggsave("MProduct.png")
plot.MProduct

任何人都知道如何使该图更具可读性吗？手动设置标签/中断会极大地限制灵活性和可读性-应该有一个选项可以将其设置为每个X刻度，对吗？与y相同。

我需要将此功能应用到多个数据集，因此我对每次都指定“聚集”数据集的列长度（在本例中为1626）感到不满意。

因为我在这里，所以我也想借此机会询问以下代码：

var.Parameter <- c("Var1", "Var2", "Var3", "Var4")

通常，我需要按特定顺序标记数据，而不必按字母顺序排列。但是，R默认为某种奇怪的行为，因此我必须绘制并验证标签确实在应有的位置。有什么线索可以迫使他们按顺序提出吗？照原样，我的解决方案是不断移动它们在该行代码中的位置，直到正确生成图形为止。

非常感谢。

Answer 1

好的。我将忽略y轴标签，因为只要您不尝试使用自定义labels_y覆盖它们，默认值似乎就可以正常工作。只需让默认值完成工作即可。对于X轴，我们将提供几个选项：

（A）在X轴上每N个产品贴上标签。查看?scale_x_discrete，我们可以将标签设置为一个函数，该函数采用所有因子水平并返回所需的标签。因此，我们将编写一个函数 al ，该函数返回一个返回第N个标签的函数：

every_n_labeler = function(n = 3) {
  function (x) {
    ind = ((1:length(x)) - 1) %% n == 0
    x[!ind] = ""
    return(x)
  }
}

现在让我们将其用作标签：

ggplot(df, aes(x = Lot,
               y = Value,
               colour = Machine)) +
  facet_grid(Parameter ~ .,
             scales = "free_y") +
  geom_point() +
  scale_x_discrete(labels = every_n_labeler(3)) +
  labs(title = "Product: Select  Trends | 2018",
       x = "Time (s)",
       y = "Value") +
  theme(axis.text.x = element_text (
    angle = 90,
    hjust = 1,
    vjust = 0.5
  ))

您可以将every_n_labeler(3)更改为(10)，使其每10个标签都贴上一个标签。

（B）也许更合适，似乎您的x轴实际上是数字轴，恰好在它的前面有“ X”，让我们将其转换为数字轴并使用默认值做标签工作：

df$time = as.numeric(gsub(pattern = "X", replacement = "", x = df$Lot))

ggplot(df, aes(x = time,
               y = Value,
               colour = Machine)) +
  facet_grid(Parameter ~ .,
             scales = "free_y") +
  geom_point() +
  labs(title = "Product: Select  Trends | 2018",
       x = "Time (s)",
       y = "Value") +
  theme(axis.text.x = element_text (
    angle = 90,
    hjust = 1,
    vjust = 0.5
  ))

在整个x范围内，我想那会很好。

（C）但是谁想读这些9位数字？您将x轴标记为“时间”，这让我认为它实际上是一个时间，从某个开始时间开始以秒为单位。我将把您的开始时间设为2010年1月1日，并将这些秒数转换为实际时间，然后我们得到一个不错的日期时间比例：

ggplot(df_s, aes(x = as.POSIXct(time, origin = "2010-01-01"),
               y = Value,
               colour = Machine)) +
  facet_grid(Parameter ~ .,
             scales = "free_y") +
  geom_point() +
  labs(title = "Product: Select  Trends | 2018",
       x = "Time (s)",
       y = "Value") +
  theme(axis.text.x = element_text (
    angle = 90,
    hjust = 1,
    vjust = 0.5
  ))

如果这是数据背后的真正含义，那么使用日期时间轴将大大提高可读性。（再次，请注意，我们没有指定中断，默认设置效果很好。）

使用此数据（我将样本数据细分为2个小平面，并使用dput使其可复制/可粘贴）：

df = structure(list(Lot = structure(c(1L, 2L, 3L, 4L, 1L, 2L, 3L, 
4L, 1L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L, 1L, 2L, 3L, 4L, 1L, 
2L, 3L, 4L, 1L), .Label = c("X180106482", "X180126485", "X180306523", 
"X180526326"), class = "factor"), Value = c(201, 156, 253, 211, 
178, 202.5, 203.4, 204.3, 205.2, 2.02, 2.17, 1.23, 1.28, 1.54, 
1.28, 1.45, 1.61, 2.35, 1.34, 1.36, 1.67, 2.01, 2.06, 2.07, 2.19, 
1.44, 2.19), Parameter = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L), .Label = c("Var 1", "Var 2", "Var 3", "Var 4"
), class = "factor"), Machine = structure(c(2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L), .Label = c("Machine 1", "Machine 2"), class = "factor"), 
    time = c(180106482, 180126485, 180306523, 180526326, 180106482, 
    180126485, 180306523, 180526326, 180106482, 180106482, 180126485, 
    180306523, 180526326, 180106482, 180126485, 180306523, 180526326, 
    180106482, 180106482, 180126485, 180306523, 180526326, 180106482, 
    180126485, 180306523, 180526326, 180106482)), row.names = c(NA, 
-27L), class = "data.frame")

将刻痕频率分配给facet_grid中的离散数据轴

1 个答案: