Question

tikzDevice在UTF-8中的Windows下不与Umlauts输出代码

我用RMarkdown编写了一个报告，并使用tikzDevice进行绘图。当我使用德国Umlauts（äöüÖÄÜ）时，RStudio会引发以下错误：

pandoc.exe：无法解码字节'\ xd6'：Data.Text.Internal.Encoding.streamDecodeUtf8With：无效的UTF-8流

这是一个最小的示例：

---
title: "test"
author: "test"
date: "Today"
output: 
  pdf_document: 
    keep_tex: true
header-includes:
   - \usepackage{tikz}
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE)
library(tikzDevice)
options(tikzDefaultEngine = "xetex")
```
```{r plot, dev="tikz", external=FALSE}
x <- rnorm(50)
y <- rnorm(50)

plot(x, y, xlab = "ÖÄÜ", ylab = "öäü")
```

使用此代码，tikzDevice用1252编码写入TeX文件（图），当包含在主LaTeX文档中时，该文件将不起作用。因此，Pandoc会引发错误。我在Ubuntu下尝试过，并且代码有效。我怀疑Windows编码是导致此问题的原因，但我找不到解决方法。

源文件（Rmd）采用UTF-8编码。（由tikzDevice生成的）TeX文件未采用UTF-8编码。

SessionInfo（Windows）：

version  R version 3.6.1 (2019-07-05)
os       Windows 10 x64
system   x86_64, mingw32
ui       RStudio
language (EN)
collate  German_Germany.1252
ctype    German_Germany.1252
tz       Europe/Berlin
date     2019-09-04

SessionInfo（Ubuntu）：

version  R version 3.4.4 (2018-03-15)
os       Ubuntu 18.04.3 LTS
system   x86_64, linux-gnu
ui       X11
language (EN)
collate  C.UTF-8
ctype    C.UTF-8
tz       Europe/Berlin
date     2019-09-04

Answer 1

我可以重现该行为。请以https://github.com/daqana/tikzDevice/issues的问题打开。作为解决方法，您可以使用

---
title: "test"
author: "test"
date: "Today"
output: 
  pdf_document: 
    keep_tex: true
header-includes:
   - \usepackage{tikz}
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE)
library(tikzDevice)
options(tikzDefaultEngine = "xetex")
```

```{r plot, dev="tikz", external=FALSE}
x <- rnorm(50)
y <- rnorm(50)

plot(x, y, xlab = '\\"O\\"A\\"U', ylab = '\\"o\\"a\\"u')
```

Answer 2

另一种解决方法是转换图形文件夹中的所有tikz / tex文件。使用iconv，文件内容将从CP1252转换为UTF-8。如果这是文档中的最后一个块，则无需“硬编码” Umlauts：

# path of the Rmd file
path <- getwd()
# subfolder of the cache and figures
subfolder <- paste(gsub(knitr::current_input(), pattern = ".Rmd", replacement = ""), "_files", sep = "")
# beamer or latex figures
figures <- ifelse(dir.exists(paste(path, subfolder, "figure-latex", sep = "/")), "figure-latex", ifelse(dir.exists(paste(path, subfolder, "figure-beamer", sep = "/")), "figure-beamer", ""))
# full path of the figure folder
folder <- paste(path, subfolder, figures, sep = "/")
# find all tex/tikz files in the figures folder
for (x in list.files(folder, pattern = "*.tex")) {
  # full path to file
  file <- paste(folder, "/", x, sep = "")
  # full path to temp file
  temp <- paste(folder, "/", "temp.tex", sep = "")
  # rename source file to temp
  file.rename(file, temp)
  # read input file in correct encoding
  input <- readLines(temp, encoding = "cp1252")
  # convert input to UTF-8
  output <- iconv(input, from = "cp1252", to = "UTF8")
  # write output with original filename
  writeLines(input, con = file(file, encoding = "UTF8"))
  # remove temp file
  file.remove(temp)
  rm(input, output)
}

编辑：现在也可以与beamer一起使用。

Answer 3

在R或Python中，在读取CSV或文本文件时，使用（r''）示例r'c：\ hem \ dow \ train.csv' 我们必须声明r''才能读取文件。

在Windows下，TikzDevice不会在UTF-8中输出带有变音符号的代码

3 个答案: