在Windows下,TikzDevice不会在UTF-8中输出带有变音符号的代码

时间:2019-09-04 09:33:50

标签: r encoding r-markdown tikzdevice

tikzDevice在UTF-8中的Windows下不与Umlauts输出代码

我用RMarkdown编写了一个报告,并使用tikzDevice进行绘图。当我使用德国Umlauts(äöüÖÄÜ)时,RStudio会引发以下错误:

  

pandoc.exe:无法解码字节'\ xd6':Data.Text.Internal.Encoding.streamDecodeUtf8With:无效的UTF-8流

这是一个最小的示例:

---
title: "test"
author: "test"
date: "Today"
output: 
  pdf_document: 
    keep_tex: true
header-includes:
   - \usepackage{tikz}
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE)
library(tikzDevice)
options(tikzDefaultEngine = "xetex")
```
```{r plot, dev="tikz", external=FALSE}
x <- rnorm(50)
y <- rnorm(50)

plot(x, y, xlab = "ÖÄÜ", ylab = "öäü")
```

使用此代码,tikzDevice用1252编码写入TeX文件(图),当包含在主LaTeX文档中时,该文件将不起作用。因此,Pandoc会引发错误。 我在Ubuntu下尝试过,并且代码有效。我怀疑Windows编码是导致此问题的原因,但我找不到解决方法。

源文件(Rmd)采用UTF-8编码。 (由tikzDevice生成的)TeX文件未采用UTF-8编码。

SessionInfo(Windows):

version  R version 3.6.1 (2019-07-05)
os       Windows 10 x64
system   x86_64, mingw32
ui       RStudio
language (EN)
collate  German_Germany.1252
ctype    German_Germany.1252
tz       Europe/Berlin
date     2019-09-04 

SessionInfo(Ubuntu):

version  R version 3.4.4 (2018-03-15)
os       Ubuntu 18.04.3 LTS
system   x86_64, linux-gnu
ui       X11
language (EN)
collate  C.UTF-8
ctype    C.UTF-8
tz       Europe/Berlin
date     2019-09-04

3 个答案:

答案 0 :(得分:1)

我可以重现该行为。请以https://github.com/daqana/tikzDevice/issues的问题打开。作为解决方法,您可以使用

---
title: "test"
author: "test"
date: "Today"
output: 
  pdf_document: 
    keep_tex: true
header-includes:
   - \usepackage{tikz}
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE)
library(tikzDevice)
options(tikzDefaultEngine = "xetex")
```

```{r plot, dev="tikz", external=FALSE}
x <- rnorm(50)
y <- rnorm(50)

plot(x, y, xlab = '\\"O\\"A\\"U', ylab = '\\"o\\"a\\"u')
```

答案 1 :(得分:1)

另一种解决方法是转换图形文件夹中的所有tikz / tex文件。使用iconv,文件内容将从CP1252转换为UTF-8。如果这是文档中的最后一个块,则无需“硬编码” Umlauts:

# path of the Rmd file
path <- getwd()
# subfolder of the cache and figures
subfolder <- paste(gsub(knitr::current_input(), pattern = ".Rmd", replacement = ""), "_files", sep = "")
# beamer or latex figures
figures <- ifelse(dir.exists(paste(path, subfolder, "figure-latex", sep = "/")), "figure-latex", ifelse(dir.exists(paste(path, subfolder, "figure-beamer", sep = "/")), "figure-beamer", ""))
# full path of the figure folder
folder <- paste(path, subfolder, figures, sep = "/")
# find all tex/tikz files in the figures folder
for (x in list.files(folder, pattern = "*.tex")) {
  # full path to file
  file <- paste(folder, "/", x, sep = "")
  # full path to temp file
  temp <- paste(folder, "/", "temp.tex", sep = "")
  # rename source file to temp
  file.rename(file, temp)
  # read input file in correct encoding
  input <- readLines(temp, encoding = "cp1252")
  # convert input to UTF-8
  output <- iconv(input, from = "cp1252", to = "UTF8")
  # write output with original filename
  writeLines(input, con = file(file, encoding = "UTF8"))
  # remove temp file
  file.remove(temp)
  rm(input, output)
}

编辑:现在也可以与beamer一起使用。

答案 2 :(得分:0)

在R或Python中,在读取CSV或文本文件时,使用(r'')示例r'c:\ hem \ dow \ train.csv' 我们必须声明r''才能读取文件。