Question

以下是我摄入R：

的字符串示例

General\\Contingency\\Import\\Import_Manual\\New\\ADC170001A13_Loc.txt

我正在尝试隔离'ADC170001A13'我已经尝试过子串，还有一个gsub来删除除了该部分字符串之外的所有内容但我收到以下错误：

Error in gsub(clean, "", TextLOCfiles) : 
invalid regular expression '\\Fs01 \DepartmentFolders$\General\Contingency\Import\Import_Manual\New\', reason 'Trailing   backslash'
In addition: Warning message:
In gsub(clean, "", TextLOCfiles) :
argument 'pattern' has length > 1 and only the first element will be used

Answer 1

最简单的解决方案是使用regmatches：

> rxmatch = regexpr('(?<=\\\\)\\w+(?=_Loc\\.)', TextLOCfiles, perl = TRUE)
> regmatches(TextLOCfiles , rxmatch)
ADC170001A13

需要

perl = TRUE才能获得零宽度断言，正如Simon在评论中所提到的那样。

Answer 2

您可以使用gsub和括号来捕获所需的部分：

> gsub(".*\\\\(\\w+)_.*", "\\1", TextLOCfiles)
[1] "ADC170001A13"

Answer 3

这看起来像文件路径。如果这是真的，那么你可以简单地使用basename（），如下所示：

sub(".txt", "", basename(TextLOCfiles))

Answer 4

试试这个：

library( tools )
basename( file_path_sans_ext( TextLOCfiles ) )

或没有插件包：

sub( "\\.[^.]*$", "", basename( TextLOCfiles ) )

这些解决方案不要求您知道文件名或扩展名，如果没有扩展名也可以使用。

在R中隔离字符串的一部分

4 个答案: