Question

我试图用Python打开一个文件，但是我收到了一个错误，在字符串的开头我得到了一个/u202a字符......有谁知道如何删除它？< / p>

def carregar_uml(arquivo, variaveis):
    cadastro_uml = {}
    id_uml = 0

    for i in open(arquivo):
        linha = i.split(",")


carregar_uml("‪H:\\7 - Script\\teste.csv", variaveis)

OSError: [Errno 22] Invalid argument: '\u202aH:\7 - Script\teste.csv'

Answer 1

最初创建.py文件时，文本编辑器引入了非打印字符。

考虑这一行：

carregar_uml("‪H:\\7 - Script\\teste.csv", variaveis)

让我们仔细选择字符串，包括引号，然后将其复制粘贴到交互式Python会话中：

$ python
Python 3.6.1 (default, Jul 25 2017, 12:45:09) 
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> "‪H:\\7 - Script\\teste.csv"
'\u202aH:\\7 - Script\\teste.csv'
>>>

正如您所看到的，在H之前有一个代码点为U-202A的角色。

正如其他人指出的那样，代码点U-202A的角色是LEFT-TO-RIGHT EMBEDDING。回到我们的Python会话：

>>> s = "‪H:\\7 - Script\\teste.csv"
>>> import unicodedata
>>> unicodedata.name(s[0])
'LEFT-TO-RIGHT EMBEDDING'
>>> unicodedata.name(s[1])
'LATIN CAPITAL LETTER H'
>>>

这进一步确认了字符串中的第一个字符不是H，而是非打印LEFT-TO-RIGHT EMBEDDING字符。

我不知道您用来创建程序的文本编辑器。即使我知道，我可能不是该编辑的专家。无论如何，您使用的一些文本编辑器，U + 202A都不知道。

一种解决方案是使用不会插入该字符的文本编辑器，和/或突出显示非打印字符。例如，在vim中，该行显示如下：

carregar_uml("<202a>H:\\7 - Script\\teste.csv", variaveis)

使用此类编辑器，只需删除"和H之间的字符。

carregar_uml("H:\\7 - Script\\teste.csv", variaveis)

即使此行在视觉上与原始行相同，我也删除了有问题的字符。使用此行可避免您报告的OSError。

Answer 2

问题是文件的目录路径未正确读取。使用原始字符串将其作为参数传递，它应该可以工作。

carregar_uml(r'H:\7 - Script\teste.csv', variaveis)

Answer 3

您可以使用此示例代码从文件路径中删除u202a

st="‪‪F:\\somepath\\filename.xlsx"    
data = pd.read_excel(st)

如果我尝试执行此操作，则会出现OSError并详细

Traceback (most recent call last):
  File "F:\CodeRepo\PythonWorkSpace\demo\removepartofstring.py", line 14, in <module>
    data = pd.read_excel(st)
  File "C:\Users\Admin\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\util\_decorators.py", line 188, in wrapper
    return func(*args, **kwargs)
  File "C:\Users\Admin\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\util\_decorators.py", line 188, in wrapper
    return func(*args, **kwargs)
  File "C:\Users\Admin\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\io\excel.py", line 350, in read_excel
    io = ExcelFile(io, engine=engine)
  File "C:\Users\Admin\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\io\excel.py", line 653, in __init__
    self._reader = self._engines[engine](self._io)
  File "C:\Users\Admin\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\io\excel.py", line 424, in __init__
    self.book = xlrd.open_workbook(filepath_or_buffer)
  File "C:\Users\Admin\AppData\Local\Programs\Python\Python37\lib\site-packages\xlrd\__init__.py", line 111, in open_workbook
    with open(filename, "rb") as f:
OSError: [Errno 22] Invalid argument: '\u202aF:\\somepath\\filename.xlsx'

但是如果我这样做的话

    st="‪‪F:\\somepath\\filename.xlsx" 
    data = pd.read_excel(st.strip("‪u202a")) #replace your string here

对我有用

Answer 4

尝试strip（），

def carregar_uml(arquivo, variaveis):
    cadastro_uml = {}
    id_uml = 0

    for i in open(arquivo):
        linha = i.split(",")


carregar_uml("‪H:\\7 - Script\\teste.csv", variaveis)

carregar_uml = carregar_uml.strip("\u202a")

Answer 5

或者您可以切出该字符

file_path = r"‪C:\Test3\Accessing_mdb.txt"
file_path = file_path[1:]
with open(file_path, 'a') as f_obj:
f_obj.write('some words')

Answer 6

在写硬盘驱动器名称时请使用小写字母！不是大字母！

ex）H：->错误例如）h：->没有错误

Answer 7

我尝试了上述所有解决方案。问题是当我们从左边复制路径或任何字符串写入时，会添加额外的字符。它不会显示在我们的 IDE 中。这个额外添加的字符表示从右到左标记 (RLM) https://en.wikipedia.org/wiki/Right-to-left_mark ，即您在从右向左复制时选择了文本。

检查链接到我的答案的图像。我也确实尝试从左到右复制，然后没有添加这个额外的字符。因此，要么手动输入您的路径，要么将其从左到右复制以避免此类问题。

从Python字符串中删除u202a

7 个答案: