Question

我正在尝试读取包含制表符和换行符等的文件，并且数据是JSON格式。

当我使用file.read() / readlines()等阅读时，所有新行和标签也会被阅读。

我试过rstrip()，分裂等但是徒劳，也许我错过了一些东西：

这基本上就是我在做什么：

 f = open('/path/to/file.txt')
 line = f.readlines()
 line.split('\n')

这是数据（包括原始标签，因此格式不佳）：

        {
      "foo": [ {
       "id1" : "1",
   "blah": "blah blah",
       "id2" : "5885221122",
      "bar" : [
              {  
         "name" : "Joe JJ", 
          "info": [                 {
         "custid": "SSN",    
         "type" : "String",             }        ]
        }     ]     }     ]  }

我想知道我们是否可以优雅地忽略它。

也希望使用json.dumps()

Answer 1

如果数据是json，为什么不使用json.load（）？

import json
d = json.load(open('myfile.txt', 'r'))

Answer 2

这个结构来自哪里？节哀顺变。无论如何，作为一个开始你可以试试这个：

cleanedData = re.sub('[\n\t]', '', f.read())

这是一个强力删除换行符和制表符。它返回的内容可能适合送入json.loads。一旦清除了额外的空格和换行符，它将在很大程度上取决于文件的内容是否实际上是有效的JSON。

Answer 3

如果你想遍历每一行，你可以：

for line in open('path/to/file.txt'):
  # Remove whitespace from both ends of line
  line = line.strip()

  # Do whatever you want with line

Answer 4

一点点黑客，我觉得效率低下：

f = open("/path/to/file.txt")
lines = f.read().replace("\n", "").replace("\t", "").replace(" ", "")

print lines

Answer 5

json模块的用法怎么样？

import json

tmp = json.loads(open("/path/to/file.txt", "r"))

output = open("/path/to/file2.txt", "w")
output.write(json.dumps(tmp, sort_keys=True, indent=4))

Answer 6

$ cat foo.json | python -mjson.tool
Expecting property name: line 11 column 41

"type" : "String",中的逗号导致JSON解码器窒息。如果不是那个问题，您可以使用json.load()直接加载文件。

换句话说，您的JSON格式不正确，这意味着您需要在将其提供给json.loads()之前执行替换操作。由于您需要完全将文件读入字符串以执行替换操作，因此请使用json.loads(jsonstr)代替json.load(jsonfilep)：

    >>> import json, re
    >>> jsonfilep = open('foo.json')
    >>> jsonstr = re.sub(r'''(["'0-9.]\s*),\s*}''', r'\1}', jsonfilep.read())
    >>> jsonobj = json.loads(jsonstr)
    >>> jsonstr = json.dumps(jsonobj)
    >>> print(jsonstr)
    {"foo": [{"blah": "blah blah", "id2": "5885221122", "bar": [{"info":
    [{"type": "String", "custid": "SSN"}], "name": "Joe JJ"}], "id1": "1"}]}

我只使用re模块，因为任何值，数字或字符串都可能发生。

如何读取python中的文件，其中包含换行符和制表符？

6 个答案: