我有一个csv文件(包含+1000行,\t
用作分隔符),我想将其作为列表加载到Python中。以下是该文件的前几行:
"col1" "col2" "col3" "col4" "col5" "col6"
1 "01-01-2017 00:00:00" "02-02-2017 00:00:00" "str1" "str3" "str4 åå here comes a few newline characters
"
2 "01-01-2017 00:00:00" "02-02-2017 00:00:00" "str2" "str3" "str5 åasg here comes more newlines
"
如您所见,字符串往往包含许多换行符。有没有办法去除所有换行符的字符串,然后创建一个包含所有行的列表?
我的尝试:基于此thread,这是我的尝试:
import csv
with open('test.dat') as csvDataFile:
csvReader = csv.reader(csvDataFile, delimiter="\t")
for i in csvReader:
print(list(map(str.strip,i)))
然而,这并没有剥离任何东西。
答案 0 :(得分:0)
从列表中删除换行符(" \ n")的示例代码段
a = ['\n', "a", "b", "c", "\n"]
def remNL(l):
return [i for i in l if i != "\n"]
print filter(remNL, a)
在你的情况下
print(filter(remNL,i))
答案 1 :(得分:0)
您可以使用正则表达式查找所有重复的\n
个字符,然后将其从输入文本中删除。
import re # The module for regular expressions
input = """ The text from the csv file """
# Find all the repeated \n chars in input and replace them with ""
# Take the first element as the function returns a tuple with the
# new string and the number of subs made
stripedInput = re.subn(r"\n{2,}", "", input)[0]
我们现在拥有csv文件文本,没有任何重复的\n
个字符。然后可以通过
rows = stripedInput.split("\n")
如果您希望拆分成列,则可以执行
for i in range(len(rows)):
rows[i] = rows[i].split("\t")