如何从python中的列表中删除\ n和\ t?

时间:2016-12-20 12:12:39

标签: python lambda web-scraping

我一直试图从python中的2个列表中删除\ n和\ t但是却无法这样做。以下是我的代码:

A=[]
B=[]
C=[]
D=[]
.................
.................

df=pd.DataFrame(A,columns=['Rank'])
df['Company Name']=B
C=list(filter(lambda x: x != '\n', C))
C=list(filter(lambda x: x != '\t', C))
df['Type of organization']=C
D=list(filter(lambda x: x != '\n', D))
D=list(filter(lambda x: x != '\t', D))
df['Industry']=D


writer = pd.ExcelWriter('compdata.xlsx', engine='xlsxwriter')
df.to_excel(writer, index=False, sheet_name='report')
writer.save()

请帮助我,因为我也尝试过lambda,但无济于事。每次我将数据框导出到Excel时,它都会在这两列中为我提供大量空间。

以下是C和D的外观:

  Rank                 Company Name  \
0   1.            Google (Alphabet)   
1   2.                       ACUITY   
2   3.  The Boston Consulting Group   
3   4.         Wegmans Food Markets   

                                Type of organization  \
0  \n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t...   
1  \n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t...   
2  \n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t...   
3  \n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t...   

                                            Industry  
0  \n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t...  
1  \n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t...  
2  \n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t...  
3  \n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t...  

实际上,组织类型应该具有" Public"或者"私人"。只是在上述任何一种选择之前,列表中有很多\ n和\ t。

2 个答案:

答案 0 :(得分:3)

尝试以下方法:

使用list comprehension

my_list = ['a', 'b', '\n', 'c', '\t', 'd']

my_list = [item for item in my_list if item not in ['\n', '\t']]

使用filter()

my_list = filter(lambda item: item not in ['\n', '\t'], my_list)

修改

对于添加到问题的输入示例,您可以删除\t\n,如下所示:

c = [''.join(item.split()) for item in c]

<强>输出:

>>> c = ['\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t‌​\t\t\t\t\t\t\t\t\tPu‌​blic', '\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\‌​t\t\t\t\t\t\t\t\tPri‌​vate', '\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\‌​t\t\t\t\t\t\t\t\tPri‌​vate', '\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\‌​t\t\t\t\t\t\t\t\tPri‌​vate']
>>>
>>> [''.join(item.split()) for item in c]
['??Pu??blic', '\\??tPri??vate', '\\??tPri??vate', '\\??tPri??vate']

答案 1 :(得分:1)

试试这个:

C = [ x.replace('\t', '').replace('\n', '') for x in C ]

D

相同