Question

我正在学习如何在python中使用pandas来操作数据。我得到了以下脚本：

import pandas as pd

df = pd.read_table( "t.txt" )    #read in the file
df.columns = [x.strip() for x in df.columns]   #strip spaces in headers
df = df.query('TLD == ".biz"')     #select the rows where TLD == ".biz"
df.to_csv('t.txt', sep='\t')  #write the output to a tab-separated file

但输出文件没有记录，只有标题。当我使用

检查时

print.df

在选择之前，输出为：

             TLD  Length                                              Words  \
0       .biz           5                                                ...   
1       .biz           4                                                ...   
2       .biz           5                                                ...   
3       .biz           5                                                ...   
4       .biz           3                                                ...   
5       .biz           3                                                ...   
6       .biz           6                                                ...

所以我知道列TLD的行包含.biz值。我也尝试过：

>>> print(df.loc[df['TLD'] == '.biz'])

但结果是

Empty DataFrame

使用我的列列表

我做错了什么？

Answer 1

似乎有一些空格，所以需要通过strip删除它们：

print(df.loc[df['TLD'].str.strip() == '.biz'])

df['TLD'] = df['TLD'].str.strip()
df = df.query('TLD == ".biz"')

在python中使用pandas时，df.query会产生空结果

1 个答案: