我的DataFrame正在考虑这个:
indeed.fr
11.41%
career2.successfactors.eu
8.53%
37.16%
pracuj.pl
7.40%
80.42%
corporate.danone.com.br
6.64%
indeed.com.br
4.68%
61.73%
因此,我想只保留第一个%,如下所示:
indeed.fr
11.41%
career2.successfactors.eu
8.53%
pracuj.pl
7.40%
corporate.danone.com.br
6.64%
indeed.com.br
4.68%
所有行都是字符串等等我不知道是否可以删除条件,例如上一行包含%?
有什么想法吗?
谢谢你的时间!
mydata =['indeed.fr','11.41%','career2.successfactors.eu','8.53%','37.16%','pracuj.pl','7.40%','80.42%','corporate.danone.com.br','6.64%','indeed.com.br','4.68%','61.73%']
df=pd.DataFrame(mydata)
最后,我想要这个:
答案 0 :(得分:1)
mydata =['indeed.fr','11.41%','career2.successfactors.eu','8.53%','37.16%','pracuj.pl','7.40%','80.42%','corporate.danone.com.br','6.64%','indeed.com.br','4.68%','61.73%']
df = pd.DataFrame(mydata)
您创建的样本。
解决方案如下
rowList = []
row = []
#Variable to keep track of the number of times I see the percentage value
percentVal = 0
for i in df.index:
if(df.at[i, 0][0] not in set('0123456789')):
row.append(df.at[i, 0])
percentVal = 0
else:
percentVal += 1
if(percentVal != 2):
row.append(df.at[i, 0])
rowList.append(row)
row = []
else:
#If percentVal == 2, that means, I have seen my second percentage value and I'm going to skip it.
print("Skipping {}".format(df.at[i, 0]))
row = []
yourSol = pd.DataFrame(rowList)
yourSol.columns = ['Incoming Referal Sources', 'Value (%)']
print(yourSol)