我有一个大数据集,因为那三列中有三列,我必须按第一列对数据进行分组,并使用第一列的频率来绘制线图和密度图形。此图中计数了1600个值。
一些数据是
Search keyword Campaign ID total_ctr
"2 +bhk +flat +in +bangalore 653435194 0.00
"2 +bhk +flat +in +bangalore 653435194 0.00
"2 +bhk +flat +in +bangalore 653435194 0.00
"2 +bhk +flat +in +bangalore 653435194 0.00
"2 +bhk +flat +in +bangalore 653435194 0.00
"2 +bhk +flat +in +bangalore 653435194 0.00
"2 +bhk +flat +in +bangalore 653435194 0.00
"2 +bhk +flat +in +bangalore 653435194 0.00
"2 +bhk +flat +in +bangalore 653435194 15.38
"2 +bhk +flat +in +bangalore 653435194 0.00
"2 +bhk +flat +in +bangalore 653435194 0.0
+bedroom +apartment +in +bangalore 1155466985 0.00
+1 +bedroom +apartment +in +bangalore 1155466985 0.00
+1 +bedroom +apartment +in +bangalore 1155466985 0.00
+1 +bedroom +apartment +in +bangalore 1155466985 100.00
+1 +bedroom +apartment +in +bangalore 1155466985 0.00
+1 +bedroom +apartment +in +bangalore 1155466985 0.00
像这样,数据集用于22,200行,并且有1600个搜索关键字具有total_ctr和campID的不同组合。
一些他们的频率是
Campaign ID total_ctr
Search keyword
"2 +bhk +flat +in +bangalore 24 24
+1 +bhk +flat +in +bangalore 89 89
+1 +bhk +flat +near +manyata tech park 23 23
+1 +bhk +flat +price +in +bangalore 15 15
+1 +bhk +flat +sale +bangalore 9 9
+1 +bhk +flats +bangalore 52 52
+1 +bhk +for +sale +in +bangalore 76 76
+1 +bhk +house +for +sale +in +bangalore20 20
+1 +bhk +in +bangalore +sale 61 61
+1 +bhk +in +north +bangalore 36 36
+1 +bhk +near +airport 1 1
+1 +bhk +north +bangalore 8 8
+1bhk +apartment +in +bangalore 53 53
+1bhk +apartments +bangalore 9 9
+1bhk +bangalore 118 118
+1bhk +flat +bangalore 26 26
+1bhk +flats +bangalore 107 107
+1bhk +near +airport 4 4
+2 +3 +bhk +flats in +bangalore 50 50
从这个频率开始,我想在线图和密度图中绘制1600个图
答案 0 :(得分:1)
for i in df['Search keyword'].unique():
xxx = df[df['Search keyword']==i]['total_ctr']
jj =len(xxx)
if jj>>29:
print(jj)
plt.plot(xxx)
plt.title(i)
plt.show()
sns.kdeplot(xxx)
plt.title(i + 'density')
plt.show()
plt.savefig('books_read.pdf')
认为这将适用于Python熊猫