Question

link 1
link 2 *我复制了表格并创建了CSV文件

我需要以线形或条形图的形式绘制文件1的总人口和新泽西州的粘附剂总人口。

我尝试将两个cvs合并在一起，但是很奇怪

import pandas as pd
import matplotlib.pyplot as plt

clifton_data = pd.read_csv('cliftondata2010census.csv')

religion = pd.read_csv('2010_ Top Five States by Adherence Rate - Sheet1.csv')

all_data = clifton_data.append(religion)
all_data.plot()
all_data.plot(kind='line',x='1',y='2') # scatter plot
all_data.plot(kind='density')

我需要绘制文件1的总人口，并与折线图或条形图与新泽西州的粘附剂总数进行比较。

Answer 1

这是一个让您凝视的快速指南。希望对您有所帮助。

从链接2中，您看到

Massachusetts   641     2,940,199   449.05
Rhode Island    159     466,598     443.30
New Jersey      729     3,235,290   367.99
Connecticut     399     1,252,936   350.56
New York        1,630   6,286,916   324.43

复制上面的文本，粘贴并将数据保存到congregation.txt。

链接1断开。但是，假设人口数据如下，

Massachusetts   3,141,270
Rhode Island    530,698
New Jersey      4,335,399
Connecticut     2,134,935
New York        10,366,556

类似地，复制上面的文本，粘贴并将数据保存到population.txt。

然后，您可以运行类似的内容

import pandas as pd
import matplotlib.pyplot as plt

con = pd.read_csv('congregation.txt', sep=r'[ \t]{2,}',header=None, index_col=False,engine='python')
pop = pd.read_csv('population.txt', sep=r'[ \t]{2,}',header=None, index_col=False,engine='python')

#note concat and not append
#con[0] is state, con[2] is congregation, pop[1] is population
#print(con.head()) and print(pop.head()) to visualize if you are still confused
df = pd.concat([con[[0,2]],pop[1]],axis=1)

df.columns = ['State', 'Congregation', 'Population']

#need to do some cleaning here to convert numbers with comma to an integer
df['Congregation'] = df['Congregation'].apply(lambda t: t.replace(',','')).astype(int)
df['Population'] = df['Population'].apply(lambda t: t.replace(',','')).astype(int)

df.set_index('State',inplace=True)

print(df.head())
#at this stage your df looks like this
#               Congregation  Population
#State                                  
#Massachusetts       2940199     3141270
#Rhode Island         466598      530698
#New Jersey          3235290     4335399
#Connecticut         1252936     2134935
#New York            6286916    10366556

输出

注意：为了演示起见，我在这里保留其他状态，否则，如果只是新泽西州，则条形图将显示为空。

ax = df.plot.bar()
plt.show()

编辑：我的意思是“坚持”而不是“会众”。我在那里弄错了。

熊猫导入两个csv文件并绘制特定数据

1 个答案: