我第一次使用jupyter笔记本。我试图将一列csv分组并得到值的计数。我用这段代码得到了以下结果。
import pandas
pandas.read_csv('a.csv', sep=',')
df.groupby('name').name.count()
name
>Aa</TOPONYM> 4
>Aachen</TOPONYM> 5
>Aartselaar</TOPONYM> 1
>Abadan</TOPONYM> 1
>Abaya</TOPONYM> 1
>Abba</TOPONYM> 12
>Abbey 2
>Abbeydale</TOPONYM> 1
>Abbot</TOPONYM> 2
>Abbots 3
>Abbotsford</TOPONYM> 22
>Abbotsinch</TOPONYM> 5
>Abbotts 1
>Abel</TOPONYM> 1
>Aberchirder</TOPONYM> 2
>Aberdare</TOPONYM> 3
>Aberdeen 1
>Aberdeen</TOPONYM> 163
>Aberdeenshire</TOPONYM> 286
>Aberdour</TOPONYM> 9
>Aberfan</TOPONYM> 1
>Aberfeldy</TOPONYM> 16
>Abergavenny</TOPONYM> 4
>Aberlady 1
>Aberlady</TOPONYM> 3
>Abernethy</TOPONYM> 1
>Abertay 1
>Abertillery</TOPONYM> 6
>Abha</TOPONYM> 2
>Abidjan</TOPONYM> 10
...
>Zakho</TOPONYM> 20
>Zakopane</TOPONYM> 1
>Zambezi 2
>Zambezi</TOPONYM> 8
>Zambia</TOPONYM> 19
>Zamboanga</TOPONYM> 4
>Zandak</TOPONYM> 3
>Zanzibar</TOPONYM> 11
>Zaragosa</TOPONYM> 1
>Zaragoza</TOPONYM> 4
>Zeebrugge</TOPONYM> 28
>Zeeland</TOPONYM> 2
>Zemun</TOPONYM> 1
>Zenica</TOPONYM> 12
>Zermatt</TOPONYM> 5
>Zetland</TOPONYM> 1
>Zhizhong</TOPONYM> 1
>Zhongshan</TOPONYM> 2
>Zhuhai</TOPONYM> 1
>Zimbabwe</TOPONYM> 377
>Znamenskoye</TOPONYM> 1
>Zoetermeer</TOPONYM> 1
>Zola</TOPONYM> 1
>Zomba</TOPONYM> 3
>Zulu</TOPONYM> 1
>Zululand</TOPONYM> 2
>Zuni</TOPONYM> 2
>Zurich</TOPONYM> 86
>Zvornik</TOPONYM> 3
>Zwolle</TOPONYM> 1
Name: name, Length: 8585, dtype: int64
是否有可能通过字母表获取计数字母,首先我应该使用字母a运行命令,它应该给出所有值,然后是下一个b,依此类推。或者如果可以跳过开始100个值的值。
我的真实数据如下:
<TOPONYM geonameid="2657540" lat="51.24827" lon="-0.76389" >Aldershot</TOPONYM>
<TOPONYM geonameid="3037854" lat="49.9" lon="2.3" >Amiens</TOPONYM>
<TOPONYM geonameid="6216857" lat="-43.59832" lon="171.55011" >Alaska</TOPONYM>
<TOPONYM geonameid="3037854" lat="49.9" lon="2.3" >Amiens</TOPONYM>
<TOPONYM geonameid="2759794" lat="52.37403" lon="4.88969" >Amsterdam</TOPONYM>
<TOPONYM geonameid="7216668" lat="28.0106" lon="-82.1184" >Alabama</TOPONYM>
<TOPONYM geonameid="5884078" lat="48.98339" lon="-73.34907" >Ally</TOPONYM>
<TOPONYM geonameid="2507480" lat="36.7525" lon="3.04197" >Algiers</TOPONYM>
<TOPONYM geonameid="2759794" lat="52.37403" lon="4.88969" >Amsterdam</TOPONYM>
<TOPONYM geonameid="2759794" lat="52.37403" lon="4.88969" >Amsterdam</TOPONYM>
答案 0 :(得分:1)
您可以使用1st XML example is
<Image>
<Name> Image01.jpg </Name>
</Image>
2nd XML example is
<Image>
<Name> Image01.jpg </Name>
<Path> C:/Image01.jpg </Path>
</Image>
public class Image{
String name;
String path;
}
选择第一个字母,然后使用value_counts
:
str[1]
第二个字母groupby
的另一个解决方案:
df = pandas.read_csv('a.csv')
a = df['name'].str[0].value_counts().rename_axis('alph').reset_index(name='count')
a = df['name'].groupby(df['name'].str[0]).count().reset_index(name='count')