Pandas groupby与分隔符连接

时间:2017-06-05 12:03:24

标签: python-3.x pandas pandas-groupby

我尝试使用groupby对具有多个值的行进行分组。

col val
A  Cat
A  Tiger
B  Ball
B  Bat

import pandas as pd
df = pd.read_csv("Inputfile.txt", sep='\t')
group = df.groupby(['col'])['val'].sum()

我得到了

A CatTiger
B BallBat

我想引入一个分隔符,以便我的输出看起来像

A Cat-Tiger
B Ball-Bat

我试过了,

group = df.groupby(['col'])['val'].sum().apply(lambda x: '-'.join(x))

这已经屈服了,

A C-a-t-T-i-g-e-r
B B-a-l-l-B-a-t

这里有什么问题?

谢谢,

AP

2 个答案:

答案 0 :(得分:4)

或者你也可以这样做:

<?xml version="1.0" encoding="utf-8"?>
<xs:schema targetNamespace="http://tempuri.org/XMLSchema.xsd"
    elementFormDefault="qualified"
    xmlns="http://tempuri.org/XMLSchema.xsd"
    xmlns:mstns="http://tempuri.org/XMLSchema.xsd"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    xmlns:vc="http://www.w3.org/2007/XMLSchema-versioning"
    vc:minVersion="1.1">
  <xs:element name="AddressList" >
    <xs:complexType>
      <xs:sequence>
        <xs:element name="Address" minOccurs="1" maxOccurs="unbounded">
          <xs:complexType>
            <xs:sequence minOccurs="1" maxOccurs="1">
              <xs:element name="AddressType" type="AddressTypeTag"/>
              <xs:element name="AddressValue" type="USAddressValue">
                <xs:alternative test="../AddressType/@fixed='CanadianAddress'" type="CanadianAddressValue"/>
                <xs:alternative test="../AddressType/@fixed='USAddress'" type="USAddressValue"/>
              </xs:element>
            </xs:sequence>
          </xs:complexType>
        </xs:element>
      </xs:sequence>
    </xs:complexType>
  </xs:element>

  <xs:complexType name="CanadianAddressValue">
    <xs:sequence minOccurs="1" maxOccurs="1">
      <xs:element name="Street" type="StreetValue"/>
      <xs:element name="Province" type="TypeAndValue"/>
      <xs:element name="PostalCode" type="TypeAndValue"/>
    </xs:sequence>
  </xs:complexType>
  <xs:complexType name="USAddressValue">
    <xs:sequence minOccurs="1" maxOccurs="1">
      <xs:element name="Street" type="StreetValue"/>
      <xs:element name="State" type="TypeAndValue"/>
      <xs:element name="ZipCode" type="TypeAndValue"/>
    </xs:sequence>
  </xs:complexType>
  <xs:complexType name="StreetValue">
    <xs:simpleContent>
      <xs:extension base="xs:string">
        <xs:attribute name="type" type="xs:string"/>
      </xs:extension>
    </xs:simpleContent>
  </xs:complexType>
  <xs:complexType name="TypeAndValue">
    <xs:attribute name="type" type="xs:string"/>
    <xs:attribute name="value" type="xs:string"/>
  </xs:complexType>
  <xs:complexType name="AddressTypeTag">
    <xs:attribute name="type" fixed="addressEnum" type="xs:string" />
    <xs:attribute name="fixed" >
      <xs:simpleType>
        <xs:restriction base="xs:string">
          <xs:enumeration value="CanadianAddress"></xs:enumeration>
          <xs:enumeration value="USAddress"></xs:enumeration>
        </xs:restriction>
      </xs:simpleType>
    </xs:attribute>
  </xs:complexType>
</xs:schema>

答案 1 :(得分:1)

试试

group = df.groupby(['col'])['val'].apply(lambda x: '-'.join(x))