XML转换为CSV,但父标签下的标签相同

时间:2018-12-30 12:36:50

标签: python excel xml

我有一个类似的XML文件,并尝试使用xml2csv python库将其转换为CSV。但是,有一个 image标签破坏了一切。我想在不同的列上获取所有标签。我该如何实现?

谢谢

const SideMenuDrawer = createDrawerNavigator({
  Main: MainScreen,
  'Invite A Friend Screen': InviteAFriendScreen,
  About: AboutScreen,
  Schedule: ScheduleScreen,
  Groups: GroupsScreen
},
  {
    navigationOptions: ({ navigation }) => {
      const { routeName } = navigation.state.routes[navigation.state.index];
      return {
        headerTitle: routeName,
        headerTintColor: '#ffffff',
        headerStyle: {
          backgroundColor: '#2F95D6',
          borderBottomColor: '#ffffff',
          borderBottomWidth: 3,
        },
        headerTitleStyle: {
          fontSize: 18
        }
      }
    }
  }
);

const AppStack = createStackNavigator({
  MainScreen: SideMenuDrawer,
  PhotoScreen: PhotoScreen,
  DocumentScreen: DocumentScreen,
  AudioScreen: AudioScreen,
  GalleryScreen: GalleryScreen

},
  {
    defaultNavigationOptions: ({ navigation }) => {
      return {
        headerLeft: (
          <Icon
            name="md-menu"
            size={35}
            style={{ paddingLeft: 10 }}
            color="white"
            onPress={() => navigation.openDrawer()}
          />
        )
      }
    }
  }
);



const AppContainer = createAppContainer(createSwitchNavigator(
  {
    LoginSplashScreen: LoginSplashScreen,
    MainScreen: AppStack
  },
  {
    initialRouteName: 'MainScreen',
  }
));

export default class App extends React.Component {
  render() {
    return (
      <AppContainer />
    )
  }
}

2 个答案:

答案 0 :(得分:0)

您可能已经猜到了,问题是因为每个product节点都有多个img_item标记,xml2csv不知道如何处理(并且遍历其文档,似乎可以让它知道如何处理这些节点)。

但是,您可以使用内置的csv模块轻松地做到这一点。您只需要决定如何定界不同图像的URL。在下面的示例中,我决定使用;(显然,除非您为列使用另一个定界符,否则您不能使用,

还请注意,我对标题进行了硬编码。可以很容易地更改它,以便从product节点的子元素中动态检测标头。

import csv
import xml.etree.ElementTree as ET

string = '''<products>
    <product>
        <code>722</code>
        <ws_code>B515C16CRU</ws_code>
        <supplier_code>B515C16CRU</supplier_code>
        <images>
            <img_item type_name="">https://www.apparel.com.tr/stance-corap-cruker-grey-orap-stance-ankle-bters-3378-72-B.jpg</img_item>
            <img_item type_name="">https://www.apparel.com.tr/stance-corap-cruker-grey-orap-stance-ankle-bters-3379-72-B.jpg</img_item>
            <img_item type_name="">https://www.apparel.com.tr/stance-corap-cruker-grey-orap-stance-ankle-bters-3380-72-B.jpg</img_item>
        </images>
    </product>
</products>'''

root = ET.fromstring(string)

headers = ('code', 'ws_code', 'supplier_code', 'images')

with open('test.csv', 'w', newline='') as f:
    writer = csv.DictWriter(f, fieldnames=headers)
    writer.writeheader()
    for product in root.iter('product'):
        writer.writerow({'code': product.find('code').text,
                         'ws_code': product.find('ws_code').text,
                         'supplier_code': product.find('supplier_code').text,
                         'images': ';'.join(img.text for img in product.iter('img_item'))})

哪个会生成以下CSV:

code,ws_code,supplier_code,images
722,B515C16CRU,B515C16CRU,https://www.apparel.com.tr/stance-corap-cruker-grey-orap-stance-ankle-bters-3378-72-B.jpg;https://www.apparel.com.tr/stance-corap-cruker-grey-orap-stance-ankle-bters-3379-72-B.jpg;https://www.apparel.com.tr/stance-corap-cruker-grey-orap-stance-ankle-bters-3380-72-B.jpg

答案 1 :(得分:-1)

import xml.etree.ElementTree as ET
import csv
import re

class xml_to_csv:
def do(self):
   #self.xml_file_location = input("Enter full path of XML file(Eg =                   D:\programs\ResidentData.xml) : ")
   self.tree = ET.parse("urunler-fotolu.xml")
   self.root = self.tree.getroot()
   self.csv_file_location = input("Enter full path to store CSV file(Eg = D:\programs\csv_file.csv ) : ")
   self.csv_data = open(self.csv_file_location, 'w')
   self.csv_writer = csv.writer(self.csv_data)
   self.find_records(self.root)

def find_attributes(self,record):
   temp = []
   dont_do = 0
   for j in record:
       temp = temp + self.find_attributes(j)
       dont_do = 1
   if(dont_do == 0):
       return [record.text]
   return temp

def find_records(self,root1):
    for i in root1:
        csv_record = self.find_attributes(i)

        sz = len(csv_record)
        i=0
        while (i<sz):
            if csv_record[i][0] == '\n':
                 csv_record[i] = csv_record[i][1:len(csv_record[i])-1]
            i = i+1;
        print(csv_record)
        self.csv_writer.writerow(csv_record)


if __name__ == "__main__":
    obj = xml_to_csv()
    obj.do()

输入:

For this = """
     <State>
       <Resident Id="100">
          <Name>Sample Name</Name>
          <PhoneNumber>1234567891</PhoneNumber>
          <EmailAddress>sample_name@example.com</EmailAddress
          <Address>
                        <StreetLine1>Street Line1</StreetLine1>
                        <City>City Name</City>
                        <StateCode>AE</StateCode>
                        <PostalCode>12345</PostalCode>
          </Address>
     </Resident>
     </State>
"""

输出:

  ['Sample Name', '1234567891', 'sample_name@example.com', 'Street Line1', 'City Name', 'AE', '12345']