Question

我有一个大约1.5 GB的日志文件。该文件包含以下格式的日志数据：

A|B|C|D delimited by '|' character and does not have column names. It has only 
four columns

如何将其解析为python 3.6，然后将其导出到.csv文件并添加用户定义的列名。如何在导出到.csv文件时划分行数。

我已经开始编写如下代码，但不知道如何继续进行：

import re
import pandas as pd
from pandas import ExcelWriter

infile = r"D:\Sys\file.log"

df = pd.DataFrame()
with open(infile,encoding="ISO-8859-1") as f:
   f = f.readlines()

for line in f:
   print(line)

我可以使用print语句检查行。

Answer 1

这是一个不涉及熊猫的解决方案：

import csv

with open(r"D:\Sys\file.log", encoding="ISO-8859-1") as f, open('logfile.csv', 'w') as f2: # or 'wb' if on python2
    writer = csv.writer(f2)
    writer.writerow(['Index', 'A', 'B', 'C', 'D']) # replace with your custom column header

    i = 0
    for line in f:
        writer.writerow([i] + line.rstrip().split('|'))
        i += 1
        if i == 10000:
            break

使用csv.writer以csv格式将数据写入文件。

Answer 2

由于您的日志格式良好，您可以使用@Override public void onLeScan(final BluetoothDevice device, final int rssi, final byte[] scanRecord) { runOnUiThread(new Runnable() { @Override public void run() { ArrayList<BeaconParser> beaconParsers = new ArrayList<BeaconParser>(); beaconParsers.add(new BeaconParser().setBeaconLayout(ALTBEACON_LAYOUT)); Log.d("Scanned count ====",scanRecord.length+""); String lRawdata=""; BLE ble=new BLE(); temp_flag=0; //int url_flag=0; String deviceName=device.getName(); String deviceAddress=device.getAddress(); Log.d("DEVICE NAME",device.toString()); Log.d("Address",deviceAddress); int startByte = 2; boolean patternFound = false; while (startByte <= 5) { if ( ((int) scanRecord[startByte + 2] & 0xff) == 0x02 && //Identifies an iBeacon ((int) scanRecord[startByte + 3] & 0xff) == 0x15) { //Identifies correct data length patternFound = true; break; } startByte++; } if (patternFound) { //Convert to hex String byte[] uuidBytes = new byte[16]; System.arraycopy(scanRecord, startByte + 4, uuidBytes, 0, 16); String hexString = bytesToHex(uuidBytes); //UUID detection uuid = hexString.substring(0,8) + "-" + hexString.substring(8,12) + "-" + hexString.substring(12,16) + "-" + hexString.substring(16,20) + "-" + hexString.substring(20,32); // major int majorValue = (scanRecord[startByte + 20] & 0xff) * 0x100 + (scanRecord[startByte + 21] & 0xff); // minor int minorValue = (scanRecord[startByte + 22] & 0xff) * 0x100 + (scanRecord[startByte + 23] & 0xff); major = majorValue+""; minor = minorValue+""; Log.i("DATA OF DEVICE","UUID: " +uuid + "nmajor: " +major +"nminor" +minor); } // if(tcz_flag1==1 || tcz_flag2==1 || tcz_flag3==1) { ble.setDeviceName(deviceName); ble.setDeviceAddress(deviceAddress); ble.setRssi(String.valueOf(rssi)); ble.setUuid(uuid); ble.setMajor(major); ble.setMinor(minor); // ble.setNamespaceid(namespaceid); // ble.setInstanceid(instanceid); int flag = 0; int index = 0; if (bleArrayList.size() > 0) { for (BLE b : bleArrayList) { if (deviceName != "TCZ") { if (b.getDeviceAddress().equals(ble.getDeviceAddress())) { flag = 1; bleArrayList.set(index, ble); } index++; } } } if (flag == 0) { bleArrayList.add(ble); } if (bleAdapter == null) { bleAdapter = new BLEAdapter(getApplicationContext(), bleArrayList); listView.setAdapter(bleAdapter); } Comparator<BLE> bleArraylistComparator = new Comparator<BLE>() { @Override public int compare(BLE lhs, BLE rhs) { String strRssi = lhs.getRssi(); String strRssi2 = rhs.getRssi(); return strRssi.compareToIgnoreCase(strRssi2); } }; Collections.sort(bleArrayList, bleArraylistComparator); bleAdapter.notifyDataSetChanged(); // } } }); } };来阅读它。然后重写它。您可以使用csv.reader编写它。

csv.writer

REFFERENCE：
Reader Objects
Writer Objects

Answer 3

使用pandas，您可以使用以下代码实现您想要的目标：

import pandas
chunk_size = 10000
custom_headers = ["custheader-1", "custheader-2", "custheader-3", "custheader-4"]
reader = pandas.read_table("log.txt", sep="|", header=None, chunksize=chunk_size)
for index, chunk in enumerate(reader):
    if (index == 0):
        chunk.to_csv("out.csv", index=False, sep="|", header=custom_headers)
    else:
        chunk.to_csv("out.csv", index=False, mode='a', sep="|", header=False)

在Python中解析日志文件并将其保存到csv

3 个答案: