我有一个大约1.5 GB的日志文件。 该文件包含以下格式的日志数据:
A|B|C|D delimited by '|' character and does not have column names. It has only
four columns
如何将其解析为python 3.6,然后将其导出到.csv文件并添加用户定义的列名。 如何在导出到.csv文件时划分行数。
我已经开始编写如下代码,但不知道如何继续进行:
import re
import pandas as pd
from pandas import ExcelWriter
infile = r"D:\Sys\file.log"
df = pd.DataFrame()
with open(infile,encoding="ISO-8859-1") as f:
f = f.readlines()
for line in f:
print(line)
我可以使用print语句检查行。
答案 0 :(得分:2)
这是一个不涉及熊猫的解决方案:
import csv
with open(r"D:\Sys\file.log", encoding="ISO-8859-1") as f, open('logfile.csv', 'w') as f2: # or 'wb' if on python2
writer = csv.writer(f2)
writer.writerow(['Index', 'A', 'B', 'C', 'D']) # replace with your custom column header
i = 0
for line in f:
writer.writerow([i] + line.rstrip().split('|'))
i += 1
if i == 10000:
break
使用csv.writer
以csv格式将数据写入文件。
答案 1 :(得分:1)
由于您的日志格式良好,您可以使用 @Override
public void onLeScan(final BluetoothDevice device, final int rssi, final byte[] scanRecord) {
runOnUiThread(new Runnable() {
@Override
public void run() {
ArrayList<BeaconParser> beaconParsers = new ArrayList<BeaconParser>();
beaconParsers.add(new BeaconParser().setBeaconLayout(ALTBEACON_LAYOUT));
Log.d("Scanned count ====",scanRecord.length+"");
String lRawdata="";
BLE ble=new BLE();
temp_flag=0;
//int url_flag=0;
String deviceName=device.getName();
String deviceAddress=device.getAddress();
Log.d("DEVICE NAME",device.toString());
Log.d("Address",deviceAddress);
int startByte = 2;
boolean patternFound = false;
while (startByte <= 5)
{
if ( ((int) scanRecord[startByte + 2] & 0xff) == 0x02 && //Identifies an iBeacon
((int) scanRecord[startByte + 3] & 0xff) == 0x15)
{ //Identifies correct data length
patternFound = true;
break;
}
startByte++;
}
if (patternFound)
{
//Convert to hex String
byte[] uuidBytes = new byte[16];
System.arraycopy(scanRecord, startByte + 4, uuidBytes, 0, 16);
String hexString = bytesToHex(uuidBytes);
//UUID detection
uuid = hexString.substring(0,8) + "-" +
hexString.substring(8,12) + "-" +
hexString.substring(12,16) + "-" +
hexString.substring(16,20) + "-" +
hexString.substring(20,32);
// major
int majorValue = (scanRecord[startByte + 20] & 0xff) * 0x100 + (scanRecord[startByte + 21] & 0xff);
// minor
int minorValue = (scanRecord[startByte + 22] & 0xff) * 0x100 + (scanRecord[startByte + 23] & 0xff);
major = majorValue+"";
minor = minorValue+"";
Log.i("DATA OF DEVICE","UUID: " +uuid + "nmajor: " +major +"nminor" +minor);
}
// if(tcz_flag1==1 || tcz_flag2==1 || tcz_flag3==1) {
ble.setDeviceName(deviceName);
ble.setDeviceAddress(deviceAddress);
ble.setRssi(String.valueOf(rssi));
ble.setUuid(uuid);
ble.setMajor(major);
ble.setMinor(minor);
// ble.setNamespaceid(namespaceid);
// ble.setInstanceid(instanceid);
int flag = 0;
int index = 0;
if (bleArrayList.size() > 0) {
for (BLE b : bleArrayList) {
if (deviceName != "TCZ") {
if (b.getDeviceAddress().equals(ble.getDeviceAddress())) {
flag = 1;
bleArrayList.set(index, ble);
}
index++;
}
}
}
if (flag == 0) {
bleArrayList.add(ble);
}
if (bleAdapter == null) {
bleAdapter = new BLEAdapter(getApplicationContext(), bleArrayList);
listView.setAdapter(bleAdapter);
}
Comparator<BLE> bleArraylistComparator = new Comparator<BLE>() {
@Override
public int compare(BLE lhs, BLE rhs) {
String strRssi = lhs.getRssi();
String strRssi2 = rhs.getRssi();
return strRssi.compareToIgnoreCase(strRssi2);
}
};
Collections.sort(bleArrayList, bleArraylistComparator);
bleAdapter.notifyDataSetChanged();
// }
}
});
}
};
来阅读它。然后重写它。您可以使用csv.reader
编写它。
csv.writer
REFFERENCE:
Reader Objects
Writer Objects
答案 2 :(得分:0)
使用pandas,您可以使用以下代码实现您想要的目标:
import pandas
chunk_size = 10000
custom_headers = ["custheader-1", "custheader-2", "custheader-3", "custheader-4"]
reader = pandas.read_table("log.txt", sep="|", header=None, chunksize=chunk_size)
for index, chunk in enumerate(reader):
if (index == 0):
chunk.to_csv("out.csv", index=False, sep="|", header=custom_headers)
else:
chunk.to_csv("out.csv", index=False, mode='a', sep="|", header=False)