Splitting String with Multiple Delimiters in a Particular Order

时间:2017-11-08 22:09:27

标签: python string split

I am dealing with a type of ASCII file where there are effectively 4 columns of data and the each row is assigned to a line in the file. Below is an example of a row of data from this file

'STOP.F 11966.0000:STOP DEPTH'

The data is always structured so that the delimiter between the first and second column is a period, the delimiter between the second and third column is a space and the delimiter between the third and fourth column is a colon.

Ideally, I would like to find a way to return the following result from the string above

['STOP', 'F', '11966.0000', 'STOP DEPTH']

I tried using a regular expression with the period, space and colon as delimiters, but it breaks down (see example below) because I don't know how to specify the specific order in which to split the string, and I don't know if there is a way to specify the maximum number of splits per delimiter right in the regular expression itself. I want it to split the delimiters in the specific order and each delimiter a maximum of 1 time.

import re
line = 'STOP.F 11966.0000:STOP DEPTH'
re.split("[. :]", line)
>>> ['STOP', 'F', '11966', '0000', 'STOP', 'DEPTH']

Any suggestions on a tidy way to do this?

2 个答案:

答案 0 :(得分:1)

这可能有用。归功于Juan

import re
pattern = re.compile(r'^(.+)\.(.+) (.+):(.+)$')
line = 'STOP.F 11966.0000:STOP DEPTH'
pattern.search(line).groups()
Out[6]: ('STOP', 'F', '11966.0000', 'STOP DEPTH')

答案 1 :(得分:0)

具有特定正则表达式模式的

/** * NPM Module dependencies. */ const express = require('express'); const photoRoute = express.Router(); const multer = require('multer'); var storage = multer.memoryStorage() var upload = multer({ storage: storage, limits: { fields: 1, fileSize: 6000000, files: 1, parts: 2 }}); const mongodb = require('mongodb'); const MongoClient = require('mongodb').MongoClient; const ObjectID = require('mongodb').ObjectID; let db; /** * NodeJS Module dependencies. */ const { Readable } = require('stream'); /** * Create Express server && Routes configuration. */ const app = express(); app.use('/photos', photoRoute); /** * Connect Mongo Driver to MongoDB. */ MongoClient.connect('mongodb://localhost/photoDB', (err, database) => { if (err) { console.log('MongoDB Connection Error. Please make sure that MongoDB is running.'); process.exit(1); } db = database; }); /** * GET photo by ID Route */ photoRoute.get('/:photoID', (req, res) => { try { var photoID = new ObjectID(req.params.photoID); } catch(err) { return res.status(400).json({ message: "Invalid PhotoID in URL parameter. Must be a single String of 12 bytes or a string of 24 hex characters" }); } let bucket = new mongodb.GridFSBucket(db, { bucketName: 'photos' }); let downloadStream = bucket.openDownloadStream(photoID); downloadStream.on('data', (chunk) => { res.write(chunk); }); downloadStream.on('error', () => { res.sendStatus(404); }); downloadStream.on('end', () => { res.end(); }); }); /** * POST photo Route */ photoRoute.post('/', (req, res) => { upload.single('photo')(req, res, (err) => { if (err) { return res.status(400).json({ message: "Upload Request Validation Failed" }); } else if(!req.body.name) { return res.status(400).json({ message: "No photo name in request body" }); } let photoName = req.body.name; // Covert buffer to Readable Stream const readablePhotoStream = new Readable(); readablePhotoStream.push(req.file.buffer); readablePhotoStream.push(null); let bucket = new mongodb.GridFSBucket(db, { bucketName: 'photos' }); let uploadStream = bucket.openUploadStream(photoName); let id = uploadStream.id; readablePhotoStream.pipe(uploadStream); uploadStream.on('error', () => { return res.status(500).json({ message: "Error uploading file" }); }); uploadStream.on('finish', () => { return res.status(201).json({ message: "File uploaded successfully, stored under Mongo ObjectID: " + id }); }); }); }); app.listen(3005, () => { console.log("App listening on port 3005!"); }); 解决方案:

re.split()

输出:

import re

s = 'STOP.F 11966.0000:STOP DEPTH'
result = re.split(r'(?<=^[^.]+)\.|(?<=^[^ ]+) |:', s)

print(result)