为清晰起见而更新:我试图将文件名的第一个匹配值附加到csv文件。我想将fname
中的第一个file_label2
匹配用于将found
值应用于Suggested Label
行。使用github3.py从GitHub检索此信息。
在我下面的代码中,我没有收到错误,但我认为这不是完成第一个文件名匹配的正确方法。
从GitHub返回的示例输出:
PR Number: 123
Login: dbs
Files:
files/file-folder/media/figure01
file_label2 = figure01
files/file-folder/jsfile-to-checkin
file_label2 = jsfile
Suggested Label: Value1
PR Number: 567
Login: dba
Files:
files/file-folder/media/figure01
file_label2 = figure01
files/file-folder/csfile-to-checkin
file_label2 = csfile
Suggested Label: Value2
所需的csv输出:
PR Number, Login, First File Found, Suggested Label
123,dbs,files/file-folder/jsfile-to-checkin, Value1
567,dba,files/file-folder/csfile-to-checkin, Value2
用于在文件拆分后匹配fname前缀的列表:
list1=["jsfile","csfile"]
list2=["css","html"]
代码:
with open(inputFile,'w') as f:
for prs in repo.pull_requests():
getlabels = repo.issue(prs.number).as_dict()
labels = [labels['name'] for labels in getlabels['labels']]
tags = ["Bug", "Blocked", "Investigate"]
enterprisetag = [tagsvalue for tagsvalue in labels if tagsvalue in tags]
found = "No file match"
if enterprisetag:
pass
else:
f.write("PR Number: %s" %getlabels['number'] + '\n' + "Login: %s" %getlabels['user']['login'] + '\n' + "Files: \n")
for data in repo.pull_request(prs.number).files():
fname, extname = os.path.splitext(data.filename)
f.write(fname+'\n')
file_label = fname.rsplit('/',1)[-1]
if file_label.count("-") == 1:
file_label2 = file_label.split("-")[0]
f.write("file_label2: %s" %file_label2 + '\n')
else:
file_label2 = "-".join(file_label.split("-",2)[:2])
f.write("file_label2: %s" %file_label2 + '\n')
if [emlabel for emlabel in list1 if emlabel in file_label2]:
found = "Value1"
break
elif [mk_label for mk_label in list2 if mk_label in file_label2]:
found = "Value2"
break
else:
found = (str(None))
f.write("Suggested Label: %s" %found + '\n')
prNum, login, firstFileFound, label = None,None,None,None
multiLineFlag = False
with open(outputFile, 'w') as w:
w.write("PR Number, Login, First File Found, Suggested Label\n")
for line in open(inputFile):
line = line.strip()
if multiLineFlag and not(firstFileFound):
if line.startswith('file_label') and any(fileType in line for fileType in enterprise_mobility + marketplace + modern_apps + pnp + tdc + tdc_abr + unlock_insights):
firstFileFound = prevLine
multiLineFlag = False
else:
prevLine = line
if not multiLineFlag:
if line.startswith('PR Number: '):
prNum = line[len('PR Number: '):]
elif line.startswith('Login: '):
login = line[len('Login: '):]
elif line.startswith('Suggested Label: '):
label = line[len('Suggested Label: '):]
elif line.startswith('Files:'):
multiLineFlag = True
if all([prNum, login, firstFileFound, label]):
w.write("%s,%s,%s,%s\n" %(prNum, login, firstFileFound, label))
prNum, login, firstFileFound, label = None,None,None,None
答案 0 :(得分:3)
一般的想法是分隔多行或单行的数据,扫描单个属性。一旦找到所有这些,你就会重新开始下一条记录。
prNum, login, firstFileFound, label = None,None,None,None
multiLineFlag = False
list1 = ["jsfile","csfile"]
inputFile = '' # Provide your input filename here
outputFile = '' # Provide your output filename here
labelFound = False
with open(outputFile, 'w') as w:
w.write("PR Number, Login, First File Found, Suggested Label\n")
for line in open(inputFile):
line = line.strip()
if multiLineFlag and not(firstFileFound):
if line.startswith('file_label') and any(fileType in line for fileType in list1):
firstFileFound = prevLine
multiLineFlag = False
else:
prevLine = line
if not multiLineFlag:
if line.startswith('PR Number:'):
prNum = line[len('PR Number: '):]
elif line.startswith('Login:'):
login = line[len('Login: '):]
elif line.startswith('Suggested Label:'):
labelFound = True
label = line[len('Suggested Label: '):]
print "label is %s "%label
elif line.startswith('Files:'):
multiLineFlag = True
if all([prNum, login, firstFileFound, labelFound]):
w.write("%s,%s,%s,%s\n" %(prNum, login, firstFileFound, label))
prNum, login, firstFileFound, label = None,None,None,None
labelFound=False
如果有关您的数据的一些假设属实,则以下内容将起作用。
因此,输入文件看起来像:
PR编号:123
登录:dbs
文件:
文件/文件夹/媒体/ figure01
file_label2 = figure01
文件/文件夹/ jsfile到签
file_label2 = jsfile
建议标签:价值1 公关编号:423
登录:ddo
文件:
文件/文件夹/媒体/ figure01
file_label2 = figure01
文件/文件夹/ csfile2到签
file_label2 = csfile
建议标签:
公关编号:567
登录:dba
文件:
文件/文件夹/媒体/ figure01
file_label2 = figure01
文件/文件夹/ csfile到签
file_label2 = csfile
推荐标签:Value2
这将返回:
公关号码,登录,找到第一个档案,建议标签
123,dbs,files / file-folder / jsfile-to-checkin,Value1
423,DDO,文件/文件夹/ csfile2到签,
567,dba,files / file-folder / csfile-to-checkin,Value2
可能需要进行调整以覆盖边缘条件。
答案 1 :(得分:1)
你没有提到你的剧本有什么错误。我注意到您发布的代码中有两个可能的错误:
for循环内部for data in repo.pull_request(prs.number).files():
if [emlabel for emlabel in list1 if emlabel in file_label2]:
found = "Value1"
此处file_label2
应为字符串,emlabel
也是字符串,因此我认为您需要的是' =='这里:
if [emlabel for emlabel in list1 if emlabel == file_label2]:
当您尝试附加文件名时:
str_to_list = [x.split(" ") for x in fname.split(" ")]
row.append(str_to_list[0])
在这里你可能会得到一个嵌套列表str_to_list=[['your/file/name']]
。这是你期望的吗?
您在代码中未解释的另一件事是参数repo
。它从何而来?它是从其他脚本中获得的,还是需要解析文本文件才能获得它?
请以更简洁明了的方式解释您的问题,以便人们真正提供帮助。
答案 2 :(得分:-1)
我认为这可以满足您的大部分需求。我做了一些假设,比如pull_request(number).files()
与外循环中的pr.files()
相同。而且我已经删除了一些我认为没有做任何事情的计算(例如,分解'''''''''''''''''
#!python3
import csv
import os.path
class C:
@property
def number(self):
return '12345'
def as_dict(self):
return {'labels':[{'name':'Foo'}],
'login':'xyzzy',
}
@property
def filename(self):
return 'path/to/jsfile-to-checkin.js'
def files(self):
return [C()]
def issue(self, num):
return C()
def pull_requests(self):
return [C()]
repo = C()
INFO = 'info.csv'
INFO_LABELS = 'info-with-labels.csv'
SKIP_TAGS = set(["Bug", "Blocked", "Investigate"])
FILENAME_LABELS = {
'csfile':'Value1',
'jsfile':'Value1',
'css':'Value2',
'html':'Value2',
}
with open(INFO, 'w+', newline='') as info_file, \
open(INFO_LABELS, 'w') as info_labels_file:
info = csv.writer(info_file)
info_labels = csv.writer(info_labels_file, lineterminator='\n')
headers = 'PR Number|Login|First file found'
info.writerow(headers.split('|'))
label_headers = headers + '|Suggested Labels'
info_labels.writerow(label_headers.split('|'))
for pr in repo.pull_requests():
pr_issue = repo.issue(pr.number).as_dict()
labels = [labels['name'] for labels in pr_issue['labels']]
if any(tag in SKIP_TAGS for tag in labels):
continue
first_file = "No file match"
use_label = ''
for pr_file in pr.files():
filename = pr_file.filename.rsplit('/', 1)[-1]
basename, ext = os.path.splitext(filename)
name_parts = basename.split('-')
if len(name_parts) < 3:
file_tag = name_parts[0]
else:
file_tag = '-'.join(name_parts[0:2])
for text,label in FILENAME_LABELS.items():
if text in file_tag:
first_file = pr_file.filename
use_label = label
break
if use_label:
break
row = [pr.number, pr_issue['login'], first_file, use_label]
info_labels.writerow(row)