我的代码存在一些问题。我理解其中一个,但不知道如何解决它。代码用于读取文本文件。
TXT文件的格式:
204 jack sparrow
http://testlink.com/test123
123 Doughboy®
http://testlink.com/test346
348 ༺༃ོེċℏυƿᾰċᾰ♭Իᾰ༂ི༻
http://testlink.com/testr55
依旧......
接下来,它应该用输出编写另一个文件,如下所示:
输出文件格式:
204 http://testlink.com/test123&u_link=jack_sparrow
123 http://testlink.com/test346&u_link=Doughboy®
348 http://testlink.com/testr55&u_link=༺༃ོེċℏυƿᾰċᾰ♭Իᾰ༂ི༻
依旧......
我的输出如下:
204 jack_sparow
http://testlink.com/test123&u_link=123_Doughboy®
http://testlink.com/test346&u_link=348_༺༃ོེċℏυƿᾰċᾰ♭Իᾰ༂ི༻
等等。
由于某种原因,当输入文件从第一行开始时,该行不会被处理,并且不存在于结果文件中。当第一行在输入文件中保留为空时,输出文件如上所示。将其移动到输入文件中的下一行,输出文件没有区别。这是我的第一个问题。第二个是我无法弄清楚如何用数字和名称分割输入文件的行,然后将数字移到行的前面并命名到输出文件中链接的后面,
我的代码如下所示:
for line in open('test2.txt'): #reading file
rec = line.strip()
rec = rec.replace(" ", "_") #Need whitespaces and brackets removed from link so i replaced them with low line
rec = rec.replace("(", "_")
rec = rec.replace(")", "_")
level = ('1', '2', '3', '4', '5', '6', '7', '8', '9') #line with number and name always starts with number
link = ('h') #line with link always starts with letter h as in http://
name = (rec[3:])
if rec.startswith(link):
f = open("test5.txt","a")
f.write(rec + "&u_link=") #writes link and append $u_link= to the end of the line and this is the place where i want to append the name
if rec.startswith(level) :
f = open("test5.txt","a")
f.write(rec + "\n\n") # this is where i write name and number
我知道代码远非完美,但我刚开始编程冒险,这是我完成同样任务的第二次尝试。我的raw_input尝试失败后,我决定使用读/写文件方法,因为名称中存在的符号和花哨字体无法在Windows中由命令行处理,但在Linux控制台上运行正常(Windows中的cmd使用不同的编码)比utf-8)。
这是我的第一个尝试代码,运行正常但是在手动输入而不是文件中传递:
print "level?",
level = raw_input() # file should be sorted by this variable
print "link?",
link = raw_input()
print "name?", # Problem with fonts and symbols
name = raw_input()
name = name.replace(" ", "") #This removes spaces from the name as URL cant have spaces
ul = "&u_link=" #This have to be appended to the link followed by the name
el = "\n" #Empty line to separate links in test.txt file
f = open("test.txt","a")
f.write(el+level+" -- "+link+ul+name+el) #file writing
print level+" -- "+link+ul+name #printing in the console just to see if works
我希望它能解释我想要做的事情。所有的帮助和建议非常感谢。原谅我任何和所有的错误..英语不是我的第一语言。
答案 0 :(得分:0)
所以我注意到如果我使用reverse()反转文件它修复了我的问题。出于某种原因,python总是会读到“链接”。首先不管是txt文件格式。 经过一小部分研究后,我找到了另一种完成任务的方法,该任务使用字符串列表并且无论txt文件格式如何都有效,这意味着它适用于链接位于包含数据的行或其上方的实例。
以下是我使用reverse()完成任务的代码:
import os
import glob
for line in reversed(open("test2.txt").readlines()):
rec = line.strip()
rec = rec.replace("<", "_")
rec = rec.replace(">", "_")
rec = rec.replace("&", "n")
rec = rec.replace(" ", "_")
rec = rec.replace("(", "_")
rec = rec.replace(")", "_")
rec = rec.replace('"', "_")
rec = rec.replace("'", "_")
level = ('1', '2', '3', '4', '5', '6', '7', '8', '9')
link = ('h')
if rec.startswith(link):
f = open("temp.txt","a")
f.write(rec + "&u_link=")
elif rec.startswith(level) :
f = open("temp.txt","a")
f.write(rec + "\n\n")
f.close()
for line in reversed(open("temp.txt").readlines()):
lines = line.strip()
f = open("hitlistlinks.txt","a")
f.write(lines + "\n")
files = glob.glob('temp.txt')
for f in files:
os.remove(f)
注意我在进程中创建了临时文件:
files = glob.glob('temp.txt')
for f in files:
os.remove(f)
在我的代码结束时。为了使这种方式起作用,我必须导入os和glob方法。
现在我对解决方案并不完全满意,所以我做了更多的研究。 最后我在http://www.reddit.com/r/learnprogramming/的帮助下写了另一段代码 强烈推荐来自Learnprogrammin @reddit的人。得到了近乎即时的帮助和许多好的建议,所以如果你对编程很新,那么这是一个很好的地方,看看你是否堆积了一些东西。他们在freenode上也有非常活跃的IRC频道#Learnprogramming。
这是最终的代码,更清洁,更有效:
# Open the file
with open("test3.txt", "r") as f:
# Here we're going to clean up the input file to wipe out
# any whitespace at the beginning or end of each line
cleaned_lines = []
for line in f:
cleaned_lines.append(line.strip())
# Now we'll recombine it back into a single string of text
# with the lines separated by the \n character
all_text = "\n".join(cleaned_lines)
# Split the text on blank lines. Groups should now be a list
# of strings, where each group contains two adjacent lines
# that contain a link and a strip of data
groups = all_text.split("\n\n")
# Now we'll go through each group and break it apart into the
# two separate lines. One of them will start with an "http"
# and that one will be our link.
for group in groups:
line1, line2 = [x for x in group.split("\n") if x]
if line1.startswith("http"):
link = line1
rec = line2
elif line2.startswith("http"):
link = line2
rec = line1
else:
# If one of the two lines doesn't start with "http" we
# have a group that doesn't have a link.
# I'll just throw
# an error and bring the program to a halt.
raise Exception("This group is missing a link! format(group))
# At this point the link variable contains the link, and
# the data variable contains the other line. Now we can process the input file as intended
# and it will work on either file.
rec = rec.replace("<", "_")
rec = rec.replace(">", "_")
rec = rec.replace("&", "n")
rec = rec.replace(" ", "_")
rec = rec.replace("(", "_")
rec = rec.replace(")", "_")
rec = rec.replace('"', "_")
rec = rec.replace("'", "_")
f = open("hitlist.txt","a")
f.write(link + "&u_link=" + rec + "\n\n")
f.close()
我希望这可以帮助其他有类似问题的人,并向他们展示针对同一问题的两种不同方法。仅供参考,有两个以上。