Question

我的代码存在一些问题。我理解其中一个，但不知道如何解决它。代码用于读取文本文件。

TXT文件的格式：

204 jack sparrow

http://testlink.com/test123

123 Doughboy®

http://testlink.com/test346

348 ༺༃ོེċℏυƿᾰċᾰ♭Իᾰ༂ི༻

http://testlink.com/testr55

依旧......

接下来，它应该用输出编写另一个文件，如下所示：

输出文件格式：

204 http://testlink.com/test123&u_link=jack_sparrow

123 http://testlink.com/test346&u_link=Doughboy®

348 http://testlink.com/testr55&u_link=༺༃ོེċℏυƿᾰċᾰ♭Իᾰ༂ི༻

依旧......

我的输出如下：

204 jack_sparow

http://testlink.com/test123&u_link=123_Doughboy®

http://testlink.com/test346&u_link=348_༺༃ོེċℏυƿᾰċᾰ♭Իᾰ༂ི༻

等等。

由于某种原因，当输入文件从第一行开始时，该行不会被处理，并且不存在于结果文件中。当第一行在输入文件中保留为空时，输出文件如上所示。将其移动到输入文件中的下一行，输出文件没有区别。这是我的第一个问题。第二个是我无法弄清楚如何用数字和名称分割输入文件的行，然后将数字移到行的前面并命名到输出文件中链接的后面，

我的代码如下所示：

for line in open('test2.txt'): #reading file

rec = line.strip()

rec = rec.replace(" ", "_")    #Need whitespaces and brackets removed from link so i replaced them with low line
rec = rec.replace("(", "_")
rec = rec.replace(")", "_") 

level = ('1', '2', '3', '4', '5', '6', '7', '8', '9')  #line with number and name always starts with number

link = ('h')            #line with link always starts with letter h as in http://

name = (rec[3:])       

if rec.startswith(link):

    f = open("test5.txt","a")   

f.write(rec + "&u_link=")      #writes link and append $u_link= to the end of the line and this is the place where i want to append the name 

if rec.startswith(level) :

    f = open("test5.txt","a") 

    f.write(rec + "\n\n")      # this is where i write name and number

我知道代码远非完美，但我刚开始编程冒险，这是我完成同样任务的第二次尝试。我的raw_input尝试失败后，我决定使用读/写文件方法，因为名称中存在的符号和花哨字体无法在Windows中由命令行处理，但在Linux控制台上运行正常（Windows中的cmd使用不同的编码）比utf-8）。

这是我的第一个尝试代码，运行正常但是在手动输入而不是文件中传递：

print "level?",      
level = raw_input()     # file should be sorted by this variable
print "link?",
link = raw_input()     
print "name?",         # Problem with fonts and symbols
name = raw_input()
name = name.replace(" ", "")  #This removes spaces from the name as URL         cant  have spaces 
ul = "&u_link="        #This have to be appended to the link followed by  the name
el = "\n"              #Empty line to separate links in test.txt file
f = open("test.txt","a") 
f.write(el+level+" -- "+link+ul+name+el)   #file writing 
print level+" -- "+link+ul+name            #printing in the console just to see if works

我希望它能解释我想要做的事情。所有的帮助和建议非常感谢。原谅我任何和所有的错误..英语不是我的第一语言。

Answer 1

所以我注意到如果我使用reverse（）反转文件它修复了我的问题。出于某种原因，python总是会读到“链接”。首先不管是txt文件格式。经过一小部分研究后，我找到了另一种完成任务的方法，该任务使用字符串列表并且无论txt文件格式如何都有效，这意味着它适用于链接位于包含数据的行或其上方的实例。

以下是我使用reverse（）完成任务的代码：

import os
import glob

for line in reversed(open("test2.txt").readlines()):
    rec = line.strip()
    rec = rec.replace("<", "_")
    rec = rec.replace(">", "_")
    rec = rec.replace("&", "n")
    rec = rec.replace(" ", "_")
    rec = rec.replace("(", "_")
    rec = rec.replace(")", "_") 
    rec = rec.replace('"', "_")
    rec = rec.replace("'", "_")
    level = ('1', '2', '3', '4', '5', '6', '7', '8', '9')
    link = ('h')
    if rec.startswith(link):
    f = open("temp.txt","a")
    f.write(rec + "&u_link=")  
    elif rec.startswith(level) :
    f = open("temp.txt","a") 
    f.write(rec + "\n\n")   
    f.close()
for line in reversed(open("temp.txt").readlines()):
    lines = line.strip()    
    f = open("hitlistlinks.txt","a") 
    f.write(lines + "\n")    

files = glob.glob('temp.txt')
for f in files:
    os.remove(f)

注意我在进程中创建了临时文件：

files = glob.glob('temp.txt')
for f in files:
    os.remove(f)

在我的代码结束时。为了使这种方式起作用，我必须导入os和glob方法。

现在我对解决方案并不完全满意，所以我做了更多的研究。最后我在http://www.reddit.com/r/learnprogramming/的帮助下写了另一段代码强烈推荐来自Learnprogrammin @reddit的人。得到了近乎即时的帮助和许多好的建议，所以如果你对编程很新，那么这是一个很好的地方，看看你是否堆积了一些东西。他们在freenode上也有非常活跃的IRC频道#Learnprogramming。

这是最终的代码，更清洁，更有效：

# Open the file
with open("test3.txt", "r") as f:

# Here we're going to clean up the input file to wipe out
# any whitespace at the beginning or end of each line
    cleaned_lines = []
    for line in f:
        cleaned_lines.append(line.strip())

# Now we'll recombine it back into a single string of text 
# with the lines separated by the \n character
    all_text = "\n".join(cleaned_lines)

# Split the text on blank lines.  Groups should now be a list
# of strings, where each group contains two adjacent lines 
# that contain a link and a strip of data
    groups = all_text.split("\n\n")

# Now we'll go through each group and break it apart into the
# two separate lines.  One of them will start with an "http" 
# and that one will be our link.

    for group in groups:

        line1, line2 = [x for x in group.split("\n") if x]
        if line1.startswith("http"):
            link = line1
            rec = line2 
        elif line2.startswith("http"):
            link = line2
            rec = line1
        else:
        # If one of the two lines doesn't start with "http" we 
        # have a group that doesn't have a link.  
        # I'll just throw 
        # an error and bring the program to a halt.
            raise Exception("This group is missing a link! format(group))

        # At this point the link variable contains the link, and 
        # the data variable contains the other line. Now we can process the input file as intended 
        # and it will work on either file.
        rec = rec.replace("<", "_")
        rec = rec.replace(">", "_")
        rec = rec.replace("&", "n")
        rec = rec.replace(" ", "_")
        rec = rec.replace("(", "_")
        rec = rec.replace(")", "_") 
        rec = rec.replace('"', "_")
        rec = rec.replace("'", "_")
        f = open("hitlist.txt","a")
        f.write(link + "&u_link=" + rec + "\n\n")
        f.close()

我希望这可以帮助其他有类似问题的人，并向他们展示针对同一问题的两种不同方法。仅供参考，有两个以上。

读/写文件：按特定顺序写入行 - Python

1 个答案: