Question

我有一个文本文件。左侧显示当前土地使用情况，右侧显示历史土地使用情况。它由竖线字符（await）分隔。看起来像这样：

\n

我已经创建了这个脚本，但是对于土地2、3和6，我得到了一个空的等价物。我如何在该空间中适应土地用途x /土地用途z？

Answer 1

您可以执行以下操作：

for line in Textfile:
    try:
        (key, value) = line.split("|")
    except ValueError:  # split() did not result in two items.
        continue        # This will deal among other with the delimiter lines ----
    key = key.strip()
    value = value.strip()
    if value:  # string is not empty after stripping
        d[key] = value
        prev_value = value  # save for next line if needed
    else:
        d[key] = prev_value  # assign last seen value as there isn't any new one

请注意，此示例非常初级，在某些情况下仍未解决。例如，如果第一个条目在第二列中没有值，则将失败并显示NameError（您可以在进入循环之前进行设置，但是正确的值是什么，在这种情况下，失败可能是正确的操作）。您可能实际上想（示例输入会建议这样）在击中定界符时重置prev_value？除了将|分为两部分以外，我们实际上没有对输入执行任何检查。

按照书面规定，在处理定界符（IndexError）时，脚本实际上应该在b=x[1]上引发----，因为这应该产生单个项目列表。

此外，在文件名中使用\时，请确保使用原始字符串文字r"g:\somefile.txt"来避免意外（或仅使用正斜杠，Windows同时知道如何处理这些斜杠，很少使用不规则的应用程序可能仍然没有。）

将if value:替换为if value != '\n':，以防您输入的内容实际上具有文字'\ n'字符串，而不是仅空格后跟换行符，以替换应使用先前值的行。

Answer 2

为了完成此操作，您需要*一个for循环范围之外的变量，以便它可以保留先前循环中的信息。在这里，我们添加了一个变量previous_landuse，该变量将随着右侧土地使用的最新发生而更新。当一行没有右侧时，它将使用该变量填充空白，因为那是该列的最后一个值。

Textfile=open(r"G:\....txt","r")
d={}
previous_landuse = ''
for line in Textfile:
    x=line.split("|")

    #ignore the -------- line
    if len(x) < 2:
        continue

    key = x[0].strip()
    value = x[1].strip()

    if value == '':
        value = previous_landuse
    else:
        previous_landuse = value

    d[key] = value

print(d)

输出：
{'landuse 1': 'landuse x', 'landuse 2': 'landuse x', 'landuse 3': 'landuse x', 'landuse 4': 'landuse y', 'landuse 5': 'landuse z', 'landuse 6': 'landuse z'}

*从技术上讲，您不需要在范围外使用它，但是这样做是一种好习惯，因为某些语言对for循环范围要严格得多。

Answer 3

如果使用条件，似乎很简单。像这样：

for line in Textfile:
    x = line.split("|")
    a = x[0]
    b = x[1]
    if r"\n" not in b:
        tmp = b
    c = tmp.strip("\n")
    e = a.strip()
    f = e.strip("-")
    g = c.strip("-")
    d[f] = g
print(d)

Answer 4

使用熊猫的选项。我将假设您的文本文件恰好包含此内容

landuse 1    |landuse x
landuse 2    |\n
landuse 3    |\n
-----------------------
landuse 4    |landuse y
-----------------------
landuse 5    |landuse z
landuse 6    |\n

包括\n和-----

import pandas as pd

df = pd.read_csv('my_data.csv',
                 header=None,
                 sep='|')
df.columns = ['id','value']

# Get rid of the `-------`
df = df.dropna()

# Replace the literal '\n' with missing values
df.loc[:,'value'] = df.loc[:,'value'].replace({r'\n':None})

# Now just forward fill
df = df.ffill()

df的最终内容是：

              id      value
0  landuse 1      landuse x
1  landuse 2      landuse x
2  landuse 3      landuse x
4  landuse 4      landuse y
6  landuse 5      landuse z
7  landuse 6      landuse z

如何将空字典条目替换为最后一个条目

4 个答案: