Question

我相对较不熟悉python。我试图采用一种标准的文件格式，并最终根据出现在一行中的特定标识符将其分解为较小的文件。

到目前为止，我已经能够提取文件，打开文件进行读写，然后将每一行分成一个列表项。现在，我试图找到以“ 03”开头的每个列表项位置。从一个“ 03”列表位置到另一个列表位置的所有内容最终都将是一个单独的文件。我陷入尝试提取列表位置包含“ 03”的列表位置的问题。我尝试使用：

for value in acct_locate:
    if value == '03':
        locations.append(acct_locate.index(value))

这似乎什么也没返回，并且我尝试了enumerate()和index()的其他版本。

当前这是我正在使用的代码：

import re
#need to look for file name
filename = 'examplebai2.txt'

#this list will store all locations where three record shows up
acct_locate = []
locations = []
acct_listing = []

with open(filename, 'r+') as file:
    line = [line.rstrip('\n') for line in file]
    for x in line:
        #locate all instances of locations starting with '03'
        look = re.findall('^03', x)
        acct_locate.append(look)
        #add those instances to a new list
    a = [i for i,x in enumerate(acct_locate) if x == '03']
    for value in a:
        print(value)
        locations.append(acct_locate.index(value))
    for y in line:
        namelist = re.findall('^03, (.*),', y)
        if len(namelist) > 0:
            acct_listing.append(namelist)

运行上面的代码将不会向我用来收集所有位置的locations列表返回任何内容。

这是我要处理的文件的骨架。

01, Testfile
02, Grouptest
03, 11111111
16
88
49
03, 22222222,
16
88
49
03, 33333333,
16
88
49
03, 44444444,
16
88
49
98, Grouptestclose
99, Testfileclose

从这个文件中，我想以四个单独的文件结尾，这些文件包含一个03记录到下一个03记录。

Answer 1

如果您不需要知道特殊字符的位置，则可以：

    class City{

        String cityName;
        String countryName;
        public String getCityName() {
            return cityName;
        }
        public void setCityName(String cityName) {
            this.cityName = cityName;
        }
        public String getCountryName() {
            return countryName;
        }
        public void setCountryName(String countryName) {
            this.countryName = countryName;
        }

        public City(String string, String string2) {
            this.cityName = string;
            this.countryName=string2;

        }

        //getter setter ns constructor stuffs

     }
    public class Test  {
        public static void main(String[] args) {
            List<City> cities = new ArrayList<>();
            cities.add(new City("SF","USA"));
            cities.add(new City("Agra","India"));
            cities.add(new City("Mumbai","India"));
            cities.add(new City("NY","USA"));
            List<String> result = new ArrayList<>();

            Map<String, List<City>> res = cities.stream().collect(Collectors.groupingBy(City::getCountryName));
            for(HashMap.Entry<String, List<City>> valuepair:res.entrySet())
            {
                List<City> value = valuepair.getValue();
                City countryName = value.get(0);
                String cname = countryName.getCountryName();
                result.add(cname);
                for (City count : value) {
                    result.add(count.getCityName());

                }

            }
            for (String string : result) {
                System.out.println(string);
            }





    }

}

说明：前两个语句读取文件，删除所有换行符，然后将结果放入单个字符串“ data”中。最后一条语句在出现“特殊字符”“ 03”时将其拆分，返回一个字符串列表，其中每个元素都是两个“ 03”之间的一部分。

编辑：

鉴于上面的示例数据，您可以尝试遍历文件并将读取的数据放入缓冲区。每次找到“ 03”时，将缓冲区清空到一个新文件中。示例：

USA
SF
NY
India
Agra
Mumbai

Answer 2

如果要“查找以'03'开头的位置的所有实例”，则应选中x.startswith("03")而不是x == "03"。

根据项目中的值从列表中提取职位

2 个答案: