Question

我还很陌生，在这种情况下会陷入困境。

我有一个与以下文件相似的csv文件。

import csv

csvpath = "C:/Test/test.csv"

with open(csvpath) as f:
    csv = csv.DictReader(f)
    for row in csv:
        print(row)

，输出为：

{'NAME': 'John', 'NICKNAME': 'Big John', 'COUNTRY': 'Canada', 'CITY': 'Toronto'}
{'NAME': 'David', 'NICKNAME': 'Small Jogn', 'COUNTRY': 'Canada', 'CITY': 'Toronto'}
{'NAME': 'Alan', 'NICKNAME': 'The Bull', 'COUNTRY': 'England', 'CITY': 'London'}
{'NAME': 'Ethan', 'NICKNAME': 'The Hawk', 'COUNTRY': 'England', 'CITY': 'London'}
{'NAME': 'Ivan', 'NICKNAME': 'The Russian', 'COUNTRY': 'Russia', 'CITY': 'Moscow'}
{'NAME': 'Boris', 'NICKNAME': 'The Bear', 'COUNTRY': 'Russia', 'CITY': 'Moscow'}

是否可以仅首先打印具有国家（地区）加拿大的行。然后，我想要另一个循环来打印与英格兰国家/地区的行，以及另一个与俄罗斯一起打印的国家/地区的行。但是，国家/地区将一直进行编辑，因此不会相同，并且此列表中每天可能会有不同的国家/地区和国家/地区数量。因此，基本上我需要在不同的for循环中分别打印具有相同国家/地区的行。

Answer 1

下面的代码按国家/地区对数据进行分组：（基于问题中的数据结构）

fn_create_plot_last <-
  function(df, year_start, year_end, bench_country) {

    bench_country_str <- rlang::as_label(enquo(bench_country))
    
    # plotting
    ggplot(df) +
      geom_segment(aes(
        x = benchmarked_start, xend = benchmarked_end,
        y = country, yend = country,
        col = continent
      ), alpha = 0.5, size = 7) +
      geom_point(aes(x = benchmarked, y = country, col = continent), size = 9, alpha = .8) +
      geom_text(aes(
        x = benchmarked_start + 8, y = country,
        label = paste(round(benchmarked_start))
      ),
      col = "grey50", hjust = "right"
      ) +
      geom_text(aes(
        x = benchmarked_end - 4.0, y = country,
        label = round(benchmarked_end)
      ),
      col = "grey50", hjust = "left"
      ) +

      # scale_x_continuous(limits = c(20,85)) +

      scale_color_brewer(palette = "Pastel2") +
      labs(
        title = glue("Countries GdpPerCap at {year_start} & {year_end})"),
        subtitle = glue("Meaning Difference of gdpPerCap of countries taken wrt {bench_country_str} \n(Benchmarked {bench_country_str} in blue line) \nFor Countries with pop > 30000000 \n(Chart created by ViSa)"),
        col = "Continent",
        x = glue("GdpPerCap Difference at {year_start} & {year_end} (w.r.t {bench_country_str})")
      ) +


      # Adding benchmark line
      geom_vline(xintercept = 0, col = "blue", alpha = 0.3) +
      geom_label(
        label = glue("{bench_country_str} - as Benchamrked line"), x = 8000, y = bench_country_str, # {bench_country}
        label.padding = unit(0.35, "lines"), # Rectangle size around label
        label.size = 0.15, color = "black"
      ) +

      # background & theme settings
      theme_classic() +
      theme(
        legend.position = "top",
        axis.line = element_blank(), # axis.text = element_blank()
        axis.ticks = element_blank()
      ) +

      # Adding $ to the axis (from scales lib)            <=========================
      scale_x_continuous(labels = label_dollar())
  }

输出

from collections import defaultdict

data = [{'NAME': 'John', 'NICKNAME': 'Big John', 'COUNTRY': 'Canada', 'CITY': 'Toronto'},
        {'NAME': 'David', 'NICKNAME': 'Small Jogn', 'COUNTRY': 'Canada', 'CITY': 'Toronto'},
        {'NAME': 'Alan', 'NICKNAME': 'The Bull', 'COUNTRY': 'England', 'CITY': 'London'},
        {'NAME': 'Ethan', 'NICKNAME': 'The Hawk', 'COUNTRY': 'England', 'CITY': 'London'},
        {'NAME': 'Ivan', 'NICKNAME': 'The Russian', 'COUNTRY': 'Russia', 'CITY': 'Moscow'},
        {'NAME': 'Boris', 'NICKNAME': 'The Bear', 'COUNTRY': 'Russia', 'CITY': 'Moscow'}]

data_by_country = defaultdict(list)
for entry in data:
    data_by_country[entry['COUNTRY']].append(entry)

for country, info_lst in data_by_country.items():
    print(country)
    for info in info_lst:
        print(f'\t {info}')

Answer 2

当它们在一起排列在一起时。

def print_countries(array,countryname):
 print("results for: ", countryname)
 for item in array:
   if item['COUNTRY'] == countryname:
      print(item)

使其成为一个功能。如果要在CSV文件中打印每个国家/地区，我会使用每个名称构建一个数组。

def get_countrynames(array):
 countryname_list = []
 for row in array:
  if row['COUNTRY'] not in countryname_list:
    countryname_list.append(row['COUNTRY'])
 return countryname_list

with open(csvpath) as f:
    csv = csv.DictReader(f)
    names = get_countrynames(csv)
    for country in names:
     print_countries(csv,country)

希望这行得通，而不是对自己进行测试

Answer 3

根据某种标准从集合中选择项目称为过滤。

Python提供了filter内置函数来执行此操作，或者您可以使用list comprehension或generator表达式来获得相同的结果。

列表理解比21更快，并且被认为更“ pythonic”，但是我发现setAndAllowWhileIdle有时使代码更易于理解。生成器表达式在语法上类似于列表推导，但是使用较少的内存（并且只能循环一次）。

以下功能显示了如何使用所有三个选项。它接受一个打开的文件和一个国家名称列表（作为字符串）作为参数，并打印出每个国家/地区找到的行。

filter

输出：

filter

打印具有相同列的行

3 个答案: