python-将列表与csv比较

时间:2018-06-26 12:38:55

标签: python

我有这个.csv:

Stops containers and removes containers, networks, volumes, and images
created by `up`.

By default, the only things removed are:

- Containers for services defined in the Compose file
- Networks defined in the `networks` section of the Compose file
- The default network, if one is used

Networks and volumes defined as `external` are never removed.

Usage: down [options]

Options:
    --rmi type          Remove images. Type must be one of:
                        'all': Remove all images used by any service.
                        'local': Remove only images that don't have a custom tag
                        set by the `image` field.
    -v, --volumes       Remove named volumes declared in the `volumes` section
                        of the Compose file and anonymous volumes
                        attached to containers.
    --remove-orphans    Remove containers for services not defined in the
                        Compose file

我需要为每个col1唯一元素获取col2值,并创建一个新的.csv,如下所示:

col1,col2,col3,col4,col5
247,19,1.0,2016-01-01 14:11:21,MP
247,3,1.0,2016-01-01 14:23:43,MP
247,12,1.0,2016-01-01 15:32:16,MP
402,3,1.0,2016-01-01 12:11:15,?
583,12,1.0,2016-01-01 02:33:57,?
769,16,1.0,2016-01-01 03:12:24,?
769,4,1.0,2016-01-01 03:22:29,?
.....

也就是说,我要输出数字,直到看到一个非唯一值为止,这时我将开始换行并继续输出数字。

我以这种方式读取.csv,并从列表中删除了重复项:

expected output:
19,3,12
3
12
16,4
...

现在事情对我来说越来越困难,我是python的新手,我的想法是将list2中的每个元素与df中的每一行进行比较,并在一个新的.csv中编写col2元素,请您帮我吗?

3 个答案:

答案 0 :(得分:3)

  

python3中的示例

 dd <- data.frame(x=rnorm(1000),y=rnorm(1000))
 ggplot(dd,aes(x,y))+geom_hex()+scale_x_reverse()

也许,您可以尝试一下。不要将整个输出存储在列表或任何数据结构(内存问题)中。在读取和聚合时写入文件。(还应优化读取以获取迭代器,而不是一次从输入文件中加载整个内容。

答案 1 :(得分:1)

您可以通过将数据分组然后应用set函数作为聚合来完成此操作。

df.groupby('col1')['col2'].apply(set).apply(list)

apply(set)函数为每个col2值创建一组所有不同的col1元素,然后apply(list)函数将其转换为列表。

答案 2 :(得分:0)

您需要跟踪重复项。最简单的方法(虽然易于理解,但会降低效率)

  # For WebSocket upgrade header
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";