Question

我有一份产品销售清单（及其成本），这些清单令人沮丧地被合并为一个字符串，并用逗号分隔。我最终需要将每个产品分成唯一的行，而使用stringr::str_split就很容易。

但是，与每种产品相关的成本用逗号表示，例如显示成千上万。 1,000.00或38,647.89。因此，str_split在产品成本范围内打了逗号，导致产品分割不正确。

我想知道最好的tidyverse解决方案是什么删除所有被数字包围的逗号，以使1,000.00变成1000.00，38,647.89变成38647.89。一旦删除了这些逗号，我就可以str_split用逗号分隔产品，从而将每个唯一的产品分成自己的行。

这是一个虚拟数据集：

df<-data.frame(id = c(1, 2), product = c("1 Car at $38,678.49, 1 Truck at $78,468.00, 1 Motorbike at $5,634.78", "1 Car at $38,678.49, 1 Truck at $78,468.00, 1 Motorbike at $5,634.78"))

df

  id                                                              product
1 1 Car at $38,678.49, 1 Truck at $78,468.00, 1 Motorbike at $5,634.78
2 1 Car at $38,678.49, 1 Truck at $78,468.00, 1 Motorbike at $5,634.78

预期结果：

  id                                                              product
1  1 1 Car at $38678.49, 1 Truck at $78468.00, 1 Motorbike at $5634.78
2  2 1 Car at $38678.49, 1 Truck at $78468.00, 1 Motorbike at $5634.78

Answer 1

def toggle(direction='off'):
    if 'on' in direction:
        _direction = 1
    else:
        _direction = 0
    with open('/dev/gpio1', 'w') as gpio:
        gpio.write(_direction)

async def worker(sound_device, my_queue):
    while 1:
        loop = asyncio.get_event_loop()
        data = await my_queue.get()
        if data:
            await loop.run_in_executor(
                some_executor,
                toggle,
                'on'
            )
            sound_device.write(data)

async def task():
   sound_device = ... something to init sound device ...
   my_queue = asyncio.Queue()
   while 1:
       # toggle(on) when we start working the queue
       await worker(sound_device, my_queue)
       # toggle(off) after we've worked through the queue

结果

df %>%
  mutate(product = product %>% str_replace_all("([0-9]),([0-9])", "\\1\\2"))

Answer 2

> apply(df,1,function(x){gsub(",([0-9])","\\1",x[2])})

[1] "1 Car at $38678.49, 1 Truck at $78468.00, 1 Motorbike at $5634.78"
[2] "1 Car at $38678.49, 1 Truck at $78468.00, 1 Motorbike at $5634.78"

Answer 3

通过底数R的方式可以是

sapply(strsplit(as.character(df$product), ' '), function(i)paste(sub(',', '', i), collapse = ' '))
#[1] "1 Car at $38678.49, 1 Truck at $78468.00, 1 Motorbike at $5634.78" "1 Car at $38678.49, 1 Truck at $78468.00, 1 Motorbike at $5634.78"

Answer 4

library(tidyverse)
df$product <- str_replace_all(df$product, "(?<=\\d),(?=\\d)", "")
df
  id                                                           product
1  1 1 Car at $38678.49, 1 Truck at $78468.00, 1 Motorbike at $5634.78
2  2 1 Car at $38678.49, 1 Truck at $78468.00, 1 Motorbike at $5634.78

仅删除数字内的逗号

4 个答案: