在同一列中拆分多个值类型

时间:2017-07-31 13:59:33

标签: r

我有一个看起来像这样的数据框

Traceback (most recent call last):
  File "C:\Users\asus\Miniconda3\envs\tensorflow\Scripts\jupyter-notebook-script.py", line 5, in <module>
    sys.exit(notebook.notebookapp.main())
  File "C:\Users\asus\Miniconda3\envs\tensorflow\lib\site-packages\jupyter_core\application.py", line 267, in launch_instance
    return super(JupyterApp, cls).launch_instance(argv=argv, **kwargs)
  File "C:\Users\asus\Miniconda3\envs\tensorflow\lib\site-packages\traitlets\config\application.py", line 657, in launch_instance
    app.initialize(argv)
  File "<decorator-gen-7>", line 2, in initialize
  File "C:\Users\asus\Miniconda3\envs\tensorflow\lib\site-packages\traitlets\config\application.py", line 87, in catch_config_error
    return method(app, *args, **kwargs)
  File "C:\Users\asus\Miniconda3\envs\tensorflow\lib\site-packages\notebook\notebookapp.py", line 1296, in initialize
    self.init_webapp()
  File "C:\Users\asus\Miniconda3\envs\tensorflow\lib\site-packages\notebook\notebookapp.py", line 1120, in init_webapp
    self.http_server.listen(port, self.ip)
  File "C:\Users\asus\Miniconda3\envs\tensorflow\lib\site-packages\tornado\tcpserver.py", line 143, in listen
    self.add_sockets(sockets)
  File "C:\Users\asus\Miniconda3\envs\tensorflow\lib\site-packages\tornado\tcpserver.py", line 155, in add_sockets
    self.io_loop = IOLoop.current()
  File "C:\Users\asus\Miniconda3\envs\tensorflow\lib\site-packages\tornado\ioloop.py", line 214, in current
    return IOLoop.instance()
  File "C:\Users\asus\Miniconda3\envs\tensorflow\lib\site-packages\tornado\ioloop.py", line 162, in instance
    IOLoop._instance = IOLoop()
  File "C:\Users\asus\Miniconda3\envs\tensorflow\lib\site-packages\tornado\util.py", line 293, in __new__
    instance.initialize(*args, **init_kwargs)
  File "C:\Users\asus\Miniconda3\envs\tensorflow\lib\site-packages\zmq\eventloop\ioloop.py", line 139, in initialize
    super(ZMQIOLoop, self).initialize(impl=impl, **kwargs)
  File "C:\Users\asus\Miniconda3\envs\tensorflow\lib\site-packages\tornado\ioloop.py", line 709, in initialize
    self._waker = Waker()
  File "C:\Users\asus\Miniconda3\envs\tensorflow\lib\site-packages\tornado\platform\common.py", line 61, in __init__
    self.writer.connect(connect_address)
TimeoutError: [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond

这里的问题是,对于每条记录ID <- c(1, 1, 1, 2, 2, 2, 2, 3, 3) Detail <- c('Name', 'Value', 'Value', 'Name', 'Value', 'Value', 'Value', 'Name', 'Value') Value <- c('Jim', 100, 200, 'Sally', 300, 200, 300, 'Jim', 500) df <- data.frame(ID, Detail, Value) ID都有名称和值。我需要对Value列进行重新整形,以便名称进入自己的列和值。

所需的输出看起来像这样。

我甚至不确定从哪里开始......或者搜索什么。

Value

3 个答案:

答案 0 :(得分:4)

我认为最简单的方法是这样的:

dfNames=df[Detail=="Name",-2]
dfValue=df[Detail=="Value",-2]


dfWide=merge(dfNames,dfValue,by="ID")
colnames(dfWide)=c("ID","Name","Value")


  ID  Name Value
1  1   Jim   100
2  1   Jim   200
3  2 Sally   300
4  2 Sally   200
5  2 Sally   300
6  3   Jim   500

答案 1 :(得分:2)

我们可以通过多种方式处理您的问题,我会展示dplyr方法之一:

library('dplyr')

inner_join(filter(df, Detail == 'Name'),
           filter(df, Detail == 'Value'),
           by = 'ID') %>% 
  select_at(vars(-contains('Detail'))) %>% 
  setNames(c('ID', 'Name', 'Value'))
  ID  Name Value
1  1   Jim   100
2  1   Jim   200
3  2 Sally   300
4  2 Sally   200
5  2 Sally   300
6  3   Jim   500

答案 2 :(得分:1)

使用dplyr,假设每个ID只有一个名称:

library(dplyr)
df %>% group_by(ID) %>% do({
    data.frame(Name = .$Value[.$Detail == "Name"], Value = .$Value[.$Detail == "Value"])
})

# A tibble: 6 x 3
# Groups:   ID [3]
#     ID   Name  Value
#  <dbl> <fctr> <fctr>
#1     1    Jim    100
#2     1    Jim    200
#3     2  Sally    300
#4     2  Sally    200
#5     2  Sally    300
#6     3    Jim    500

使用data.table

library(data.table)
setDT(df)[, .(Name = Value[Detail == "Name"], Value = Value[Detail == "Value"]) ,ID]

#   ID  Name Value
#1:  1   Jim   100
#2:  1   Jim   200
#3:  2 Sally   300
#4:  2 Sally   200
#5:  2 Sally   300
#6:  3   Jim   500