Question

我有一个csv文件，如下所示：

Id,Title,FullDescription,LocationRaw,LocationNormalized
1,hi,abc,def,Bristol
1,yo,abc,def,Bristol
1,was,abc,def,England
1,up,abc,def,India
1,yoh,abc,def,Nepal
1,home,abc,def,Bristol

我想为每个LocationNormalized变量获取唯一的ID。这样我的

output looks like this:
    Id,Title,FullDescription,LocationRaw,LocationNormalized,ID
    1,hi,abc,def,Bristol,1
    1,yo,abc,def,Bristol,1
    1,was,abc,def,England,2
    1,up,abc,def,India,3
    1,yoh,abc,def,Nepal,4
    1,home,abc,def,Bristol,1

我是R的新手。我尝试了as.factor和一些失败的脚本。

Answer 1

数据

df <- data.table::fread("Id,Title,FullDescription,LocationRaw,LocationNormalized
1,hi,abc,def,Bristol
1,yo,abc,def,Bristol
1,was,abc,def,England
1,up,abc,def,India
1,yoh,abc,def,Nepal
1,home,abc,def,Bristol")

解决方案

library(dplyr)

df %>%
  mutate(new_ID = group_indices(., LocationNormalized))

  Id Title FullDescription LocationRaw LocationNormalized new_ID
1  1    hi             abc         def            Bristol      1
2  1    yo             abc         def            Bristol      1
3  1   was             abc         def            England      2
4  1    up             abc         def              India      3
5  1   yoh             abc         def              Nepal      4
6  1  home             abc         def            Bristol      1

Answer 2

使用<table class="table" id="basket"> <thead> <tr> <th>Item name</th> <th>Type</th> <th>Price</th> <th class="middlecolumnbasket">Frequency</th> <th>Remove</th> </tr> </thead> <tbody> <tr> <td>Product 1</td> <td>Download</td> <td>€4.99 </td> <td>Each month</td> <td align="center"> <input type="checkbox" name="delete244113" value="1"> </td> </tr> <tr> <td>Product 2</td> <td>Download</td> <td>€99.99 </td> <td>Each year</td> <td align="center"> <input type="checkbox" name="delete245466" value="1"> </td> </tr> <tr> <td>Product 3</td> <td>Download</td> <td>€99.99 </td> <td>Each year</td> <td align="center"> <input type="checkbox" name="delete253047" value="1"> </td> </tr> <tr> <td>Product 4</td> <td>Download</td> <td>€29.99 </td> <td>Each year</td> <td align="center"> <input type="checkbox" name="delete253053" value="1"> </td> </tr> <tr> <td>Product 5</td> <td>Download</td> <td>€49.99 </td> <td>Each year</td> <td align="center"> <input type="checkbox" name="delete253055" value="1"> </td> </tr> </tbody> </table>

data.table

数据

library(data.table)
setDT(df1)[, ID := .GRP, by =  LocationNormalized]
df1
#   Id Title FullDescription LocationRaw LocationNormalized ID
#1:  1    hi             abc         def            Bristol  1
#2:  1    yo             abc         def            Bristol  1
#3:  1   was             abc         def            England  2
#4:  1    up             abc         def              India  3
#5:  1   yoh             abc         def              Nepal  4
#6:  1  home             abc         def            Bristol  1

为R中的变量分配唯一编号

2 个答案:

数据

解决方案

数据