如何计算一列每一行的单词数,然后转换为数字?

时间:2019-04-26 16:18:21

标签: r regex

我在数据框中有一个列,其中列出了在酒店位置找到的便利设施。我需要计算每行中有多少便利设施,然后将其转换为数字,然后再用这些数字创建另一列。

> airbnb$amenities[1:25]
 [1] "{TV,Internet,Wifi,\"Air conditioning\",\"Paid parking off premises\",Breakfast,Heating,\"Smoke detector\",\"Carbon monoxide detector\",\"First aid kit\",\"Safety card\",\"Fire extinguisher\",Essentials,Shampoo,\"Lock on bedroom door\",\"24-hour check-in\",Hangers,\"Hair dryer\",Iron,\"Laptop friendly workspace\",\"translation missing: en.hosting_amenity_49\",\"translation missing: en.hosting_amenity_50\",\"Private entrance\",\"Hot water\",\"Patio or balcony\",\"Garden or backyard\",\"Luggage dropoff allowed\",\"Well-lit path to entrance\",\"Host greets you\"}"                                                                                                                                                                                                                                                                                                         
 [2] "{TV,Wifi,\"Air conditioning\",Kitchen,\"Pets live on this property\",Cat(s),\"Free street parking\",Heating,Washer,Dryer,\"Smoke detector\",Essentials,Shampoo,Hangers,\"Hair dryer\",Iron,\"Laptop friendly workspace\",\"Hot water\",\"Luggage dropoff allowed\",Other}"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     
 [3] "{TV,\"Cable TV\",Wifi,\"Air conditioning\",Pool,Kitchen,\"Free parking on premises\",Breakfast,Elevator,\"Hot tub\",\"Buzzer/wireless intercom\",Heating,\"Family/kid friendly\",Washer,\"Smoke detector\",\"First aid kit\",Essentials,Shampoo,\"24-hour check-in\",Hangers,\"Hair dryer\",Iron,\"translation missing: en.hosting_amenity_50\"}"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
 [4] "{Internet,Wifi,Pool,Kitchen,\"Free street parking\",\"Buzzer/wireless intercom\",Heating,\"Smoke detector\",Essentials,Hangers,Iron,\"Hot water\",Microwave,\"Coffee maker\",Refrigerator,\"Dishes and silverware\",\"Cooking basics\",\"BBQ grill\",\"Garden or backyard\",\"Long term stays allowed\",\"Host greets you\"}"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  
 [5] "{TV,Internet,Wifi,\"Air conditioning\",Kitchen,\"Paid parking off premises\",Elevator,\"Buzzer/wireless intercom\",Heating,Washer,Dryer,\"Smoke detector\",\"First aid kit\",\"Safety card\",Essentials,Shampoo,Hangers,\"Hair dryer\",Iron,\"Laptop friendly workspace\",\"Hot water\",Microwave,Refrigerator,Dishwasher,\"Dishes and silverware\",\"Cooking basics\",Oven,Stove,\"Long term stays allowed\",Other}"     

我熟悉使用grep,gsub之类的东西,但是我对如何在每一行中计数感到困惑。我以为grep('[a-z]',airbnb $ amenities)可能会以某种方式计算每行中的模式,但仍然很困惑。

1 个答案:

答案 0 :(得分:2)

一个选项是str_count来计算定界符(,),然后将其加1以获取n元语法词的数量

library(stringr)
airbnb$amenityCount <- str_count(airbnb$amenities, ",") + 1