R-如何计算列上的值并除以整数

时间:2019-06-07 20:38:01

标签: r dataframe dplyr tidyverse

我有一个数据帧,其中各组的大小各不相同,但是我想对行进行求和并除以 n (存在的整数数),在新列中( V1 .mean

除非有一种方法可以按名称进行选择,否则我们必须选择列号(在这种情况下为[10:18])。如果是这样,一定要教我,因为我必须像这样转换8个问题(x9)(请参见下面的示例)。

所以我尝试了这个:

df$v1.mean <- rowSums(cbind(df[10:18]), na.rm = T ) / # sums it up
              ncol(is.integer(cbind(df[10:18] )))     # sums integers, but no

我看到了this dplyr的示例,但是我不确定如何使其在tally()中工作

数据框架如下所示,其中 V1.mean 是我正在寻找的解决方案。

  V1.1 V1.2 V1.3 V1.4 V1.5 V1.6 V1.7 V1.8  V1.9 V2.1 | V1.mean V2.mean
1     5    4    5   NA   NA   NA   NA   NA   NA   5  | 4.67 [== (5+4+5)/3]
2     5    5    5   NA   NA   NA   NA   NA   NA   3
3     5    5    5    5   NA   NA   NA   NA   NA  ...
4     5    4    5   NA   NA   NA   NA   NA   NA  ...
5     5    5   NA   NA   NA   NA   NA   NA   NA  ...
6     5    5    5    5   NA   NA   NA   NA   NA  ...
7     5    5    5    4    4   NA   NA   NA   NA  ...
8     5    5    5    4    5    5   NA   NA   NA  ... | 4.83 [== (5+5+5+4+5+5)/6]
9     4    5    5    5    4   NA   NA   NA   NA  ...
10    5    5    5   NA   NA   NA   NA   NA   NA  ...

预先感谢:)

3 个答案:

答案 0 :(得分:1)

一种选择是将数据local task_manager_id=`curl -s "http://localhost:8081/taskmanagers/" | jq -r '.taskmanagers[0].id'` 放入from bs4 import BeautifulSoup as bs import requests import json import pandas as pd import re headers = { 'User-Agent' : 'Mozilla/5.0', 'X-RequestDigest' : '0x34AF663EB174C6B490674D6D041668FE4928C0A60754D7CAB6C39B120504145C2DD53F657F7D82D78D6EFFDE3ADEDC210520BB12197AC1966ACD9562B94EB096,07 Jun 2019 20:18:44 -0000', 'X-Requested-With': 'XMLHttpRequest' } xml = '''<Request xmlns="http://schemas.microsoft.com/sharepoint/clientquery/2009" SchemaVersion="15.0.0.0" LibraryVersion="15.0.0.0" ApplicationName="Javascript Library"><Actions><ObjectPath Id="1" ObjectPathId="0" /><SetProperty Id="2" ObjectPathId="0" Name="TimeZoneId"><Parameter Type="Number">10</Parameter></SetProperty><ObjectPath Id="4" ObjectPathId="3" /><Method Name="Add" Id="5" ObjectPathId="3"><Parameters><Parameter Type="String">RefinableDate03</Parameter><Parameter Type="Number">0</Parameter></Parameters></Method><SetProperty Id="6" ObjectPathId="0" Name="Culture"><Parameter Type="Number">-1</Parameter></SetProperty><SetProperty Id="7" ObjectPathId="0" Name="RowsPerPage"><Parameter Type="Number">10</Parameter></SetProperty><SetProperty Id="8" ObjectPathId="0" Name="RowLimit"><Parameter Type="Number">10</Parameter></SetProperty><SetProperty Id="9" ObjectPathId="0" Name="TotalRowsExactMinimum"><Parameter Type="Number">11</Parameter></SetProperty><SetProperty Id="10" ObjectPathId="0" Name="SourceId"><Parameter Type="Guid">{ec6b6718-a344-4f42-9942-f6c673ab1089}</Parameter></SetProperty><ObjectPath Id="12" ObjectPathId="11" /><Method Name="SetQueryPropertyValue" Id="13" ObjectPathId="11"><Parameters><Parameter Type="String">SourceName</Parameter><Parameter TypeId="{b25ba502-71d7-4ae4-a701-4ca2fb1223be}"><Property Name="BoolVal" Type="Boolean">false</Property><Property Name="IntVal" Type="Number">0</Property><Property Name="QueryPropertyValueTypeIndex" Type="Number">1</Property><Property Name="StrArray" Type="Null" /><Property Name="StrVal" Type="String">AACR Event List Source</Property></Parameter></Parameters></Method><Method Name="SetQueryPropertyValue" Id="14" ObjectPathId="11"><Parameters><Parameter Type="String">SourceLevel</Parameter><Parameter TypeId="{b25ba502-71d7-4ae4-a701-4ca2fb1223be}"><Property Name="BoolVal" Type="Boolean">false</Property><Property Name="IntVal" Type="Number">0</Property><Property Name="QueryPropertyValueTypeIndex" Type="Number">1</Property><Property Name="StrArray" Type="Null" /><Property Name="StrVal" Type="String">SPWeb</Property></Parameter></Parameters></Method><SetProperty Id="15" ObjectPathId="0" Name="Refiners"><Parameter Type="String">RefinableString59(filter=15/0/*),RefinableString58(filter=15/0/*)</Parameter></SetProperty><ObjectPath Id="17" ObjectPathId="16" /><Method Name="Add" Id="18" ObjectPathId="16"><Parameters><Parameter Type="String">Title</Parameter></Parameters></Method><Method Name="Add" Id="19" ObjectPathId="16"><Parameters><Parameter Type="String">Path</Parameter></Parameters></Method><Method Name="Add" Id="20" ObjectPathId="16"><Parameters><Parameter Type="String">Author</Parameter></Parameters></Method><Method Name="Add" Id="21" ObjectPathId="16"><Parameters><Parameter Type="String">SectionNames</Parameter></Parameters></Method><Method Name="Add" Id="22" ObjectPathId="16"><Parameters><Parameter Type="String">SiteDescription</Parameter></Parameters></Method><SetProperty Id="23" ObjectPathId="0" Name="TrimDuplicates"><Parameter Type="Boolean">false</Parameter></SetProperty><Method Name="SetQueryPropertyValue" Id="24" ObjectPathId="11"><Parameters><Parameter Type="String">ListId</Parameter><Parameter TypeId="{b25ba502-71d7-4ae4-a701-4ca2fb1223be}"><Property Name="BoolVal" Type="Boolean">false</Property><Property Name="IntVal" Type="Number">0</Property><Property Name="QueryPropertyValueTypeIndex" Type="Number">1</Property><Property Name="StrArray" Type="Null" /><Property Name="StrVal" Type="String">4dfaa8e2-a519-4988-b774-f81961091dba</Property></Parameter></Parameters></Method><Method Name="SetQueryPropertyValue" Id="25" ObjectPathId="11"><Parameters><Parameter Type="String">ListItemId</Parameter><Parameter TypeId="{b25ba502-71d7-4ae4-a701-4ca2fb1223be}"><Property Name="BoolVal" Type="Boolean">false</Property><Property Name="IntVal" Type="Number">4</Property><Property Name="QueryPropertyValueTypeIndex" Type="Number">2</Property><Property Name="StrArray" Type="Null" /><Property Name="StrVal" Type="Null" /></Parameter></Parameters></Method><SetProperty Id="26" ObjectPathId="0" Name="ResultsUrl"><Parameter Type="String">https://www.aacr.org/MEETINGS/PAGES/EVENTLISTING.ASPX#k=</Parameter></SetProperty><SetProperty Id="27" ObjectPathId="0" Name="ClientType"><Parameter Type="String"></Parameter></SetProperty><Method Name="SetQueryPropertyValue" Id="28" ObjectPathId="11"><Parameters><Parameter Type="String">QuerySession</Parameter><Parameter TypeId="{b25ba502-71d7-4ae4-a701-4ca2fb1223be}"><Property Name="BoolVal" Type="Boolean">false</Property><Property Name="IntVal" Type="Number">0</Property><Property Name="QueryPropertyValueTypeIndex" Type="Number">1</Property><Property Name="StrArray" Type="Null" /><Property Name="StrVal" Type="String">75323e1b-fbc8-437d-a884-9474eb16e68a</Property></Parameter></Parameters></Method><SetProperty Id="29" ObjectPathId="0" Name="ProcessPersonalFavorites"><Parameter Type="Boolean">false</Parameter></SetProperty><SetProperty Id="30" ObjectPathId="0" Name="SafeQueryPropertiesTemplateUrl"><Parameter Type="String">querygroup://webroot/PAGES/EVENTLISTING.ASPX?groupname=Default</Parameter></SetProperty><SetProperty Id="31" ObjectPathId="0" Name="IgnoreSafeQueryPropertiesTemplateUrl"><Parameter Type="Boolean">false</Parameter></SetProperty><ObjectPath Id="33" ObjectPathId="32" /><ExceptionHandlingScope Id="34"><TryScope Id="36"><Method Name="ExecuteQueries" Id="38" ObjectPathId="32"><Parameters><Parameter Type="Array"><Object Type="String">b759b507-ba34-499a-b350-9478f1deb96cDefault</Object></Parameter><Parameter Type="Array"><Object ObjectPathId="0" /></Parameter><Parameter Type="Boolean">true</Parameter></Parameters></Method></TryScope><CatchScope Id="40" /></ExceptionHandlingScope></Actions><ObjectPaths><Constructor Id="0" TypeId="{80173281-fffd-47b6-9a49-312e06ff8428}" /><Property Id="3" ParentId="0" Name="SortList" /><Property Id="11" ParentId="0" Name="Properties" /><Property Id="16" ParentId="0" Name="HitHighlightedProperties" /><Constructor Id="32" TypeId="{8d2ac302-db2f-46fe-9015-872b35f15098}" /></ObjectPaths></Request>''' r = requests.post('https://www.aacr.org/Meetings/_vti_bin/client.svc/ProcessQuery', data = xml, headers = headers) soup = BeautifulSoup(r.content, 'lxml') data = json.loads(soup.select_one('p').text) results = [] for item in data[12]['b759b507-ba34-499a-b350-9478f1deb96cDefault']['ResultTables'][0]['ResultRows']: title = item['Title'] start = re.search(r'(\d+)', item['MeetingStartDate']).groups(0)[0] end = re.search(r'(\d+)', item['MeetingEndDate']).groups(0)[0] location = item['EventCityStOWSTEXT'] start = pandas.to_datetime(start,unit='ms') end = pandas.to_datetime(end,unit='ms') row = [title, start, end, location] results.append(row) pd.options.display.max_columns = 4 pd.set_option('display.width', 1000) df = pd.DataFrame(results, columns = ['Title', 'Start', 'End', 'Location']) print(df) 的{​​{1}}并生成split

list

或者在tidyverse链中使用相同的逻辑

data.frame

数据

rowMeans

答案 1 :(得分:1)

Akrun给出了正确的答案,但是对于大多数分析而言,您的数据并不是最简单的格式。

您可能要考虑融合数据。

x = melt(as.matrix(df),varnames = c('row','col'))
x$id = substr(x$col,1,2)
ddply(x,c('row','id'),summarise,mean=mean(value,na.rm = T)) # or aggregate, etc.

答案 2 :(得分:1)

我认为row_mean中的hablar是更简单的解决方案。我重用了@akrun的df。

library(hablar)

df1 %>% 
  mutate(v1.mean = row_mean_(contains("v1")))

为您提供:

   V1.1 V1.2 V1.3 V1.4 V1.5 V1.6 V1.7 V1.8 V1.9 V2.1  v1.mean
1     5    4    5   NA   NA   NA   NA   NA   NA    5 4.666667
2     5    5    5   NA   NA   NA   NA   NA   NA    3 5.000000
3     5    5    5    5   NA   NA   NA   NA   NA    4 5.000000
4     5    4    5   NA   NA   NA   NA   NA   NA    3 4.666667
5     5    5   NA   NA   NA   NA   NA   NA   NA    2 5.000000
6     5    5    5    5   NA   NA   NA   NA   NA    1 5.000000
7     5    5    5    4    4   NA   NA   NA   NA    5 4.600000
8     5    5    5    4    5    5   NA   NA   NA    4 4.833333
9     4    5    5    5    4   NA   NA   NA   NA    1 4.600000
10    5    5    5   NA   NA   NA   NA   NA   NA    5 5.000000