聚合函数R的问题

时间:2017-08-02 04:56:01

标签: r dataframe ggplot2 sum aggregate

我试图合并一些数据,不幸的是,我似乎有一些损失......

dataframe <-   Project    Subproject       Value      Date  
                 A           1              9       2017-03-08
                 A           2              5       2017-03-08
                 B           1              1       2017-03-08

overall <- aggregate(dataframe$Valeu, by=list(Date=dataframe$Date, Project=dataframe$Project), FUN=sum)

只会回复我:

dataframe <-   Project      Value      Date  
                 A           14       2017-03-08

当我想要的是这个

dataframe <-   Project    Value      Date  
                 A          14       2017-03-08
                 B           1       2017-03-08

更新:我再次尝试了所提出的解决方案,虽然R告诉我在我的数据框中有一个带有所述值和日期的项目B,但我的ggplot告诉我其他...

 ggplot(data = dataframe, aes(x = Date, y = Value, fill = Project)) +
  geom_bar(stat = 'identity') + geom_text(data = dataframe, aes(label = Value, fill = Project), size=4)

无论我做什么,它都只是绘制项目A的数据。但是,如果我不总结/汇总数据,它将为我正常绘制两个项目,但是geom_text仍将继续为项目A分别给出两个数字。我的总体目标是建立一个按照我的描述聚合的数据框,这样我就可以用geom_text正确标记我的条形图来清晰地绘制我的聚合数据......

1 个答案:

答案 0 :(得分:1)

你可以尝试:

df %>%
  group_by(Project, Date)%>%
  summarise(Value = sum(Value))

给出:

  Project       Date Value

1       A 2017-03-08    14
2       B 2017-03-08     1

可以像ggplot(data = df, aes(x = Date, y = Value, fill = Project)) + geom_bar(stat = 'identity')

一样绘制

enter image description here

编辑1:根据OP的评论,要将输出保存在数据框中,请使用df %<>% ...中的df %>% ...代替magrittr的内容from pprint import pprint import requests import lxml import csv import urllib2 from bs4 import BeautifulSoup ####Function returns list of URLs based on search key##### def get_url_for_search_key(search_key): base_url = 'http://www.marketing-interactive.com/' response = requests.get(base_url + '?s=' + search_key) soup = BeautifulSoup(response.content, "lxml") newlinks = [] soup = BeautifulSoup(response.text, "lxml") results = soup.findAll('a', {'rel': 'bookmark'}) return [url['href'] for url in soup.findAll('a', {'rel': 'bookmark'})] pprint(get_url_for_search_key('digital marketing')) ### Scraped Links written into csv file(under a single column) ### with open('ctp_output.csv', 'w+') as f: f.seek(0) f.write('\n'.join(get_url_for_search_key('digital marketing'))) ### Scraped Links read from csv file and respective information is scraped and written into text file ### with open('ctp_output.csv', 'rb') as f1: f1.seek(0) reader = csv.reader(f1) for line in reader: url = line[0] soup = BeautifulSoup(urllib2.urlopen(url), "lxml") with open('ctp_output.txt', 'a+') as f2: for tag in soup.find_all('p'): f2.write(tag.text.encode('utf-8') + '\n')