我想使用java8流合并内部列表,如下所示:
何时
List<List<Integer>> mainList = new ArrayList<List<Integer>>();
mainList.add(Arrays.asList(0,1));
mainList.add(Arrays.asList(0,1,2));
mainList.add(Arrays.asList(1,2));
mainList.add(Arrays.asList(3));
应该合并到
[[0,1,2],[3]];
当
List<List<Integer>> mainList = new ArrayList<List<Integer>>();
mainList.add(Arrays.asList(0,2));
mainList.add(Arrays.asList(1,4));
mainList.add(Arrays.asList(0,2,4));
mainList.add(Arrays.asList(3,4));
mainList.add(Arrays.asList(1,3,4));
应该合并到
[[0,1,2,3,4]];
这是迄今为止我所做的
static void mergeCollections(List<List<Integer>> collectionTomerge) {
boolean isMerge = false;
List<List<Integer>> mergeCollection = new ArrayList<List<Integer>>();
for (List<Integer> listInner : collectionTomerge) {
List<Integer> mergeAny = mergeCollection.stream().map(
lc -> lc.stream().filter(listInner::contains)
).findFirst()
.orElse(null)
.collect(Collectors.toList());
}
}
但我得到了这个例外:
Exception in thread "main" java.lang.NullPointerException
at linqArraysOperations.LinqOperations.mergeCollections(LinqOperations.java:87)
更新了我的答案版本
这就是我想要实现的目标,但Tagir的答案是没有递归
我通过使用Tagir回答没有平面地图的逻辑来改变Mikhaal的一些答案来实现这一目标
public static <T> List<List<T>> combineList(List<List<T>> argList) {
boolean isMerge = false;
List<List<T>> result = new ArrayList<>();
for (List<T> list : argList) {
List<List<T>> mergedFound =
result.stream()
.filter(mt->list.stream().anyMatch(mt::contains))
.map(
t -> Stream.concat(t.stream(),list.stream()).distinct()
.collect(Collectors.toList())
)
.collect(Collectors.toList());
//if(mergedFound !=null && ( mergedFound.size() > 0 && mergedFound.stream().findFirst().get().size() > 0 )){
if(mergedFound !=null && mergedFound.size() > 0 && ){
result = Stream.concat(result.stream().filter(t->list.stream().noneMatch(t::contains)),mergedFound.stream()).distinct().collect(Collectors.toList());
isMerge = true;
}
else
result.add(list);
}
if(isMerge && result.size() > 1)
return combineList(result);
return result;
}
答案 0 :(得分:5)
这是一个非常简单但不是非常有效的解决方案:
static List<List<Integer>> mergeCollections(List<List<Integer>> input) {
List<List<Integer>> result = Collections.emptyList();
for (List<Integer> listInner : input) {
List<Integer> merged = Stream.concat(
// read current results and select only those which contain
// numbers from current list
result.stream()
.filter(list -> list.stream().anyMatch(listInner::contains))
// flatten them into single stream
.flatMap(List::stream),
// concatenate current list, remove repeating numbers and collect
listInner.stream()).distinct().collect(Collectors.toList());
// Now we need to remove used lists from the result and add the newly created
// merged list
result = Stream.concat(
result.stream()
// filter out used lists
.filter(list -> list.stream().noneMatch(merged::contains)),
Stream.of(merged)).collect(Collectors.toList());
}
return result;
}
棘手的部分是下一个listInner
可以合并已经添加的几个列表。例如,如果我们有部分结果,例如[[1, 2], [4, 5], [7, 8]]
,并处理新的listInner
内容为[2, 3, 5, 7]
,那么部分结果应该变为[[1, 2, 3, 4, 5, 7, 8]]
(即所有列表都是合并在一起)。因此,在每次迭代中,我们都在查找现有的部分结果,这些部分结果与当前listInner
具有相同的数字,将它们展平,与当前listInner
连接并转储到新的merged
列表中。接下来,我们会从merged
中使用的当前结果列表中过滤出来,并在那里添加merged
。
使用partitioningBy
收集器可以立即执行两个过滤步骤,使解决方案更有效:
static List<List<Integer>> mergeCollections(List<List<Integer>> input) {
List<List<Integer>> result = Collections.emptyList();
for (List<Integer> listInner : input) {
// partition current results by condition: whether they contain
// numbers from listInner
Map<Boolean, List<List<Integer>>> map = result.stream().collect(
Collectors.partitioningBy(
list -> list.stream().anyMatch(listInner::contains)));
// now map.get(true) contains lists which intersect with current
// and should be merged with current
// and map.get(false) contains other lists which should be preserved
// in result as is
List<Integer> merged = Stream.concat(
map.get(true).stream().flatMap(List::stream),
listInner.stream()).distinct().collect(Collectors.toList());
result = Stream.concat(map.get(false).stream(), Stream.of(merged))
.collect(Collectors.toList());
}
return result;
}
此处map.get(true)
包含的列表包含listInner
和map.get(false)
中的元素,其中包含应从之前结果中保留的其他列表。
元素的顺序可能不是您所期望的,但您可以轻松地对嵌套列表进行排序,或者根据需要使用List<TreeSet<Integer>>
作为结果数据结构。
答案 1 :(得分:1)
对于您获得的例外情况,我猜我传递#coding:utf-8
import multiprocessing
import requests
import bs4
import re
import string
root_url = 'http://www.haoshiwen.org'
#index_url = root_url+'/type.php?c=1'
def xianqin_url():
f = 0
h = 0
x = 0
y = 0
b = []
l=[]
for i in range(1,64):#页数
index_url=root_url+'/type.php?c=1'+'&page='+"%s" % i
response = requests.get(index_url)
soup = bs4.BeautifulSoup(response.text,"html.parser")
x = [a.attrs.get('href') for a in soup.select('div.sons a[href^=/]')]#取出每一页的div是sons的链接
c=len(x)#一共c个链接
j=0
for j in range(c):
url = root_url+x[j]
us = str(url)
print "收集到%s" % us
l.append(url) #pool = multiprocessing.Pool(8)
return l
def feng (url) :
response = requests.get(url)
response.encoding='utf-8'
#print response.text
soup = bs4.BeautifulSoup(response.text, "html.parser")
#content = soup.select('div.shileft')
qq=str(soup)
soupout = re.findall(r"原文(.+?)</div>",qq,re.S)#以“原文”开头<div>结尾的字段
#print soupout[1]
content=str(soupout[1])
b="风"
cc=content.count(b,0,len(content))
return cc
def start_process():
print 'Starting',multiprocessing.current_process().name
def feng (url) :
response = requests.get(url)
response.encoding='utf-8'
#print response.text
soup = bs4.BeautifulSoup(response.text, "html.parser")
#content = soup.select('div.shileft')
qq=str(soup)
soupout = re.findall(r"原文(.+?)</div>",qq,re.S)#以“原文”开头<div>结尾的字段
#print soupout[1]
content=str(soupout[1])
b="风"
c="花"
d="雪"
e="月"
f=content.count(b,0,len(content))
h=content.count(c,0,len(content))
x=content.count(d,0,len(content))
y=content.count(e,0,len(content))
return f,h,x,y
def find(urls):
r= [0,0,0,0]
pool=multiprocessing.Pool()
res=pool.map4(feng,urls)
for i in range(len(res)):
r=map(lambda (a,b):a+b, zip(r,res[i]))
return r
if __name__=="__main__":
print "开始收集网址"
qurls=xianqin_url()
print "收集到%s个链接" % len(qurls)
print "开始匹配先秦诗文"
find(qurls)
print '''
%s篇先秦文章中:
---------------------------
风有:%s
花有:%s
雪有:%s
月有:%s
数据来源:%s
''' % (len(qurls),find(qurls)[0],find(qurls)[1],find(qurls)[2],find(qurls)[3],root_url)
的{{1}}包含List<List<Integer>>
值,并且那些投掷mergeCollections
在null
。
其次,如果我正确理解您的问题,您需要一种可以合并共享共同元素的列表的算法。我想出了解决问题的方法:
NullPointerException
该算法相当简单,它使用简单的递归来运行算法,直到输出为&#34; clean&#34;,即直到列表完全合并为止。我还没有做过任何优化,但它确实做了它应该做的事情。
请注意,此方法还会合并您现有的列表。