用于查找连接组件的BFS实现耗时太长

时间:2013-03-09 19:23:32

标签: java performance graph breadth-first-search

这是我问here的问题的继续 鉴于节点(员工)和邻接列表(员工之间的友谊)的总数,我需要找到所有连接的组件。以下是我的代码 -

    public class Main {
        static HashMap<String, Set<String>> friendShips;

        public static void main(String[] args) throws IOException {
            BufferedReader  in= new BufferedReader(new InputStreamReader(System.in));

                String dataLine = in.readLine();
                String[] lineParts = dataLine.split(" ");
                int employeeCount = Integer.parseInt(lineParts[0]);
                int friendShipCount = Integer.parseInt(lineParts[1]);
                friendShips = new HashMap<String, Set<String>>();
                for (int i = 0; i < friendShipCount; i++) {
                    String friendShipLine = in.readLine();
                    String[] friendParts = friendShipLine.split(" ");
                    mapFriends(friendParts[0], friendParts[1], friendShips);
                    mapFriends(friendParts[1], friendParts[0], friendShips);
                }
                Set<String> employees = new HashSet<String>();
                for (int i = 1; i <= employeeCount; i++) {
                    employees.add(Integer.toString(i));
                }
                Vector<Set<String>> friendBuckets = bucketizeEmployees(employees);
                System.out.println(friendBuckets.size());
        }

        public static void mapFriends(String friendA, String friendB, Map<String, Set<String>> friendsShipMap) {
            if (friendsShipMap.containsKey(friendA)) {
                friendsShipMap.get(friendA).add(friendB);
            } else {
                Set<String> friends = new HashSet<String>();
                friends.add(friendB);
                friendsShipMap.put(friendA, friends);
            }
        }

        public static Vector<Set<String>> bucketizeEmployees(Set<String> employees) {
            Vector<Set<String>> friendBuckets = new Vector<Set<String>>();
            while (!employees.isEmpty()) {
                String employee = getHeadElement(employees);
                Set<String> connectedEmployeesBucket = getConnectedFriends(employee);
                friendBuckets.add(connectedEmployeesBucket);
                employees.removeAll(connectedEmployeesBucket);
            }
            return friendBuckets;
        }

        private static Set<String> getConnectedFriends(String friend) {
            Set<String> connectedFriends = new HashSet<String>();
            connectedFriends.add(friend);
            Set<String> queuedFriends = new LinkedHashSet<String>();
            if (friendShips.get(friend) != null) {
                queuedFriends.addAll(friendShips.get(friend));
            }
            while (!queuedFriends.isEmpty()) {
                String poppedFriend = getHeadElement(queuedFriends);
                connectedFriends.add(poppedFriend);
                if (friendShips.containsKey(poppedFriend))
                    for (String directFriend : friendShips.get(poppedFriend)) {
                        if (!connectedFriends.contains(directFriend) && !queuedFriends.contains(directFriend)) {
                            queuedFriends.add(directFriend);
                        }
                    }
            }
            return connectedFriends;
        }

        private static String getHeadElement(Set<String> setFriends) {
            Iterator<String> iter = setFriends.iterator();
            String head = iter.next();
            iter.remove();
            return head;
        }
    }

我使用以下脚本测试了我的代码,其结果我作为sdtIn使用 -

#!/bin/bash
echo "100000 100000"
for i in {1..100000}
do
    r1=$(( $RANDOM % 100000 ))
    r2=$(( $RANDOM % 100000 ))
    echo "$r1 $r2"
   done

虽然我能够验证(对于微不足道的输入)我的答案是正确的,当我尝试使用上述脚本的大量输入时,我看到运行需要很长时间(~20s)。
任何我我的实施能做得更好吗?

1 个答案:

答案 0 :(得分:0)

避免使用已同步的类Vector。取而代之的是ArrayList。 如果您找到了一种方法来创建字符串并使用Integer,那将是一个优势。 (例如userId而不是userName