Question

我有一个从文本文件中获取数据的程序。该文件的格式为

Date/Time Tip From (Name)
Message(If one was left)
(Tipped Amount)    (Total Tips Recieved)

我已经能够将我的代码分开并将其放入Map中，以便将每个名称的所有提示相加并以降序输出。

E.g

INPUT ----------------------------------
Dec. 14, 2013, 2:31 a.m.     Tip from rs
25  24986
Dec. 14, 2013, 2:27 a.m.     Tip from ro
100 24961
Dec. 14, 2013, 2:27 a.m.     Tip from rs
15  24861
Dec. 14, 2013, 2:25 a.m.     Tip from da
3   24846
OUTPUT-----------------------------------
ro=100
rs=40
da=3

我现在遇到了一个问题。我发现我正在丢失数据而无法找出原因。在文本文件中有超过1,000个，所以大约2,000行文本。其中一个自卸车X，在手动计算时，倾向于1990年。当运行时，该程序仅计算1690，比实际倾斜的少300。我在试图调试这个以找出数据可能被删除或跳过的位置时感到茫然。

以下是我的代码与正在执行的计算器相关的摘录

        while ((line = bufferedReader.readLine()) != null) {
            if (line.contains("Tip from")) { // Finds the line that contains
                                                    // the tippers name
                final String tipperName = line.substring(line
                        .indexOf("from ") + 5);
                currentTipper = tipperName;

            } else if (line.substring(0, 1).matches("\\d")) { // finds the
                                                                // line that
                                                                // contains
                                                                // the
                                                                // tipped
                                                                // amount
                final Integer tipValue = Integer.parseInt(line.substring(0,
                        line.indexOf("\t")));
                // here we store the tip in the map. If we have a record
                // we
                // sum, else
                // we store as is
                tipsByName
                        .put(currentTipper,
                                (tipsByName.get(currentTipper) == null ? 0
                                        : tipsByName.get(currentTipper))
                                        + tipValue);

            } else { // if line doesnt contain a name or a tip, skips to
                        // next line
                bufferedReader.readLine();

            }
        }

如果完整代码更有帮助，请告诉我，我会编辑帖子。

谢谢！

Answer 1

我认为你有重复的条目，这就是你“丢失数据”的原因。在向地图插入条目之前，请尝试查看密钥是否已存在。

Answer 2

在旁注而不是编写用于解析文本文件的代码，因为你有一个固定的分隔符分隔字段，尝试使用OsterMillerUtils CSVParser来减轻编码麻烦http://ostermiller.org/utils/javadoc/CSVParser.html。

Answer 3

您可以使用正则表达式来解析整个文件（Java 7）：

import java.io.IOException;
import java.nio.ByteBuffer;
import java.nio.charset.Charset;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.*;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Test {

    public static <K, V extends Comparable<? super V>> Map<K, V>
    sortByValueDescending( Map<K, V> map )
    {
        List<Map.Entry<K, V>> list =
                new LinkedList<Map.Entry<K, V>>( map.entrySet() );
        Collections.sort( list, new Comparator<Map.Entry<K, V>>()
        {
            public int compare( Map.Entry<K, V> o1, Map.Entry<K, V> o2 )
            {
                return (o2.getValue()).compareTo( o1.getValue() );
            }
        } );

        Map<K, V> result = new LinkedHashMap<K, V>();
        for (Map.Entry<K, V> entry : list)
        {
            result.put( entry.getKey(), entry.getValue() );
        }
        return result;
    }

    static String readFile(String path, Charset encoding)
            throws IOException
    {
        byte[] encoded = Files.readAllBytes(Paths.get(path));
        return encoding.decode(ByteBuffer.wrap(encoded)).toString();
    }

    public static void main(String[] args) throws IOException
    {
        //String sourceString = readFile(args[1], Charset.defaultCharset());
        String sourceString = "Dec. 14, 2013, 2:31 a.m.     Tip from rs\n" +
                "25  24986\n" +
                "Dec. 14, 2013, 2:27 a.m.     Tip from ro\n" +
                "100 24961\n" +
                "Dec. 14, 2013, 2:27 a.m.     Tip from rs\n" +
                "15  24861\n" +
                "Dec. 14, 2013, 2:25 a.m.     Tip from da\n" +
                "3   24846";

        Pattern re = Pattern.compile("^[\\w]{3}\\.\\s\\d{1,2},\\s\\d{4},\\s\\d{1,2}:\\d{2}\\s[ap]\\.m\\.\\s+Tip\\sfrom\\s(\\w+)\\s*^(\\d+)\\s+\\d+\\s*"
                ,Pattern.CASE_INSENSITIVE | Pattern.MULTILINE);

        Map<String,Integer> tips = new HashMap<String, Integer>();

        Matcher m = re.matcher(sourceString);
        while (m.find()){
            String server = m.group(1);
            Integer tip = Integer.parseInt(m.group(2));

            Integer serverTips = tips.get(server);
            if(serverTips == null) serverTips = 0;
            serverTips +=  tip;

            tips.put(server, serverTips);
        }

        Map<String,Integer> sortedTips = sortByValueDescending(tips);
        for(Map.Entry<String,Integer> entry : sortedTips.entrySet())
        {
            System.out.println(entry.getKey()+"="+ entry.getValue());
        }
    }
}

输出：

ro=100
rs=40
da=3

您可以将args[1]替换为文件路径，如果不使用默认编码，则可以更改编码。

Java - 丢失数据，包含String和Integer的Map

3 个答案: