我有一个程序,其中以元组形式的哈希映射(表示从输入文件读取的句子)和整数(在数据中观察的次数)是能够填充数据,但无法回应我打印它的内容的尝试。它填充在下面代码中的'for'循环内部,并且在该代码段的底部是它打印的位置。
public static void main(String[] args) throws IOException
{
Ontology ontology = new Ontology();
BufferedReader br = new BufferedReader(new FileReader("/home/matthias/Workbench/SUTD/2_January/learning_first-order_horn_clauses_from_web_text/reverb/code/input_data/stackoverflow_test.txt"));
Pattern p = Pattern.compile("'(.*?)'\\('(.*?)',\\s*'(.*?)'\\)\\.");
String line;
while ((line = br.readLine()) != null)
{
Matcher m = p.matcher(line);
if( m.matches() )
{
String verb = m.group(1);
String object = m.group(2);
String subject = m.group(3);
ontology.addSentence( new Sentence( verb, object, subject ) );
}
}
for( String joint: ontology.getJoints() )
{
for( Integer subind: ontology.getSubjectIndices( joint ) )
{
Sentence xaS = ontology.getSentence( subind );
for( Integer obind: ontology.getObjectIndices( joint ) )
{
Sentence yOb = ontology.getSentence( obind );
Sentence s = new Sentence( xaS.getVerb(),
xaS.getObject(),
yOb.getSubject() );
//System.out.println( s );
ontology.numberRules( s );
}
}
}
for (Map.Entry<Sentence, Integer> entry : ontology.numberRules.entrySet())
{
System.out.println(entry.getKey()+" : "+entry.getValue());
}
}
以下文件的底部是哈希映射的实现位置。这也采用输入句子并搜索句子的主语和宾语中的重叠值。系统尝试做的是通过输入数据的推断来学习“规则”,即contains(vitamin c, oranges)
,prevents(scurvy, vitamin c)
会产生输出prevents(scurvy, oranges)
,事实是,在我的测试数据中有许多相同的规则,所以我想跟踪他们被观察的次数,同时也只存储一个独特的“规则”的副本。这就是哈希映射将句子存储为键并将整数(计数)存储为值的原因。
private List<Sentence> sentences = new ArrayList<>();
/*
* The following maps store the relation of a string occurring
* as a subject or object, respectively, to the list of Sentence
* ordinals where they occur.
*/
private Map<String,List<Integer>> subject2index = new HashMap<>();
private Map<String,List<Integer>> object2index = new HashMap<>();
/*
* This set contains strings that occur as both,
* subject and object. This is useful for determining strings
* acting as an in-between connecting two relations.
*/
private Set<String> joints = new HashSet<>();
public void addSentence( Sentence s )
{
// add Sentence to the list of all Sentences
sentences.add( s );
// add the Subject of the Sentence to the map mapping strings
// occurring as a subject to the ordinal of this Sentence
List<Integer> subind = subject2index.get( s.getSubject() );
if( subind == null )
{
subind = new ArrayList<>();
subject2index.put( s.getSubject(), subind );
}
subind.add( sentences.size() - 1 );
// add the Object of the Sentence to the map mapping strings
// occurring as an object to the ordinal of this Sentence
List<Integer> objind = object2index.get( s.getObject() );
if( objind == null )
{
objind = new ArrayList<>();
object2index.put( s.getObject(), objind );
}
objind.add( sentences.size() - 1 );
// determine whether we've found a "joining" string
if( subject2index.containsKey( s.getObject() ) )
{
joints.add( s.getObject() );
}
if( object2index.containsKey( s.getSubject() ) )
{
joints.add( s.getSubject() );
}
}
public Collection<String> getJoints()
{
return joints;
}
public List<Integer> getSubjectIndices( String subject )
{
return subject2index.get( subject );
}
public List<Integer> getObjectIndices( String object )
{
return object2index.get( object );
}
public Sentence getSentence( int index )
{
return sentences.get( index );
}
//map to store learned 'rules'
Map<Sentence, Integer> ruleCount = new HashMap<>();
//store data
public void numberRules(Sentence sentence)
{
if (!ruleCount.containsKey(sentence))
{
ruleCount.put(sentence, 0);
}
ruleCount.put(sentence, ruleCount.get(sentence) + 1);
}
这是存储句子的对象。
public class Sentence
{
private String verb;
private String object;
private String subject;
public Sentence(String verb, String object, String subject )
{
this.verb = verb;
this.object = object;
this.subject = subject;
}
public String getVerb()
{
return verb;
}
public String getObject()
{
return object;
}
public String getSubject()
{
return subject;
}
public String toString()
{
return verb + "(" + object + ", " + subject + ").";
}
}
输入数据如下所示
'prevents'('scurvy','vitamin C').
'contains'('vitamin C','orange').
'contains'('vitamin C','sauerkraut').
'isa'('fruit','orange').
'improves'('health','fruit').
我希望输出数据可以告诉我,例如
prevents(scurvy, orange). 2
prevents(scurvy, sauerkraut). 4
improves(health, orange). 1
其中句子是哈希映射的关键字,整数是关联值,对应于在数据中观察到句子的次数。
答案 0 :(得分:2)
我没有在您的Ontology类中看到numberRules
成员。
也许您打算使用ruleCount
成员,这是我在您的代码中看到的Map<Sentence, Integer>
类型的唯一变量。
for (Map.Entry<Sentence, Integer> entry : ontology.ruleCount.entrySet())
{
System.out.println(entry.getKey()+" : "+entry.getValue());
}
关于赫克托尔的评论,这是一个不同的问题。当您使用其中一个自定义类作为HashMap
(在您的情况下为Sentence
类)中的键时,您必须覆盖equals
和hashCode
。如果你不a.equals(b)
只会a==b
返回true,这可能不是你想要的行为。当两个比较句子的动词,对象和主语分别相等时,您可能希望a.equals(b)
返回true。 hashCode
应以a.equals(b)
为真{!}}的方式实施。