RandomAccessFile Question
Hey,
I'm having some trouble with the RandomAccessFile.
I have already created my Terms.Offset and Postings.Data and now I'm trying to get the cosine scoring to work.
I'm trying to fetch the postings list from the Postings.Data file for the term. I've got something like:
String currentTerm = "Jordan";
int postingOffset = TermsOffsetTable.get(currentTerm);
postings.seek(postingOffset);
int numOfDocs = postings.readInt();
Where TermsOffsetTable is a hashmap<string>, and it returns the correct offset value.
numOfDocs is returning an outrageous number, like 825505077, when it should actually be returning 145583 (I have a very small test file, and can look directly at the data).
I tried numerous RandomAccessFile methods, but not return the correct answer.
Other info:
Terms.Offset and Postings.Data are on 1 line (a very long line). is this going to be a problem?
thanks.
I'm having some trouble with the RandomAccessFile.
I have already created my Terms.Offset and Postings.Data and now I'm trying to get the cosine scoring to work.
I'm trying to fetch the postings list from the Postings.Data file for the term. I've got something like:
String currentTerm = "Jordan";
int postingOffset = TermsOffsetTable.get(currentTerm);
postings.seek(postingOffset);
int numOfDocs = postings.readInt();
Where TermsOffsetTable is a hashmap<string>, and it returns the correct offset value.
numOfDocs is returning an outrageous number, like 825505077, when it should actually be returning 145583 (I have a very small test file, and can look directly at the data).
I tried numerous RandomAccessFile methods, but not return the correct answer.
Other info:
Terms.Offset and Postings.Data are on 1 line (a very long line). is this going to be a problem?
thanks.
1
person has this question
I have this question, too!
Tell me when someone answers.
The more people who ask this question, the more it gets noticed.
The more people who ask this question, the more it gets noticed.
Create a customer community for your own organization
Plans starting at $19/month
-
Inappropriate?Jordan, when you get an unexpected integer, it means that you have written the data incorrectly. You can also put your write snippet here, so that we can check.
What do you mean by 1 line? These files are binary files and there is no notion of line in them. -
Inappropriate?I must be writing the data incorrectly.
Terms.Offset shouldn't matter what the file looks like since it is just being stored in memory.
Terms.Offset has, as an Example: "Jordan",0
Where "Jordan" is the term and 0 is the offset.
Postings.Data looks like:
145583,3,56,4,5,5,15,6,1,7,2,8,22,9,11,10,1,13,1,14,5,18,2 ......... and goes on for a looong time.
Looking at the posting, we start at the beginning and see 145583. I would assume that RandomAccessFile.readInt() would return that number.
For writing my file, i use a filewriter.
Google says:
FileOutputStream file_output = new FileOutputStream (file);
DataOutputStream data_out = new DataOutputStream (file_output);
data_out.writeInt(blah blah)
Should I be doing it this way? I am not familiar with "Binary Files".
Thanks. -
Inappropriate?No, you should use the same RandomAccessFile for writing in the file. Check my slides. I have examples for both reading from and writing to RandomAccessFiles.
-
Inappropriate?Right. I didn't make the connection that I should be doing that for the postings.data as well. But it worked! thanks
Loading Profile...


EMPLOYEE
EMPLOYEE