In this exercise, you will create a simplified Lucene index. Toget partial credit in case of miscalculations, please give detailedsolutions. Given the following documents: D1: You say “goodbye”, I say “hello, hello, hello” D2: You say stop, I say go. D3: “Hello, hello, hello,” you say “goodbye”. D4: I say yes, you say no 1. (4 points) Build the inverted index for the documents. a. Dictionary file: e.g. Term DocFreq hello 2 I 3 b. Posting file (terms are implicit) e.g. Doc # Frequency 1 3 3 3 c. Position file (terms are implicit from dictionary file, useabsolute position of terms in the document) e.g. D1 D2 D3 D4 6,7,8 0 1,2,3 0 4 4 0 1 d. For a given query Q: say goodbye, describe the process to search the inverted index. 2. (2 points) a. Estimate the total size of the inverted index files in bytes.Numbers and characters are counted as 4 bytes. Strings are countedas the number of characters multiplied by 4 bytes. For example, thesize of string “hello” is 5*4 = 20 bytes. b. Compare the result from 2a. to the total size of thedocuments in bytes. . . .
https://papertowriters.com/wp-content/uploads/2020/07/Writerspng-300x62.png 0 0 admin https://papertowriters.com/wp-content/uploads/2020/07/Writerspng-300x62.png admin2021-11-09 23:08:582021-11-09 23:08:58In this exercise, you will create a simplified Lucene index. Toget partial credit in case of miscalc