In the Document Distance problem from the first two lectures, we compared two documents by counting the words in each, treating these counts as vectors, and computing the angle between these two vectors. For this problem, we will change the Document Distance code to use a new metric. Now, we will only care about words that show up in both documents, and we will ignore the contributions of words that only show up in one document. Download ps1.py, docdist7.py, and test-ps1.py from the class Sakai site. docdist7.py is mostly the same as docdist6.py seen in class, however it does not implement vector angle or inner product; instead, it imports those functions from ps1.py. Currently, ps1.py contains code copied straight from docdist6.py, but you will need to modify this code to implement the new metric. • Modify inner product to take a third argument, domain, which will be a set containing the words in both texts. Modify the code so that it only increases sum if the word is in domain. Don’t forget to change the documentation string at the top. • Modify vector angle so that it creates sets of the words in both L1 and L2, takes their intersection, and uses that intersection when calling inner product. Again, don’t forget to change the docstring at the top. Run test-ps1.py to make sure your modified code works. The same test suite will be run when you submit ps1.py to the class Sakai site. Does your code take significantly longer with the new metric? Why or why not? Submit ps1.py on the class Sakai site. All code submitted for this class will be checked for accuracy, asymptotic efficiency, and clarity.
https://papertowriters.com/wp-content/uploads/2020/07/Writerspng-300x62.png 0 0 admin https://papertowriters.com/wp-content/uploads/2020/07/Writerspng-300x62.png admin2021-11-02 17:08:052021-11-02 17:08:05In the Document Distance problem from the first two lectures, we compared two documents by counting