When indexing text based word frequency / relevance which may be applicable for web searches, one of the procedures used is to create a term frequency (tf) array followed by an inverse document frequency (idf) one. You can read more about this here.
In a previous post I experimented with some text in order to build hashmaps with the words of sentences (to keep things in perspective for a blog post). In that post I used a string that I copied from a course I took some years ago. The sting was already preprocessed. The text had already been stripped off punctuation marks. Continue reading “More than a List of Words”
This blog entry is based on the example on page 141 of Algorithms fourth edition by Robert Sedgewick and Kevin Wayne. I am reading the book as a refresher and learning experience on computer algorithms. Initially what called my attention was the statement that it contains 50 algorithms that every programmer should know. I want to make sure I have a good handle on those algorithms. After reading the first few chapters which have not cover the algorithms yet, I feel that the way the ground work is presented is very educational, direct to the point and simple to follow. I always like simplicity in code and documentation. Elegant code is very hard to find and write. This book seems that it should help readers achieve such goal. Continue reading “Generics Implementation”