Những từ nào là quan trọng nhất trong các bài phát biểu của tổng thống? .

TFIDF : Data Science Concepts

  1. This is a great explanation. Thanks.
    I have a question about differences between the implementation described in this video and another implementation commonly found on the web.
    Can you explain how these two details would impact the final representation:
    1) Term frequency simply calculated as term count
    2) Applying vector normalisation (L2) to the document vector obtained in this video

    Another question which is more open-ended: why is TfIdf still relevant ? Or less provocatively – is there a sweet spot where one would prefer TfIdf over the modern dense vector representations (such as word2vec, doc2vec, etc.) ?

  2. but if healthcare appears 100 times in one document, and only once in each of the other 2 documents, then the result will be zero!

  3. Excellent teaching! Perfectly designed, clearly explained and not even one sentence that would be redundant. I’m your fan my friend 👍🏼🙏🏼

  4. How do you model multiple objects associated to a term class: Dental Care: United Health Care, Blue Shield, …, by state? This becomes contextual and local within the text – how close is the word dental care in the text to UHC, for instance. The result would show which states address dental care in their health insurance regulations and which insurance companies make it available – both in a positive and negative way. Understand that this is a narrow example. Thanks

