Python icon indicating copy to clipboard operation
Python copied to clipboard

Solving the `Top k most frequent words` problem using a max-heap

Open aparibocci opened this issue 2 years ago • 0 comments

Describe your change:

This PR aims to add an algorithm to identify the top k most frequent strings given a provided string list of elements. To do this, the algorithm is using a max-heap implementation already existing in this repository (a generic type was introduced to allow the usage).

Time complexity is O(n), where n is the number of words:

  • O(n) for building the max-heap
  • k*O(logn) for extracting the k most frequent strings
  • [x] Add an algorithm?
  • [ ] Fix a bug or typo in an existing algorithm?
  • [ ] Documentation change?

Checklist:

  • [x] I have read CONTRIBUTING.md.
  • [x] This pull request is all my own work -- I have not plagiarized.
  • [x] I know that pull requests will not be merged if they fail the automated tests.
  • [x] This PR only changes one algorithm file. To ease review, please open separate PRs for separate algorithms.
  • [x] All new Python files are placed inside an existing directory.
  • [x] All filenames are in all lowercase characters with no spaces or dashes.
  • [x] All functions and variable names follow Python naming conventions.
  • [x] All function parameters and return values are annotated with Python type hints.
  • [x] All functions have doctests that pass the automated testing.
  • [x] All new algorithms include at least one URL that points to Wikipedia or another similar explanation.
  • [x] If this pull request resolves one or more open issues then the commit message contains Fixes: #{$ISSUE_NO}.

aparibocci avatar Feb 07 '23 19:02 aparibocci