pdfparser icon indicating copy to clipboard operation
pdfparser copied to clipboard

Discussions about how to organize further maintenance of this library

Open k00ni opened this issue 5 years ago • 42 comments

Based on the latest commit in master (over a year old) as well as 16 pending pull requests, i assume @smalot is not maintaining this library anymore. That's fine, he will have his reasons.

In this issue i would like to discuss where the community around this library should continue the work? There are some who already developed their own strain, for instance:

  • @amooij with https://github.com/amooij/pdfparser
  • @gyselroth with https://github.com/gyselroth/pdfparser
  • ~~@lausek with https://github.com/lausek/pdfparser~~
  • @limweb with https://github.com/limweb/pdfparser

Found them using https://github.com/smalot/pdfparser/network

Any ideas?


EDIT: Removed fork from @lausek (ref).

k00ni avatar Apr 15 '20 10:04 k00ni

My fork does not contain any changes of value. Sorry.

lausek avatar Apr 15 '20 11:04 lausek

We actively use this library for internal projects and we've fixed several issues, but we don't have the resources to become the lead maintainer. We are more than willing to share all our work.

On Wed, Apr 15, 2020 at 1:46 PM lausek [email protected] wrote:

My fork does not contain any changes of value. Sorry.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/smalot/pdfparser/issues/286#issuecomment-613989287, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADYMTFTFUDQR4OI2X5U75DRMWNDBANCNFSM4MIOUOTA .

amooij avatar Apr 15 '20 17:04 amooij

That sounds great! I can support with ~1 hour per week to help with organizing issues, tests and pull requests.

@smalot it would be nice to hear what you plan / think.

k00ni avatar Apr 16 '20 07:04 k00ni

I moved to a paid service setasign.com, does everything I need as a reader and soooo much more, maybe that is an option?

NoxxieNl avatar Apr 16 '20 18:04 NoxxieNl

Thank you for the tip, but i dont think that this is an option for everyone. For myself, the current state of pdfparser works decent, but switching could be an option, sure.

k00ni avatar Apr 17 '20 08:04 k00ni

You could go for the free version that support til pdf v1.4 (everything above is in the paid service). Fpdi is the package, for trading works great and integrates with allot of writer packages also.

Better alternative for my taste and those packages are better maintained, not saying you should move away, but a pdf parser is Allot of work... :)

Just my 2 cents...

NoxxieNl avatar Apr 17 '20 13:04 NoxxieNl

Hi all, Indeed, I'm not as available as before. I'll try to validate some PR or make some feedback if needed. Thanks

smalot avatar Apr 21 '20 06:04 smalot

@smalot maybe you can add one or two maintainers so they will ba able to help you.

j0k3r avatar May 18 '20 05:05 j0k3r

Hi @j0k3r , I noticed you work at 20 minutes Would you be interesting in working on this project ?

smalot avatar May 25 '20 07:05 smalot

Dont forget the post from @amooij

Quote:

We actively use this library for internal projects and we've fixed several issues, but we don't have the resources to become the lead maintainer. We are more than willing to share all our work.

k00ni avatar May 25 '20 07:05 k00ni

That sounds great! I can support with ~1 hour per week to help with organizing issues, tests and pull requests.

@smalot it would be nice to hear what you plan / think.

Hi @k00ni I just sent you an invite to work on this project if you are still interested on

smalot avatar May 25 '20 07:05 smalot

Hi @amooij I just sent too an invite to you if you have time to spend on this library

smalot avatar May 25 '20 07:05 smalot

@smalot, thank you for the invite.

I saw that you invited further people besides me. How do you want to organize this repository with you and 3+ further maintainers/helpers? Any preferences?

k00ni avatar May 25 '20 07:05 k00ni

Maybe you should first define at least one approval review to merge a PR.

It means at least 1 person have checked the code and if a maintainer submit a PR, at least 2 maintainers have seen the code.

j0k3r avatar May 25 '20 08:05 j0k3r

This library is very used and followed. I don't know if TDD is a good solution, but in the past I saw PR with bad approach using strpos instead of preg_match which could break some syntax patterns or some other issues. I tried to cover it with some useful unit tests. So I would appreciate to continue to maintain quality as first goal.

@j0k3r made a good suggestion.

Currently you seems to be more aware about real issues and have more time to spend on this library to make it alive. In my company we use to validate PR with :+1: ou :-1: to indicate if more work is required or if we can merge. Being 3 or 4 can create a useful debate for any evolution.

What do you think about it ?

smalot avatar May 25 '20 09:05 smalot

I support @j0k3r's suggestions, but it shouldn't matter who made a PR. For me at least the approval (+ review) of 1 maintainer is sufficient.

We should aim for basic but long term oriented "rules", because everyone of us mostly likely has a job and life too. So i would suggest we start with:

  • new pull request: 1 approval of a maintainer is enough for a merge (besides no objections by failing tests etc.)
  • new issue: maintainer may add/remove labels, moderate and take care of things
  • new release: I'd say if needed. All maintainer can file a draft and open an issue to discuss it.

Besides these, i would suggest that we try to setup tooling to automate as much as possible. Hound checks PR's and helps with code reviews. @smalot: Can you add support for https://coveralls.io so that we have an overview about the test coverage? Here is a good tutorial: https://kizu514.com/blog/setting-up-coveralls-io-with-travis-ci-and-phpunit/ I am sure there are other tools, which can make our life easier.

My 2 cents, what do you think?

k00ni avatar May 25 '20 12:05 k00ni

Currently, unit tests are made using Atoum which was quite enough until now. http://atoum.org/ You can run unit test and code coverage using this command line:

./vendor/bin/atoum -d src/Smalot/PdfParser/Tests/

I tried to increase code coverage, but I agree that's not a goal by itself. That's why I've included some pdf sample to test in real conditions. Last reports are available here : https://travis-ci.org/github/smalot/pdfparser

smalot avatar May 25 '20 22:05 smalot

Is a switch to PHPUnit an option for you @smalot?

k00ni avatar May 26 '20 07:05 k00ni

I'm 👍to move to PHPUnit. Also adding https://scrutinizer-ci.com/ could be a good thing.

j0k3r avatar May 26 '20 07:05 j0k3r

I've just reactivated my scrutinizer account: https://scrutinizer-ci.com/g/smalot/pdfparser

don't hesitate if you need specific settings on this tools

smalot avatar May 26 '20 20:05 smalot

No problem to use PHPUnit instead of Atoum, but I'm not sure to have time to work on it. The main advantage I found about Atoum when I choose it, it is really strict about typing. I suppose PHPUnit is too a good tool, but I'm not aware about all its capacities

smalot avatar May 26 '20 20:05 smalot

It should be interesting to move Tests into a specific namespace and folder declared only in "autoload-dev". I let you create a dedicated issue to follow the work

smalot avatar May 26 '20 20:05 smalot

I granted you a "Developer Access" on Scrutinizer. I hope it help you

smalot avatar May 26 '20 21:05 smalot

Any plans on closing some issues? There are issues that are last updated over 6 years ago and questions for which an answer is no longer needed. I'm interested in helping with some issues but there are a lot and it's not clear which issues need attention and which do not.

rubenvanerk avatar Jun 03 '20 11:06 rubenvanerk

Hi @rubenvanerk, i am still in the process of getting a birds eye view. Therefore i am focusing on recent issues and PRs and trying to get in touch with people to solve things. As you can see, not all responded.

The most important PRs for me are my own (#299, #300) for now, so we have a stable basement in the long run. I also prefer pull requests like #297, where the author is cooperative and helpful.

On the issue side: bug reports and issues about missing functionality should be handled with priority. Questions and everything else is secondary to me, due to lack of time.

If you wanna help it would be cool if you check issues and comment on it with an status update. For instance, an answer is no longer needed, because ... or something like that. @j0k3r or me can later decide how we handle it.

What do you think?

k00ni avatar Jun 03 '20 11:06 k00ni

I think I'm just going to work through the issues from most recent to least recent. I assume you and @j0k3r are monitoring the issues from time to time? Or should I tag you when I think an issue can be closed?

rubenvanerk avatar Jun 03 '20 18:06 rubenvanerk

We monitor them :)

j0k3r avatar Jun 03 '20 20:06 j0k3r

@j0k3r and the others, what features and plans do you have or working on? Maybe we can coordinate up front and help each other?

Ref: #306

For me: i am good for now and don't plan anything particular. But i am still available for issues and PRs.

k00ni avatar Jun 08 '20 10:06 k00ni

I don’t want to work on sth particular. Just wanted to provide enough tools to ease the maintain of the lib (phpunit, cs-fixer, phpstan, etc).

j0k3r avatar Jun 08 '20 12:06 j0k3r

I wanna make a proposal: If one of the collaborators assigns himself to a PR, he will take care of merge.

Also, a PR should be kept open for a while, so that our community has a chance to comment. Even after it is ready to be merged, a window of ~~1+ week~~ 2-3 days should be used to allow further comments. Closing it right away prevents that possibility.

Hope that is fine with you guys.

k00ni avatar Sep 19 '20 10:09 k00ni