php-mlx icon indicating copy to clipboard operation
php-mlx copied to clipboard

php-mlx architecture

Open akondas opened this issue 5 years ago • 18 comments

Because it is a new library, we now have a moment to discuss architecture. php-ml in its implementation as a dataset used ordinary native arrays. This was associated with a large amount of memory loss. I would like to focus mainly on the architecture of datasets and its circulation among estimators, transformers, etc.

akondas avatar Jun 01 '20 21:06 akondas

@akondas hi! What do you think about the use of computational shaders to speed up work?

I can provide OpenCL or Vulkan API (like this: https://github.com/BicEngine/Vulkan) for interacting with GPU

SerafimArts avatar Jun 01 '20 22:06 SerafimArts

Absolutely. I have already experimented with cuda bindings, but I can't find this library right now. I suggest you do a separate issue :wink:

akondas avatar Jun 02 '20 06:06 akondas

Would it be possible / efficient to use "php-ds" or the spl data structures as an alternative?

I refactored an app I built from arrays to php-ds vectors and decreased memory by ~50%, However, going from 2GB of mem to 1GB was still too much so I resorted to streams (which maintained ~19MB of mem usage during runtime).

ideaguy3d avatar Jun 03 '20 19:06 ideaguy3d

I would like to see a random forest implementation :heart: I'm also open for contributing to this project.

sencere avatar Jun 09 '20 15:06 sencere

Guys ! with FFI in our door steps , why dont we jump for highest ? i think php can be useful as python if we take python approach . using php as an interface for huge computation

hamidrezabstn avatar Oct 19 '20 20:10 hamidrezabstn

@hamidrezabstn

i think php can be useful as python

Hell yeah!

And not only for the interface, I believe.

The upcoming PHP8 major release, with the JIT enabled, is much much faster than I expected.

Some scripts are faster than Golang. And believe or not PHP8 with JIT enabled in Docker on Raspberry Pi3 is faster than running local on macOS' default PHP 7.3.11! (Thanks for @akondas' blog article)

I believe from PHP 8.0 things will change and bring a recursive acronym escapade of Programmers Hate PHP.

The only problem is that I am no good at math at all ... Σ( ̄□ ̄;)What?!

KEINOS avatar Oct 20 '20 02:10 KEINOS

architecture of datasets

I prefer SQLite3 as a storage of the datasets since it's simple, lightweight, stable, and fast. But what about .h5 format?

I think many people use HDF5 format datasets. So, it would be handy if we can use them by default, wouldn't it?

Something like h5py might help. But so far it seems there's no package for HDF5 nor something like h5php.

The below was the only discussion I found about HDF5 in PHP.

  • How to read HDF5 files in a PHP script @ Quora

However, it didn't reach to the answer rather than switching to Python. The main reason seems to be by not having an extension for it.

Since there's a Clang source, as a theory, an extension might be created. Yea, as a theory...

KEINOS avatar Oct 20 '20 03:10 KEINOS

Storage is only a implementation details. And as we know, the devil is in the details :wink:

There are two more topics left for me to close and I will be able to go back to the php-mlx implementation. I am assuming that particular algorithms (classifiers or regressors) should be interfaces and have different implementations. This will allow the use of an FFI, for example. I am eager to see a PoC solution like this, to be able to state whether it makes sense in general.

In addition, I would also like to focus more on the dataset itself. Enable operations such as the Pandas library and the visualization method like Jupyter Notebooks.

I have also described my main concerns in this article: https://arkadiuszkondas.com/3-reasons-why-php-is-not-yet-perfect-for-machine-learning/

If someone wants to start now, I am happy to help.

akondas avatar Oct 20 '20 06:10 akondas

particular algorithms (classifiers or regressors) should be interfaces

So, you mean to define something like an API first so to be able to switch the backend? That makes sense.

I would also like to focus more on the dataset itself. Enable operations such as the Pandas library and the visualization method like Jupyter Notebooks.

Since I wasn't familiar with Jupyter Notebook, I implemented PHP8-RC2 Kernel to work on Jupyter to see how they work. And yea, I feel there is a lack of visualization method in PHP. Something like matplotlib we need as you mentioned the article.

Is it reasonable to create something like matplotlib in pure PHP rather than make a parasitical wrapper of Python's matplotlib in PHP?

KEINOS avatar Oct 27 '20 04:10 KEINOS

https://arkadiuszkondas.com/3-reasons-why-php-is-not-yet-perfect-for-machine-learning/ I read the above article. I strongly agree.

I have already made it, except for the Jupyter.

  • https://rindow.github.io/mathematics/matrix/matrix.html
  • https://rindow.github.io/mathematics/plot/overviewplot.html
  • https://rindow.github.io/mathematics/openblas/overviewopenblas.html
  • https://rindow.github.io/mathematics/acceleration/opencl.html

yuichiis avatar Jan 20 '21 08:01 yuichiis

Looks very promising. Thanks to this job there is some slight chance of entering the mainstream. But you will have to put in a lot of work that no one pays for.

Unfortunately, comments like "abandoned and disputed php-ml" (https://github.com/nextcloud/suspicious_login/pull/375) do not encourage me to further work on this library.

I am happy to accept / supervise the work (PR's), but in the near future I am not going to write the code myself, so far I have other interesting things to do.

akondas avatar Jan 25 '21 17:01 akondas

@akondas

I'm sorry to hear that. I understand that anyone can lose one's drive, after working so hard all alone, and taken all down in no time by the subsequent projects' DMCA attack. I felt unfair with the rashness of them and DMCA.

No matter what they say, it's as concrete as anything else that PHP-ML was the pioneer of machine learning in PHP!

Take your time, do what you want to do. I'm always on your side dude.

KEINOS avatar Jan 26 '21 00:01 KEINOS

@akondas

I am sorry to hear your problem. The honor of the pioneer remains unchanged. Do something fun for you. But I hope you will come back to write the program in the future. I'm writing code because it's fun too.

My hope is the spread of interoperable interfaces.

I think that the data structure for high-speed numerical calculation that is aware of SIMD instructions and GPU is very important for PHP. I have the NDArray and LinearBuffer interfaces in the "interop-phpobjects" I'm using. That is the core of high-speed computing. If a lot of compatible libraries and PHP extensions are created, we can expect a little game change for python. More people should write mathematical libraries in PHP.

The interface definition is independent of my library, so I aim to be able to replace and discard my library when other advanced libraries emerge.

I aim for coexistence with other programs and free competition.

yuichiis avatar Jan 26 '21 06:01 yuichiis

@yuichiis @KEINOS thanks for the words of support. Don't get me wrong, I'm not a quit open source. I just have to complain out loud sometimes.

I admit that https://rindow.github.io/mathematics/index.html encourages me to go back to php-mlx. Plus as a new project, we always have the advantage that we can design everything from scratch.

akondas avatar Jan 26 '21 07:01 akondas

you did advertise rubix ml by doing so.

Now I looked at them, but am not sure if they are really better, or faster etc it isn't clear

there is a guy training something and the training took him 6 hours for something... and they say they have no GPU support.

Imho there really little reason to implement something NEW without GPU support. Or is there?

torian257x avatar May 25 '21 20:05 torian257x

@akondas

I understand how life and priorities take precedence over building and maintaining open-source projects. We do thank you and appreciate your contributions tremendously.

I'm just curious why the php-ml project seems to have disappeared from GitHub entirely? Am I misunderstanding something? Since this project has zero code committed, would it make sense to leave the other project online, in archived status?

Hoping all is well.

ConnectGrid avatar Nov 20 '23 01:11 ConnectGrid

I'm just curious why the php-ml project seems to have disappeared from GitHub entirely? Am I misunderstanding something? Since this project has zero code committed, would it make sense to leave the other project online, in archived status?

https://arkadiuszkondas.com/dmca-php-ml-and-copyright-boundaries/

Every time I read this blog post, my heart aches and I get a glimpse into why PHP can't take the next step in the ML world and why it can't overtake Python.

KEINOS avatar Nov 20 '23 11:11 KEINOS