json icon indicating copy to clipboard operation
json copied to clipboard

delete_at_pointer

Open RoyBellingan opened this issue 11 months ago • 27 comments

Some provisional code to add support for the delete_at_pointer functionality.

To test with

https://github.com/RoyBellingan/boostDeleteAtPointer

It should produce this

Initial Json:
[
    {
        "image" : {
            "path" : "https://i.natgeofe.com/n/4f5aaece-3300-41a4-b2a8-ed2708a0a27c/domestic-dog_thumb_square.jpg?wp=1&w=170&h=170",
            "size" : 9100
        },
        "comment" : [
            {
                "text" : "this is cool",
                "timestamp" : 123456,
                "likes" : [
                    {
                        "author" : "Coco",
                        "timestamp" : 123
                    },
                    {
                        "author" : "Izzy",
                        "timestamp" : 456
                    }
                ]
            }
        ]
    }
]
Removing invalid ptr /1/image 
past-the-end token not supported @ /1
------
Removing invalid ptr /0/image/invalid 
no referenced value @ /0/image/invalid
------
Removing /0/comment/0/text 
[
    {
        "image" : {
            "path" : "https://i.natgeofe.com/n/4f5aaece-3300-41a4-b2a8-ed2708a0a27c/domestic-dog_thumb_square.jpg?wp=1&w=170&h=170",
            "size" : 9100
        },
        "comment" : [
            {
                "likes" : [
                    {
                        "author" : "Coco",
                        "timestamp" : 123
                    },
                    {
                        "author" : "Izzy",
                        "timestamp" : 456
                    }
                ],
                "timestamp" : 123456
            }
        ]
    }
]

------
Removing /0/comment/0/likes/1 
[
    {
        "image" : {
            "path" : "https://i.natgeofe.com/n/4f5aaece-3300-41a4-b2a8-ed2708a0a27c/domestic-dog_thumb_square.jpg?wp=1&w=170&h=170",
            "size" : 9100
        },
        "comment" : [
            {
                "likes" : [
                    {
                        "author" : "Coco",
                        "timestamp" : 123
                    }
                ],
                "timestamp" : 123456
            }
        ]
    }
]

------
Removing /0/comment/0 
[
    {
        "image" : {
            "path" : "https://i.natgeofe.com/n/4f5aaece-3300-41a4-b2a8-ed2708a0a27c/domestic-dog_thumb_square.jpg?wp=1&w=170&h=170",
            "size" : 9100
        },
        "comment" : [

        ]
    }
]

------
Removing /0/image 
[
    {
        "comment" : [

        ]
    }
]

------

RoyBellingan avatar Feb 15 '25 01:02 RoyBellingan

An automated preview of the documentation is available at https://1071.json.prtest2.cppalliance.org/libs/json/doc/html/index.html

cppalliance-bot avatar Feb 15 '25 02:02 cppalliance-bot

Why does this PR kill the performance in the benchmarks? image

vinniefalco avatar Feb 15 '25 03:02 vinniefalco

I will try to do another PR with same code, I am confident @vinniefalco that above was just some temporary fluctuation on the vm running the test.

RoyBellingan avatar Feb 15 '25 13:02 RoyBellingan

An automated preview of the documentation is available at https://1071.json.prtest2.cppalliance.org/libs/json/doc/html/index.html

cppalliance-bot avatar Feb 15 '25 13:02 cppalliance-bot

An automated preview of the documentation is available at https://1071.json.prtest2.cppalliance.org/libs/json/doc/html/index.html

cppalliance-bot avatar Feb 15 '25 15:02 cppalliance-bot

An automated preview of the documentation is available at https://1071.json.prtest2.cppalliance.org/libs/json/doc/html/index.html

cppalliance-bot avatar Feb 15 '25 22:02 cppalliance-bot

Codecov Report

Attention: Patch coverage is 0% with 63 lines in your changes missing coverage. Please review.

Project coverage is 93.06%. Comparing base (c02d872) to head (c829369).

Files with missing lines Patch % Lines
include/boost/json/impl/pointer.ipp 0.00% 63 Missing :warning:
Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #1071      +/-   ##
===========================================
- Coverage    93.70%   93.06%   -0.65%     
===========================================
  Files           91       91              
  Lines         9139     9202      +63     
===========================================
  Hits          8564     8564              
- Misses         575      638      +63     
Files with missing lines Coverage Δ
include/boost/json/value.hpp 98.83% <ø> (ø)
include/boost/json/impl/pointer.ipp 79.41% <0.00%> (-20.59%) :arrow_down:

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update c02d872...c829369. Read the comment docs.

codecov[bot] avatar Feb 15 '25 23:02 codecov[bot]

@sdarwin once again I question these benchmarks....

vinniefalco avatar Feb 17 '25 18:02 vinniefalco

At least at this moment, there is not a basis to think the benchmark is wrong. I will run a test and attempt to replicate that result.

sdarwin avatar Feb 17 '25 18:02 sdarwin

Just as expected, the results can be easily replicated. :-) See https://github.com/sdarwin/json/pull/20 , which shows the image below.
If a benchmark is questionable then re-run the same job by pushing a commit with a small whitespace modification and nothing else.

bench

sdarwin avatar Feb 17 '25 20:02 sdarwin

I'm not entirely sold on this idea. Given that you also want to return the part of the pointer string that was matched (which is not done for other JSON Pointer functions), it appears to me there's a need for lower-level Pointer functionality, with which users could build their own algorithms.

Alternative idea for a simple "delete at pointer" API: add a special "undefined" state to value_ref that erases the corresponding element when used with set_at_pointer.

grisumbras avatar Feb 19 '25 05:02 grisumbras

@sdarwin the question isn't whether benchmark results are random. You showed us several times that they are not. The question is why they are so much affected by random changes. It could be that the benchmark runner program is not very good, but I remember at least in one case a change in a CMake script (that did not relate to the benchmarks) resulted in benchmark performance degradation.

The only way I can explain this phenomenon is that the build process itself exhausts some resource, or results in some file loading/unloading and that changes the overall performance of the system. Maybe we could try building the benchmarks, storing the build artifacts, and then launching the benchmarks with those artifacts in a fresh container?

grisumbras avatar Feb 19 '25 06:02 grisumbras

The current benchmark in this pull request is the following:

bench8

That is mostly around 3% or less. Some 0%. It's not wildly unexpected.

However, @grisumbras you should understand as well as anyone, even a very small code change (a wrongly placed "if", a missing "return", a bit or byte in the wrong place) can generate a detour in the code path, such that the code is very buggy, and the results are anyone's guess. It may be accurate and correct for the benchmarks to deviate by 30% or 100%, or 10000%, when there is a bug, a mistake, in the pull request.

sdarwin avatar Feb 19 '25 11:02 sdarwin

I wouldn't expect bugs in this PR's code (whether they exist or not) to affect benchmarks, given that none of this new code is ever executed by the benchmarks.

Regardless, my comment should have been considered more generally. Consider this PR: #1031. It just causes warnings in certain cases. Shouldn't affect runtime at all. And yet several tests are affected quite dramatically.

grisumbras avatar Feb 19 '25 12:02 grisumbras

https://learn.microsoft.com/en-us/azure/architecture/antipatterns/noisy-neighbor/noisy-neighbor

This is what probably is @grisumbras and @sdarwin .

RoyBellingan avatar Feb 19 '25 12:02 RoyBellingan

Consider this PR: https://github.com/boostorg/json/pull/1031.

Checking notes... that exact week was when the newest server khjson1.cpp.al went into production for the first time.
That's a sea of 0% . :-) It's a great success. My interpretation of that result is:

  • mostly, the code had no affect on benchmarks
  • on specifically "twitter" and "twitterescaped", there was (unknown) benchmark instability and variation. It wasn't caused by the pull request itself. However, at the time, we discussed "twitter" and "twitterescaped" briefly, you had made some remarks, and maybe even changed something in the json code, such that those benches improved later on. In recent times, an example of margin-of-error in the benchmarks, and no affect from the PR, is in the recent PR 1073. This is randomness. but pretty good:

bench15

sdarwin avatar Feb 19 '25 12:02 sdarwin

@RoyBellingan , there is no evidence of "noisy neighbors" usually: this is not a virtual machine on shared hardware, it is a separate server.

sdarwin avatar Feb 19 '25 13:02 sdarwin

@RoyBellingan after some thinking, I'm not sure that producing an error when an element is not found makes much sense. The effect of the operation is that the element is not nested in the value. If it already wasn't there, the effect is the same. So, the function should mostly report an error if the pointer string is invalid. In this case, if error is not set and true is returned, the element was found and removed, and if error is not set and false is returned, the element wasn't found.

An aside, I think the function should be called erase_at_pointer, to better match the conventional C++ nomenclature. It's also a tiny bit shorter.

In addition, there's a question of what to do with a pointer that designates the root object (an empty string). As far as I can see your algorithm doesn't handle it in a sensible manner. Even if it did, it obviously cannot actually remove the root value, so I'm inclined to think that this situation should actually be an error. Probably a new one.

I also want you to remove the tracking of the consumed part of the Pointer string. None of the other Pointer functions do that right now, and it would be strange to introduce it only for this function. We have a potential idea on how to implement such tracking in the future.

Finally, for the PR to be eventually merged, it should also contain all the necessary overloads (similar to set_at_pointer) and tests. All new lines should be covered by tests, unless there's a reason it can't be reasonably done.

grisumbras avatar Feb 24 '25 07:02 grisumbras

I have to say, this

he effect of the operation is that the element is not nested in the value Is so elegant and correct that... yes! You are totally right

erase_at_pointer I agree on that too, is just me that I mix things sorry.

Thank you very much for the feedback! I will try soon to work on those element!

RoyBellingan avatar Feb 24 '25 15:02 RoyBellingan

An automated preview of the documentation is available at https://1071.json.prtest2.cppalliance.org/libs/json/doc/html/index.html

cppalliance-bot avatar Jul 10 '25 00:07 cppalliance-bot

Hi! Can you add a few tests that check that the implemented functionality works?

grisumbras avatar Aug 22 '25 10:08 grisumbras

oh @grisumbras trust me I would love to have a moment to do that!

RoyBellingan avatar Aug 22 '25 14:08 RoyBellingan