jackrabbit icon indicating copy to clipboard operation
jackrabbit copied to clipboard

Supply parsed JSON content through the importer interface

Open Alfusainey opened this issue 11 years ago • 6 comments

/cc @dbu /cc @anchela

Alfusainey avatar May 19 '14 17:05 Alfusainey

This implementation parses serialized content in jsop format to nodes and properties. The content tree is then serialized to xml and the resulting xml containing all the nodes and properties gets written to the repository using session.ImportXml(,,,). Therefore, adding nodes(entire subtree of nodes) to the repository is not written using regular write methods but rather using session.ImportXml. The remoting server, as far as adding nodes are concern, uses the importer regardless of whether the content is protected or not. The reason for choosing this approach is that a node that might not be protected can have protected properties. Thus even though the node itself can be written using API write methods, its protected properties cannot. For this reason, i do not want to use API write methods to write the node itself and to use importXml to write its protected properties- resulting in a clumsy code. This is the reason why i pack the whole content in a tree, serialize the tree to xml and write the resulting xml data using session.ImportXml.

If in turn the xml importer detects that the content its trying to import is not protected, then it will use the API to directly write the content and if not, it will ask its p-i-import handlers to handle any content it detects as being protected.

/cc @anchela /cc @dbu /cc @tripodsan

Alfusainey avatar May 25 '14 00:05 Alfusainey

hi alfusainey

i just had a closer look at your lastest patch. while i definitely see that your current approach makes things easy from a developer point of view, it somehow defeats the purpose of having json-based remoting if the complete json string is finally just converted to xml. that's the biggest concern that i currently have with your approach.

apart from that, just a kind reminder with respect to the diffs: i would be really glad if you could make sure that your patches don't contain modifications unrelated to your work.. that also applies for the import statements.

kind regards angela

From: Alfusainey <[email protected]mailto:[email protected]> Reply-To: apache/jackrabbit <[email protected]mailto:[email protected]> Date: Sunday 25 May 2014 02:36 To: apache/jackrabbit <[email protected]mailto:[email protected]> Cc: anchela <[email protected]mailto:[email protected]> Subject: Re: [jackrabbit] XML content import for remoting server (#19)

Reply to this email directly or view it on GitHubhttps://github.com/apache/jackrabbit/pull/19#issuecomment-44108291.

anchela avatar May 26 '14 10:05 anchela

@Alfusainey while i see the benefit of having less redundant code, i think the additional serialize to xml and then in importXml deserialize again is a severe performance hit. can't you just use the same deserialized objects that represent the information and feed them into the same code?

another thing: to get the acl import working, i would not go about too heavy refactorings of jackrabbit. it will be much more difficult to convince people to accept a severe refactoring with all its potential of bugs or performance penalties over just handling acl nodes with a special handler, with almost no impact on the operations that where already supported.

this decision is up to @anchela and other jackrabbit maintainers but i guess they will be more at ease with less invasive changes. once you got your primary goal through, you can still look at jackrabbit and also oak and see if you want to propose further refactoring. but lets first focus on the primary goal of remoting acl information.

dbu avatar May 26 '14 20:05 dbu

Hi Angela,

thanks for the feedback! please have a look at this latest patch which, instead of converting the json-string to xml, submits the parsed json content to the repository through the Importer interface.

In Session#getImportContentHander, a concrete importer instance was created and supplied to the XML import content handler which uses the importer instance to make calls for writing the parsed xml content. I am taking this approach, which is to have the parsed json content submitted through the Importer which then knows how to write any type of content. This way i avoid doing any sort of content protection check on the server and instead defer it to the importer.

one change to the json-string: i now encode jcr properties as json objects instead of simple name:value(s) pairs. encoding properties as json objects helps to represent all information about a property i.e name, type, value(s), which is needed by the importer.

I take note of the diff info and seems to be inorder this time around!

cheers, alfusainey

Alfusainey avatar Jun 09 '14 00:06 Alfusainey

Hi Alfusainey,

one change to the json-string: i now encode jcr properties as json objects instead of simple name:value(s) pairs

while I see that using objects for properties is more robust, it defies to purpose of using a simple, human readable format like json. for a machine-to-machine format we can use sysview XML.

I would really try to use name/value pairs as properties. and encode the type either in the value or in the name (the later has the advantage, that you can define types on jcr mv-properties easier). eg

{
"jcr:lastModified@Date": "2014-06-10T14:30Z",
"numbers@Long": []
}

(btw: I'd prefer this discussion on the jackrabbit dev mailing list)

tripodsan avatar Jun 10 '14 06:06 tripodsan

Hi Alfusainey,

rethinking this, we cannot change the format of the JSON for davex, as other clients also use it.

tripodsan avatar Jun 10 '14 07:06 tripodsan