signedxml icon indicating copy to clipboard operation
signedxml copied to clipboard

Namespace Propagation

Open frazeradam opened this issue 2 years ago • 19 comments

I've been working on getting some code moved off of PHP and into Go, and in testing found different behavior from a known working PHP system. It appears that namespaces aren't getting propagated when taking the subset that is getting signed, as in the example in section 3.7 here https://www.w3.org/TR/2001/REC-xml-c14n-20010315

In our case the (extremely abbreviated) XML is:

<E:Envelope xmlns:E="http://schemas.xmlsoap.org/soap/envelope/">
    <E:Body id="Body">

where in the dump I've extracted from inside the PHP library I'm using the body gets converted to

<E:Body xmlns:E="http://schemas.xmlsoap.org/soap/envelope/" id="Body">

and then generates the correct DigestValue.

I believe the canonicalization algorithm requires that when taking a subset, it needs to look up the chain for any namespace declarations, and move them to the highest level tag(s) that first reference that namespace (or something like that, the c14n specs are painful).

frazeradam avatar Jun 19 '23 20:06 frazeradam

I took a crack at this and made some progress, but I've got a few question on how the internals are organized that I'm blocked on.

Inside signatureData.getReferencedXML I have been able to propagate the namespace up, but it appears that making the changes there the code to normalize the order of the attributes must have already run so simply appending the propagated namespaces gets me close, but not quite there. Intentionally putting the namespace first does it, but if there are multiple namespaces that need ordering it wouldn't work, so I do need to make sure that things go through the attribute ordering code.

If you do have an idea on that last step, I'm happy to submit my changes as a PR. From what I can tell renderAttributes seems to be involved in all of this but doesn't seem to fit correctly with what I'm doing.

Mostly working code in signedxml.go for feedback as well if I'm causing any unintentional side effects that would break other systems:

func (s *signatureData) getReferencedXML(reference *etree.Element, inputDoc *etree.Document) (outputDoc *etree.Document, err error) {
	uri := reference.SelectAttrValue("URI", "")
	uri = strings.Replace(uri, "#", "", 1)
	// populate doc with the referenced xml from the Reference URI
	if uri == "" {
		outputDoc = inputDoc
	} else {
		refIDAttribute := "ID"
		if s.refIDAttribute != "" {
			refIDAttribute = s.refIDAttribute
		}
		path := fmt.Sprintf(".//[@%s='%s']", refIDAttribute, uri)
		e := inputDoc.FindElement(path)
		if e == nil {
			// SAML v1.1 Assertions use AssertionID
			path := fmt.Sprintf(".//[@AssertionID='%s']", uri)
			e = inputDoc.FindElement(path)
		}
		if e != nil {
			// Attempt namespace propagation
			if err := s.propagateNamespace(e, e, inputDoc); err != nil {
				return nil, err
			}
			// Turn the node into a full document we can work on
			outputDoc = etree.NewDocument()
			outputDoc.SetRoot(e.Copy())
		}
	}

	if outputDoc == nil {
		return nil, errors.New("signedxml: unable to find refereced xml")
	}

	return outputDoc, nil
}

func (s *signatureData) propagateNamespace(e *etree.Element, base *etree.Element, inputDoc *etree.Document) error {
	// We begin by trying to resolve any references for the top level element, then we can recuse trying to resolve things
	attr, err := findNamespace(e, base, inputDoc)
	if err != nil {
		return err
	}
	if attr != nil {
		// Apply it to our current element
		e.Attr = append(e.Attr, *attr)
		// We need to sort such that namespaces come first and I have no idea how to make that happen here
	}
	for _, elem := range e.ChildElements() {
		if err := s.propagateNamespace(elem, base, inputDoc); err != nil {
			return err
		}
	}
	return nil
}

// findNamespace looks to find the namespace for e, returning it if and only if it's inside inputDoc but outside base.
// If there is a reference to the namespace and it can't be found an error will be returned.
func findNamespace(e *etree.Element, base *etree.Element, inputDoc *etree.Document) (*etree.Attr, error) {
	xlmns := "xmlns:" + e.Space
	// If there isn't a namespace, we can skip this whole function
	if e.Space == "" {
		return nil, nil
	}
	// If the namespace's reference is in the element itself, we have nothing to do
	if e.SelectAttr(xlmns) != nil {
		return nil, nil
	}
	// Start searching up the chain for a match
	current := e
	for {
		current = current.Parent()
		if current == nil {
			return nil, fmt.Errorf("couldn't resolve namespace reference of %s:%s", e.Space, e.Tag)
		}
		if attr := current.SelectAttr(xlmns); attr != nil {
			// We found it, last thing to do is check and see if it's inside or outside our input doc
			if containsElement(base, current) {
				return nil, nil
			}
			return attr, nil
		}
	}
}

// containsElement returns true if e is base or is base has e as any of its child nodes recursively
func containsElement(base *etree.Element, e *etree.Element) bool {
	if base == e {
		return true
	}
	for _, child := range base.ChildElements() {
		if containsElement(child, e) {
			return true
		}
	}
	return false
}

frazeradam avatar Jun 23 '23 16:06 frazeradam

@adamdecaf any advice on how to move forward here. I'm happy to do most of the work, I just need a pointer in the right direction.

frazeradam avatar Jul 13 '23 14:07 frazeradam

Could we sort attributes again after adding the namespace attribute? If you open a PR with your changes I'll take a look and help out.

adamdecaf avatar Jul 13 '23 15:07 adamdecaf

Yes, I'm pretty sure that's the way to go, but where I got stuck. I'll get a PR opened tomorrow (the code is on a computer I don't have easy access to today).

frazeradam avatar Jul 13 '23 17:07 frazeradam

@adamdecaf #41 has been entered with my code that mostly solves the issue (minus sorting of attributes). I was unable to find any place in the code that did that at a XML tree level and didn't just output strings that I could reuse. Feel free to make a change or point me in the right direction for that and I can make the change. Thanks for the help with this!

frazeradam avatar Jul 14 '23 14:07 frazeradam

@adamdecaf have you had a chance to look at this?

frazeradam avatar Aug 14 '23 10:08 frazeradam

@adamdecaf have you had a chance to look at this?

A bit. I'm trying to recruit some help inside Moov to look at this more.

adamdecaf avatar Aug 15 '23 22:08 adamdecaf

@adamdecaf assuming you're back at work, any chance I could get an update on this. We're not far from needing to do something, and I'm hoping to not need to keep around a PHP code base just for XML digital signatures.

frazeradam avatar Nov 13 '23 15:11 frazeradam

I'm sure we can figure something out.

adamdecaf avatar Nov 16 '23 19:11 adamdecaf

@frazeradam ~~Do you have a more complete test for this? Or are your original snippets enough to verify this works?~~

Nevermind, We've got a commented out test of the 3.7 example.

adamdecaf avatar Nov 29 '23 15:11 adamdecaf

I've looked at this again this morning and am afraid I don't understand the xml specs well enough to help out. Have you found a solution with PHP in the mix?

I do see a lot of work on https://github.com/moov-io/signedxml/compare/master...daugminas:signedxml:master but that deviates from the project as it is now.

adamdecaf avatar Nov 29 '23 16:11 adamdecaf

I do see some success with daugminas/master, so it might be worthwhile to adopt changes from that.

Running the 3.7 example gets closer...

Failures:

  * /Users/adam/code/src/github.com/moov-io/signedxml/examples/canonicalization_test.go 
  Line 202:
  Expected: '<e1 xmlns="http://www.ietf.org" xmlns:w3c="http://www.w3.org"><e3 xmlns="" id="E3" xml:space="preserve"></e3></e1>'
  Actual:   '<doc xmlns="http://www.ietf.org">
     <e1>
        <e2 xmlns="">
           <e3 id="E3"></e3>
        </e2>
     </e1>
  </doc>'
  (Should be equal)
  Diff:     '<e1doc xmlns="http://www.ietf.org">
   xmlns:w3c="http://www.w3.org"  <e1>
        <e32 xmlns="">
           <e3 id="E3" xml:space="preserv></e"3>
        </e32>
     </e1>
  </doc>'

adamdecaf avatar Nov 29 '23 16:11 adamdecaf

I'm currently at AWS re:Invent, but will take a look at that next week when I get back and have access to all of my test code. Thanks for digging into it more.

frazeradam avatar Nov 29 '23 22:11 frazeradam

@adamdecaf sadly despite playing around with it for a bit, I've failed to even make it run. It seems like it uses a C library (what I was trying to avoid - and according to the readme for this repository, also something this package is trying to avoid). Specifically the CGo "fun" originates in this package https://github.com/lestrrat-go/libxml2/blob/master/clib/clib.go which is used in the c14n.go file that #44 proposes to add.

frazeradam avatar Dec 18 '23 22:12 frazeradam

What errors are you having trying to get CGO working? I've got a fair amount of experience cross compiling it and running inside docker images. We use CGO at Moov where needed, but try to avoid it.

adamdecaf avatar Dec 20 '23 19:12 adamdecaf

CGo is a non-starter for us for this project (at least as part of the main executable - we may end up needing to break out into a second executable to handle the signed XML). We actually picked this package to work with since the README says:

Other packages that provide similar functionality rely on C libraries, which makes them difficult to run across platforms without significant configuration. signedxml is written in pure go, and can be easily used on any platform.

frazeradam avatar Dec 20 '23 19:12 frazeradam

Understood. I agree CGO is best to avoid and would try to keep it as isolated as possible. The project we're using this in already has CGO. Feel free to reach out if you need anymore help either in private or on Issues.

adamdecaf avatar Dec 20 '23 20:12 adamdecaf

I figured I should let you know where we ended up with this. We found that github.com/ebitengine/purego can be used to interface with libxml2 without needing CGO and this will cover our use case.

frazeradam avatar Jun 14 '24 20:06 frazeradam

That sounds like a lot better solution. Do you have an example that you could share? We could adopt that within signedxml.

Edit: I've seen some people have success with https://github.com/nbio/xml as well

adamdecaf avatar Jul 08 '24 19:07 adamdecaf