Adding polarization support
I wanted to simulate a system with birefringent optics when I looked into OpticSim.jl code to see the polarization is not yet supported. I am unsure if I can volunteer on it, maybe I can do it by myself or delegate to some of friends/colleagues. But prior to doing so I wish to cross-check if there is already an effort on this.
W.r.t. @caseykneale's quote below, I believe that Jones is not enough as it handles only polarized light, but not unpolarized. IMHO the correst formalism is Stokes, or in other words Poincare sphere plus power. To me the easiest way how to enhance the current processintersection is to keep power variable untouched and add 3 Poincare coordinates definind the SoP (state of polarization). What is still missing is to define a coordinate system for polarizations. In real world one of popular choices is H/V, but -- correct me if I'm wrond -- this dies of singularity in case of purely vertically propagating ray. Therefore what I have in mind (and not yet on paper/code editor) is, for each particular ray to define a non-ambiguous coord system derived from its world-coordinate direction of propagation.
To my view, the changes needed are:
- altering the number of return values in
processintersection()(adding SoP), and adding support for full Fresnel eqns there- keeping in mind that in case of birefringent material instead of having <=2 outcomes (t and r), we now may have <=3 (r, o, e)
- propagate new component of Ray (SoP) throughout the code
- adding support for anisotropic materials, defined as indicatrix tensor matrix in place of simple index of refraction scalar
- plus few convenience idealized components, such as ideal polarizer, ideal waveplate
Any feedback welcome.
Originally posted by @caseykneale in https://github.com/microsoft/OpticSim.jl/issues/33#issuecomment-804431676 As for representing polarization - it's tricky. I don't know if anyone has a good way of doing that beyond field equations. The Jones/Mueller matrices are a little niche but they can be broadened out to handle other systems. At one point I was eyeing up adding polarization to local plane interface approximation (LPIA) for a system somewhat similar in spirit to what you all are working on. Just ran out of time. Well - I can do a code dump at some point soon and you all can decide if any of it meshes with your interests or not.
I've just found OpticSim so I don't know much for the underlying infrastructure. One resource to different options for polarization raytracing is "Polarized Light and Optical Systems" by Chipman, Lam and Young. There is enough detail there to guide good decisions about how to implement the support needed for efficient and accurate polarization raytracing. The textbook approach means that it almost provides cookbook guidance for writing code.
In particular, it describes coordinate systems and issues with phase pretty well.
(..) The textbook approach means that it almost provides cookbook guidance for writing code.
In particular, it describes coordinate systems and issues with phase pretty well.
Thanks for the pointer, I will definitely have a look.
@hrobotron, @mattderstine I've wanted to support polarization for a long time but don't (currently) have the background to design the api. I ordered the book you recommended and start thinking about an api. I can't work on it full time because we are focused on improving OpticSim support for repeated structures, but can definitely help.
@BrianGun I had been working on my own raytracing code in Julia before I saw this project. It wasn't around when I started. I've just started looking at it and will be thinking about this more as it is much better than what I did in many regards and seems to have the same basic approach, even to using Makie for graphics. With just the quickest of looks, there are some design decisions being made that may have a major impact on how easy it will be to add polarization later on. In particular, I have noticed that there is some stochastic decisions sometimes made about r & t and the way info is managed with that process. Doing polarization is just like keeping track of r & t except there is more bookkeeping along the way. This means that decisions about detectors and what data to keep when doing a raytrace are very important. Most importantly, I think is that a potentially significant advantage for OpticSim and polarization raytracing is the ability to easily include nanostructured interfaces and meta materials. This is what I would keep in mind while developing the raytracing planning for adding polarization. I'd look at chapters 9,10, 19,20 for how to do polarization raytracing and backtrack to the other chapters to understand the details and applications. I look forward to helping where I can.
@mattderstine Reading the polarization book is a workout in the physical as well as mental sense. It is massive. But it looks excellent so far.
With just the quickest of looks, there are some design decisions being made that may have a major impact on how easy it will be to add polarization later on. In particular, I have noticed that there is some stochastic decisions sometimes made about r & t and the way info is managed with that process. Doing polarization is just like keeping track of r & t except there is more bookkeeping along the way. This means that decisions about detectors and what data to keep when doing a raytrace are very important.
I think that the stochastic dealing with t/r is quite OK for polarization as well. At the interface, we have a reflection, and one or two transmitions depending on birefingence. The Monte Carlo approach is good in dealing with that in a sense that we do not suffer explosion in number of rays (doubling or tripling at each interface).
The question to decide is whether to support sull SoP including partially polarized light (Poincare/Stokes) in each ray, OR to delegate the partially polarized light to Monte Carlo as well and do it stochastically (Jones/Mueller). I think it is better to cover it in the SoP for each ray.
Does the nanostructure you talk about differ in any way compared to naturally birefringent materials? I'd think the basic formalism for a material w.r.t. polarization is changing from scalar refractive index to indicatrix tensor, do the materials you talk about require something better?
The issue with polarization ray tracing through birefringent materials is that rays do explode. The issue I am concerned about is not the pruning of rays to those of interest, but that the bookkeeping of the polarization characteristics along the ray. What I'm really suggesting is that when a ray is pruned, the r or t of what remains should be tracked (and by this I'm assuming these are complex number with amplitude and phase, not R or T for power). The polarization that needs to be tracked includes these properties in multiple dimensions. This is important when the rays are to be combined at some point when all of the amplitudes, phases and k vectors will determine the properties. This even more important if a material or surface is depolarizing. Ensuring the bookkeeping is present enables this to handled efficiently rather than having to trace ever more of an exploding set of rays.
My preference about tracing polarization is to separate the polarization from the idea of refractive index. If one were to do RCWA on these structures the polarization properties come out more naturally than forcing the answer to be in terms of index, whereas the polarization properties can be easily found from the index of more natural materials. In the end it's a matter of bookkeeping and understanding. Keeping things in terms of polarization aberrations that can be aggregated across the system (compare to wavefront aberration for example) make it easier to understand what is happening rather than having to pretend there is an element that represents the system that has a certain refractive index (complex, of course) tensor .
Naturally the choice that works best is problem dependent. I'm just advocating for more general structure of the raytrace so that the interesting problems of our day (AR headsets say) can have the robust foundation needed to generate results that are trusted even when it is difficult to understand them.
I attach a white paper from AiryOptics that shows off their raytrace program. This is just the simplest kinds of operations that I'd like to to be able to use. I'd also like to be able to get these same kinds of plots after going through birefringent-like coatings and materials. For example Example 20.2 of Chipman, Lam & Young. CassegrainWhitePaper.pdf
Hello,
(..) The issue I am concerned about is not the pruning of rays to those of interest, but that the bookkeeping of the polarization characteristics along the ray. What I'm really suggesting is that when a ray is pruned, the r or t of what remains should be tracked (and by this I'm assuming these are complex number with amplitude and phase, not R or T for power). The polarization that needs to be tracked includes these properties in multiple dimensions. This is important when the rays are to be combined at some point when all of the amplitudes, phases and k vectors will determine the properties. This even more important if a material or surface is depolarizing. Ensuring the bookkeeping is present enables this to handled efficiently rather than having to trace ever more of an exploding set of rays.
yes, indeed. This is exactly what I am talking about since the first post in this enhancement, nothing more, nothing less.
My preference about tracing polarization is to separate the polarization from the idea of refractive index. (..)
The ideas are separate. Even the fully isotropic media (scalar index) do have polarization effects via Fresnel equations. Even current OpticSim code (in processintersection()) does talk about them, although taking only approximate result just because that the SoP is not propagated further.
I mentioned indicator matrix, as this is needed and AFAIK sufficient measure how to properly handle anisotropic materials once the SoP is introduced into the code.
I attach a white paper from AiryOptics that shows off their raytrace program. This is just the simplest kinds of operations that I'd like to to be able to use. (..)
Sure, I see no problem with that and the Cassegrain example you provided shall indeed work without any problems.
I am afraid we are are at the point where I or someone else shall show some code. My time allocation is less than poor, but let's see, if I will be struck by procrastination in this direction, I will contribute for your review :-)
I've looked at the polarization textbook. Doesn't look like it would be that difficult to add. But how will it be tested? Do you all have polarization systems with known responses that the OpticSim implementation could be compared to?
But how will it be tested? Do you all have polarization systems with known responses that the OpticSim implementation could be compared to?
I think that a plan-parallel plate of BK7 with non-perpendicular angle of incidence is a good starting point :-) After introducing indicator matrix, next valuable example could be beam displacer using a single piece of prism, demonstrating how the beam splits into three.
I agree with @hrobotron that simple examples could be compared to results calculated directly. For what I had been doing, I was testing against the textbook examples and problems. This was an easy way to check the work. The white paper has examples that could be modeled, at least qualitatively. If you have access to Zemax and CodeV they have some capabilities and side by side testing would be interesting if not definitive. I would also look to the papers and dissertations of Chipman's students. I expect good examples might be found there.
Just confirming I also got the book, starting to look.
I have briefly looked into the formalism of "Polarized Light and Optical Systems" by Chipman, Lam and Young. I welcome the WCS of P matrix. However, the formalism seems to deal only with fully polarised light, which I think is a pity: the common case of unpolarized light would need a stochastic mixture, which I think is a wasteful approach.
To me at this point seems the best way to go along Zhang et al: Three-dimensional polarization ray tracing calculus for partially polarized light DOI:10.1364/OE.25.026973. This is my suggestion and if I will have some spare time (which I do not promise) I'd like to contribute either some initial code, or at least test examples.
It seems there is even a more general approach described in later Li et al: Three-dimensional polarization algebra for all polarization sensitive optical systems DOI:10.1364/OE.26.014109, but here IMHO the added value (i.e. the new cases this is able to deal with compared to previous one) is arguable, given much larger matrices and therefore complexity in this case. (Change my mind -- then please say which practical examples covered by Li&al are not covered by Zhang&al.)
So my suggestion is clear: DOI:10.1364/OE.25.026973. Which is just a well described and presumably proven formalism capturing the spirit of my initial post (i.e. to support also partial polarizations).
I'm going to go silent while I dig into this. My current understanding of the approach by Chipman is that the results from characterizing the optical system can then be used with unpolarized and partially polarized input light to determine the output. However, this is not something that I've spent much time with and so need to look carefully at the details to make sure that I understand the limitations of one approach over the other. Thanks for the detailed references. One question that I know little about his how depolarizing optics are modeled. It would seem to me from basic physical principals that the k-vectors are also changed in this process and so modeling this kind of element becomes stochastic as well. Or is this outside the scope of what we can expect a "general" optical modeling program to cover?
I'm going to go silent while I dig into this. My current understanding of the approach by Chipman is that the results from characterizing the optical system can then be used with unpolarized and partially polarized input light to determine the output. (..)
According to my (very brief) skim of Chipman&al and according to words by which Zhang&al and Li&al refer to Chipman&al, which is consistent with my view, the "ray property" in Chipman&al is polarization matrix which is equivalent to some polarized light, with totally arbitraty polarization, BUT this means that the unpolarized light can not be modelled by a single ray. The unpolarized light is not a single polarization -- it is a stochastic mixture. This and only this reason is why I suggest to advance to Zhang&al. The simplest approach to understand the matter is Stokes vs. Jones (this is without 3D).
One question that I know little about his how depolarizing optics are modeled. It would seem to me from basic physical principals that the k-vectors are also changed in this process and so modeling this kind of element becomes stochastic as well. Or is this outside the scope of what we can expect a "general" optical modeling program to cover?
I am not competent to answer. AFAIK depolarizing optics in real life always change optical mode of the light (geometry of the beam), if made strictly off passive components. But I have no strong opinion and zero knowledge of this. Everywhere I talk about unpolarized light is where the ray has random polarization already at the source. (And along the way through "lens and prism assembly" it can only split into a larger number of more-or-less polarized rays.)
Still working but some first impressions. Either method works fine for partially polarized light as long as there is no depolarizing elements in the system. It's just a matter of converting the Jones matrix to the Mueller matrix and then using the proper stokes vector for the input light. From my quick read of Zhang et al, their comments about the limitations with regard to partial polarization of Chipman's approach can be attributed to the details of polarization raytracing being complicated and the pressure put on authors to emphasize novelty beyond what might really be there.
On the other hand, I take seriously the comparison of the double gauss lens where the value of the 3D coherence method is described as an improvement over an linearization assumption made by Chipman's group. At the same time, this means that I need to dig into the details of the calculations on both sides to see where different approximations are made. This statement is rather surprising to me as when I implemented the Chipman method, it did not appear that there were any approximations, but I need to get to the bottom of this.
One limitation that is not discussed by Zhang et al, is how this would model a depolarizing or scattering element. Here I reference the discussions of approaches described in Chipman sections 27.2.8-27.2.10. These explain strategies to address the depolarizing with raytraces. It is clear that unless the raytrace is a Mueller matrix based approach then depolarization is not well handled. My limited understanding of Zhang suggests that it is not. On the other hand, the scattering the comes with depolarization is challenging to model and so just using a Mueller matrix would not really make a more useful tool.
As I said, this isn't a final conclusion, just a quick update along the way to help me organize my thinking and perhaps get pointers where I might look.
@mattderstine @hrobotron
I created a new branch (#276) for this issue and have started modifying the code to add polarization information to OpticalRay. In the new system you create an optical ray with polarization = nothing, or to a subtype of AbstractPolarization. Currently the only subtype is Chipman defined as an empty stub. As soon as I get the current code working with polarization = nothing then I'll start on the Chipman polarization tracing.
It doesn't look like the polarization tracing will take long to code, unless I'm misreading the book. Once I finish could one or both of you write tests to make sure the code works?
It will simplify life if you make all your modifications to this branch.
@mattderstine @hrobotron
I'm going to rewrite processintersection for FresnelInterface to handle polarization using the Chipman method because this looks like something I can handle with my beginner's understanding of polarization. I have no idea how to do this for the holographic elements. Do either of you know how to do this?
The new code is written so that we can easily add a new polarization type without rewriting any of the trace code. We should only have to rewrite the processinintersection functions. I'll work on abstracting the polarization code out of the processinintersection functions as much as possible so that only a little code has to be rewritten to handle different polarization models.
@hrobotron I had a more careful look at the beginning of the Zhang article and the double Gauss lens. Unlike the their analysis of the first part of the article, they didn't actually implement the polarization approach of Chipman for the double Gauss. I looked carefully at Chapter 9 and there is no paraxial approximation made. All of the analysis is valid over all angles. This realization combined with other unwarranted claims and mistakes in the paper lead me to the conclusion that for nearly all cases, the approach in Chipman is valid and, given the more complete description of the pitfalls of implementation, is what I would choose for a robust polarization raytrace for unpolarized, partially polarized and polarized sources. This approach does not work well for optic systems that have depolarizers. In these very few cases, I think that the approach outlined in 27.2.10 would be appropriate. I have never seen such elements used in an optical design that was complicated enough that the performance wouldn't be better modeled by just coding the Mueller matrices directly. With the proper combination of rays at the end (and with that statement I hide lots of complexity), the Chipman approach would enable modeling of polarimeters, fiber optic isolators for unpolarized light and all manner of polarization performance of birefringent and normal optical elements.
@BrianGun I have not really had time to look at OpticSim, I've been deep in the papers and book. I like your approach although I do suggest you read and understand chapter 8 & 9 before actually implementing the stuff in chapter 10.
Starting with the Fresnel equations is a great choice. Holograms are an advanced topic that I suggest is on the back burner for a while; unless someone else is really pushing for them and can contribute. A more important next step is coatings. I have no idea what is in the package right now for coatings (I really have only ever run the first example, read some of the examples and looked at the issues on GitHubs. I'm hoping to change that soon) but polarization effects from multilayer coatings are a big deal. There are references to polarization changes in holograms and Chipman has a chapter on RCWA that all provide background.
The key point is that your choice of placing the polarization interaction into the processinteraction methods means that nothing breaks as new calculations of the P matrix are added for new surface types and there is no need to implement all of these things at one time.
A bigger issue for the long term is handling the ray traces with strongly birefringent materials. This will lead to bifurcating of the rays at each of these elements. Again, I haven't looked at the OpticSim raytrace, but these materials will require different raytrace calculations.This is the point of chapters 17-20.
I'd recommend a plan something like this (with the expectation that whomever is doing the work should push the functions they need):
- Add the basic code to permit calculation of the P matrix at each surface and manipulation at the end of raytrace
- Add the the code to calculate the P matrix from the fresnel equations for normal surfaces and optical materials, including metals with complex refractive indices.
- Write functions to calculate Jones Pupils and other visualizations of Chapter 11 and 12
- Write functions to handle unpolarized/partially polarized sources and analyze their performance
- Add polarization effects of coatings (this is project that will never really be done)
- Add a real birefringent raytrace
- Add beam combination functions and visualizations
- Plan for other experts to find the package, make improvements to what has been done and add the fancy optics for their specific problems.
I need to figure out how much I can really help, but I certainly would like to write functions that test all of the examples and problems sets from the book. I see two advantages to this approach. The first is that there is a very good chance the answers in the book are correct and the second is that the book then helps explain what the code does to make it easier for people to use.
Thoughts?
@BrianGun @mattderstine
I created a new branch (#276) for this issue and have started modifying the code to add polarization information to OpticalRay. In the new system you create an optical ray with polarization = nothing, or to a subtype of AbstractPolarization. Currently the only subtype is Chipman defined as an empty stub.
Dear Brian, thanks a lot for the quick action. I believe it is very wise to introduce SoP as an abstract class, as our preference and understanding may eventually evolve (although better to choose the "best" one now in order to avoid future massive code rewrites for different classes of interfaces).
It doesn't look like the polarization tracing will take long to code, unless I'm misreading the book. Once I finish could one or both of you write tests to make sure the code works?
It will simplify life if you make all your modifications to this branch.
I will happily help. I will start with taking your branch and refreshing my mind on how to write a proper test.
Dear @mattderstine
Still working but some first impressions. Either method works fine for partially polarized light as long as there is no depolarizing elements in the system. It's just a matter of converting the Jones matrix to the Mueller matrix and then using the proper stokes vector for the input light.
the Mueller/Stokes approach is what I am heading towards, in contrast to Jones, which is simply incapable to support the most ordinary, unpolarized light. Which I believe still shall be the default source and default input for a generic lens assembly for a person who does not care about polarization much.
While you can easily convert Jones matrix to Mueller matrix, you can not do the opposite. And Chipman's P-matrix is just a tool how to handle Jones in 3D in WCS (world or lab coordinate frame). Therefore for this reason I think it's not what we want. Maybe we can draw an inspiration and test cases there, yes. <= Correct me if I am wrong in understanding what P matrix does in Chipman.
I need to read the Zhang (which I have not yet) to see if it really provides something like marriage of Chipman and Mueller, OR if it is more into the direction of your impression.
FYI: @BrianGun
Dear @hrobotron, The Jones approach is actually the more general approach for a raytrace. It supports both polarized and unpolarized light because it can be reduced to Mueller matrices in the end. The claims by Zhang that Jones cannot be used as the basis for modeling systems with unpolarized light is absolutely incorrect.
Let me be clear and explain the process of how this would work. A traditional optical system is modeled (let's just say the telescope in the Airy Optics white paper). The polarization changes for each surface are computed and represented in the P matrix. Because of the formulation of the polarization calculus, it is relatively simple to computer a P matrix for the entire system. This can then be used to generate Jones matrices as a function of pupil position (what Chipman calls the Jones pupil). It is also straightforward to compute the Mueller matrix for each position in the pupil. From this, a source with an arbitrary set of Stokes parameter can be used to find the Stokes parameters at all coordinates in the pupil and in the image plane. If you think about the problem, this telescope is almost certainly for imaging unpolarized light. By using the Jone approach, it clearly shows where the issues are in the design and how the imaging properties will be effected by the polarization changes. And then, when the design is finished, the stokes parameters can be calculated to understand what the polarizing at the detector might be. However, this is probably less important than how the resolution is impacted by the polarization changes.
The Mueller approach does not support the phase information that is tracked in the Jones approach. This is why you cannot go from Mueller to Jones. It is also why you could not model the changes to imaging performance with Mueller as shown in the example .There is less information about the optical system in the Mueller approach. As I said before the only place the Meuller approach does better in a raytrace is when there are depolarizing elements like opal glass or painted surfaces. If this is where your interest lies, then I agree that the Mueller approach is better. If you interest is seeing how systems made of more traditional optical elements work with unpolarized light, then I think the Jones matrix with proper interpretation would be a better investment of your time. If your interest is in applications like polarimeter cameras, then the Jones approach is the clear choice. It will give much better insight into what is happening in the optical design and what is happening to the polarization than a Mueller raytrace, but at the same time easily provides the Mueller matrices needed to examine the Stokes parameters to link it to specific application problems.
Zhang is primarily an alternate formulation of tracking polarization with alternative ways of generating results that mirror the Mueller and Jones matrices. It's primary advantage is that if somehow have a depolarizer that doesn't also scatter it will better model the results than other methods (Chipman, Jones, Mueller, ...). However, such a depolarizer seems unphysical to me. I cannot imaging how to change the entropy of just this parameter without altering the entropy of other parameters. As far as I know, all true depolarizers either change the light in spatial or temporal dimensions as well. Pseudo depolarizers are the exception but these could be modeled with the Jones approach because there is a well defined polarization transformation at each place across the aperture.
Dear @mattderstine
Let me concentrate only to a small but fundamental subset of what you wrote:
The Jones approach is actually the more general approach for a raytrace. It supports both polarized and unpolarized light because it can be reduced to Mueller matrices in the end. The claims by Zhang that Jones cannot be used as the basis for modeling systems with unpolarized light is absolutely incorrect.
Here I can not agree with my current knowledge. You can see in my initial post where I was completely unaware of both Chipman as well as Zhang that I was heading in the direction which has the same basis which you call absolutely incorrect. So, either you have some manipulation of the rays in mind which I do not follow, or one of us is incorrect. I believe there is one simple test which may reveal the truth:
please, can you take one single ray, unit power, zero phase, and write me either Jones vector or Chipman P matrix to represent this ray? But please, really do write numbers, not human language sentences.
If you succeed, I hope I will learn a lot or at least see the direction.
If you fail, then I think our minds will be in tune.
Thank you for your effort.
jones matrix example.pdf Attached is a pdf of a Mathematica Notebook that shows how this is done for a nontrivial example. Something is wrong with the Jones to Mueller matrix conversion, probably with Mathematica's rigid implementation of complex numbers, but I'm not going to spend the time to fix that.
I picked an example with a wire grid polarizer and a zero order waveplate. This kind of optic might be found in an old LCD projection TV. This kind of analysis might have been done to compare the design to measured performance.
The example uses Jones matrices rather the the polarization ray trace matrix of Chipman because the rays directions never changes. Doing it with the P matrix would have just added lots of extra zero elements to the Jones matrices. It shows the basic process. Implicitly a ray is traced through the system. At each interface the Jones matrix is computed (or in my case, they can trivially be stated). Then the system Jones matrix is computed (or the P matrix if that was chosen). This is then converted to a Mueller matrix and then multiplied by the Stokes vector for unpolarized light computing the polarization state at the output.
Pardon the Mathematica, but I'm not yet proficient enough to do symbolic calculations in Julia. Hopefully, in spite of my problems with the Jones to Mueller conversion, you see the process to simulate the systems response with any light that can be represented with Stokes parameters but analyzed with the Jones or P matrix formulation. Even if I did it badly, this shows that the Zhang statement is not correct.
I'm going to move on to learning OpticsSim and seeing if I can make some contributions.
Dear @BrianGun
I created a new branch (#276) for this issue and have started modifying the code to add polarization information to OpticalRay. In the new system you create an optical ray with polarization = nothing, or to a subtype of AbstractPolarization. Currently the only subtype is Chipman defined as an empty stub. As soon as I get the current code working with polarization = nothing then I'll start on the Chipman polarization tracing.
It doesn't look like the polarization tracing will take long to code, unless I'm misreading the book. Once I finish could one or both of you write tests to make sure the code works?
It will simplify life if you make all your modifications to this branch.
I tried to check out this branch, but I do not see any commit except that the branch exist. Is there any problem on my side (between kbd-chair), or do you have your commit with AbstractPolarization still to be pushed?
Thx.
Dear @mattderstine
[jones matrix example.pdf]
thank you very much for your quick response and for elaboration of the example. Please let me recap to make sure our understanding is the same. In Chipman's P-matrix formalism the 2D Jones 2x2 complex matrix is extended to 3D yielding complex 3x3 matrix, representing Jones action in WCS. During the propagation of the ray through N interfaces, at each (n-1,n), the matrix P (and not the 3D Jones vector E) is kept as a property of ray. At the detector, the ray's P matrix is a product of elemental P matrices of each interface intersection. How does a polarization of the source evolve to a polarization at detector is then solved by applying the P matrix, likely from-to respective pupil local coordinate systems (LCS). Is my understanding correct?
I have to repeat, that while I like the idea of P matrix as an end-to-end polarization transfer function in WCS, I still think that raytracing polarization this way is very impractical in OpticSim for the following reasons:
- doubling/tripling of rays. During the propagation, the ray does not know its polarization. For this reason, it is necessary to cover all doubling (r & t) and tripling (r & t_o & t_e) of rays -- exactly as Chipman discusses. The problem is that there can be even infinite amount of these ray forks, which in deterministic case is hard to handle -- may be reduced by a heuristic to avoid cycles, but then it is less realistic. (Simple example: plan-parallel plate nearly perpendicular to the ray; or: a lens with long focus.) What I find especially suboptimal is that even in case that there will be zero power in specific ray fork, it is not known until the very end so the (zero-power) ray still consumes the resources.
- Memory (and possibly also computing speed): the P-matrix is 3x3xComplex (18 numbers). On the other hand, 3D Jones vector is 3xComplex, enhanced with partial polarization adds one number (DoP, degree of polarization), therefore in total 7 numbers (degrees-of-freedom). I find this a bit of waste, especially as most users will be happy with as much rays as possible.
- Compatibility with current OpticSim.jl: prior to starting this Issue thread, I looked into the code and I was actually quite happy with the Monte Carlo implementation of reflect/transmit forks, which maintains constant number of rays, only changes their pseudorandom distribution in a way they will cover more powerful beams with more rays and near zero power rays with nearly no power -- which fits the purpose very well.
Below, I would like to make a proposal how to handle the SoP of each ray. But prior to that, let me say:
- I believe the Chimpan's P-matrix is a very nice idea and we can introduce it at least during development as a test matrix, which shall be able to prove if any other representation of SoP works well. The only thing which I would not like to do is to implement ray forking and explosion in number of rays.
- I have finally looked in to Zhang&al and I share your opinion the added value is not much. I do not think the formalism is especially suited for us.
Let me now propose how to handle the polarization. First, let me express what components related to polarization, power, phase and direction we need to track:
-
thetaorient (plane of polarization/ellipsoid major axis w.r.t. ray's LCS) -
epsilonellipticity -
DoPdegree of polarization (0..1) -
phiphase -
powintensity (power) -
directionor vector k (unit vector, i.e. 2 degrees-of-freedom)
The well-known formalisms do handle the following (sub)sets:
- Jones has
pow,phi,theta,epsilon, lackingDoP- Chipman's P matrix/E vector adds 2 DoF:
direction
- Chipman's P matrix/E vector adds 2 DoF:
- Poincare sphere covers
theta,epsilon,DoP - Stokes vector covers Poincare sphere plus
pow
In the framework I'd like to propose covers all 7 DoFs, i.e. the same as Chimpan's vector E plus DoP. (Please note that I do want to propagate the vector E and not matrix P, in order to be able to evaluate instantaneous ray's power and therefore Monte Carlo redistribution of rays. And to save some memory at the same time.)
Now, what exact representation of the above? Although I like a lot the idea of common WCS for all quantities without changing to LCS of each frame, after some doodling on a paper it seems to me that simplest and still very efficient is actually to switch to ray's LCS at each interface. In order to enable this, we need an un-ambiguous definition of LCS based solely on beam's direction:
k1 = normalize([k3[2]+k3[3], -k3[1]+k3[3], -k3[1]-k3[2]])
k2 = cross(k1, k3)
T = [k1 k2 k3]
This gives us an ortho coord transformation matrix T to go from-to the LCS of a ray at the interface. The T is thus fully defined by current OpticSim's ray property (direction). Then, the new elements to add are: theta, epsilon (perhaps find a better name not resembling eps()), DoP which constitutes the SoP, and phi (or phase) for phase. pow also already exists.
In the LCS, we can start to process the most basic existing interfaces such as Fresnel. I propose to add -- as a means of debug and development helper -- the Chipman's P-matrix in order to validate the results, to one of the AbstractPolarizations. For "production" I'd however advice to put it aside.
Memory-wise the current ray will need to carry 4 more Float values.
FYI: @BrianGun
@hrobotron @mattderstine
A Monte Carlo approach should handle the problem of ray splitting in birefringent materials without causing an explosion of rays. We currently do this to handle reflection/transmission and it seems effective.
A possible advantage of using the P matrix instead of the E vector is that one can compute the response for any type of input polarization without having to rerun the raytrace. This would only work for a single source, i.e., you'd need to compute one P matrix image for each source.
I still need to better understand how P matrices are added, as opposed to being composed by multiplication, to verify that this is true. If it is then you could do a ray trace, propagate the P matrices through the system and store the resulting P matrix at each pixel. Then using that P matrix image you could generate results for aribtrary input polarizations. This would be extremely fast even for large images.
If you have multiple sources then you'd need to create multiple P matrix images. Unless your images are very large or you have many sources it would still be faster to compute the output for a different polarization this way than to redo the raytrace.
Whereas it seems that if you propagate the E vector then your result is only useful for that polarization. If you change the polarization of the source then you have to rerun the entire raytrace.
The extra overhead from computing and propagating the P matrices should be small compared to everything else being done during the raytrace.
Don't know why the branch doesn't have the latest changes I've made locally. I'll look at it today.
Give me a few more days to add the P matrix computations before you start munging on the code so we don't get too out of sync. Then you'll have something to build on and create tests against.
Dear @hrobotron, I have spent too much time on this already and I'm afraid this is going to be a bit blunt.
Forking at e and o is only necessary for birefringent material where the power actually is split. That said, for many problems, there is no other solution than to fork. It doesn't matter about the formalism used for tracking the polarization. I agree that if your problem only needs one of the forks, it would be nice if the code didn't do the extra calculations. It is more important that the code does the needed calculation and it is correct and understandable. (and as a side note, I am not convinced with the approach taken with the stochastic method used with r & t but that's for another thread.). As a justification for forking, I will point to the problem of analyzing the impact of stress birefringence on the "PSF" of an objective lens. This cannot be analyzed unless the e and o rays are traced and recombined to understand the impact. This could be on two ray traces (one for e and one for o) or it could have forking. With a 30 element objective, I'd rather have the code fork for me for the 1000 rays that I need than have to set up 2^29 cases.
Assuming a SoP before the ray is traced has marginal benefit for most problems. It is more useful to understand what happens with variations to the source SoP when assessing the value of design. Requiring separate ray traces for each SoP is very wasteful in this instance.
For most applications, the intermediate P and Q matrices do not need to be saved. This is no different than only saving the ray data at a detector. This means the memory differences of one approach or that other are pretty much unimportant. If enough rays are to be traced that it does matter, the computation to the interesting quantities can be done at the end of each raytrace, for example at a polarization detector. For cases where a few rays are traced to see the evolution, having the full polarization state is better (see last paragraph).
In spite of the difficulty in getting it correct, the polarization raytrace is not the hardest part of the problem. It the most critical and must be absolutely correct and it must provide the capability for all of the analysis that a user needs. Creating and perfecting the code to actually analyze the results is much harder than doing the raytrace. Having the complete information about the polarization properties of the system in one consistent representation makes this implementation much easier and facilitates breaking problems up so that multiple people can address them.
The approach is Chipman is complete, well documented, and addresses nearly all possible problems. It is in book form so that many people can refer to it, study it, and have a common understanding of how all the parts work. While it all may not be computationally optimal, it is correct and users can spend their time understanding their optics problem rather than trying to understand if the issue they have is from a quirk in the code.
I told @BrianGun that I would work on the examples and problems from the book. This is actually a huge undertaking since it will require developing or testing numerous analysis routines. The only way I can possibly accomplish this is to have a reference that at least gives me hints on how to approach these routines. I will simply be unable to do this if we choose some subset of this solution. The key output of this effort is actually not the testing of the code, but the development of the analysis routines that others can use for similar problems.
OpticsSim has the potential to have a huge impact on the way optical design is done. We should be focusing on capabilities and rigor rather than on optimizing for Monte Carlo simulations. Basing code on well explained methods will only make it accessible to more users but also easier for those users to add and share capabilities. CodeV and Zemax suffer from having polarization grafted on. The notebook interface and the structure of OpticSim are ideally suited for solving problems that are difficult or impossible with those codes. Let's push to get that capability in users hands rather than getting bogged down in understanding if the code actually works for their applications.
I can see that in version 2 or 3 there may be a need for computational optimization. Let's wait until then to know what the users need and will provide benefit before we start pruning our options.
@mattderstine @hrobotron
I appreciate the time and care you are both putting into this. Your comments and analysis have helped me to better understand how to incorporate polarization into OpticSim.
It seems to me now that the total amount of code that will need to be rewritten/modified is small, on the order of at most a hundred lines, probably far less. It's tricky primarily because we are doing surgery on an existing system and have to be careful to make sure all the appropriate dependent functions are updated properly when we make a change.
Because the code changes are small and localized I'm less concerned with trying Chipman first and then switching to something entirely new later if Chipman proves insufficient. An initial implementation using Chipman's formulation will be a good way to start because it is well documented, relatively easy to understand, and easy to verify against the worked examples in the book. It won't be that big a deal to switch to something entirely different if we decide to change later.
Modifying the code to incorporate P matrices will be much easier for me to do than for either of you because of my familiarity with the code. I hope to have the P matrix propagation done soon after which I can turn it over to you all to experiment with and verify. Once we learn the limitations/power of the result we can turn the crank again and make necessary improvements or extensions.