power-grid-model icon indicating copy to clipboard operation
power-grid-model copied to clipboard

[BUG] *Sparse matrix error when calculating the transformer 63/10*

Open WinApiMan opened this issue 4 months ago • 13 comments

Hi!I encountered a problem. When calculating the asymmetrical short circuit (A) on the 0.4 kV busbar of a transformer with Id=1503, a sparse matrix error occurs.Moreover, when calculating an asymmetrical short circuit on the buses of a similar transformer ID=1426, such an error does not occur. We reconnected the objects. We replaced ID=1426 with ID=1503, and set 1426 instead of 1503. However, the error still occurs precisely where ID=1503 is set. Perhaps this isn't a bug, but my own mistake, and you can help me solve this problem.

Input Data Validity calculate_model 1503.json

calculate_model ID1426.json

calculate_output_model ID1426.json

Screenshots

Image

WinApiMan avatar Oct 21 '25 13:10 WinApiMan

hi @WinApiMan, Thank you for reporting this.

Can you try to load the data into Python and use the data validator (https://power-grid-model.readthedocs.io/en/stable/user_manual/data-validator.html)? This may expose some basic usage problems that may not be caught in the C API.

Can you please also provide the exact PGM version you are running (either release tag or Git SHA) and, ideally, an example script of how you run the PGM? With that, we can try to have a look at this tomorrow.

mgovers avatar Oct 21 '25 15:10 mgovers

FYI, I've been able to reproduce the SparseMatrixError and I also confirmed that our data validator does not signify any errors. We will dive into this now.

mgovers avatar Oct 22 '25 06:10 mgovers

Hi @mgovers, Thanks for the quick reply.

We are using version 1.12.54. I'm attaching a test example of the call script and errors.

Image Image

WinApiMan avatar Oct 22 '25 06:10 WinApiMan

Thank you for your extensive input, it really helps with the investigation. This indeed seems something that requires a deep-dive on our end. We will investigate.

mgovers avatar Oct 22 '25 06:10 mgovers

Hello @WinApiMan,

I want to let you know that we are actively working on this issue. We have found some leads, but nothing conclusive yet. We will communicate with you as soon as we have more concrete progress.

figueroa1395 avatar Oct 29 '25 10:10 figueroa1395

Hi @figueroa1395!Thank you, the moment is quite relevant, so we are waiting

WinApiMan avatar Oct 29 '25 13:10 WinApiMan

@mgovers, @figueroa1395 , Hi!We ran a series of further tests and found a pattern. If after transformer (after to_node) we have only one node, we get sparse matrix error. However, when we added another node after to_node and connected it to the to_node, model was calculated. Maybe this will help you.

calculate_model_test_2041_invalid.json

calculate_model_test_2041_valid.json

WinApiMan avatar Nov 04 '25 13:11 WinApiMan

Hi @WinApiMan,

Thanks for thinking along! The problem is probably caused because:

  • The solvers used in PGM solve in the phase (abc) domain. We also use these solvers to solve short circuit calculations
  • In this particular case the grid is floating: no Dn or Yn connection + no shunt and the fault is 2 phase. In the phase domain this results in a sparse matrix error.
  • However, for short circuit calculations, this particular case is solvable in the sequence domain (012)

We might have wrongfully assumed that we can solve all short circuit calculations in the phase domain. If this is the case the solution will not be trivial, unfortunately.

petersalemink95 avatar Nov 04 '25 14:11 petersalemink95

@mgovers , @petersalemink95, I don't think the problem lies in the asymmetrical short circuit calculation. Here's an example of the same error in the Power Flow and asymmetrical load calculation.

calculate_model PF invalid.json

WinApiMan avatar Nov 05 '25 08:11 WinApiMan

For analysis, we added an additional link after the transformer and before the load to the previous scheme. The scheme began processing and at the 30th iteration, showed a numericla error of 0.02, but then numerical error began to increase and a sparse matrix error occurred.

calculate_model PF invalid_2232.json

WinApiMan avatar Nov 05 '25 09:11 WinApiMan

Hi @WinApiMan,

You are right that a similar thing does happen in power flow calculations. Reason: we cannot do calculations in the phase domain on floating grids. However, there is a difference:

  • for power flow calculations this is expected
  • for short circuit calculations we should be able to calculate the short circuit current using the sequence domain

There are some potential resolutions that we're looking in to:

  • re-writing the solvers of short circuit calculations to the sequence domain - this would be a lot of work and is not going to be a "quick fix"
  • Internally adding a large grounding impedance to the transformer - this can be easier to implement and might (to be investigated) also provide a solution in the power flow case

petersalemink95 avatar Nov 06 '25 14:11 petersalemink95

Hi @WinApiMan,

Update from our side: we're investigating the 2nd option I mentioned above - "Internally adding a large grounding impedance to the transformer". Currently we see promising results. We're adding more validation cases to prove that this is the right direction forward. If this is indeed the case we're expecting to have the full implementation available within 2 weeks.

petersalemink95 avatar Nov 13 '25 12:11 petersalemink95

@petersalemink95 , Great, I hope you're right :)

WinApiMan avatar Nov 13 '25 17:11 WinApiMan

@TonyXiang8787, @mgovers . We used the last pgm version 1.12.69. The sparse matrix error didn't disappear.

Files

calculate_model_1.json

Image Image

Additional test

calculate_model.json

WinApiMan avatar Nov 20 '25 07:11 WinApiMan

@WinApiMan ~Awesome, that's great to hear!~ EDIT: I mis-read. That's unfortunate. See below for continuation.

Thank you for your patience. It took us some time to come up with a good and stable solution (in particular, one that doesn't potentially break other things).

We will be working on some documentation additions in the near future regarding our findings, as well as some other, closely related improvements

mgovers avatar Nov 20 '25 07:11 mgovers

@TonyXiang8787, @mgovers . We used the last pgm version 1.12.69. The sparse matrix error didn't disappear.

Files

calculate_model_1.json Image Image

Additional test

calculate_model.json

@WinApiMan Awesome, that's great to hear!

Thank you for your patience. It took us some time to come up with a good and stable solution (in particular, one that doesn't potentially break other things).

We will be working on some documentation additions in the near future regarding our findings, as well as some other, closely related improvements

I apologize, I misread. you said it didn't disappear. That's a shame. I'll do a quick investigation.

mgovers avatar Nov 20 '25 07:11 mgovers

If i reduce the max_iterations significantly, I do get another indication of what is happening: IterationDiverge. That points towards calculation instabilities. Setting max_iterations very high may lead to various types of behavior (limit cycles, divergences, ...) and therefore, the state can at some point not be trusted anymore. The real issue is the fact that the calculation does not converge.

I am still trying to investigate the underlying issue, but seems to be a different problem from the one that this issue is about. If that is indeed the case, maybe we need to close the issue and re-open a new one.

mgovers avatar Nov 20 '25 08:11 mgovers

@mgovers For analysis, we tried increasing the error_tolerance value, and at a value of 1e-1, an error appeared that we didn't understand

Image

WinApiMan avatar Nov 20 '25 08:11 WinApiMan

@mgovers For analysis, we tried increasing the error_tolerance value, and at a value of 1e-1, an error appeared that we didn't understand Image

I think by that point, you're running into really weird edge cases (undefined behavior; see also https://power-grid-model.readthedocs.io/en/stable/advanced_documentation/terminology.html#undefined-behavior). Usually, that means that you've entered the wrong rabbit hole 😬 Instead, our recommendation is to try to come up with a smaller, simpler version of the grid that reproduces the same behavior (a minimal reproducible case). That's much simpler to reason about, and often is able to pinpoint the exact problem you're running into.

On that note, in the meantime, I've managed to reproduce the issue with a single source, transformer, sym load and asym load. For simplicity, I've also rounded some values to nearest power of 10, as well as remapped the IDs.

{
  "version": "1.0",
  "type": "input",
  "is_batch": false,
  "attributes": {},
  "data": {
    "node": [
      {"id": 0, "u_rated": 10000},
      {"id": 1, "u_rated": 1000}
    ],
    "transformer": [
      {"id": 2, "from_node": 0, "to_node": 1, "from_status": 1, "to_status": 1, "u1": 10000, "u2": 1000, "sn": 100000, "uk": 0.10000000000000001, "pk": 1000, "i0": 0.01, "p0": 100, "winding_from": 0, "winding_to": 1, "clock": 0, "tap_side": 0, "tap_pos": 3, "tap_min": 1, "tap_max": 5, "tap_nom": 3, "tap_size": 100, "uk_min": 0.10000000000000001, "uk_max": 0.10000000000000001, "pk_min": 1000, "pk_max": 1000, "r_grounding_from": 0, "x_grounding_from": 0, "r_grounding_to": 0, "x_grounding_to": 0}
    ],
    "source": [
      {"id": 3, "node": 0, "status": 1, "u_ref": 1, "u_ref_angle": 0, "sk": 100000000, "rx_ratio": 1, "z01_ratio": 1000000}
    ],
    "sym_load": [
      {"id": 4, "node": 0, "status": 1, "type": 0, "p_specified": 100000, "q_specified": 100000}
    ],
    "asym_load": [
      {"id": 5, "node": 1, "status": 1, "type": 0, "p_specified": [50000, 40000, 30000], "q_specified": [20000, 10000, 0]}
    ]
  }
}

At this point, this feels very closely related to another, related but definitely separate thing we're looking into separately, which is the sparse matrix solving stability (see e.g. #1125 and #1167). We're re-iterating on that because the original direction didn't seem to be the right one, so it's still work in progress. However, I can't be sure at this point and will have to take this back to the team first.

mgovers avatar Nov 20 '25 08:11 mgovers

@mgovers Thanks for the information. We use parts of a real network for testing, so this was the smallest part that we were able to generate quickly :)

WinApiMan avatar Nov 20 '25 10:11 WinApiMan

We've had some extensive discussions within the team and we found a couple things.

  1. There is at least one data error:
    1. z01_ratio should be of the order of magnitude of 1, not 1e6 (this seems to be a data error)
    2. sk, we expect to be of the order of magnitude of 1e12, not 1e7 ~ 1e8 (we expect this to be data error but we can't be sure)
  2. If these are solved, the underlying source of the iteration diverge changes from the transformers to the links:
    1. Links are known to cause calculation instabilities. This is related to the work in #1125 and #1167. Hopefully that will reduce the amount of edge cases we encounter in the mid- to long-term, but that will not provide a guaranteed workaround.
    2. As a workaround, you can try to replace the links with low-impedance lines, e.g. r1=1e-6, x1=1e-6, c1=0, tan1=0, r0=1e-6, x0=1e-6, c0=0, tan0=0. This seems to resolve the case you sent me.
  3. There are some other things that popped up in the discussion that we are going to take offline, but they probably do not affect you directly.

Can you please see if these things resolve the problems you are encountering?

mgovers avatar Nov 20 '25 11:11 mgovers

We can certainly change this. But this is an artificial parameter change. We're using real network parameters: Un = 10,000 V, Isc = 5,000 A. If we use 10e12, the current would need to be increased by a factor of 10,000, and I've never encountered such currents in distribution networks in my entire career. Using the z0 coefficient in the source, we simulate a network with an isolated neutral; setting z0 = 1 would result in a network with a grounded neutral, and problems 1000%) would not arise. We're interested in the library's behavior specifically in such weak and isolated networks. If you need it for your research, we can conduct an experiment with your parameters, but it will be far from realistic.

WinApiMan avatar Nov 20 '25 11:11 WinApiMan

Unfortunately, we can't quickly replace all the links on the line (the object generation algorithm needs to be changed), but we reduced their number and removed the direct connection between the load and the source. And this truly produced a workable model on this small portion of the network.

WinApiMan avatar Nov 20 '25 12:11 WinApiMan

@TonyXiang8787, @petersalemink95 or @nitbharambe, can one of you guys please give some input here as power system experts with electrical engineering background?

mgovers avatar Nov 20 '25 12:11 mgovers

Hi @WinApiMan,

The main problem of the network is your specified transformer configuration. You now have a Yyn transformer, which does not allow significant zero sequence current flow in the secondary side. And there are asymmetrical loads connecting at the secondary side.

I would not image such a configuration exists in the reality. Most likely, I guess the actual transformer configration is YynD, a three-winding transformer with an internal delta-winding which has no connection to the outside. Maybe you can have a check on the specsheet of the transformer or your database?

TonyXiang8787 avatar Nov 20 '25 12:11 TonyXiang8787

@TonyXiang8787 , @mgovers , We are sending you the diagrams of two substations from the provided model. Current imbalance in the 0.4 kV network is entirely possible. It may not be as severe as shown in the model, but the values ​​are close to reality.

Image Image

WinApiMan avatar Nov 20 '25 13:11 WinApiMan

Hi @WinApiMan,

Thanks for sharing the diagram. The diagram is makng sense. However, it still does not show if the transformer has an internal delta winding or not. It is usually not shown in the schematics. Maybe you need to check the actual specsheet of the transformer?

TonyXiang8787 avatar Nov 20 '25 13:11 TonyXiang8787

@TonyXiang8787 We simulate a real section 10 kV network with loads from the 0.4 kV side. We also have transformers with Y/Yn windings.

Power transformer example:

https://thetimesenergygroup.uz/en/products/power-oil-transformer-tmg-100610-04.htm

Image

WinApiMan avatar Nov 20 '25 17:11 WinApiMan

Hi @WinApiMan,

Thanks for sharing the information and your patience. I have done more in-depth investigation and found we indeed have another design fault in the zero-sequence modelling of transformer, espacially for Yyn transformers. I have tried a quick fix and it works in your network with error_tolerence=1e-4 and default maximum iteration steps.

However, it would take more time to actually adjust all the test and add new test to release the fix. So it will take a while to release. If you want to also test this beta version yourself, we can build a wheel for you to test. Please indicate which operating system (and CPU architecture) you are using.

TonyXiang8787 avatar Nov 21 '25 12:11 TonyXiang8787

Hi @TonyXiang8787 Thanks for the quick reply! We use Windows x64 and we use your library for C++ (power_grid_model_c.dll). If you provide us with a test version for testing, we will be happy to test it and share the results.

WinApiMan avatar Nov 21 '25 13:11 WinApiMan