TorchSharp icon indicating copy to clipboard operation
TorchSharp copied to clipboard

Dispose `Scalar`s implicitly created in `Tensor` operators

Open ds5678 opened this issue 1 year ago • 6 comments

This pull request modifies the Tensor operators involving primitives to immediately dispose the Scalar object they create, rather than waiting for the garbage collector.

ds5678 avatar Jan 10 '25 03:01 ds5678

@dotnet-policy-service agree

ds5678 avatar Jan 10 '25 04:01 ds5678

Is there anything I need to do for this to be merged?

ds5678 avatar May 27 '25 21:05 ds5678

  • FYI, let me share some my findings which are not covered by changes in this current MR, but related to missing TorchSharp.Scalar.Dispose.
    • TorchSharp torch.Tensor.add_ may internally call another torch.Tensor.add_ w/ implicit TorchSharp.Scalar conversions from hardcoded 1.
      • https://github.com/dotnet/TorchSharp/blob/6ceda5336102bfb1e20b65ece1686fa9a8bd9a20/src/TorchSharp/Tensor/Tensor.Math.cs#L106
      • https://github.com/dotnet/TorchSharp/blob/6ceda5336102bfb1e20b65ece1686fa9a8bd9a20/src/TorchSharp/Tensor/Tensor.Math.cs#L129
      • consider in cases of a user intentionally call torch.Tensor.add_ for a inplace op.
      • update (2025/09/05): TorchSharp torch.Tensor.add also have same symptom.
        • https://github.com/dotnet/TorchSharp/blob/6ceda5336102bfb1e20b65ece1686fa9a8bd9a20/src/TorchSharp/Tensor/Tensor.Math.cs#L58
        • https://github.com/dotnet/TorchSharp/blob/6ceda5336102bfb1e20b65ece1686fa9a8bd9a20/src/TorchSharp/Tensor/Tensor.Math.cs#L82
      • update (2025/09/05): TorchSharp torch.Tensor.addcdiv{,_} also have same symptom.
        • https://github.com/dotnet/TorchSharp/blob/6ceda5336102bfb1e20b65ece1686fa9a8bd9a20/src/TorchSharp/Tensor/Tensor.Math.cs#L203
        • https://github.com/dotnet/TorchSharp/blob/6ceda5336102bfb1e20b65ece1686fa9a8bd9a20/src/TorchSharp/Tensor/Tensor.Math.cs#L228
      • update (2025/09/05): TorchSharp.Modules.BatchNorm.forward also has same symptom.
        • https://github.com/dotnet/TorchSharp/blob/6ceda5336102bfb1e20b65ece1686fa9a8bd9a20/src/TorchSharp/NN/Normalization/BatchNorm.cs#L40
      • a possible workaround i can think of: carefully call torch.Tensor.add_ (or same kind method) w/ using declared TorchSharp.Scalar typed alpha (or same kind argument) explicitly.
    • TorchSharp torch.nn.functional.leaky_relu internally call torch.Tensor.leaky_relu{,_} w/ implicit TorchSharp.Scalar conversions from double typed negative_slope.
      • https://github.com/dotnet/TorchSharp/blob/6ceda5336102bfb1e20b65ece1686fa9a8bd9a20/src/TorchSharp/NN/Activation/LeakyReLU.cs#L59
      • other {modules,functions} that directly use specified .NET typed arguments for forward calculation also have possible same symptom.
      • a possible workaround i can think of: carefully call torch.Tensor.leaky_relu{,_} (or other same kind method) w/ using declared TorchSharp.Scalar typed negative_slope (or other same kind arguments) explicitly.
    • TorchSharp torch.arange may internally call another torch.arange w/ implicit TorchSharp.Scalar conversions from hardcoded 0 and/or 1.
      • https://github.com/dotnet/TorchSharp/blob/6ceda5336102bfb1e20b65ece1686fa9a8bd9a20/src/TorchSharp/Tensor/Factories/Tensor.Factories.cs#L65
      • https://github.com/dotnet/TorchSharp/blob/6ceda5336102bfb1e20b65ece1686fa9a8bd9a20/src/TorchSharp/Tensor/Factories/Tensor.Factories.cs#L73
      • a possible workaround i can think of: carefully call torch.arange w/ using declared TorchSharp.Scalar typed {start,stop,step}.
    • TorchSharp.Modules.SGD.step calls torch.Tensor.{add,add_,mul_} w/ implicit TorchSharp.Scalar conversions from double typed ones.
      • https://github.com/dotnet/TorchSharp/blob/6ceda5336102bfb1e20b65ece1686fa9a8bd9a20/src/TorchSharp/Optimizers/SGD.cs#L158
      • https://github.com/dotnet/TorchSharp/blob/6ceda5336102bfb1e20b65ece1686fa9a8bd9a20/src/TorchSharp/Optimizers/SGD.cs#L168
      • https://github.com/dotnet/TorchSharp/blob/6ceda5336102bfb1e20b65ece1686fa9a8bd9a20/src/TorchSharp/Optimizers/SGD.cs#L172
      • https://github.com/dotnet/TorchSharp/blob/6ceda5336102bfb1e20b65ece1686fa9a8bd9a20/src/TorchSharp/Optimizers/SGD.cs#L181
      • other TorchSharp optimizers also have possible same symptom.
      • a possible workaround: derive from TorchSharp.Modules.SGD (or other optimizer class) and override step.

hiyuh avatar Sep 04 '25 08:09 hiyuh

  • IMHO, relying implict conversions to type inherits IDisposable (i mean TorchSharp.Scalar) inside TorchSharp easily produces serious memory leak that won't be able to workaround by user side.
  • most probably, someone would have to do global screening (and hopefully fixing) for TorchSharp.Scalar (and torch.Tensor as well if possible) w/ temporary disabling their implicit conversions, otherwise TorchSharp definetly can not be used for any applications run long time.
    • and most of deep learning based applications usually run long time; e.g. long time training, FPS speed inference iteration.
  • another possible solution i can think of is making TorchSharp.DisposeScope to cover TorchSharp.Scalar as well?

hiyuh avatar Sep 05 '25 05:09 hiyuh

@hiyuh if you want to make a pull request onto my branch with your additions, I'd happily merge it so that they can be part of the review for this pull request.

ds5678 avatar Sep 05 '25 07:09 ds5678

@ds5678
thanks. i'm grandually moving my task from "avoiding memory leak in my app code" to "fix leak in TorchSharp". i'll let you know if my outcome does make something better.

hiyuh avatar Sep 05 '25 08:09 hiyuh