Docs: HuggingFace (NLP) Migration Guide

Open luisquintanilla opened this issue 11 months ago • 3 comments

Add guidance on how to use (NLP) models from HuggingFace

Tokenizers
TorchSharp / ONNX
Tensors

Feb 11 '25 16:02 luisquintanilla

Install dependencies

pip install transformers torch torchvision torchaudio torchsharp onnxruntime

from transformers import AutoTokenizer, AutoModel import torch import torch.nn as nn import torchsharp import onnxruntime as ort

1. Tokenization

tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased") text = "Hugging Face is great!" tokens = tokenizer(text, padding=True, truncation=True, return_tensors="pt")

2. Load Model (Torch)

model = AutoModel.from_pretrained("bert-base-uncased") with torch.no_grad(): output = model(**tokens)

3. Convert PyTorch model to ONNX

torch.onnx.export( model, # Model (tokens["input_ids"], tokens["attention_mask"]), # Inputs "bert_model.onnx", # Output file input_names=["input_ids", "attention_mask"], output_names=["output"], dynamic_axes={"input_ids": {0: "batch_size"}, "attention_mask": {0: "batch_size"}}, opset_version=11 )

4. Run ONNX Model

ort_session = ort.InferenceSession("bert_model.onnx") onnx_inputs = {k: v.cpu().numpy() for k, v in tokens.items()} onnx_output = ort_session.run(None, onnx_inputs)

5. Convert Output to Tensor (TorchSharp)

output_tensor = torch.tensor(onnx_output[0]) print(output_tensor.shape)

Use this for better result

Feb 16 '25 16:02 tarun111111

@tarun111111 This ticket is for documenting migration from the huggingface python world, to the c# world of:

System.Numerics.Tensors
Microsoft.Extenstions.Tokenizers
Onnx / torchsharp / ML.Net

The conversion to onnx is great docs, but more is needed for this story to be complete:

tokenizers - how to migrate from huggingface tokenizers to the new tokenizers from dotnet 9
how to migrate the pipeline from huggingface to c# And much more.

Feb 16 '25 20:02 tjwald

I'd be content if there was a Tokenizer.FromPretrained("tokenizer.json") factory method for my particular scenario :)

Mar 13 '25 17:03 kzu