How to create a parser for later reuse.
We want to use nom to parse the streaming output text of the LLM, which contains a bunch of structured content. So we're trying to create a parser and save it for later reuse, because it is very often to match on streaming output.
But when I tried it, I ran into a problem. For example, the nom::branch::Choice<T> implements Parser only for T = &mut [A] / [A; N] / (A, B, ..), in which Vec<A> is not included. But in our use case, the tag_names is not fixed. So we tried passing the Vec to nom::branch::alt, but the result Choice doesn't implement Parser, if we pass the reference of Vec to nom::branch::alt, the result Choice will borrow the local variable tags.
The sample code:
use nom::Parser;
use nom::{self, IResult};
fn new_matcher<'a>(tag_names: &[&'static str],) -> impl Parser<&'a str, Output = &'a str, Error = nom::error::Error<&'a str>> {
let mut tags: Vec<_> = tag_names
.iter()
.map(|tag_name: &&str| {
nom::bytes::streaming::tag::<&str, &'a str, nom::error::Error<&'a str>>(*tag_name)
})
.collect();
// wrong because all_tag_name will borrow local variable
let all_tag_name = nom::branch::alt(tags.as_mut());
// wrong because the result Choice doesn't implement nom Parser
// let all_tag_name = nom::branch::alt(tags);
all_tag_name
}
I'm wondering what the correctway to create and save a nom Parser for later use.
Usually when you get the error that Choice does not implement Parser it is because not all branches (in a tuple) have the same Output type. (When it is an array they have to have the same actual type.)
If you're passing in a vec you can always call as_slice()
Edit: This seems to do what you'd want. (check out this comment's edit history to see a "proper" trait using implementation.)
use nom::Parser;
fn tag_collection<'a, E>(
tag_names: &[&'static str],
) -> impl Parser<&'a str, Output = &'a str, Error = E>
where
E: nom::error::ParseError<&'a str>,
{
let mut tags: Vec<_> = tag_names
.iter()
.map(|tag_name: &&str| nom::bytes::tag::<&str, &'a str, E>(*tag_name))
.collect();
move |input| nom::branch::alt(tags.as_mut_slice()).parse_complete(input)
}
fn main() {
let input = "bbfranceaaaaabbb";
let tags = vec!["aa", "bb", "cc"];
let mut tag_collection = tag_collection::<nom::error::Error<&str>>(tags.as_slice());
let result = tag_collection.parse_complete(input).unwrap();
dbg!(result);
}