automatalib PaigeTarjanMinimization.minimizeDFA does not guarantee minimal result

The following code snippet creates and then minimizes a simple DFA:

public static void main(String[] args) throws IOException {
	CompactDFA<String> dfa = new CompactDFA<>(new ArrayAlphabet<>("e1", "e2", "e3"));
	
	int s0 = dfa.addInitialState(true);
	int s1 = dfa.addState(false);
	int s2 = dfa.addState(false);
	
	dfa.addTransition(s0, "e1", s1);
	dfa.addTransition(s1, "e2", s2);
	dfa.addTransition(s2, "e3", s1);
	
	StringBuilder b = new StringBuilder();
	GraphDOT.write(dfa, dfa.getInputAlphabet(), b);
	
	CompactDFA<String> minimalDfa = PaigeTarjanMinimization.minimizeDFA(dfa);
	GraphDOT.write(minimalDfa, minimalDfa.getInputAlphabet(), b);
	
	System.out.println(b.toString());
}

Output:

digraph g {

	s0 [shape="doublecircle" label="0"];
	s1 [shape="circle" label="1"];
	s2 [shape="circle" label="2"];
	s0 -> s1 [label="e1"];
	s1 -> s2 [label="e2"];
	s2 -> s1 [label="e3"];

__start0 [label="" shape="none" width="0" height="0"];
__start0 -> s0;

}
digraph g {

	s0 [shape="doublecircle" label="0"];
	s1 [shape="circle" label="1"];
	s0 -> s1 [label="e1"];
	s1 -> s1 [label="e2"];

__start0 [label="" shape="none" width="0" height="0"];
__start0 -> s0;

}

However since dfa only accepts the empty string its minimal representation would only have one initial accepting state and no transitions. So the output of PaigeTarjanMinimization.minimizeDFA does not seem minimal.

Feb 04 '22 14:02 wytseoortwijn

Dear @wytseoortwijn, I think, this is more of a discussion between partial and total automata. For prefix-closed languages (your automaton accepts the prefix-closed language {epsilon}), the minimal, total automaton representation will always have a sink state that collects all rejected words. Technically, this representation can be further minimized by removing the sink state and relying on the semantics of undefined transitions (which usually reject words).

Your use-case is especially mean, because you have a partial automaton with two equivalent sink states s1 and s2. Unfortunately, AutomataLib has several places where the main algorithm is implemented for generic universal automata only which do not support special semantics such as "undefined transition" = "rejection". For generic automata, states/successors are often characterized structurally, e.g., via their state properties and transition properties of outgoing transitions, and differ from undefined successors. So this is not really an issue with the PaigeTarjan implementation but with the way the automaton structure is interpreted.

We are aware of this issue but haven't found a solution yet that doesn't require re-writing a lot of stuff.

For now, if you run into any issues with this, I would suggest you first decide which canonical representation you want (total vs. partial) and then either complete the automata beforehand (see, e.g. MutableDFAs) or remove any dead-ends afterwards (see, e.g., ShrinkableAutomaton#removeState).

Feb 05 '22 19:02 mtf90

With #73 merged, you may now use DFAs#trim to get an even more compressed view on the automaton (at the cost of introducing partiality).

May 26 '24 21:05 mtf90