libgaba icon indicating copy to clipboard operation
libgaba copied to clipboard

why alignment results are 16 characters shorter than input sequences?

Open chenying2016 opened this issue 8 years ago • 1 comments

In bench.c, we make sequences a and b identical sequences and both of length 10000. And we then add a statement
printf("m(%lld)\n", m);
after line 3674 of file gaba.c. When we compile and run bench.c, it always output m(9984) while the normal output should be m(10000). Is there something going wrong?

chenying2016 avatar Aug 20 '17 07:08 chenying2016

First of all, I'm sorry the bench.c being incorrect in calling the libgaba APIs. The correct one will be the following:

/* root coordinates; align from the heads */
uint32_t asp = 0, bsp = 0;

/* margin */
char const m[96] = {0};

/* build section structs */
gaba_section_t asec = gaba_build_section(0, a, strlen(a));
gaba_section_t bsec = gaba_build_section(2, b, strlen(b));
gaba_section_t msec = gaba_build_section(4, &m[32], 32);

/* keep current pointers in ap and bp */
gaba_section_t const *ap = &asec, *bp = &bsec;

/* fill the body of the banded matrix */
gaba_fill_t *f = gaba_dp_fill_root(
	dp, ap, asp, bp, bsp);

/* keep section with the maximum cell value in m */
gaba_fill_t *m = f;

/* fill-in the tail of the banded matrix */
uint32_t flag = GABA_STATUS_TERM;
do {
	if (f->status & GABA_STATUS_UPDATE_A) {
		ap = &msec;
	}
	if (f->status & GABA_STATUS_UPDATE_B) {
		bp = &msec;
	}
	flag |= f->status & (
		GABA_STATUS_UPDATE_A | GABA_STATUS_UPDATE_B
	);
	f = gaba_dp_fill(dp, f, ap, bp);
	m = (f->max > m->max)? f : m;
} while (!(flag & f->status));

/* fill-in is done, m holds block with the maximum */
printf("score(%lld)\n", m->max);

The behavior of the library comes from its design principle to handle each input sequence as a concatenation of sequence segments. The library forwards the front of the band (or ``extends the band'') till 32-base-long subsequence pair is available (see Fig in https://github.com/ocxtal/libgaba#sequences). So, to align toward the tips of two sequences we have to pass margin or dummy sequences after the sequences of interest.

Thanks,

Hajime Suzuki

ocxtal avatar Aug 21 '17 04:08 ocxtal