lewton Refactor `OggStreamReader` and provide precise seek

Current OggStreamReader has several issues:

We cannot seek the stream at sample-level (closes #16)
get_last_absgp provides insufficient information in some case (closes #84) and may not work for chained streams (closes #87)
Header fields are public, and subject to unwanted change (closes #85)
Behavior for multiplexed streams is unclear (closes #86)
~We cannot obtain the length of stream~ (#95): this is still unimplemented
It won't drop the samples in the first packet even when needed (which is required by spec)
It always requires Seek (Rustaudio/rodio#333) (Well, I know you may disagree about this)

So I implemented a sample-level seek.

Changes in API

OggStreamReader does no longer require io::Seek. Therefore it does not provide seek_absgp_pg.
Instead, new struct SeekableOggStreamReader is provided. This struct comes with seek_absgp, where users can seek the reader in sample-level.
- I had to keep those two structs separate because we need to do the different initialization steps and fields for seekable and unseekable reader (in specific, record the stream bounds).
OggStreamReader and SeekableOggStreamReader is dedicated to only a single stream. If you want to parse chained stream, you have to create a new one.
- Later, I'm going to implement some function like next_stream(self) -> Option<OggStreamReader> that consumes the reader, checks if the next packet exists, and returns if exists.
ident_hdr, comment_hdr, and setup_hdr is no longer public. Now they can be accessed only immutably through the corresponding getter functions.

Issues & Future Task

One major issue is that the page absgp in real-world ogg/vorbis file often contains errors. Specifically, each of the ogg files I have has 3-20 errors, classified to the following two types:

When a page ends with a "long" block and the next page starts with "short" block, then the absgp for the former page is wrong. In specific, all of the ogg files I have use "long" blocks of length 2048 and "short" blocks of length 512, and the absgp for the former page sometimes differs by 448. This type of error was found in almost all the ogg files I have.
The absgp for the last page is less than the number of total samples obtained from the entire stream, subtracted by the number of samples obtained from the last packet. The spec requires to drop the last few frames from the final packet according to the absgp provided in the page, as quoted below, but it seems that real-word ogg files requires to drop more samples.

A granule position on the final page in a stream that indicates less audio data than the final packet would normally return is used to end the stream on other than even frame boundaries

If the absgp provided in the pages has an error, then the seek result may be wrong, especially for the first type of error mentioned above. I'm now wondering that I've mistook some spec of the Vorbis stream, or this is the bug of many Vorbis encoders (all I have of which is based on libvorbis by the way). Yeah, it's only 448 frames in ~44100 fps data, so the diff is 0.01 seconds, so we may not have to care about that though.

Speaking about the incompatibility with the spec, I also realized that most of the ogg files does not follow the spec that " the second finished audio packet must flush the page on which it appears and the third packet begin a fresh page". I implemented the dropping-leading-samples features based on this fact, but it does not actually run for most of the files. Anyway I followed the another requirement in the spec that "Failure to do so should, at worst, cause a decoder implementation to return incorrect positioning information for seeking operations at the very beginning of the stream". At any rate, most of the ogg files starts with absgp 0, which means that we don't have to drop the initial frames at all.

Here are the other tasks and concerns:

Only informal test is done yet for the new seeking function. Maybe should I include it in this repository as a test?
- Also, it is not yet tested against chained / multiplexed streams.
I have not checked the compatibility with async feature yet.
I didn't encountered an audio packet spanning more than one page, so there may be a bug for that case.
The code may not compile with MSRV.

And here are the issues I'll handle at last:

The code may not be well-formatted.
The commits are not rebased yet.

Feb 16 '21 03:02 TonalidadeHidrica

I'm opening a draft PR since it's quite a large change. As soon as I'm done with other tasks I'll switch. But earlier comments and critics are welcome!

Feb 16 '21 03:02 TonalidadeHidrica

Some things I forgot to write:

I'm wondering if I should implement Deref<Target=OggStreamReader> for SeekableOggStreamReader>. It's definitely convenient as it requires less changes on old code, but it is considered as anti-pattern.
This PR is dependent on RustAudio/ogg#22.

Feb 16 '21 07:02 TonalidadeHidrica

@TonalidadeHidrica thanks I'll take a look tomorrow.

Feb 16 '21 07:02 est31

@est31 Thanks a lot, but there's no rush!

Feb 16 '21 07:02 TonalidadeHidrica

You can make a trait containing the common methods of OggStreamReader and SeekableOggStreamReader, impl it for OggStreamReader, and then forward the calls into impl YourTrait for SeekableOggStreamReader. That will also make it more convenient for people extending on Lewton.

May 08 '21 23:05 ghost

Is this still being worked on?

May 10 '21 08:05 ghost

The implementations are working correctly in my product so I'm currently waiting for a review. Also, I'm busy recently so if I receive a review, I will not be able to respond instantly.

May 10 '21 08:05 TonalidadeHidrica

Alright then it's just up the RustAudio to review this then... which seems to be taking a while, this was last commented on by a org member 3 months ago. Do you already have a branch of ogg with those two PRs merged?

May 10 '21 08:05 ghost

@JackRedstonia better-seek branch in my fork is my state-of-the-art branch which includes every improvements I made :)

May 10 '21 08:05 TonalidadeHidrica

based and forked

May 10 '21 08:05 ghost

Just to clarify, the SeekableOggStreamReader::stream_end_pos function returns the byte position of the last logical packet, right? I'm not exactly sure how to go from here to getting the stream duration in granules, as I don't know much about the ogg/vorbis format.

May 10 '21 09:05 ghost

@JackRedstonia Oops, sorry. You're right. stream_end_pos seems to return the final byte position, so I DIDN'T implement it. I'll remove the "fixes" tag. (I guess it is quite easy to implement it though)

May 10 '21 13:05 TonalidadeHidrica

Sorry I'm a bit busy atm. I think first the ogg PRs need to be reviewed, then this PR.

May 10 '21 21:05 est31

On getting the stream length, from my understanding, the steps needed are:

Seek to the final packet's byte position, which we can obtain using stream_end_pos
Read that packet
Get the absolute granule position
The stream length is the absolute granule position plus the packet's length in granules

Is this correct?

May 13 '21 21:05 ghost

@JackRedstonia one also has to take the first packet's absolute granule pos into account, but other than that, yes it's accurate.

May 13 '21 22:05 est31

wait no, one doesn't have to take the length of the last packet into account, but the length of the first packet. Absolute granule positions always refer to the last sample offset of the last packet finished on the page.

May 13 '21 22:05 est31

I don't actually remember the code I wrote few months ago, but now I'm wondering that stream_end_pos really seeks to the beginning of the last packet (or page?) of the current stream? Doesn't it seek to the real "end" of the stream? I've gotta check.

May 14 '21 00:05 TonalidadeHidrica

What is the correct way to use SeekableOggStreamReader? I am currently testing the code in this branch. If I call seek_absgp with certain values, inner_mut().read_dec_packet() returns BadAudio(AudioBadFormat). I can seek to small values (0-5000 works), but if I seek to a larger value, I get the error.

Jul 05 '21 20:07 hf29h8sh321

@hf29h8sh321 I'm very sorry to leave it undocumented. Since I'm quite busy and sleepy I cannot currently give you a precise example now, but perhaps seeing this file might help you. Sorry for the inconvenienve.

Jul 05 '21 21:07 TonalidadeHidrica

I copied that file into my project, and I cannot get any audio data after seeking. Calls to next only return None.

Jul 05 '21 22:07 hf29h8sh321

Hmm, at least it works for almost all of my ogg files, so there should be some difference...

Jul 05 '21 22:07 TonalidadeHidrica

@hf29h8sh321 Did you check out this messy gist of mine? (I'm not confident it would work right now...)

Jul 05 '21 22:07 TonalidadeHidrica

I have narrowed down the issue. I ran the code in the gist, which works with my own files. However, I then modified the code and removed this line: let samples = take_n_samples(&mut reader, 1_500_000, false)?;, and all other lines that depend on that line. Then, check_seek fails with Error: Vorbis bitstream audio decode problem.

It seems like seeking into the middle of the stream without reading it without seeking first causes the error.

Jul 06 '21 00:07 hf29h8sh321

I think if you remove first 2000 bytes or so of the audio file. it should work properly. Thats where all the "garbage" data is stored.

Jul 08 '21 04:07 hsi3322

@hsi3322 Can you elaborate on what you mean by "garbage data"?

Jul 31 '21 04:07 hf29h8sh321

Thanks for keeping this PR alive! This is the single biggest issue of the lewton crate at the moment, I think.

Apr 10 '22 16:04 est31

You're welcome! The seeking is in fact crucial part of my project, and the feature has actually been thoroughly tested through my actually using. I'll resolve the conflict when I have time, and of course review is always welcome, in that case I'll respond to it as soon as I can.

Apr 10 '22 22:04 TonalidadeHidrica