simavr icon indicating copy to clipboard operation
simavr copied to clipboard

Precision of timer

Open messani opened this issue 9 years ago • 46 comments

I think that timers in simavr are not precise. I will try to describe it... I made callback in HOST code:

void callback(struct avr_t * avr, avr_io_addr_t addr, uint8_t v, void *param)
{
  static uint64_t prevCycle = 0;
  uint64_t diff = avr->cycle - prevCycle;
  prevCycle = avr->cycle;

  if (diff != 262144)
  {
    // this should not happen
  }
}

...

avr_register_io_write(avr, 0xff, callback, nullptr);

In AVR code I can call this callback this way:

  _SFR_MEM8(0xff) = 0xff;

I configured divider of TIMER0 to 1024 so it overflows every 1024*256=262144 CPU cycles. If I call callback from ISR(TIMER0_OVF_vect), it works as expected - cycle difference between calls is 262144. The issue is when I need to use output capture register (OCR0A). When I write OCR0A, timer is shifted by small amount of time. Here is callstack right at the moment, when internal timer is shifted:

1  avr_timer_tov         avr_timer.c 169  0x7f0f6f1775ac 
2  avr_timer_configure   avr_timer.c 324  0x7f0f6f177e6c 
3  avr_timer_reconfigure avr_timer.c 339  0x7f0f6f177ef7 
4  avr_timer_write_ocr   avr_timer.c 380  0x7f0f6f1780d9 
5  _avr_set_r            sim_core.c  186  0x7f0f6f165071 
6  _avr_set_ram          sim_core.c  237  0x7f0f6f1652c0 
7  avr_run_one           sim_core.c  1315 0x7f0f6f168df0 
8  avr_callback_run_raw  sim_avr.c   317  0x7f0f6f16df20 
9  avr_run               sim_avr.c   354  0x7f0f6f16e026 
10 main                  main.cpp    83   0x401afb       

At this moment, program counter is at instruction, which is writing to the register OCR0A:

            OCR0A = minHwTimeRaw & 0xff;
     c68:       c7 bc           out     0x27, r12       ; 39

Version: 1f8eab5d47c8b4ade5be05d495717349ced4a60a Simulated mcu: atmega328p

messani avatar Aug 28 '16 17:08 messani

I think that this is duplicate of issue 125

messani avatar Aug 28 '16 19:08 messani

Not sure of if my tests jive with what you were doing (I think there also may be more to the issue? one thing that comes to mind could be a pickle to work out), but I may have found the problem... Try the branch below and get back to us. I'm also going to put this on the other issue you mentioned... as yes, it could very well be part of the problem... I'm also going to check it on some of the code I was messing wih.

https://github.com/bsekisser/simavr/tree/bsekisser-timer-fix

bsekisser avatar Aug 29 '16 19:08 bsekisser

This is the closest to the original code and I checked that it passes the tests... I goofed on the last one.

https://github.com/bsekisser/simavr/tree/bsekisser-timer-fix2

bsekisser avatar Aug 29 '16 22:08 bsekisser

OK. I Believe that I have found cause. There are two similar problems. And there is possible third problem which will happen when AVR frequency is low.

1) Overflow cycle timer registration avr_timer_reconfigure() is called from two places. I think that it must be decided whether the timer is reseted (it runs again from zero) or only reconfigured. It is reconfigured when called from avr_timer_write_ocr and reseted when called from avr_timer_write. To make the story short, I changed the end of function avr_timer_reconfigure to this:

if (p->tov_cycles > 1) {
  if (reset) {
    avr_cycle_timer_register(p->io.avr, p->tov_cycles, avr_timer_tov, p);
    // calling it once, with when == 0 tells it to arm the A/B/C timers if needed
    p->tov_base = 0;
    avr_timer_tov(p->io.avr, p->io.avr->cycle, p);
  }
  else {
    uint64_t orig_tov_base = p->tov_base;
    avr_cycle_timer_register(p->io.avr, p->tov_cycles - (p->io.avr->cycle - orig_tov_base), avr_timer_tov, p);
    // calling it once, with when == 0 tells it to arm the A/B/C timers if needed
    p->tov_base = 0;
    avr_timer_tov(p->io.avr, orig_tov_base, p);
  }
}

After this change interrupt was called correctly periodically.

2) OCR cycle timer registration Same problem applies to OCR timer. Whenever OCR register is changed there is avr_cycle_timer_register called in avr_timer_tov. The second parameter 'when' is p->comp[compi].comp_cycles, but this is not correct. Because of this callback is called late.

3) Possible timing problem avr_timer_t::cs_div_clock can be rounded when simulated microcontroller has low frequency. For example if frequency is 1MHz and timer divider is 1024, this expression: p->cs_div_clock = clock >> p->cs_div[new_cs]; will be rounded. I think that different units should be used .

messani avatar Aug 30 '16 18:08 messani

Not sure if that is the best thing too do...  might be better to leave be and marked as in error... I'll take a look at what you just sent. On Aug 30, 2016 1:27 PM, messani [email protected] wrote:I have to delete some posts. They were wrong. If you have received them on email, ignore them. I will delete this post after your next post to keep this issue clean.

—You are receiving this because you commented.Reply to this email directly, view it on GitHub, or mute the thread.

bsekisser avatar Aug 30 '16 18:08 bsekisser

If I am not mistaken, I think I inadvertently found the issue you are referring to on point three... Check it if you would on the second fix I sent.

As for the other two points... plausable... I don't doubt it, and really could have been part of what I had been seeing... The code I ran did periodically bump the ocrs... while it worked for the most part, I had hints that there were possible issues... just no way to pinpoint it. Test cases demonstrating a fault goes a long way.

bsekisser avatar Aug 30 '16 19:08 bsekisser

3) Possible timing problem Yes - branches are trying to solve it. I think that best would be to use integer instead of float. I did not tried it - I found this bug by reading the code.

1), 2) I will try to make a fix in next few days. But because I don't know your style and I am C++ programmer, my solution probably will not be nice. I have actually already fixed issue 1), so I have to fix only issue 2).

messani avatar Aug 30 '16 19:08 messani

@frej Try this patch. It is dirty - I did not have time to write it properly, but it works for me.

1), 2) should be fixed (i have tested it and interrupts are called in correct time) 3) I have changed timer units from time (float) to ticks (uint32_t) - it should work well for AVR. Problem is asynchronous 32KHz oscillator. But I think that because it is asynchronous, you will get an error even on real hardware.

0001-timer-fix-dirty.zip

messani avatar Aug 31 '16 15:08 messani

Any chance you could fork a branch with this?  that is the prefered way to handle it...  I like some of the Ideas you have and some comments to make.  I can do it for you if you want me to and let busserror know but it would be better if you did it.  Meantime, I'll be trying it out myself.

On Aug 31, 2016 11:58 AM, messani [email protected] wrote:@frej Try this patch. It is dirty - I did not have time to write it properly, but it works for me.

1), 2) should be fixed (i have tested it and interrupts are called in correct time) 3) I have changed timer units from time (float) to ticks (uint32_t) - it should work well for AVR. Problem is asynchronous 32KHz oscillator. But I think that because it is asynchronous, you will get an error even on real hardware.

0001-timer-fix-dirty.zip

—You are receiving this because you commented.Reply to this email directly, view it on GitHub, or mute the thread.

bsekisser avatar Aug 31 '16 17:08 bsekisser

Yes, I can branch this. I did not it because I am not C programmer and as you can see I substituted bool by int (I need to read more code to understand code style). There are some TODOs too which I need to do :) I will try to do it properly and will branch it. In the meantime you can give me a feedback about patch.

messani avatar Aug 31 '16 19:08 messani

On Wed, 31 Aug 2016 12:08:07 -0700 messani [email protected] wrote:

Yes, I can branch this. I did not it because I am not C programmer and as you can see I substituted bool by int (I need to read more code to understand code style). There are some TODOs too which I need to do :) I will try to do it properly and will branch it. In the meantime you can give me a feedback about patch.

That's okay... that is the good thing about forking and branching on github... github makes it easy for everyone to comment and annotate submissions and thus why it is the preferred method.

bsekisser avatar Aug 31 '16 19:08 bsekisser

@messani I just tagged you and frej on a posted branch with your patch and put up my tips thus far.

bsekisser avatar Aug 31 '16 21:08 bsekisser

it is here. Try it.

messani avatar Sep 01 '16 04:09 messani

@bsekisser : I tried to put comments, but it is mixed with another changes. So It is not clear for me. Optimum way is to discus cs_div_clock first and then decide what changes to do in code.

I think that:

  • cs_div_clock should be uint (value of timer divider)
  • I would tolerate that async 32KHz trigger is not accurate. It is not accurate even on real hardware.

messani avatar Sep 01 '16 05:09 messani

If somebody has tests which are failing, send it to me, please...

messani avatar Sep 01 '16 05:09 messani

sorry for being hard on you...  const not a hard rule but I think it good practice...  busserror fairly picky himself and fairly close to what you many expect from him...  will try to get back to you but gotta go for now.

bsekisser avatar Sep 01 '16 18:09 bsekisser

the tests in the simavr test folder...  go in to that and do a 'make' and then 'make run_tests'. so you know, I'll be away till sometime next week...  I may respond but likely won't be feasable to work on the code. On Sep 1, 2016 1:33 AM, messani [email protected] wrote:If somebody has tests which are failing, send it to me, please...

—You are receiving this because you were mentioned.Reply to this email directly, view it on GitHub, or mute the thread.

bsekisser avatar Sep 01 '16 18:09 bsekisser

@bsekisser: No, I want to learn something and I am interested in others ideas and habits. I use const for pointers and references because it protects object from modifying. For int and especially in parameters it distract you when you want to read the code. I will keep rules of @buserror because he is owner of this repository. I read few files and he does not use const in this case.

I tried to run that test and the problem is that the test runs too quickly. I think that there is another bug. It seems that sleep is not working. I will check that.

I have my own application for testing interrupts. I have simple scheduler which uses overflow and OCR interrupt. I can schedule task to any time and it works very well now. Maybe there is another bug.

I am not native English speaker so my sentences sometimes can have a little aggresive atmosphere, so sorry.

messani avatar Sep 01 '16 19:09 messani

You are right - there is still something wrong with interrupts. I have understand the test.

messani avatar Sep 01 '16 19:09 messani

Try this. I think that it is OK now. I don't like casting to float and back but it is probably the best way. Would you suggest something different?

Test still ends with error, but it is caused by rounding error of asynchronous timer.

Program ran for too few cycles (12497336 < 12500000)

I think that we are in boundaries. It is neccessary to change test code if you think that this is OK.

messani avatar Sep 01 '16 19:09 messani

I have to apologize. You were right with the comment:

should this not still be ocr + 1?

I have commited this and I think that it is working well now. Test still fails but I think that it is because of rounding error.

If I run test now, I can see: Program ran for too few cycles (12499289 < 12500000)

If I change line p->cs_div_clock = (uint32_t)((float)avr->frequency * (1 << p->cs_div[new_cs]) / 32768); to p->cs_div_clock = (uint32_t)((float)avr->frequency * (1 << p->cs_div[new_cs]) / 32768) + 1;

test result is: Program ran for too many cycles (12505689 > 12500300)

We can lower an error by changing cs_div_clock to float, but I would not recommend it.

messani avatar Sep 01 '16 20:09 messani

I was thinking again about changing cs_div_clock to float and even tried it. The test passed which is OK. But still it will not work well at all circumstances. For example there will be still error if for example AVR frequency is 1MHz and divider is 1. For that case: 1000000*1/32768 = 30,517578125 Overflow will be after 256 cycles so it will be after 7812,5 cycles. So you will get 0.5 tick error per overflow. This is still OK but if you use CTC mode (which clears timer), timer will work differently when you reset it in different times.

For example: F_CPU=8MHz, divider=1 cs_div_clock=244.140625

OCR2A=2 - it means that it will take 2+1 periods to reset: 3*244.140625=732.421875 - it will be rounded to 732 ticks.

If this loops 10x, number of ticks will be 7320.

When OCR2A=29, it will take 30*244.140625=7324,21875 ticks which will be rounded to 7324 ticks.

So there is 4 ticks error with the same timer.

I know that this issues could be fixed by adding some more floats. But there is still risk of rounding error somewhere.

I would like to know your opinion about it.

messani avatar Sep 01 '16 21:09 messani

@messani, I finally got around to test 35f27285b51784ecee0e51518fb2a6c0b3c152d2 against the project from which the test-case in #125 is extracted. It is now executing correctly, great work!

frej avatar Sep 03 '16 18:09 frej

Everything got too complicated. I read docs about timers and I think that the test is wrong. I think that original test behaved well and my changes are wrong.

This is code from the test:

    int count = 0;
    while (count++ < 100) {
        // we read TCNT1, which should contain some sort of incrementing value
        tcnt = TCNT1;       // read it
        if (tcnt > 10000) {
            TCNT1 = 500;    // reset it arbitrarily
            PORTB ^= 2;     // mark it in the waveform file
        }
        sleep_cpu();        // this will sleep until a new timer2 tick interrupt occurs
    }

If sleep_cpu() is called, one of sleep modes is entered. But none of them has CLK(CPU) enabled (I read doc of atmega328p). So avr->cycle cannot be changed while avr is in sleep mode.

So I came to conclusion that test cannot measure tick count, because clock is stopped while avr is sleeping.

What do you think?

messani avatar Sep 04 '16 13:09 messani

peripheral clocks keep running while sleeping... simavr clock keeps going... or more accurately, skips forward in the event of sleep.

bsekisser avatar Sep 19 '16 19:09 bsekisser

Just catching up here, any progress on that?

buserror avatar Oct 19 '16 09:10 buserror

Commits which I have done helped. But I don't think that the solution is good.

It is written above. I was trying to fix the code so it would pass the test. But I think that test is bad. Test sets up timer to interrupt code 64 times per second. CPU is at sleep mode most of the time of the test. Problem is that when CPU sleeps, CPU clock is not running (at least I think so). So you cannot measure time in CPU cycles. But test does.

Next problem is that the counter does not follow changes in frequency. But I think that simulator was not intended to work in this way.

Main disagreement between me and @bsekisser is that I think that the time should not be calculated as float. I know that there is a rounding error when you don't use float so timer frequency is not equal to 32768Hz but it is at least constant. But if you use float you would experience different errors. I don't know what are properties of all simulated AVRs. I am amateur and I am using atmega328p. You can connect only one crystal to this AVR. So one of clock sources is RC circuit which is inaccurate. If I consider this (which might not apply to all AVRs), it is better if it is inaccurate that if is not deterministic.

messani avatar Oct 19 '16 18:10 messani

This is a very interesting topic actually. I have an idea to make a special frequency counter component and test all the thing thoroughly in this simavr-based environment.

hovercraft-github avatar Oct 26 '16 03:10 hovercraft-github

BTW I'm quite clear that we shouldn't use floats to do calculations, they have nothing to do in simavr really. You can always used fixed point if really necessary, but flats have they own stack of problems and I feel we'll just be pushing the problem further under the carpet anyway.

buserror avatar Oct 26 '16 06:10 buserror

@buserror: I agree. I would get rid of floats too. I use them in my branch to calculate some values, but it was only because I did not want to risk an integer overflow. I think that changes I did in my branch improves behaviour. You can check them. I tried to follow your code style. I am happy with this modification in my own program.

messani avatar Oct 26 '16 16:10 messani