Data granularity
Some of the exposures are large, but they might actually be individual policies with many vehicles. Will have to investigate/ask.
I believe that every record is not an individual risk. Each row is the unique combination of vehicle_category_code, region_code, vehicle_code, sex_code, age_code, and vehicle_year.

I agree. Also, each individual exposure can count to up to 0.5, as the database is for a 6-month period (even though the vast majority of auto policies in Brazil are annual.)
The aggregated rows shouldn't make a difference to the output of a GLM... What worries me a bit more is that some of the rows with the largest exposures seem to have a different premium rate than the single rows.