While parsing a protocol message, the input ended unexpectedly in the middle of a field.
Hi, thanks for this library.
I am trying to parse country specific pbf files I downloaded from geofabric.de and consistently get the following error when parsing any of the country files (https://download.geofabrik.de/europe/belgium-latest.osm.pbf).
The code for parsing I am using is as follows:
final PbfIterator iterator = new PbfIterator(inputStream, false);
final InMemoryMapDataSet data = MapDataSetLoader.read(iterator, false, false, true);
I get the following error when reading into MapDataSetLoader
java.lang.RuntimeException: error while reading block
at de.topobyte.osm4j.pbf.seq.PbfIterator.ensureBeyondBounds(PbfIterator.java:228)
at de.topobyte.osm4j.pbf.seq.PbfIterator.hasBounds(PbfIterator.java:211)
at de.topobyte.osm4j.core.dataset.MapDataSetLoader.read(MapDataSetLoader.java:78)
at io.referential.data.OsmFileProcessor.main(OsmFileProcessor.java:143)
Caused by: com.google.protobuf.InvalidProtocolBufferException: While parsing a protocol message, the input ended unexpectedly in the middle of a field. This could mean either that the input has been truncated or that an embedded message misreported its own length.
at com.google.protobuf.InvalidProtocolBufferException.truncatedMessage(InvalidProtocolBufferException.java:107)
at com.google.protobuf.GeneratedMessageLite.parsePartialFrom(GeneratedMessageLite.java:1584)
at com.google.protobuf.GeneratedMessageLite.parseFrom(GeneratedMessageLite.java:1671)
at de.topobyte.osm4j.pbf.protobuf.Fileformat$BlobHeader.parseFrom(Fileformat.java:1094)
at de.topobyte.osm4j.pbf.util.PbfUtil.parseHeader(PbfUtil.java:92)
at de.topobyte.osm4j.pbf.util.PbfUtil.parseHeader(PbfUtil.java:80)
at de.topobyte.osm4j.pbf.seq.PbfIterator.advanceBlock(PbfIterator.java:129)
at de.topobyte.osm4j.pbf.seq.PbfIterator.tryAdvanceBlock(PbfIterator.java:120)
at de.topobyte.osm4j.pbf.seq.PbfIterator.ensureBeyondBounds(PbfIterator.java:226)
... 3 common frames omitted
I am using osm4j-core, osm4j-pbf, osm4j-geometry 1.2.0.
Hmm, not sure what's going on there. I've just tried this example here:
import java.io.IOException;
import java.io.InputStream;
import java.net.URL;
import de.topobyte.osm4j.core.access.OsmIterator;
import de.topobyte.osm4j.core.dataset.InMemoryMapDataSet;
import de.topobyte.osm4j.core.dataset.MapDataSetLoader;
import de.topobyte.osm4j.pbf.seq.PbfIterator;
public class TestParsePbf
{
public static void main(String[] args) throws IOException
{
String url = "https://download.geofabrik.de/europe/netherlands/zeeland-latest.osm.pbf";
InputStream input = new URL(url).openStream();
OsmIterator iterator = new PbfIterator(input, false);
final InMemoryMapDataSet data = MapDataSetLoader.read(iterator, false,
false, true);
System.out.println("number of nodes: " + data.getNodes().size());
System.out.println("number of ways: " + data.getWays().size());
System.out.println("number of relations: " + data.getRelations().size());
}
}
and it runs fine and I get this output:
number of nodes: 4053324
number of ways: 527827
number of relations: 9100
I don't see that I did anything different than you did, could you still try this with your setup and see if it produces the same problem you experienced above?
I tried with a smaller region than whole Belgium because it yields results more quickly and also I get a OutOfMemoryError if I do not increase my heap size to 12G. Tried now with Belgium anyway and also went trough fine...
number of nodes: 55861654
number of ways: 8599759
number of relations: 89978
I had this hunch that you might be using files with metadata from https://osm-internal.download.geofabrik.de/ maybe, however I can parse those files fine, too.
Another hunch: if you're using osm4j in a project with multiple other dependencies, it's possible you end up with a different version for com.google.protobuf:protobuf-javalite which might lead to this problem. osm4j-pbf is currently built using com.google.protobuf:protobuf-javalite:3.9.1 (from August 2019). I guess we could upgrade 3.21.9, especially given that versions before 3.21.7 seem to be vulnerable to a DOS attack.
so what you could try is read some data from a minimal project such as https://github.com/topobyte/osm4j-examples and see if the problem shows up there too. If not, I guess adding your other dependencies in batches could reveal which one is causing the implicit upgrade (or use something like ./gradlew dependencies to find out)
sebkur, thank for your investigation. I will verify the protobuf version I am using and run the example above. thank you
Closing this due to inactivity. Feel free to reopen if you're still experiencing the problem.