osm4j icon indicating copy to clipboard operation
osm4j copied to clipboard

While parsing a protocol message, the input ended unexpectedly in the middle of a field.

Open kkarski opened this issue 3 years ago • 5 comments

Hi, thanks for this library.

I am trying to parse country specific pbf files I downloaded from geofabric.de and consistently get the following error when parsing any of the country files (https://download.geofabrik.de/europe/belgium-latest.osm.pbf).

The code for parsing I am using is as follows:

final PbfIterator iterator = new PbfIterator(inputStream, false);
final InMemoryMapDataSet data = MapDataSetLoader.read(iterator, false, false, true);

I get the following error when reading into MapDataSetLoader

java.lang.RuntimeException: error while reading block
	at de.topobyte.osm4j.pbf.seq.PbfIterator.ensureBeyondBounds(PbfIterator.java:228)
	at de.topobyte.osm4j.pbf.seq.PbfIterator.hasBounds(PbfIterator.java:211)
	at de.topobyte.osm4j.core.dataset.MapDataSetLoader.read(MapDataSetLoader.java:78)
	at io.referential.data.OsmFileProcessor.main(OsmFileProcessor.java:143)
Caused by: com.google.protobuf.InvalidProtocolBufferException: While parsing a protocol message, the input ended unexpectedly in the middle of a field.  This could mean either that the input has been truncated or that an embedded message misreported its own length.
	at com.google.protobuf.InvalidProtocolBufferException.truncatedMessage(InvalidProtocolBufferException.java:107)
	at com.google.protobuf.GeneratedMessageLite.parsePartialFrom(GeneratedMessageLite.java:1584)
	at com.google.protobuf.GeneratedMessageLite.parseFrom(GeneratedMessageLite.java:1671)
	at de.topobyte.osm4j.pbf.protobuf.Fileformat$BlobHeader.parseFrom(Fileformat.java:1094)
	at de.topobyte.osm4j.pbf.util.PbfUtil.parseHeader(PbfUtil.java:92)
	at de.topobyte.osm4j.pbf.util.PbfUtil.parseHeader(PbfUtil.java:80)
	at de.topobyte.osm4j.pbf.seq.PbfIterator.advanceBlock(PbfIterator.java:129)
	at de.topobyte.osm4j.pbf.seq.PbfIterator.tryAdvanceBlock(PbfIterator.java:120)
	at de.topobyte.osm4j.pbf.seq.PbfIterator.ensureBeyondBounds(PbfIterator.java:226)
	... 3 common frames omitted

I am using osm4j-core, osm4j-pbf, osm4j-geometry 1.2.0.

kkarski avatar Nov 08 '22 12:11 kkarski

Hmm, not sure what's going on there. I've just tried this example here:

import java.io.IOException;
import java.io.InputStream;
import java.net.URL;

import de.topobyte.osm4j.core.access.OsmIterator;
import de.topobyte.osm4j.core.dataset.InMemoryMapDataSet;
import de.topobyte.osm4j.core.dataset.MapDataSetLoader;
import de.topobyte.osm4j.pbf.seq.PbfIterator;

public class TestParsePbf
{

	public static void main(String[] args) throws IOException
	{
		String url = "https://download.geofabrik.de/europe/netherlands/zeeland-latest.osm.pbf";
		InputStream input = new URL(url).openStream();
		OsmIterator iterator = new PbfIterator(input, false);

		final InMemoryMapDataSet data = MapDataSetLoader.read(iterator, false,
				false, true);
		System.out.println("number of nodes: " + data.getNodes().size());
		System.out.println("number of ways: " + data.getWays().size());
		System.out.println("number of relations: " + data.getRelations().size());
	}

}

and it runs fine and I get this output:

number of nodes: 4053324
number of ways: 527827
number of relations: 9100

I don't see that I did anything different than you did, could you still try this with your setup and see if it produces the same problem you experienced above?

I tried with a smaller region than whole Belgium because it yields results more quickly and also I get a OutOfMemoryError if I do not increase my heap size to 12G. Tried now with Belgium anyway and also went trough fine...

number of nodes: 55861654
number of ways: 8599759
number of relations: 89978

sebkur avatar Nov 08 '22 17:11 sebkur

I had this hunch that you might be using files with metadata from https://osm-internal.download.geofabrik.de/ maybe, however I can parse those files fine, too.

sebkur avatar Nov 08 '22 17:11 sebkur

Another hunch: if you're using osm4j in a project with multiple other dependencies, it's possible you end up with a different version for com.google.protobuf:protobuf-javalite which might lead to this problem. osm4j-pbf is currently built using com.google.protobuf:protobuf-javalite:3.9.1 (from August 2019). I guess we could upgrade 3.21.9, especially given that versions before 3.21.7 seem to be vulnerable to a DOS attack.

sebkur avatar Nov 10 '22 09:11 sebkur

so what you could try is read some data from a minimal project such as https://github.com/topobyte/osm4j-examples and see if the problem shows up there too. If not, I guess adding your other dependencies in batches could reveal which one is causing the implicit upgrade (or use something like ./gradlew dependencies to find out)

sebkur avatar Nov 10 '22 09:11 sebkur

sebkur, thank for your investigation. I will verify the protobuf version I am using and run the example above. thank you

kkarski avatar Nov 14 '22 10:11 kkarski

Closing this due to inactivity. Feel free to reopen if you're still experiencing the problem.

sebkur avatar Jan 07 '25 20:01 sebkur