[BUG] User agent processor gives java.lang.NullPointerException if entry does not contain source field
Describe the bug
With Data Prepper 2.12.1 installed as Docker container I get java.lang.NullPointerException when using the User agent processor if entry does contain defined source field.
To Reproduce Steps to reproduce the behavior:
- Have a pipeline like this configured
source:
http:
port: 2021
ssl: false
health_check_service: true
processor:
- user_agent:
source: agent
target: parsed_agent
sink:
- opensearch:
hosts:
- https://HOST:PORT
- https://HOST:PORT
- https://HOST:PORT
insecure: true
username: USER
password: PASS
index: TEST
- Ingest data which may or may not contain field
agent - Following errors get logged (for each ingested entry without field
agentI believe)
Reading pipelines and data-prepper configuration files from Data Prepper home directory.
/usr/bin/java
Found openjdk version of 17.0
2025-08-22T15:33:33,583 [main] INFO org.opensearch.dataprepper.pipeline.parser.transformer.DynamicConfigTransformer - No transformation needed
2025-08-22T15:33:35,782 [main] INFO org.opensearch.dataprepper.plugins.kafka.extension.KafkaClusterConfigExtension - Applying Kafka Cluster Config Extension.
2025-08-22T15:33:37,104 [main] WARN org.opensearch.dataprepper.plugins.source.loghttp.HTTPSource - Creating http source without authentication. This is not secure.
2025-08-22T15:33:37,105 [main] WARN org.opensearch.dataprepper.plugins.source.loghttp.HTTPSource - In order to set up Http Basic authentication for the http source, go here: https://github.com/opensearch-project/data-prepper/tree/main/data-prepper
-plugins/http-source#authentication-configurations
2025-08-22T15:33:37,798 [main] INFO org.opensearch.dataprepper.plugins.geoip.extension.GeoIPDatabaseManager - Downloading GeoIP database to /usr/share/data-prepper/data/geoip/blue_database
2025-08-22T15:33:44,367 [main] WARN org.opensearch.dataprepper.core.pipeline.server.config.DataPrepperServerConfiguration - Creating data prepper server without authentication. This is not secure.
2025-08-22T15:33:44,372 [main] WARN org.opensearch.dataprepper.core.pipeline.server.config.DataPrepperServerConfiguration - In order to set up Http Basic authentication for the data prepper server, go here: https://github.com/opensearch-project/da
ta-prepper/blob/main/docs/core_apis.md#authentication
2025-08-22T15:33:44,831 [log-ingest-pipeline-sink-worker-2-thread-1] INFO org.opensearch.dataprepper.plugins.sink.opensearch.OpenSearchSink - Initializing OpenSearch sink
2025-08-22T15:33:44,848 [main] WARN org.opensearch.dataprepper.core.pipeline.server.HttpServerProvider - Creating Data Prepper server without TLS. This is not secure.
2025-08-22T15:33:44,852 [main] WARN org.opensearch.dataprepper.core.pipeline.server.HttpServerProvider - In order to set up TLS for the Data Prepper server, go here: https://github.com/opensearch-project/data-prepper/blob/main/docs/configuration.m
d#server-configuration
2025-08-22T15:33:44,867 [log-ingest-pipeline-sink-worker-2-thread-1] INFO org.opensearch.dataprepper.plugins.sink.opensearch.ConnectionConfiguration - Using the username provided in the config.
2025-08-22T15:33:44,907 [log-ingest-pipeline-sink-worker-2-thread-1] INFO org.opensearch.dataprepper.plugins.sink.opensearch.ConnectionConfiguration - Using the trust all strategy
2025-08-22T15:33:45,480 [log-ingest-pipeline-sink-worker-2-thread-1] INFO org.opensearch.dataprepper.plugins.sink.opensearch.OpenSearchSink - Initialized OpenSearch sink
2025-08-22T15:33:45,801 [log-ingest-pipeline-sink-worker-2-thread-1] WARN com.linecorp.armeria.common.CommonPools - Failed to register the common worker group as non-blocking for Reactor. Please consider upgrading Reactor to 3.7.0 or newer.
2025-08-22T15:33:45,948 [log-ingest-pipeline-sink-worker-2-thread-1] WARN org.opensearch.dataprepper.plugins.server.CreateServer - Creating http without SSL/TLS. This is not secure.
2025-08-22T15:33:45,949 [log-ingest-pipeline-sink-worker-2-thread-1] WARN org.opensearch.dataprepper.plugins.server.CreateServer - In order to set up TLS for the http, go here: https://github.com/opensearch-project/data-prepper/tree/main/data-prep
per-plugins/http-source#ssl
2025-08-22T15:33:46,017 [log-ingest-pipeline-sink-worker-2-thread-1] INFO org.opensearch.dataprepper.plugins.server.CreateServer - HTTP source health check is enabled
2025-08-22T15:33:46,545 [log-ingest-pipeline-sink-worker-2-thread-1] INFO org.opensearch.dataprepper.plugins.source.loghttp.HTTPSource - Started http source on port 2021...
2025-08-22T15:33:50,241 [log-ingest-pipeline-processor-worker-1-thread-1] ERROR org.opensearch.dataprepper.plugins.processor.useragent.UserAgentProcessor - An exception occurred when parsing user agent data from event [org.opensearch.dataprepper.model.log.JacksonLog@79ad1a39] with source key [agent]
java.lang.NullPointerException: null
at java.base/java.util.Objects.requireNonNull(Objects.java:209) ~[?:?]
at org.opensearch.dataprepper.plugins.processor.useragent.UserAgentProcessor.doExecute(UserAgentProcessor.java:57) ~[data-prepper-user-agent-processor-2.12.1.jar:?]
at org.opensearch.dataprepper.model.processor.AbstractProcessor.lambda$execute$0(AbstractProcessor.java:54) ~[data-prepper-api-2.12.1.jar:?]
at io.micrometer.core.instrument.composite.CompositeTimer.record(CompositeTimer.java:69) [micrometer-core-1.14.4.jar:1.14.4]
at org.opensearch.dataprepper.model.processor.AbstractProcessor.execute(AbstractProcessor.java:54) [data-prepper-api-2.12.1.jar:?]
at org.opensearch.dataprepper.core.pipeline.PipelineRunnerImpl.runProcessorsAndProcessAcknowledgements(PipelineRunnerImpl.java:105) [data-prepper-core-2.12.1.jar:?]
at org.opensearch.dataprepper.core.pipeline.PipelineRunnerImpl.runAllProcessorsAndPublishToSinks(PipelineRunnerImpl.java:55) [data-prepper-core-2.12.1.jar:?]
at org.opensearch.dataprepper.core.pipeline.ProcessWorker.doRun(ProcessWorker.java:80) [data-prepper-core-2.12.1.jar:?]
at org.opensearch.dataprepper.core.pipeline.ProcessWorker.run(ProcessWorker.java:40) [data-prepper-core-2.12.1.jar:?]
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) [?:?]
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
at java.base/java.lang.Thread.run(Thread.java:840) [?:?]
2025-08-22T15:33:50,251 [log-ingest-pipeline-processor-worker-1-thread-1] ERROR org.opensearch.dataprepper.plugins.processor.useragent.UserAgentProcessor - An exception occurred when parsing user agent data from event [org.opensearch.dataprepper.model.log.JacksonLog@54ba2d01] with source key [agent]
java.lang.NullPointerException: null
at java.base/java.util.Objects.requireNonNull(Objects.java:209) ~[?:?]
at org.opensearch.dataprepper.plugins.processor.useragent.UserAgentProcessor.doExecute(UserAgentProcessor.java:57) ~[data-prepper-user-agent-processor-2.12.1.jar:?]
at org.opensearch.dataprepper.model.processor.AbstractProcessor.lambda$execute$0(AbstractProcessor.java:54) ~[data-prepper-api-2.12.1.jar:?]
at io.micrometer.core.instrument.composite.CompositeTimer.record(CompositeTimer.java:69) [micrometer-core-1.14.4.jar:1.14.4]
at org.opensearch.dataprepper.model.processor.AbstractProcessor.execute(AbstractProcessor.java:54) [data-prepper-api-2.12.1.jar:?]
at org.opensearch.dataprepper.core.pipeline.PipelineRunnerImpl.runProcessorsAndProcessAcknowledgements(PipelineRunnerImpl.java:105) [data-prepper-core-2.12.1.jar:?]
at org.opensearch.dataprepper.core.pipeline.PipelineRunnerImpl.runAllProcessorsAndPublishToSinks(PipelineRunnerImpl.java:55) [data-prepper-core-2.12.1.jar:?]
at org.opensearch.dataprepper.core.pipeline.ProcessWorker.doRun(ProcessWorker.java:80) [data-prepper-core-2.12.1.jar:?]
at org.opensearch.dataprepper.core.pipeline.ProcessWorker.run(ProcessWorker.java:40) [data-prepper-core-2.12.1.jar:?]
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) [?:?]
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
at java.base/java.lang.Thread.run(Thread.java:840) [?:?]
Environment (please complete the following information):
- OS: Ubuntu 24.04 LTS
- Version: Data Prepper 2.12.1
Additional context In my use case I ingest log entries which may or may not contain user agent. I could not figure out any way to pass only entries containing agent information to the User agent processor so I tought I could pass all entries to it without getting erros.
@StrategiosP, Thanks for making note of this.
Do you mean that the events coming in do not have the source field present?
Would you be able to produce a fix for this? Or perhaps contribute a unit test that replicates the behavior?
@dlvenable Thanks for response.
Correct, some incoming events do not have the source field present.
Unfortunately I have never programmed in Java so I am not up to the task of contributing code.
Hi @dlvenable! 👋
I'd be happy to take a look at this issue and work on a fix. It seems like we need to add a null check before accessing the source field in the user agent processor.
I'm relatively new to contributing to Data Prepper, but I've worked with Java before. Would you mind if I give it a try? I'll make sure to include unit tests that replicate the behavior as you suggested.
Let me know if this issue is still available or if there's anything specific I should keep in mind while working on it!
Hi @dlvenable,
I have gone through the test cases for UserAgentProcessor, and I can see that this scenario is handled in this particular test case.
Regarding the exception, it is just the logging on the processor(added test case output as well). Please let me know if it requires any changes.
Test Case:
@Test
public void testTagsAddedOnParseFailure() {
when(mockConfig.getSource()).thenReturn(eventKeyFactory.createEventKey("bad_source"));
when(mockConfig.getCacheSize()).thenReturn(TEST_CACHE_SIZE);
when(mockConfig.getTarget()).thenReturn("user_agent");
final String tagOnFailure1 = UUID.randomUUID().toString();
final String tagOnFailure2 = UUID.randomUUID().toString();
when(mockConfig.getTagsOnParseFailure()).thenReturn(List.of(tagOnFailure1, tagOnFailure2));
final UserAgentProcessor processor = createObjectUnderTest();
final Record<Event> testRecord = createTestRecord(UUID.randomUUID().toString());
final List<Record<Event>> resultRecord = (List<Record<Event>>) processor.doExecute(Collections.singletonList(testRecord));
final Event resultEvent = resultRecord.get(0).getData();
assertThat(resultEvent.containsKey("user_agent"), is(false));
assertThat(resultEvent.getMetadata().getTags().contains(tagOnFailure1), is(true));
assertThat(resultEvent.getMetadata().getTags().contains(tagOnFailure2), is(true));
}
Test case log
2025-11-18T05:01:52.921082Z Test worker ERROR An exception occurred when parsing user agent data from event [org.opensearch.dataprepper.model.event.JacksonEvent@57cc535] with source key [bad_source]
java.lang.NullPointerException
at java.base/java.util.Objects.requireNonNull(Objects.java:209)
at org.opensearch.dataprepper.plugins.processor.useragent.UserAgentProcessor.doExecute(UserAgentProcessor.java:57)
at org.opensearch.dataprepper.plugins.processor.useragent.UserAgentProcessorTest.testTagsAddedOnParseFailure(UserAgentProcessorTest.java:138)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:569)
at org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:725)
+1 on this issue. I'm running into the same problem, where I'm ingesting data that may or may not have a useragent field populated. I'm running tests with the opensearchproject/data-prepper:2.13.0 (latest) Docker image to deploy data-prepper to possibly replace an existing logstash implementation.
As an end user, my options are:
- Drop using the user_agent processor altogether, or parse it myself
- Create pipelines and routes to avoid sending null values to user_agent
- Use user_agent as-is and accept that I'm going to spam my logs with
java.lang.NullPointerException
Option 1 is where I'm at right now. Option 2 is more complicated than I really want to implement, and option 3 is a non-starter since this is a project to replace existing functionality and I can't really justify promoting a solution with so many expected errors to ignore.
I know for a lot of the DP processors, there's a "*_when" conditional option to skip processing. My expertise is on the systems side so I don't know how hard it would be or what time would be required to implement, but I think it would work around the issue (at least in some use cases including mine) to have a ua_when option where we could just use ua_when: /source != null.