sql icon indicating copy to clipboard operation
sql copied to clipboard

[FEATURE]Add `iplocation` function to PPL for IP address geolocation

Open YANG-DB opened this issue 1 year ago • 3 comments

Description: We propose adding an geoip function to OpenSearch's Piped Processing Language (PPL) and SQL to provide built-in IP address geolocation capabilities. This feature would be similar to functionality used in OpenSearch's geospatial feature, enhancing PPL's ability to enrich log data with geographical information based on IP addresses.

Proposed Functionality:

  1. The 'geoip' function should take an IP address as input and return geographical information.
  2. It should support both IPv4 and IPv6 addresses.
  3. The function should return multiple fields including country, region, city, latitude, longitude, and others as available.
  4. It should allow users to specify which geolocation fields to include in the output.
  5. The function should use a regularly updated IP geolocation database for accuracy.

Example Usage:

... | eval geolocation = geoip(ip_field)

This would add a new field 'geolocation' with all available location information for the IP address in 'ip_field'.

... | eval country = geoip(ip_field, "country")
... | eval lat = geoip(ip_field, "lat"), lon = iplocation(ip_field, "lon")

This would add new fields with specific geolocation information.

... | eval location_info = geoip(ip_field, "country,region,city,lat,lon")

This would add a new field 'location_info' with multiple pieces of geolocation data.

Additional considerations

  • Allow for using the geospatial opensearch plugin for the ip to geo resolving

Related resources

  • https://github.com/opensearch-project/geospatial?tab=readme-ov-file

YANG-DB avatar Sep 16 '24 23:09 YANG-DB

[Catch All Triage - 1, 2, 3, 4]

dblock avatar Oct 07 '24 16:10 dblock

Am in the process of implementing this

kenrickyap avatar Oct 18 '24 21:10 kenrickyap

Hi @YANG-DB ,

What was the intended method of leveraging the geospatial plugin?

Following the example of the inclusion of the job-scheduler and ml-commons plugin, I have been trying to import it directly into the project but noticed that the published geospatial plugin on maven has no jar. As such it does not seem possible to directly import the plugin. Is this assumption correct?

If so then, my current plan is to call the endpoint that the geospatial plugin exposes in OpenSearch documented here and communicate with it using the OpenSearchRestClient. Would this be a good path forward? or am I missing something that would make it possible to expose the geospatial plugin?

Thanks!

kenrickyap avatar Oct 24 '24 18:10 kenrickyap

Hi, @YANG-DB After a few discovery and feasibility checks, we have updated our approach, the below are the high-level plan along with the proposed code changes. Can you have a look and advise?

High-level idea:

  • Create a new ActionType with logic on Geo-Spatial plugin to expose a new action which takes a IP string and return the appropriate Geo detail.
  • Update existing SQL plugin accordingly to invoke the call on nodeClient with the newly created Geo-Spatial action for the geoip function.

Proposed code changes:

GeoSpatial:

  • Create a new TransportAction and register it accordingly:
    • Create a new TransportAction class , which is similar to GetDatasourceTransportAction and the sole purpose of this Action is to process an incoming IP String, with the given provider, then return the appropriate geoSpatial detail fields.
    • Update GeospatialPlugin.getAction( ) class to register the newly created action.
  • Create a new sub-module with name geo-spatial-client which has the nodeClientWrapper as the wrapper for the cross-plugin interaction interface, a few ActionType along with appropriate wrapper object to form the API signature for the return type.
  • Update Gradle script to publish geo-spatial-client module as a separate jar.

SQL module:

  • Update Gradle setting to import geo-spatial-client into OpenSearch sub-module.
  • Override the existing EvalOpeartor processing logic:
    • Create a new OpenSearchEvalOperator class which extends from the existing EvalOperator with an additional class property NodeClient.
    • Update OpernSearchIndex class to override the visitEval( ) method, and return a new OpenSearchEvalOperator instance instead.
    • Update the OpenSearchEvalOperator to perform the following logic when processing geoip function:
      • Reading the incoming ip string
      • Invoke a call on nodeClient with appropriate arguments and timeout value
      • Marshal the response and update the evalMap accordingly

andy-k-improving avatar Nov 12 '24 19:11 andy-k-improving

we don't even have a way to do basic Ip address lookups, why are you guys working on the next level before even having a basic way to query ip field type??

kedbirhan avatar Nov 13 '24 16:11 kedbirhan

Hi @kedbirhan, thanks for the feedback and indeed that make sense.

For now we are only proposing the high-level changes required for the functionality but not yet reach to the implementation phase.

I believe by the time we have the design gathered for this ticket, https://github.com/opensearch-project/sql/issues/3145 should already be wrapped to have the IP type support.

Thanks,

andy-k-improving avatar Nov 14 '24 00:11 andy-k-improving

@andy-k-improving I really like this idea - can you please create an RFC for the Geospatial plugin suggesting this change? We would need their feedback

YANG-DB avatar Nov 15 '24 18:11 YANG-DB

@andy-k-improving I really like this idea - can you please create an RFC for the Geospatial plugin suggesting this change? We would need their feedback

@YANG-DB see below for the RFC on Geo spatial side. https://github.com/opensearch-project/geospatial/issues/698 I will proceed to work on the implementation on GeoSpatial side.

andy-k-improving avatar Nov 18 '24 18:11 andy-k-improving

The actual PR on SQL repo: https://github.com/opensearch-project/sql/pull/3228

andy-k-improving avatar Jan 23 '25 18:01 andy-k-improving

resolved by https://github.com/opensearch-project/sql/pull/3604

LantaoJin avatar Jun 16 '25 03:06 LantaoJin