Cannot assign transformation for children columns of type - nested.
PGSync version: 2.0.0 Postgres version: 13.3 Elasticsearch version: 7.4.0 Redis version: 6.2.3 Python version: 3.9.2
Problem Description: I am trying to set explicit mapping for columns that are of children node. The children node is of type "nested". I have transformed the children node at root level. After transformation, explicit mappings are not working.
Eg: Consider example in docs -
[
{
"database": "book",
"index": "book",
"nodes": {
"table": "book",
"children": [
{
"table": "author",
"columns": [
"id",
"name"
],
"relationship": {
"variant": "object",
"type": "one_to_many"
},
"transform": {
"mapping": {
"name": {
"type": "keyword"
}
}
}
}
],
"transform": {
"mapping": {
"author": {
"type": "nested"
}
}
}
}
}
]
If I remove the root level transformation on author of type "nested", internal mapping for "author.name" works. I hope I am using the right syntax to define the schema. I couldn't find any examples of using "nested" type in docs.
I will add some documentation for this. A mapping basically has a field and a type. The field should be one of the columns in the current node.
e.g
"transform": {
"mapping": {
"myfield": {
"type": "keyword"
}
}
}
This means you want myfield to be of type keyword. So you can only apply mappings to fields.
In the example provided, author is a table
Thanks for the speedy reply. Applying type to author as 'nested' is working, as 'nested' is applied on children documents rather than fields. Reference - https://www.elastic.co/guide/en/elasticsearch/reference/current/nested.html
Sorry about the delay. Do you have a working example with pgsync schema.json and database schema? A transform node should be attached to a node you want to perform the operation on. e.g
[
{
"database": "book",
"index": "book",
"nodes": {
"table": "book",
"transform": {
"mapping": {
"isbn": {
"type": "nested"
}
}
}
}
}
]
So the author transform in your example need to be within the author node/table
any update on this?
@toluaina Apologies for the late reply, had to take care of other engagements. I tried putting the author transformation within the author table. It didn't work as mapping generated for author encapsulates author of object type.
So, if I add author within the author node/table as you shared, the mapping will look like this -
[
{
"database": "book",
"index": "book",
"nodes": {
"table": "book",
"children": [
{
"table": "author",
"columns": [
"id",
"name"
],
"relationship": {
"variant": "object",
"type": "one_to_many"
},
"transform": {
"mapping": {
"name": {
"type": "keyword"
},
"author": {
"type": "nested"
}
}
}
}
]
}
}
]
It will return something like this -
{
"book" : {
"mappings" : {
"properties" : {
"author" : {
"properties" : {
"author" : {
"type" : "nested"
},
"price" : {
"type" : "double"
}
}
}
}
}
}
}
What I wanted was this -
{
"book" : {
"mappings" : {
"properties" : {
"author" : {
"type" : "nested",
"properties" : {
"price" : {
"type" : "double"
}
}
}
}
}
}
}
Please note, if I remove transformation on price from the author node, and add transformation on root level for author of type nested, it works. It doesn't work only if I add the price transformation.
As a workaround for now, I have manually updated the index mapping after bootstrapping via pgsync.
@toluaina Any updates on this? Is there a workaround to create a child table with the type "nested" like @cedzz mentioned above?
I feel this would entail a bit of re-working and introduce too much complexity. I can add a option to the schema.json to specify exact mapping you require and child mapping will be ignored? So you would add your own mapping as you would expect in the schema
I feel this would entail a bit of re-working and introduce too much complexity. I can add a option to the schema.json to specify exact mapping you require and child mapping will be ignored? So you would add your own mapping as you would expect in the schema
Hi @toluaina , yes please, if I could add my own mapping that would help. I am working with generic search pattern, so it looks for nested objects, with its path and key. So I really need to specify child table as nested, otherwise the search fails.
@toluaina any update on this? option to specify exact mapping and ignore child mapping.
I added an option to allow you to specify your own mapping in the schema.json. Here is an example
@toluaina , I tried putting all my mapping under the "mapping" property liked you mentioned. So under nodes I am only mentioning the columns, child tables and relationships. But it not identifying my "analyzers" even though I have put it under the "settings" property.
[
{
"database": "postgres",
"index": "customers",
"settings": {
"analysis": {
"analyzer": {
"analyzer_keyword": {
"tokenizer": "whitespace",
"filter": [
"trim",
"lowercase"
]
},
"case_insensitive_analyzer": {
"type": "custom",
"filter": [
"lowercase"
],
"tokenizer": "keyword"
}
},
"normalizer": {
"sorting": {
"type": "custom",
"char_filter": [],
"filter": [
"lowercase",
"asciifolding"
]
}
}
}
},
"mapping": {
"Id": {
"type": "text",
"fields": {
"raw": {
"type": "keyword",
"normalizer": "sorting"
}
}
},
"CreatedDateTime": {
"type": "date",
"format": "dd-MM-yyyy HH:mm:ss||epoch_millis",
"fields": {
"raw": {
"type": "date"
}
}
},
"UpdatedDateTime": {
"type": "date",
"format": "dd-MM-yyyy HH:mm:ss||epoch_millis",
"fields": {
"raw": {
"type": "date"
}
}
},
"FirstName": {
"type": "text",
"analyzer": "analyzer_keyword",
"fields": {
"raw": {
"type": "keyword",
"normalizer": "sorting"
}
}
},
"LastName": {
"type": "text",
"analyzer": "analyzer_keyword",
"fields": {
"raw": {
"type": "keyword",
"normalizer": "sorting"
}
}
},
"Name": {
"type": "text",
"analyzer": "case_insensitive_analyzer",
"fields": {
"raw": {
"type": "keyword",
"normalizer": "sorting"
}
}
},
"TaxExemptId": {
"type": "text",
"fields": {
"raw": {
"type": "keyword",
"normalizer": "sorting"
}
}
},
"DateOfBirth": {
"type": "text",
"analyzer": "analyzer_keyword",
"fields": {
"raw": {
"type": "keyword",
"normalizer": "sorting"
}
}
},
"CustomerType": {
"type": "text",
"fields": {
"raw": {
"type": "keyword",
"normalizer": "sorting"
}
}
},
"NotificationPreferences": {
"type": "text",
"fields": {
"raw": {
"type": "keyword",
"normalizer": "sorting"
}
}
},
"Status": {
"type": "text",
"fields": {
"raw": {
"type": "keyword",
"normalizer": "sorting"
}
}
},
"CompanyName": {
"type": "text",
"analyzer": "analyzer_keyword",
"fields": {
"raw": {
"type": "keyword",
"normalizer": "sorting"
}
}
},
"CustomerAddress": {
"type": "nested",
"properties": {
"Id": {
"type": "text",
"fields": {
"raw": {
"type": "keyword",
"normalizer": "sorting"
}
}
},
"ZipCode": {
"type": "text",
"analyzer": "analyzer_keyword",
"fields": {
"raw": {
"type": "keyword",
"normalizer": "sorting"
}
}
},
"AddressType": {
"type": "text",
"analyzer": "analyzer_keyword",
"fields": {
"raw": {
"type": "keyword",
"normalizer": "sorting"
}
}
},
"Street": {
"type": "text",
"analyzer": "analyzer_keyword",
"fields": {
"raw": {
"type": "keyword",
"normalizer": "sorting"
}
}
},
"Street2": {
"type": "text",
"analyzer": "analyzer_keyword",
"fields": {
"raw": {
"type": "keyword",
"normalizer": "sorting"
}
}
},
"City": {
"type": "text",
"analyzer": "analyzer_keyword",
"fields": {
"raw": {
"type": "keyword",
"normalizer": "sorting"
}
}
},
"State": {
"type": "text",
"analyzer": "analyzer_keyword",
"fields": {
"raw": {
"type": "keyword",
"normalizer": "sorting"
}
}
},
"Country": {
"type": "text",
"analyzer": "analyzer_keyword",
"fields": {
"raw": {
"type": "keyword",
"normalizer": "sorting"
}
}
}
}
},
"CustomerContact": {
"type": "nested",
"properties": {
"Id": {
"type": "text",
"fields": {
"raw": {
"type": "keyword",
"normalizer": "sorting"
}
}
},
"ContactType": {
"type": "text",
"analyzer": "analyzer_keyword",
"fields": {
"raw": {
"type": "keyword",
"normalizer": "sorting"
}
}
},
"ContactDetail": {
"type": "text",
"analyzer": "analyzer_keyword",
"fields": {
"raw": {
"type": "keyword",
"normalizer": "sorting"
}
}
}
}
},
"AlternateContactDetails": {
"type": "nested",
"properties": {
"Id": {
"type": "text",
"fields": {
"raw": {
"type": "keyword",
"normalizer": "sorting"
}
}
},
"FirstName": {
"type": "text",
"analyzer": "analyzer_keyword",
"fields": {
"raw": {
"type": "keyword",
"normalizer": "sorting"
}
}
},
"LastName": {
"type": "text",
"analyzer": "analyzer_keyword",
"fields": {
"raw": {
"type": "keyword",
"normalizer": "sorting"
}
}
},
"Email": {
"type": "text",
"analyzer": "analyzer_keyword",
"fields": {
"raw": {
"type": "keyword",
"normalizer": "sorting"
}
}
},
"Phone": {
"type": "text",
"analyzer": "analyzer_keyword",
"fields": {
"raw": {
"type": "keyword",
"normalizer": "sorting"
}
}
}
}
},
"EquipmentInfo": {
"type": "nested",
"properties": {
"EquipmentId": {
"type": "text",
"fields": {
"raw": {
"type": "keyword",
"normalizer": "sorting"
}
}
},
"Title": {
"type": "text",
"analyzer": "analyzer_keyword",
"fields": {
"raw": {
"type": "keyword",
"normalizer": "sorting"
}
}
},
"Year": {
"type": "text",
"analyzer": "analyzer_keyword",
"fields": {
"raw": {
"type": "keyword",
"normalizer": "sorting"
}
}
},
"Make": {
"type": "text",
"analyzer": "analyzer_keyword",
"fields": {
"raw": {
"type": "keyword",
"normalizer": "sorting"
}
}
},
"Model": {
"type": "text",
"analyzer": "analyzer_keyword",
"fields": {
"raw": {
"type": "keyword",
"normalizer": "sorting"
}
}
}
}
}
},
"nodes": {
"table": "Customers",
"schema": "public",
"columns": [
"Id",
"FirstName",
"LastName",
"CompanyName",
"Name",
"CreatedDateTime",
"UpdatedDateTime",
"DateOfBirth",
"IsTaxExempt",
"TaxExemptId",
"DoNotMail",
"DoNotService",
"CustomerType",
"NotificationPreferences",
"Status",
"LastActivityDate",
"IsDeleted"
],
"children": [
{
"table": "CustomerAddress",
"columns": [
"Id",
"CustomerId",
"Name",
"AddressType",
"Street",
"Street2",
"City",
"State",
"Country",
"Latitude",
"Longitude",
"ZipCode",
"IsDefault",
"IsDeleted",
"IsLtr",
"IsVerified"
],
"label": "CustomerAddress",
"relationship": {
"variant": "object",
"type": "one_to_many",
"foreign_key": {
"child": [
"CustomerId"
],
"parent": [
"Id"
]
}
}
},
{
"table": "CustomerContacts",
"columns": [
"Id",
"CustomerId",
"IsPrimary",
"AllowVoice",
"ContactDetail",
"AllowMail",
"AllowText",
"ContactType",
"IsDeleted"
],
"label": "CustomerContact",
"relationship": {
"variant": "object",
"type": "one_to_many",
"foreign_key": {
"child": [
"CustomerId"
],
"parent": [
"Id"
]
}
}
},
{
"table": "AlternateContactDetails",
"columns": [
"Id",
"CustomerId",
"FirstName",
"LastName",
"Description",
"Email",
"Phone",
"IsDeleted"
],
"label": "AlternateContactDetails",
"relationship": {
"variant": "object",
"type": "one_to_many",
"foreign_key": {
"child": [
"CustomerId"
],
"parent": [
"Id"
]
}
}
},
{
"table": "EquipmentInfo",
"columns": [
"EquipmentId",
"CustomerId",
"Title",
"Year",
"Make",
"Model",
"CreatedOn",
"UpdatedOn"
],
"label": "EquipmentInfo",
"relationship": {
"variant": "object",
"type": "one_to_many",
"foreign_key": {
"child": [
"CustomerId"
],
"parent": [
"Id"
]
}
}
},
{
"table": "Notes",
"columns": [
"Id",
"CustomerId",
"NoteType",
"Note",
"IsDeleted",
"CreatedBy",
"CreatedOn",
"UpdatedBy",
"UpdatedOn"
],
"label": "Notes",
"relationship": {
"variant": "object",
"type": "one_to_many",
"foreign_key": {
"child": [
"CustomerId"
],
"parent": [
"Id"
]
}
}
}
]
}
}
]
@joshuafernandes1996 Your key is named settings and should be named setting.
Well spotted @voyc-jean Thanks for that.
Yes it should be setting singular