phoenix-ecto-append-only-log-example icon indicating copy to clipboard operation
phoenix-ecto-append-only-log-example copied to clipboard

Why? What? Who? How?

Open nelsonic opened this issue 7 years ago • 7 comments

Why?

Having an append-only log is incredibly useful in way more situations than most people realise. Anywhere you would need accountability in data is an excellent candidate for immutability.

  • CRM - where customer data is updated and can be incorrectly altered, having the complete history of a record and being able to "time travel" through the change log is a really good idea.
  • CMS/Blog - being able to "roll back" content means you can invite your trusted readers / stakeholders to edit/improve your content without "fear" of it decaying.
  • E-Commerce - both for journey/story tracking and transaction reliability. Also, same applies for the Product catalog (which is a specific type of CMS); having version history dramatically increase confidence in the site both from an internal/vendor perspective and from end-users.
    • This is especially useful for the Reviews on e-commerce sites/apps where we want to be able to detect/see where people have updated their review following extended usage. e.g: was did the product disintegrate after a short period of time? did the user give an initially unfaforable e.g 3/5 stars review and with time come to realise that the product is actually exceptionally durable, well-designed and great value-for-money because it has lasted twice a long as any previous product they purchased to perform the same "job to be done".
  • Chat - a Chat system should allow editing of previously sent messages for typos/inaccuracies. but that edit/revision history should be transparent not "message edited" (with no visibility of what changed) and if a person deletes the a message they should have to provide a comment indicating why they are "breaking" the chain. (more on this later).
  • Most other Consumer Web/Mobile Applications - you name the app, I can illustrate exactly how an append-only log is applicable/useful/essential to the reliability/confidence in that app.
    • Forums - any sort of user-generate content where
    • Social Networking - not allowing people to delete a message without leaving a clarifying comment promote accountability for what people write and in many cases avoids most hate speech.

When a system/db does not have (field/record level) "version control" each update over-writes the state of the record so it's impossible to retrieve it without having to go digging through a backup which is often a multi-day process, cost/time prohibitive or simply unavailable.

We propose that all apps should be built with an Append Only Log at the core by default. This is not a "new" idea. Several large companies have used the "Lambda" or "Kappa" architecture in production with superb results for reliability and scalability. see: http://milinda.pathirage.org/kappa-architecture.com

What?

Instead of using Ecto's standard "CRUD" which allows overwriting and deleting data without "rollback" or "recoverability", we propose writing a thin "proxy" layer between the application code and the PostgreSQL database

Who?

All developers who have basic understanding of web development where data is stored in a DB and want to "level up" their knowledge/skills and the reliability of the product they are building with a view to understanding more ("advanced") "distributed" application architecture including the ability to (optionally/incrementally) use IPFS and Blockchain.

How?

The purpose of this tutorial is to:

  • [ ] Make it easy for anyone to build a Phoenix/Ecto based app with an append-only log at it it's core.
  • [ ] Write a step-by-step guide that shows how to:
    • [ ] create a content type using standard Phoenix generators
      • [ ] Take your pick of what you think is the simplest/most beginner friendly content type. e.g:
        • [ ] Address Book is easy to show value of tracking updates.
  • [x] no additional DB plugins/"add-ons" should be required to make this work, just "stock" PostgreSQL as downloaded or available through a DB-as-a-service provider. e.g. Heroku.
  • [ ] Add our tutorial to this thread once it's "ready": https://elixirforum.com/t/append-only-db/13355
  • [ ] Add link to tutorial to: https://stackoverflow.com/questions/35405671/append-only-data-in-phoenix-ecto-and-postgres

Open Questions:

  • [ ] How to reference the previous version of a record from the latest one. (keen to hear feedback on this) Happy to explore using hash of data as "parent id" thus we would have a "merkle tree" structure for all data. i.e. "Blockchain" but without the "proof of work" (factor), just a chain with the history of a record for accountability/rollback purposes no need to waste CPU cycles.
  • [ ] How to delete data (mark an item as deleted) without "destroying" the data.
  • [ ] Note: we would still have to have a "batch process" that is able to "really delete" data to comply with GDPR/"Right to be forgotten". But from the User's perspective we only need to "unlink" the data in the UI and then the batch process will delete it after a specified expiry. Similar to the recycling bin on a Desktop OS.

Similar to: (please use these a reference when writing the doc(s))

  • https://github.com/dwyl/phoenix-ecto-encryption-example [Quite Technical]
  • https://github.com/dwyl/phoenix-chat-example [more Beginner Friendly]

nelsonic avatar Sep 12 '18 13:09 nelsonic

At present the README.md is a bit sparse at the start of PR #2: https://github.com/dwyl/phoenix-ecto-append-only-log-example/pull/2/files#diff-04c6e90faac2675aa89e2176d2eec7d8R4 image

I realise that I am responsible for making the intro clear and beginner friendly. I will add to the README.md as soon as I get a chance.

nelsonic avatar Sep 26 '18 14:09 nelsonic

Database Migrations are Always Backward Compatible. "Tables" are only ever extended with new columns. Columns are never deleted or altered so existing code/queries never "break". This is essential for "Zero Downtime Continuous Deployment". A Database Migration can be applied before the Application Server is Updated and the existing/current version of the Application can continue to run like nothing happened.

nelsonic avatar Sep 28 '18 03:09 nelsonic

@Cleop given your knowledge quest in #5 do you feel that you can write the intro to this tutorial and do a general tidy/tightening so we can "publish" it to the Elixir community? Please let me know if you feel confident that you can approach the first 3 questions with a "beginners mind" and "sell" the idea of an append-only log to people who are "curious" but not yet "convinced": no-intro-yet

If you have any questions, as always, I'm happy to answer/clarify. 😕 |> 🤔 |> 😃 |> 🤓 |> 🚀 LMK! 👍 Thanks! ✨

nelsonic avatar Oct 06 '18 08:10 nelsonic

Why?

Append only logs are an attractive approach to database structures because:

  • They allow you to process data in real-time. This makes them ideal for analytics. :bar_chart: 🤔
  • Data is never lost. Making them reliable for recovery during outages. As well ideal for version control and accountability. :closed_lock_with_key:
  • Columns are never deleted or altered so existing code/queries never "break" :negative_squared_cross_mark: :warning:. This is essential for "Zero Downtime Continuous Deployment".
  • A database migration can be applied before the application server is updated and the existing/current version of the application can continue to run like nothing happened. :sunglasses:
  • Database migrations are always backward compatible. :leftwards_arrow_with_hook:

Here are some examples of use cases where these characteristics may prove useful:

  • CRM - where customer data is updated and can be incorrectly altered, having the complete history of a record and being able to "time travel" through the change log is a really good idea. :clock10: :leftwards_arrow_with_hook: :white_check_mark:
  • CMS/Blog - being able to "roll back" content means you can invite your trusted readers / stakeholders to edit/improve your content without "fear" of it decaying. :lock_with_ink_pen:
  • E-Commerce - both for journey/story tracking and transaction reliability. Also, same applies for the Product catalog (which is a specific type of CMS); having version history dramatically increases confidence in the site both from an internal/vendor perspective and from end-users. This is especially useful for the reviews on e-commerce sites/apps where we want to be able to detect/see where people have updated their review following extended usage. e.g: did the product disintegrate after a short period of time? did the user give an initially unfavourable e.g 3/5 star review and over time come to realise that the product is actually exceptionally durable, well-designed and great value-for-money because it has lasted twice a long as any previous product they purchased to perform the same "job to be done"? :star::star::star::star::star:
  • Chat - a chat system should allow editing of previously sent messages for typos/inaccuracies. but that edit/revision history should be transparent not "message edited" (with no visibility of what changed). If a person deletes a message they should have to provide a comment indicating why they are "breaking" the chain (more on this later). :pencil2:
  • Social Networking - not allowing people to delete a message without leaving a clarifying comment to promote accountability for what people write. In many cases this can reduce hate speech. :rage2::speech_balloon:
  • Most Other Consumer Web/Mobile Applications - you name the app, there are ways in which an append-only log is applicable/useful/essential to the reliability/confidence in that app. :sparkling_heart:

What?

The Phoenix Append Only Log Example is an immutable database structure.

It is an alternative to using Ecto's standard "CRUD" which allows overwriting and deleting data without "rollback" or "recoverability". In these instances each update over-writes the state of the record so it's impossible to retrieve it without having to go digging through a backup which is often a multi-day process, cost/time prohibitive or simply unavailable.

Instead the Phoenix Append Only Log Example "tables" are only ever extended with new columns. This means that whilst changes can be displayed as you wish to your users, data is never irretrievable or lost to you on the back-end. It is also a time series database meaning that whatever activity occurs, we also know when it occurred, making understanding chronology and how events occurred easy.

Who?

All developers who have a basic understanding of database storage in web development and want to "level up" their knowledge/skills. Those who want to improve the reliability of the product they are building. Those who want to understand more ("advanced") "distributed" application architecture including the ability to (optionally/incrementally) use IPFS and Blockchain.

Cleop avatar Oct 14 '18 18:10 Cleop

Hi @Cleop thank you for summarising the intro questions/answers.

Please create a Pull Request with the bullet points you have written and an intro paragraph. (Thanks!)

The Why? section should answer the question: "Why should I care about this?" (or "Why should I take the extra steps to use an Append Only Log?")

Seek to answer the question "What's in it for me?" in the first 7 seconds of the readme. This the question going through the reader's head, and we can either engage them with a reason to continue reading, or they will "bounce".

Bullet points are good and have their place, keep them! Consider an intro paragraph (prose) which answers the questions going through the reader's mind and will "flow" better when a human is reading it.

e.g:


If you have ever used the "undo" functionality in a program, you will have experienced the power of an Append-only Log.


When changes always create a new state (without altering history) we can easily return to the previous state. This is the principal in the Elm Architecture (borrowed by Redux so most React apps) and in Elixir/Haskell too; data is always "transformed" never mutated. This makes it much faster to build reliable/predictable apps because debugging apps is considerably easier when state is always known.


If any these terms are unclear to you now, never fear we will be clarifying them below. The main thing to remember is that using an Append-only Log to store your App's data makes it much easier to build the App almost immediately because records are never changed, history is preserved and can easily be referred to i.e. you have built-in debug-ability/traceability.

Once you are overcome the initial learning curve, you will see that your Apps become easy to reason about and you "unlock" many other possibilities for useful features and functionality that will delight the users! You will get your work done much faster and more reliably, users will be happier with the UX and Product Owners/Managers will be able to see how data is transformed in the app; easily visualise the usage data and "flow" on analytics charts/graphs in realtime!


I'm sure you can re-word/work this to read bettererer.

nelsonic avatar Oct 15 '18 11:10 nelsonic

intro questions answered in by @Cleop's PR #9 🎉

nelsonic avatar Oct 15 '18 17:10 nelsonic

Oops. I closed the issue but still has un-finished items in the checklist...! (derp!) 😕 Re-opening and ensuring that we cover the checklist items. ⏳

nelsonic avatar Oct 16 '18 10:10 nelsonic