Sprint - May 13 to May 24, 2024
Global Sprint Planning
3 things that might take us down
- Clickhouse fire rages
Team sprint planning
For your team sprint planning copy this template into a comment below for each team.
# Team ___
**Support hero:** ___
## Retro
<!-- Grab the high and low priority items from last time and add whether that item was completed or not -->
-
## Hang over items from previous sprint
<!-- For each item, decide to re-prioritise (and add below) or deprioritise -->
- Item 1. prioritised/deprioritise
## OKR
1. OKR, status (red/yellow/green) and action points if yellow/red
### High priority
-
### Low priority / side quests
-
Team ~Benterprise~ ~what-the-heck-is-siem~ Infra
Hang over items from previous sprint
- π‘ SOC2 Audit work came in with a fair amount of follow up work taking a good chunk of @danielxnj
- π‘ Switched gear to try Loki as our central log system to tick the SIEM box @danielxnj
- Still a lot of work here to make it performant, get all the logs from different systems in etc.
-
π’ Noticed big costs in Cloudwatch logs - investigated where it was coming from and dropped the issuse (halving the cost) @frankh
-
π’ Get Vault UI and roles and accounts setup @danielxnj
- We will let this take a back seat to focus on other things atm
- π’ New kafka setup (i.e. correct zones and size for migrating over to) @danielxnj
- π΄ Get some or all logs running into SIEM tool @danielxnj
- Ditched Wazuh as it seemed like a maintenance nightmare
- π΄ Documentation for Canary deploys so that people can use them @frankh
- π’ Capacity planning dashboard with additional alerting @frankh
- π’ Verify and then document / publicize the IP allow list stuff @benjackwhite
- π’ Rollout security improvement fixes @benjackwhite
- π‘ CDP v1.5 - Webhooks destination filtered by Action @benjackwhite
OKR
- πͺ Deploy with confidence π’
- Finalize our Canary Deploy process π‘
- π¨ Improved alerting and monitoring π’
- SIEM work has made this a priority
- π Deeper Security π‘
- π° Continued cost control π’
High priority
- Look into Loki performance as we will base our SIEM plan on it @frankh
- Get as many other log sources into Loki as possible (Kubernetes audit logs, Cloudtrail, SQL/Clickhouse queries, Metabase, HogQL queries) @danielxnj
- Get more useful auditing logs out of Django (auth attempts, access logs, HogQL queries) @benjackwhite
- Get Metabase setup to tick the audit boxes (auditing, SSO etc.) @frankh @danielxnj
- Continue with Soc2 Audit follow up @danielxnj
- Fix all remaining High/medium pen test outcomes @benjackwhite
- Offsite!
Team Data Shack
OKR Q2 2024
Objective
Release data warehouse to everyone
-
Key Results:
- Integration first experience
- schemas are reliable
- modeling of each integration is clear
- Good automatic roll up views and joins
- Wizard to onboard people
- Establish a solid pattern to build integrations
- Complete data warehouse experience in the rest of the app (insights, feature flags, experiments)
- Integration first experience
James as a Service -> Clickhouse as a Service
-
Key Results:
- Better Visibility
- Regularly testing backups
- Monitoring/alerting
- Mutations
- Moves
- Management
- Managing/killing mutations
- Self Serve
- Schema design feedback (James non blocking
- Schema management
- Automation
- Replace/Upgrade replicas
- Upgrading to 24.04
- Disk configs
- Better Visibility
Retro
Product
- [x] @EDsCODE Postgres integration all connection options. Spike of customer requests that really want this working. Solved the immediate bugs here. Did not end up building out connection options.
- [x] @EDsCODE ^shifted focus to resolving issues with joined person property filtering
- [x] @Gilbert09 Schema Validation workflow. Ability to edit schema that's interpreted by clickhouse
Infra
- [x] Fix data discrepancies (blocking upgrade to 23.12)
- [ ] Fully upgrade CH clusters to 23.12
- [x] Storage increases across the clusters
- [x] EU Cluster replacement progress (node is up - waiting to start replication)
High priority
- offsite
Product
- making the UI changes to move sources into pipeline 3000
- Hackathon
Infra
- [ ] Hackathon
- [ ] Start EU replication
- [ ] Upgrade EU Cluster to 23.12
- [ ] Increase storage on EU instances
Team Product your Analytics
Support hero: Thomas
Retro
We're still in the middle of it all. HogQL insights are out for everyone. We found a few blind spots (formula + differing results for breakdown options, MAU trends modal, dashboard issues), which we're hard at work on. Dashboards are getting better by the day, fixed a big insight issue today.
-
π’ 100% of all insight results for everyone use the new HogQL backend @mariusandra
- π‘ Clean up insights - everyone
- π‘ Fix existing insight bugs - everyone
- π‘ Fix dashboardLogic @webjunkie
- βͺ Convert filters to query for the /api/../insight endpoints
- βͺ Remove all old legacy code @thmsobrmlr
-
Low priority / side quests
- π‘ Start developing the new insight features (Sandy joined and worked on this!)
- βͺ Fix shared dashboards reloads @Twixes
- βͺ Project environments
Hang over items from previous sprint
Support & insight bugs.
OKR
No major change here.
-
HogQL-based querying
- Convert the remaining legacy queries to HogQL and release to public (Thomas, Julian, Marius)
- π’ Insights β they are rolled out!! (still some bugs)
- βͺ Cohorts
- Remove legacy querying backend (Thomas, Julian)
- π Clean up or rewrite dashboardLogic π this sprint
- π Convert filters to query (insights, notebooks, activity log, experiments) π this sprint
- Missing Product Analytics features (Thomas, Julian)
- βͺ Breakdowns (multiple) in literally everything
- π Make a list based on GitHub issues from customer requestsβ¦ π this sprint
- βͺ Fix those issues
- Missing HogQL features (Tom, Marius)
- π Type system, JSON π Data Warehouse is on this
- βͺ Missing things when building funnels
- Convert the remaining legacy queries to HogQL and release to public (Thomas, Julian, Marius)
-
Querying and processing performance (Thomas, Julian)
- Global performance overview dashboards
- βͺ Insights
- βͺ Exports
- βͺ Cohort recalculations
- Query request tracing
- βͺ Possibly query runner Python optimizations
- βͺ Exports improvements
- βͺ Identify top 5 query optimizations in terms of impact
- Global performance overview dashboards
-
Artificial Hog / Post Intelligence (Michael)
- βͺ Ask a question to get a magical insight (aware of your taxonomy)
- βͺ Figure out infra for upgrading queries and models
- βͺ Product-wide framework for opting into sharing with OpenAI
-
Activation (side quest: Michael)
- βͺ Michael to work with Growth to identify optimizations to getting started with Product Analytics
High priority
For most of the team:
- 1 week offsite
For everyone else:
- Fix all reported and open insight issues
- Fix dashboard and insight logics
Low priority / side quests
Better tracked here: https://github.com/PostHog/meta/issues/200
- Project environments
Team Growth
Retro
Retro items
High priority
- @raquelmsmith
- Out 1.5 days this sprint
- [x] UI for person profiles addon
- [x] General project management & comms for person profiles addon
- [ ] Feature gating for activity panel
- Migrate customers
- [ ] Feature gating for automatic provisioning
- Migrate customers
- @xrdt
- [x] Put tasks into celery
- [x] Put sync_invoices tasks into celery so we can unlock parallelization and process isolation
- Now have grafana metrics
- [ ] Billing admin improvements
- [ ] Make plans_map json a series of selects so it's less error-prone
- [x] Find a way to enhance history messages for inlined objects (CustomerToStripeCustomer relation)
- [ ] compare_prices improvements
- [ ] Add tests, make sure things like compare prices when we are overriding price_id_overrides works.
- [x] Put tasks into celery
- @zlwaterfield
- [x] Get teams plan addon shipped
- [x] New Teams Plan in app Billing UI
- [ ] Work through changes to subscribe to all products - billing page, pricing page, activation logic, pay gates, etc. - will do next week
- [x] Better loading states for activate/deactivate subscriptions
- [x] Help Frank get Frank Django 3.10 out w/ upgrade to nginx
- [ ] (nice to have) Do some cleaning of the billing repo - looking at https://github.com/HackSoftware/Django-Styleguide and trying to make the logic a cleaner and easier to read / debug
- [x] Has first PR that pulls trust scores out into a service
- [ ] (nice to have) Look into email subscription (this seems like an important topic to make sure we're staying compliant - https://posthog.slack.com/archives/C043VJ93L3B/p1713538664642819)
Q2 Goals
β =finished π‘=in progress π΄=won't finish
- βͺ Create a flow in product analytics onboarding to fill out a dashboard template using actions (Raquel)
- π‘ Simplify our subscription flows (Zach, supported by Raquel)
- π‘ Launch pricing changes (Bianca, Raquel)
- π‘ Personless events - will help us reach more customers at an affordable price
- Data warehouse - it's becoming pretty useful, we should charge for it
- Session replay - we can reduce costs to improve retention and reach more people
This sprint
- Only have 3 non-offsite days this sprint!
High priority
- @raquelmsmith
- [ ] Get automatic provisioning feature gating out
- [ ] Personless events pricing UI on posthog.com and pricing tables
- @xrdt
- [ ] Move remaining bits of sync_invoices to celery
- [ ] Finish PRs for updating unsubscription flow with auto-payment of outstanding invoices
- [ ] Testing compare_prices
- [ ] Billing admin improvements - make plans_map a dropdown of selects, feature_override reason
- @zlwaterfield
- [ ] Finish the 6 project limit migration
- [ ] Migrating all enterprise customers to free teams addon + documenting for CS
- [ ] Looking into
otherrevenue edge cases in reporting - [ ] Planning for subscribing to all products next week
Team Web Analytics
Support hero: n/a
Retro
High priority
- π’ get session property filters in web analytics rolled out and working for all queries on the page (if not finished this sprint)
Low priority
- π΄ Rolling out the session property filters for all queries, not just on web analytics
- π΄ Ploughing through small feature requests for web analytics
Rolled out web analytics (with Beta tag) to 100%, but I'm not happy with query performance. Based on a hunch, I did a spike of using UUIDv7s (casted to UInt128) for the session id, so far it looks like it's a significant speedup, around 87% on clickhouse cloud, but needs to be tested in a prod-like env. See WIP RFC
Hang over items from previous sprint
- deprioritised behind perf work: Rolling out the session property filters for all queries, not just on web analytics
OKR
- Make querying fast enough for large customers
- Do personless events work where necessary (unknown amount of work)
- Iterate on customer feedback
- Product management work
High priority
- finish this spike on uuidv7s
- if it's worth doing
- write the SQL for the new sessions table that uses is
- write and run a proper backfill job
- switch the hogql sessions table over to the new table
- make sure docs / sdks make it clear enough that session_ids should be a UUIDv7
- then roll out session properties elsewhere in the app
- mop up personless events, TBD(mykonos) what this involves
Team Feature Success
Support hero: @neilkakkar
Days off:
Juraj: 5 + 3 days
Neil: 5 + 2 days
Retro
- Surveys branching logic - @jurajmajerik - β
- Focused on a seamless dev environment before working on branching
- Refactoring survey popup code to make improvements easier
- MDE for experiments
- Make sure Phani's onboarding goes well - @neilkakkar β
- Create dumb feature flags service in rust, write down steps to make it production ready - @neilkakkar β -> https://github.com/PostHog/posthog/issues/22131
- Make repeatedly showing dismissed surveys more ergonomic - https://github.com/PostHog/posthog/issues/17863 - @Phanatic π‘
- Link survey security vulnerability - @Phanatic β
Hang over items from previous sprint
OKRs
- Make sure feature flags can handle 10x current scale
- Polish new experiments UI & collect feedback
- Add most requested surveys functionality
High priority
- Offsite hackathon + wrapup - @jurajmajerik / @neilkakkar
- Surveys popup refactor - @jurajmajerik
- Make sure Dylan's onboarding goes well - @neilkakkar
- Implement the RFC for repeatable surveys - https://github.com/PostHog/meta/pull/203 + some small bugs - @Phanatic
Stretch:
- https://github.com/PostHog/posthog/issues/22131 make some progress on rewrite - @neilkakkar
Low priority / side quests / maybe Neil will get to this next year
- Temporal queues for feature success - @neilkakkar
- Setup instrumentation for flip-flopping problem of experiment significance - @neilkakkar