HouseClick
HouseClick copied to clipboard
House prices app
HouseClick - ClickHouse + Supabase demo
HouseClick is a fake estate agency business, powered by Supabase and ClickHouse via the ClickHouse Foreign Data Wrapper. Supabase for transactional data and application state, ClickHouse for analytics - all the through the Supabase API.
This demo complements the blog post Adding Real Time Analytics to a Supabase Application. This requires you to have the Foreign Data Wrapper for ClickHouse available in your Supabase project.
Important
- This is a demo only.
- The code has no tests and comments are currently minimal - although the code is very simple.
- Rough and ready. Use for inspiration only.
- PRs welcome to improve the code.
- Due to concerns regards image writes, we do not provide a generated dataset. Steps to generate are included.
- This requires the Foreign Data Wrapper (FDW) for ClickHouse to be available in Supabase.


Setup
1. Requirements
- Python 3.10+ - for data generation scripts
- Node v16.15.1+
- Yarn 1.22.19
- Supabase (free tier is sufficient) account or local instance. You will need Foreign Data Wrapper enabled on this instance.
- OpenAPI key. $10 free credit is sufficient to generate 1000 properties. This is used to generated property descriptions.
2. Installation
git clone [email protected]:ClickHouse/HouseClick.git
# setup script dependencies
cd scripts
# optionally create a virutal env e.g. virtualenv -p python3 .venv && source .venv
pip install -r requirements.txt
# Install frontend
cd ../HouseClick
yarn install
3. Loading data into ClickHouse
From the ClickHouse client. This data is around 28m rows and takes several minutes to load on most internet connections. Further details here.
CREATE TABLE uk_price_paid
(
price UInt32,
date Date,
postcode1 LowCardinality(String),
postcode2 LowCardinality(String),
type Enum8('terraced' = 1, 'semi-detached' = 2, 'detached' = 3, 'flat' = 4, 'other' = 0),
is_new UInt8,
duration Enum8('freehold' = 1, 'leasehold' = 2, 'unknown' = 0),
addr1 String,
addr2 String,
street LowCardinality(String),
locality LowCardinality(String),
town LowCardinality(String),
district LowCardinality(String),
county LowCardinality(String)
)
ENGINE = MergeTree
ORDER BY (postcode1, postcode2, addr1, addr2);
INSERT INTO uk_price_paid
WITH
splitByChar(' ', postcode) AS p
SELECT
toUInt32(price_string) AS price,
parseDateTimeBestEffortUS(time) AS date,
p[1] AS postcode1,
p[2] AS postcode2,
transform(a, ['T', 'S', 'D', 'F', 'O'], ['terraced', 'semi-detached', 'detached', 'flat', 'other']) AS type,
b = 'Y' AS is_new,
transform(c, ['F', 'L', 'U'], ['freehold', 'leasehold', 'unknown']) AS duration,
addr1,
addr2,
street,
locality,
town,
district,
county
FROM url(
'http://prod.publicdata.landregistry.gov.uk.s3-website-eu-west-1.amazonaws.com/pp-complete.csv',
'CSV',
'uuid_string String,
price_string String,
time String,
postcode String,
a String,
b String,
c String,
addr1 String,
addr2 String,
street String,
locality String,
town String,
district String,
county String,
d String,
e String'
) SETTINGS max_http_get_redirects=10;
-- Create parameterized view for an example FDW visualization
CREATE VIEW default.sold_by_duration AS
SELECT
duration AS name,
count() AS value
FROM default.uk_price_paid
WHERE (duration != 'unknown') AND ((postcode1 = {_postcode:String}) OR (district = {_district:String}) OR (town = {_town:String}))
GROUP BY duration
4. Generating Property listings data
This step requires the uk_price_paid data to be loaded into ClickHouse (see previous step).
From the base directory. Adjust settings as required. Below assumes local instance.
export CLICKHOUSE_HOST=localhost
export CLICKHOUSE_PORT=8123
export CLICKHOUSE_USER=default
export CLICKHOUSE_PASSWORD=password
export CLICKHOUSE_PROTO=http
export OPENAI_API_KEY=blah
If you need to adjust the number of properties see here. This script could be improved - specifically, the price to bedrooms extrapolation. Improvements welcome.
The following will generate a file house_prices.csv in the base directory. This process can be slow depending on your connection and speed of the OpenAI API. This step generates image links using the Bing API.
python scripts/generate_data.py
5. Setup Supabase tables
Execute the following from your Supabase project SQL console. This will create the main uk_house_listings table and add a search index.
It also adds the Foreign Data Wrapper for ClickHouse and connects to the earlier created view.
-- create the table
CREATE TABLE uk_house_listings
(
id integer primary key,
date Date,
addr1 varchar(100),
addr2 varchar(100),
street varchar(60),
locality varchar(35),
town varchar(35),
district varchar(40),
county varchar(35),
postcode1 varchar(8),
postcode2 varchar(3),
type varchar(13),
duration varchar(9),
is_new SMALLINT,
price INTEGER,
rooms SMALLINT,
title text,
description text,
urls Text[],
sold boolean,
sold_date Date,
features text
)
-- add a search index
alter table
uk_house_listings
add column
fts tsvector generated always as (to_tsvector('english', description || ' ' || title || ' ' || postcode1 || ' '|| postcode2)) stored;
create index listings_fts on uk_house_listings using gin (fts); -- generate the index
-- install clickhouse wrapper
create extension if not exists wrappers;
create foreign data wrapper clickhouse_wrapper
handler click_house_fdw_handler
validator click_house_fdw_validator;
-- modify the following to point to your ClickHouse instance
create server clickhouse_server
foreign data wrapper clickhouse_wrapper
options (
conn_string 'tcp://default:<password>@<host>:9440/default?connection_timeout=30s&ping_before_query=false&secure=true'
);
create foreign table sold_by_duration (
name text,
value bigint,
postcode1 text, -- parameter column, used for input parameter,
district text,
town text
)
server clickhouse_server
options (
table '(select * from sold_by_duration(_postcode=${postcode1},_district=${district}, _town=${town}))',
rowid_column 'duration'
);
6. Loading data into Supabase
We will need a Supabase account (free tier is sufficient) or instance. Ensure you use a Project API keys for this step which have write access.
The data should of been generated in the previous step and be present in the base directory as house_prices.csv.
From the base directory:
export SUPABASE_URL=url
export SUPABASE_PRIVATE_KEY=write_key
python scripts/import_data.py
7. Running in Development mode
From the HouseClick directory.
Important: The Supabase key here should be the [anon key](https://supabase.com/docs/guides/api/api-keys. You should ensure row access security is configured and this key has only read access to the uk_house_listings table.
export CLICKHOUSE_HOST=http://localhost:8123
export CLICKHOUSE_USER=default
export CLICKHOUSE_PASSWORD=password
export NEXT_PUBLIC_SUPABASE_URL=url
export NEXT_PUBLIC_SUPABASE_ANON_KEY=anon_key
yarn dev
Visit localhost:3000!