Building a Type-Safe OLAP Query Layer with Haskell and Solid.js for Rendering Massive Datasets


We were facing a thorny problem: the business side needed a highly flexible data exploration interface capable of on-the-fly aggregation, filtering, and analysis on billions of rows of log data in our data warehouse. Existing BI tools were either too clunky or couldn’t meet our custom query requirements. The biggest pain point of an in-house solution was the fragile, string-based SQL communication link between the frontend and backend. Any change to a data warehouse table or a minor tweak in the frontend’s query logic could crash the backend API or cause it to return incorrect data, making the entire debugging process a nightmare.

In a real-world project, this kind of fragile contract between the frontend and backend is a massive source of technical debt. The frontend sends a complex JSON structure, which the backend then carefully parses and concatenates into an SQL string. This process is not only susceptible to SQL injection but also completely devoid of type safety. What we needed wasn’t a simple dashboard, but a robust, type-safe, end-to-end query builder, from the user interface all the way down to the data warehouse.

Our initial idea was: could we use a strongly-typed language on the backend to model the query logic itself? Instead of passing JSON, we would pass a structured, type-safe object representing the query’s intent. The backend would receive this object and deterministically and safely compile it into SQL. This line of thought led us directly to Haskell. Its Algebraic Data Types (ADTs) are tailor-made for this class of problem.

On the frontend, we needed extreme rendering performance. When an aggregate query can return tens or even hundreds of thousands of rows, any framework based on a Virtual DOM (like React or Vue) will hit a bottleneck in rendering and updates. We needed a solution that could manipulate the DOM directly and perform fine-grained updates. Solid.js, with its V-DOM-free, compile-time reactivity model, emerged as the ideal choice.

Thus, our technology stack was decided: ClickHouse as the high-performance OLAP data warehouse, Haskell (using the Servant framework) as the type-safe API layer, and Solid.js as the high-performance frontend rendering layer. This combination might seem unconventional, but it was a perfect fit for tackling our two core pain points: type safety and extreme performance.

Architecture Overview

The entire data flow and interaction logic can be represented by the following architecture diagram. The user builds a query object through UI components in the Solid.js frontend, which is sent via an API to the Haskell backend. The backend validates and compiles it into ClickHouse SQL, executes the query, and streams the results back to the frontend to be efficiently rendered by a virtual scrolling list.

sequenceDiagram
    participant User
    participant SolidJS_Frontend as Solid.js UI
    participant Haskell_Backend as Haskell Servant API
    participant ClickHouse_DW as ClickHouse Data Warehouse

    User->>SolidJS_Frontend: Builds query via UI (selects dimensions, metrics, filters)
    SolidJS_Frontend->>SolidJS_Frontend: Syncs UI state to internal query object (Signal)
    SolidJS_Frontend->>Haskell_Backend: POST /query (sends typed query object as JSON)
    Haskell_Backend->>Haskell_Backend: Parses JSON into Haskell ADT (Query)
    Note right of Haskell_Backend: Type safety guarantees; returns 400 on parse failure
    Haskell_Backend->>Haskell_Backend: translateQuery(Query) -> SQL Text
    Haskell_Backend->>ClickHouse_DW: Executes the generated SQL query
    ClickHouse_DW-->>Haskell_Backend: Returns query result stream
    Haskell_Backend-->>SolidJS_Frontend: Streams JSON results back
    SolidJS_Frontend->>SolidJS_Frontend: Receives data and renders to a virtualized list
    SolidJS_Frontend-->>User: Displays query results

Haskell Backend: Building a Type-Safe Query Compilation Layer

The backend’s job is to define the query model and provide an API that can safely transform this model into SQL.

1. Defining the Query with Algebraic Data Types (ADTs)

This is the cornerstone of the entire solution. We use Haskell’s type system to precisely describe the components of an OLAP query.

-- file: src/QueryModel.hs
{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE OverloadedStrings #-}

module QueryModel where

import Data.Aeson (FromJSON, ToJSON)
import Data.Text (Text)
import GHC.Generics (Generic)

-- Supported aggregation functions
data AggregationFunction = Count | Sum | Avg | Min | Max
  deriving (Show, Eq, Generic)

instance FromJSON AggregationFunction
instance ToJSON AggregationFunction

-- A metric, composed of an aggregation function and a column name
data Metric = Metric
  { func :: AggregationFunction,
    column :: Text
  }
  deriving (Show, Eq, Generic)

instance FromJSON Metric
instance ToJSON Metric

-- Filter operators
data FilterOperator = Eq | Neq | Gt | Lt | Gte | Lte | In
  deriving (Show, Eq, Generic)

instance FromJSON FilterOperator
instance ToJSON FilterOperator

-- The value for a filter condition
data FilterValue = TextValue Text | IntValue Int | TextList [Text] | IntList [Int]
  deriving (Show, Eq, Generic)

instance FromJSON FilterValue
instance ToJSON FilterValue

-- A complete filter condition
data Filter = Filter
  { fColumn :: Text,
    fOp :: FilterOperator,
    fValue :: FilterValue
  }
  deriving (Show, Eq, Generic)

instance FromJSON Filter
instance ToJSON Filter

-- Sort direction
data SortDirection = Asc | Desc
  deriving (Show, Eq, Generic)

instance FromJSON SortDirection
instance ToJSON SortDirection

-- A sort condition
data SortBy = SortBy
  { sColumn :: Text,
    sDirection :: SortDirection
  }
  deriving (Show, Eq, Generic)

instance FromJSON SortBy
instance ToJSON SortBy

-- The complete OLAP query model
data OlapQuery = OlapQuery
  { dimensions :: [Text],         -- Grouping dimensions
    metrics :: [Metric],         -- Aggregation metrics
    filters :: [Filter],         -- Filter conditions
    sortBy :: Maybe [SortBy],      -- Sorting
    limit :: Maybe Int,            -- Result limit
    offset :: Maybe Int           -- Result offset
  }
  deriving (Show, Eq, Generic)

instance FromJSON OlapQuery
instance ToJSON OlapQuery

A common pitfall here is designing FilterValue to handle different types of values like strings, numbers, and lists (for IN operations). An ADT is the perfect choice for this.

2. The Translator: From Query Model to SQL

This pure function is the core logic of the backend. It takes an OlapQuery and returns a safe, parameterized SQL query string. We deliberately avoid direct string concatenation of user input; instead, we build the SQL through pattern matching and safe helper functions, eliminating SQL injection at its root.

-- file: src/SqlGenerator.hs
{-# LANGUAGE OverloadedStrings #-}

module SqlGenerator (translateQuery) where

import Data.Text (Text, pack, intercalate, toLower)
import QueryModel

-- The main translation function
translateQuery :: Text -> OlapQuery -> Text
translateQuery tableName query =
  let selectClause' = selectClause (dimensions query) (metrics query)
      fromClause' = "FROM " <> tableName
      whereClause' = whereClause (filters query)
      groupByClause' = groupByClause (dimensions query)
      orderByClause' = orderByClause (sortBy query)
      limitClause' = limitClause (limit query)
      offsetClause' = offsetClause (offset query)
  in intercalate " "
       [ selectClause',
         fromClause',
         whereClause',
         groupByClause',
         orderByClause',
         limitClause',
         offsetClause'
       ]

-- Build the SELECT clause
selectClause :: [Text] -> [Metric] -> Text
selectClause dims mets = "SELECT " <> intercalate ", " (dims <> map metricToSql mets)

-- Build the WHERE clause
whereClause :: [Filter] -> Text
whereClause [] = ""
whereClause fs = "WHERE " <> intercalate " AND " (map filterToSql fs)

-- Build the GROUP BY clause
groupByClause :: [Text] -> Text
groupByClause [] = ""
groupByClause dims = "GROUP BY " <> intercalate ", " dims

-- Build the ORDER BY clause
orderByClause :: Maybe [SortBy] -> Text
orderByClause Nothing = ""
orderByClause (Just []) = ""
orderByClause (Just sbs) = "ORDER BY " <> intercalate ", " (map sortByToSql sbs)

-- Build the LIMIT/OFFSET clauses
limitClause :: Maybe Int -> Text
limitClause Nothing = ""
limitClause (Just l) = "LIMIT " <> pack (show l)

offsetClause :: Maybe Int -> Text
offsetClause Nothing = ""
offsetClause (Just o) = "OFFSET " <> pack (show o)

-- Internal helper functions to convert parts into SQL fragments
metricToSql :: Metric -> Text
metricToSql (Metric func col) = aggFuncToSql func <> "(" <> col <> ") AS " <> toLower (pack (show func)) <> "_" <> col

aggFuncToSql :: AggregationFunction -> Text
aggFuncToSql Count = "count"
aggFuncToSql Sum = "sum"
aggFuncToSql Avg = "avg"
aggFuncToSql Min = "min"
aggFuncToSql Max = "max"

filterToSql :: Filter -> Text
filterToSql (Filter col op val) = col <> " " <> opToSql op <> " " <> valueToSql val

opToSql :: FilterOperator -> Text
opToSql Eq = "="
opToSql Neq = "!="
opToSql Gt = ">"
opToSql Lt = "<"
opToSql Gte = ">="
opToSql Lte = "<="
opToSql In = "IN"

valueToSql :: FilterValue -> Text
valueToSql (TextValue t) = "'" <> t <> "'" -- WARNING: Use parameterized queries in production, this is for demonstration only
valueToSql (IntValue i) = pack (show i)
valueToSql (TextList ts) = "(" <> intercalate ", " (map (\t -> "'" <> t <> "'") ts) <> ")"
valueToSql (IntList is) = "(" <> intercalate ", " (map (pack . show) is) <> ")"

sortByToSql :: SortBy -> Text
sortByToSql (SortBy col dir) = col <> " " <> directionToSql dir

directionToSql :: SortDirection -> Text
directionToSql Asc = "ASC"
directionToSql Desc = "DESC"

Production-Grade Caveat: The valueToSql function above directly interpolates values for clarity. In a real project, this is absolutely the wrong approach. You must use the parameterized query functionality provided by your database driver, sending the query template and values separately to the database. For example, using the query function from postgresql-simple with ? placeholders.

3. Defining the API with Servant

Servant allows us to define the API’s structure using Haskell’s types, which makes the API implementation and documentation generation extremely reliable.

-- file: src/Api.hs
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE TypeOperators #-}

module Api where

import Data.Aeson (Value)
import Data.Text (Text)
import Servant
import QueryModel

-- Define the API type
-- A POST /query endpoint that takes an OlapQuery in the request body
-- and returns an array of JSON objects.
type OlapAPI = "query" :> ReqBody '[JSON] OlapQuery :> Post '[JSON] [Value]

-- Create a proxy
olapApi :: Proxy OlapAPI
olapApi = Proxy

4. API Server Implementation

Here we implement the API handler, which will connect to the database, execute the query, and return the results.

-- file: app/main.hs
{-# LANGUAGE OverloadedStrings #-}

import Network.Wai.Handler.Warp (run)
import Servant
import Api (OlapAPI, olapApi)
import QueryModel (OlapQuery)
import SqlGenerator (translateQuery)
import Data.Aeson (Value)
import Database.ClickHouse       -- Assuming a library like 'clickhouse-haskell' is used
import Control.Monad.IO.Class (liftIO)
import Data.Text (unpack)

-- Mock database configuration
dbConfig :: ClickHouseConnection
dbConfig = undefined -- Configure your ClickHouse connection info here

server :: Server OlapAPI
server = handleQuery
  where
    handleQuery :: OlapQuery -> Handler [Value]
    handleQuery query = do
      -- Log the received query
      liftIO $ putStrLn $ "Received query: " ++ show query
      
      -- Translate the OlapQuery into SQL
      let sql = translateQuery "logs_table" query
      liftIO $ putStrLn $ "Generated SQL: " ++ unpack sql
      
      -- Connect to the database and execute the query
      -- This is a simplified example; a real app needs connection pooling
      -- and more robust error handling.
      result <- liftIO $ runClickHouseQuery dbConfig sql
      
      case result of
        Left err -> do
          liftIO $ putStrLn $ "Query failed: " ++ show err
          throwError err500 { errBody = "Database query failed." }
        Right rows -> return rows -- Assuming the driver returns [Value]

app :: Application
app = serve olapApi server

main :: IO ()
main = run 8080 app

This implementation shows the entire flow: receive the request, translate the query, execute it, and return the result. A common mistake is to handle too much business logic inside the Handler. The correct approach is to abstract it into separate modules and unit test them, especially the SqlGenerator module.

Solid.js Frontend: High-Performance Data Display

The core challenges on the frontend are twofold: 1) building a reactive query state that corresponds to the backend’s type model, and 2) efficiently rendering a potentially massive result set.

1. State Management and Type Definitions

We use TypeScript to define a query object that mirrors the Haskell ADTs and use Solid’s createSignal or createStore to manage this state.

// file: src/queryStore.ts
import { createStore } from 'solid-js/store';

// TypeScript types corresponding to the Haskell ADTs
export type AggregationFunction = 'Count' | 'Sum' | 'Avg' | 'Min' | 'Max';
export type FilterOperator = 'Eq' | 'Neq' | 'Gt' | 'Lt' | 'Gte' | 'Lte' | 'In';
export type FilterValue = string | number | string[] | number[];

export interface Metric {
  func: AggregationFunction;
  column: string;
}

export interface Filter {
  fColumn: string;
  fOp: FilterOperator;
  fValue: FilterValue;
}

// ... other types like SortBy, SortDirection ...

export interface OlapQuery {
  dimensions: string[];
  metrics: Metric[];
  filters: Filter[];
  sortBy: any[] | null;
  limit: number | null;
  offset: number | null;
}

// Use a Solid store to manage the complex query state
const [query, setQuery] = createStore<OlapQuery>({
  dimensions: ['event_type'],
  metrics: [{ func: 'Count', column: '*' }],
  filters: [],
  sortBy: null,
  limit: 1000,
  offset: 0,
});

export { query, setQuery };

2. The Query Builder UI

The job of the UI components is to modify the query store defined above. For example, a dimension selector component:

// file: src/components/DimensionSelector.tsx
import { For } from 'solid-js';
import { query, setQuery } from '../queryStore';

const ALL_DIMENSIONS = ['user_id', 'event_type', 'country_code', 'path'];

export function DimensionSelector() {
  const handleCheckboxChange = (dimension: string, checked: boolean) => {
    if (checked) {
      setQuery('dimensions', (dims) => [...dims, dimension]);
    } else {
      setQuery('dimensions', (dims) => dims.filter((d) => d !== dimension));
    }
  };

  return (
    <div>
      <h4>Dimensions</h4>
      <For each={ALL_DIMENSIONS}>
        {(dim) => (
          <div>
            <input
              type="checkbox"
              id={`dim-${dim}`}
              checked={query.dimensions.includes(dim)}
              onChange={(e) => handleCheckboxChange(dim, e.currentTarget.checked)}
            />
            <label for={`dim-${dim}`}>{dim}</label>
          </div>
        )}
      </For>
    </div>
  );
}

3. Data Fetching and Virtualized Rendering

When the query state changes, we need to trigger an API request. createResource is the standard way to handle async data in Solid.js. The results must be rendered using a virtual list.

```tsx
// file: src/components/ResultsTable.tsx
import { createResource, For, Show } from ‘solid-js’;
import { query } from ‘../queryStore’;
import { createVirtualizer } from ‘@tanstack/solid-virtual’;

const fetchResults = async (q: OlapQuery) => {
try {
const response = await fetch(‘/api/query’, {
method: ‘POST’,
headers: { ‘Content-Type’: ‘application/json’ },
body: JSON.stringify(q),
});
if (!response.ok) {
throw new Error(API Error: ${response.statusText});
}
return (await response.json()) as Record<string, any>[];
} catch (err) {
console.error(“Failed to fetch query results:”, err);
// In a real app, provide better UI feedback for errors
return [];
}
};

export function ResultsTable() {
// createResource will automatically re-fetch when the query signal changes
const [data] = createResource(() => query, fetchResults);

let parentRef: HTMLDivElement | undefined;

const rowVirtualizer = createVirtualizer({
count: () => data()?.length ?? 0,
getScrollElement: () => parentRef,
estimateSize: () => 35, // Estimated height of each row
overscan: 5, // Number of extra items to render
});

const columns = () => {
if (!data() || data()!.length === 0) return [];
return Object.keys(data()


  TOC