×marble
all posts
May 18, 2026·13 min read

How to Add Personalization to Your App: 5 Patterns for 2026

Five concrete patterns for adding personalization to your app in 2026 — re-rank API, edge feature store, generative-on-the-fly, signal-driven, and the full layer.

Alex Shrestha·Founder, ×marble

How to Add Personalization to Your App: 5 Patterns for 2026

TL;DR.

  • There are five viable patterns for adding personalization to your app in 2026, ordered by how invasive the integration is: re-rank API, edge feature store, signal-driven, generative-on-the-fly, and the full personalization layer.
  • The re-rank API is the cheapest first move — you keep your existing list endpoint and add one HTTP call that reorders the array per user, typically in <30 ms p50.
  • The edge feature store moves precomputed user vectors to the CDN so reads happen at <10 ms from the browser without a round trip to your origin.
  • Signal-driven personalization treats every click, hover, and scroll as an event and re-ranks the next page load — this is what Spotify, TikTok, and Netflix do.
  • Generative-on-the-fly is new in 2026 — an LLM rewrites copy, headlines, and CTAs per user, but only for slots small enough that latency and cost stay sane.

If you're trying to figure out how to add personalization to your app this year, you don't have one decision to make — you have five. Each pattern below trades implementation effort against how much of the user experience it touches. We've shipped variants of all five and wired them into the reference architecture for real-time personalization we run for our own products. This post walks each one with a code-style sketch, a real product that uses it, and the trade-off that bites in production.

The patterns are ordered by invasiveness — when you decide how to add personalization to your app, pick the lowest one that solves your problem, not the highest one that looks impressive.

Pattern 1: the re-rank API (start here)

The cheapest way to add personalization to your app is to leave your existing list endpoint alone. Every team we've worked with started here when they asked how to add personalization to your app without a rewrite and add one HTTP call that reorders the array. Your /api/products or /api/feed returns the same 50 items in the same order it always has. You then POST that list — plus the user ID — to a personalization service. It returns the same 50 items in a different order. You render the reordered list.

// before: GET /api/products -> [item1, item2, ...]
const items = await fetchProducts(category)

// after: same fetch + one re-rank call
const items = await fetchProducts(category)
const ranked = await fetch('https://personalize.api/rerank', {
  method: 'POST',
  body: JSON.stringify({ user_id, items: items.map(i => i.id) }),
}).then(r => r.json())

return ranked.items.map(id => items.find(i => i.id === id))

When to use it. You have a working app with one or more lists (product grid, feed, search results, "recommended for you" rail). You don't want to rewrite anything. You're willing to add one network call, typically <30 ms, on the critical path.

Real example. Algolia's Personalization API uses exactly this shape — you keep using their search endpoint, but pass a userToken and they re-rank results based on prior events (Algolia docs). Most "personalization-as-a-service" vendors land here because it's the only integration most customers can absorb.

Trade-off. The re-rank API only touches order. It can't add an item that wasn't in your list, can't change layout, can't rewrite copy. You're personalizing on the output of your existing ranking — which means if your base ranking is bad, the re-rank can only do so much. The other trade-off is that the user ID has to be known at request time, which means logged-in users get the full benefit and anonymous users get a degraded experience until they accept enough cookies for a stable token.

Pattern 2: the edge feature store

The re-rank API works, but it puts a synchronous network call on every page load. The edge feature store removes that round trip by precomputing the user's relevant features and pushing them to a CDN-cached endpoint that the browser reads directly. The personalization happens client-side using the feature vector.

// at the edge (CloudFront / Cloudflare Workers / Vercel Edge)
export async function GET(req: Request) {
  const userId = req.headers.get('x-user-id')
  const features = await edgeKV.get(`user:${userId}:features`)
  return Response.json(features, {
    headers: { 'cache-control': 'public, max-age=300' },
  })
}

// in the client
const features = await fetch('/edge/features').then(r => r.json())
const ranked = clientSideRerank(items, features)

The trick is that "features" here is small — maybe {topCategories: [...], affinityVector: [0.2, 0.7, ...], priceTier: 'mid'} — a few KB at most. The server-side personalization job runs every few minutes (or on each interesting signal) and writes these vectors to an edge KV store like Cloudflare Workers KV, Vercel Edge Config, or DynamoDB Global Tables. The browser reads them in well under 10 ms because the CDN already has them at the nearest POP.

When to use it. You care about Largest Contentful Paint, your app is read-heavy, and the personalization decisions are simple enough to run client-side once you have the feature vector. Marketing landing pages, e-commerce home pages, and content discovery surfaces are all great fits.

Real example. Contentstack's Personalize Edge API ships exactly this pattern — personalization decisions happen at the edge close to the user, not round-tripped to origin. AWS has published a full reference implementation at aws-samples/personalization-apis showing the multi-layer caching topology (CloudFront, API Gateway, Lambda, DynamoDB).

Trade-off. You're pushing decisions to the client, which means you can't personalize anything the client can't see — server-rendered SEO content, for example, can use the features but the first render won't have them. You also have to be careful about what's in the feature vector — anything you ship to the client is visible to the user, so no "we think you're a bargain shopper" labels that would embarrass you in dev tools.

Pattern 3: signal-driven personalization

Patterns 1 and 2 personalize at request time using whatever state happens to be loaded. Signal-driven personalization flips the orientation — every click, hover, dwell, and scroll fires an event, and the personalization layer recomputes the user's state in near real time. The next page load reflects the last action, not a model that was trained yesterday.

// fire a signal on every meaningful interaction
track('item_clicked', { item_id, position, dwell_ms })
track('item_added_to_cart', { item_id, qty })
track('scroll_past', { item_id })

// server side: event stream -> feature update -> re-ranking refresh
// (this is where Kafka / Redpanda / a knowledge graph earns its keep)

The server-side half of this pattern is where things get interesting. You need an event pipeline, a feature store that updates incrementally (not a batch job), and a ranker that respects the new state on the very next request. Most teams use a stream processor (Flink, Spark Streaming, or just a Kafka consumer pool) feeding into Redis, DynamoDB, or a graph database. The reason knowledge graphs have caught on here is that signals naturally form a graph — (user)-[clicked]->(item)-[tagged]->(concept) — and graph traversal at read time is faster than rebuilding embeddings on every event. We wrote about this in knowledge graphs vs vector embeddings.

When to use it. You care about session-level personalization, not day-level. Discovery products (TikTok, Spotify's Home, YouTube Shorts) live or die on this. So do live commerce and any app where the user's interest evolves quickly inside a session.

Real example. TikTok's For You page is the canonical signal-driven personalization layer — every scroll, every pause, every replay is a signal, and the next batch of recommendations reflects what you just did. Spotify's Home page tile order updates within seconds of a like or skip. Our own Vivo daily briefing uses this pattern — your interactions during today's briefing shape tomorrow's.

Trade-off. Cost and complexity. You're running an event pipeline (which means a Kafka or equivalent), a streaming compute layer, and an online feature store. That's three pieces of infrastructure that didn't exist when you only had patterns 1 and 2. The other trade-off is the cold-start problem — for new users, you have no signals yet, so the first session falls back to a baseline. This is solvable (we covered it in the cold-start problem and day-zero personalization), but you have to plan for it.

Pattern 4: generative-on-the-fly

This pattern is new since 2024 and hit production stability around 2025. Instead of selecting from a finite catalog, you generate the personalized content with an LLM at request time. The page headline, the CTA copy, the empty-state message, the email subject line — all written per user, per session.

// the per-user copy generation pattern
const userContext = await getUserFeatures(userId)
const copy = await llm.generate({
  prompt: `Write a 1-sentence CTA for a user with these features:
           ${JSON.stringify(userContext)}.
           Tone: ${brand.tone}. Constraint: 8 words or less.`,
  cache: { key: `cta:${userContext.segment}`, ttl: 3600 },
})
return <Button>{copy}</Button>

The cache key matters. You don't generate per-user — that's too expensive and too slow. You generate per-segment, where segments come from the feature store. The 200,000 users in your "first-time visitor, organic search, mobile, US, evening" bucket all see the same generated copy, but a different bucket gets a different generation. We've seen good results bucketing into a few hundred segments and caching aggressively at the edge.

When to use it. You have copy-heavy surfaces (marketing pages, transactional emails, in-app banners) where the cost of writing variants by hand is prohibitive. The unit of personalization is content, not order.

Real example. Klaviyo, Iterable, and Braze have all shipped LLM-powered email subject line generation in the last 18 months. Spotify Wrapped's per-user copy in 2024 leaned on generative — the playlist names and listener archetypes were partly LLM-authored. Our Music product uses generative-on-the-fly for the daily mix descriptions — the songs come from the personalization layer, the why this mix today line is generated.

Trade-off. LLM latency and cost. Even a fast model is around 200 ms for a short generation, and a slow one is 1-2 s. That's incompatible with the critical render path of an app, so you do this offline (precompute per segment), at the edge (cache hard), or in slots that can tolerate latency (after first paint, in emails, in background jobs). The other trade-off is quality control — generative output is non-deterministic, which is fine for low-stakes copy and dangerous for legal disclosures, prices, or anything regulated.

Pattern 5: the full personalization layer

The four patterns above each touch one slice of the app. The fifth pattern is the one most teams want and the one most regret choosing too early — a full personalization layer that sits between your app and your data, and is the source of truth for every personalized decision: ranking, content, layout, notifications, search, emails.

// the personalization layer pattern -- one interface, many surfaces
const decision = await personalization.decide({
  user_id,
  surface: 'home_feed',   // or 'email_subject', 'push_copy', 'search_rank'
  candidates: items,      // optional -- layer can also generate them
  context: { device, locale, time_of_day },
})

return render(decision.items, decision.layout, decision.copy)

What makes this a "layer" and not a "service" is that it owns the data model. The user's interaction graph, the catalog graph, the segment definitions, the feature store, the ranker, and the generation hooks all live behind one interface. Your app calls personalization.decide() and gets back a decision — items, copy, layout, ranking, explanations — for any surface.

When to use it. When you have three or more surfaces that all need personalization (home, search, emails, push, etc.), and you've noticed that maintaining patterns 1-4 separately is producing inconsistency — the email recommends one product and the home page another, because each surface has its own copy of the ranking logic. We've written more about this distinction in recommendation engine vs personalization layer.

Real example. This is what Netflix's recommendation infrastructure looks like internally — one decision service serving home rows, similar titles, search, billboards, and the email/push notification team. Spotify's "Personalization" team owns a similar layer. Building one in-house typically takes 18-36 months and a team of more than ten engineers. Most companies should buy.

Trade-off. It's expensive to build and centralizing this concern means you have one team in the critical path of every surface. The upside is that every personalization decision in the app is consistent, every experiment is comparable, and every new surface gets personalization for free.

Choosing between the five

| Pattern | Implementation effort | Latency added | Surfaces touched | Where it shines | |---|---|---|---|---| | Re-rank API | 1-3 days | <30 ms | One list | Drop-in, no rewrite | | Edge feature store | 1-2 weeks | <10 ms | Many reads | Read-heavy, SEO-sensitive | | Signal-driven | 4-8 weeks | Negligible at read | Session-level surfaces | Discovery, feeds, live | | Generative-on-the-fly | 2-4 weeks | Variable, cached | Copy, headlines, emails | Content-heavy | | Full layer | 12+ months in-house | <50 ms per decide | All | When you have 3+ surfaces |

A practical sequence we recommend: start with pattern 1 to prove that personalization moves your top-line metric. If the lift is real, layer pattern 3 underneath for signal-driven freshness. Add pattern 4 selectively where copy variation pays. Only consider pattern 5 (or buying it) once you're sure you have a long-term personalization roadmap.

How ×marble fits in

×marble is the full personalization layer — pattern 5 — as a product. We built it because we got tired of seeing teams reinvent the same knowledge graph, the same signal pipeline, and the same online feature store every time. The interface is one decide() call per surface; the backing engine is a personalization knowledge graph that holds your users, your catalog, and the relationships between them.

You can use ×marble as the full layer from day one, or you can wire it in as the re-rank service behind pattern 1 and grow into the layer as your surfaces multiply. The same engine powers our products — daily AI briefings at Vivo, personalized YouTube at Video, and music discovery at ×marble Music — so we eat our own cooking. See the overview at timesmarble.com or read more in the marketing engineer's personalization stack.

FAQ

What is the easiest way to add personalization to your app?

The easiest way to add personalization to your app is the re-rank API pattern — you keep your existing list endpoint and add one HTTP POST that reorders the items per user. Most teams ship this in 1-3 days using a service like Algolia, AWS Personalize, or a personalization layer like ×marble. It typically adds <30 ms of latency and requires no changes to the front end beyond using the returned order.

What is a personalization API?

A personalization API is an HTTP service that takes a user identifier plus some context (usually a list of items, a surface name, or a feature vector) and returns a personalized decision — typically a reordered list, a set of recommendations, or a per-user piece of copy. The common shape is a POST /rerank or POST /decide endpoint with <30 ms p50 latency and global edge caching. It's the lowest-friction integration pattern for adding personalization to an existing app.

How do you personalize an app in real time?

Real-time personalization combines a signal pipeline (every click, hover, and scroll fires an event) with an online feature store (Redis, DynamoDB, or a knowledge graph) that updates within seconds, and a low-latency ranker that reads the latest features at request time. The architecture is the same one that powers TikTok's For You page and Spotify's Home — events stream in, features update incrementally, and the next page load reflects the most recent action. See our reference architecture for real-time personalization for the full topology.

How long does it take to build personalization into an app?

A simple re-rank integration takes 1-3 days. An edge feature store takes 1-2 weeks. A signal-driven pipeline with online features takes 4-8 weeks for the first version. A full in-house personalization layer (data model, signal pipeline, ranker, generation, multi-surface decision service) typically takes 18-36 months and a team of more than ten engineers — which is why most companies buy the layer instead of building it.

What is the difference between a recommendation engine and a personalization layer?

A recommendation engine is one component — it answers "what items should I show this user?". A personalization layer is the system that owns every personalized decision across every surface — ranking, copy, layout, notifications, search, emails — behind one interface. The layer typically uses a recommendation engine internally but adds the data model, signal pipeline, and decision API that make it consistent across the whole app. Read the long version in recommendation engine vs personalization layer.

Further reading

the product behind these notes

×marble is the personalization graph.

One API. A living knowledge graph per user. Day-zero ready, explainable by construction. We built it so you don't have to.

See ×marble