Why Add an AI Chatbot to Your Webflow Site?
Visitor-facing AI chatbots reduce support ticket volume, qualify leads around the clock, and give users instant answers drawn from your own content. Unlike rigid decision-tree bots, a large-language-model chatbot built on Anthropic's Claude can handle nuanced questions, summarize long pages, and guide visitors toward the right product or service. Because Claude is available through a simple REST API, you can integrate it into a Webflow site without swapping platforms or building a separate app.
High-Level Architecture
The architecture has three pieces. First, a chat widget embedded in Webflow via a custom code block — a floating button in the bottom-right corner that expands into a conversation panel. Second, a lightweight serverless function (Supabase Edge Function, Cloudflare Worker, or Vercel Edge Function) that acts as a proxy between the browser and the Claude API. This proxy is necessary because you must never expose your Anthropic API key in client-side code. Third, an optional knowledge base stored in Supabase with pgvector for retrieval-augmented generation (RAG), so the chatbot answers questions using your specific content rather than general knowledge.
// Supabase Edge Function — /functions/chat/index.ts
import { serve } from "https://deno.land/std@0.177.0/http/server.ts";
import Anthropic from "npm:@anthropic-ai/sdk@0.39.0";
const anthropic = new Anthropic({
apiKey: Deno.env.get("ANTHROPIC_API_KEY")!,
});
const SYSTEM_PROMPT = `You are the LIVV Studio assistant. Answer questions
about our web design, Webflow development, and creative engineering services.
Be concise, helpful, and friendly. If you don't know something, say so.`;
serve(async (req) => {
const { messages } = await req.json();
const response = await anthropic.messages.create({
model: "claude-sonnet-4-20250514",
max_tokens: 1024,
system: SYSTEM_PROMPT,
messages, // array of { role, content }
});
return new Response(JSON.stringify({
reply: response.content[0].text,
}), {
headers: { "Content-Type": "application/json" },
});
});Building the Chat Widget
The front-end widget is vanilla HTML, CSS, and JavaScript injected through Webflow's page-level or project-level custom code. The widget maintains a local array of messages, renders them in a scrollable container, and sends the full conversation history to your proxy on every submission. Streaming the response (using the Claude API's stream option and reading the ReadableStream in the browser) makes the bot feel responsive — tokens appear as they are generated rather than after a multi-second wait.
// Front-end: send message and stream response
async function sendMessage(userText) {
appendMessage("user", userText);
conversationHistory.push({ role: "user", content: userText });
const res = await fetch(EDGE_FUNCTION_URL, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ messages: conversationHistory }),
});
const { reply } = await res.json();
appendMessage("assistant", reply);
conversationHistory.push({ role: "assistant", content: reply });
}
// appendMessage() creates a DOM node inside the chat panel
function appendMessage(role, text) {
const bubble = document.createElement("div");
bubble.className = `chat-bubble chat-${role}`;
bubble.textContent = text;
chatContainer.appendChild(bubble);
chatContainer.scrollTop = chatContainer.scrollHeight;
}Adding RAG with pgvector for Contextual Answers
To make the chatbot answer questions about your specific content — service pages, case studies, pricing — you need retrieval-augmented generation. The process works like this: chunk your site content into passages of roughly 500 tokens, generate an embedding for each chunk using an embedding model, and store those embeddings in a Supabase table with the pgvector extension. When a user asks a question, embed their query, run a cosine similarity search against your chunks, and inject the top 3–5 results into the Claude system prompt as context. This way Claude grounds its answers in your actual content.
-- Enable pgvector extension in Supabase
CREATE EXTENSION IF NOT EXISTS vector;
-- Create a table for content embeddings
CREATE TABLE content_chunks (
id BIGSERIAL PRIMARY KEY,
page_url TEXT NOT NULL,
chunk_text TEXT NOT NULL,
embedding VECTOR(1536), -- dimension depends on model
created_at TIMESTAMPTZ DEFAULT now()
);
-- Similarity search function
CREATE OR REPLACE FUNCTION match_chunks(
query_embedding VECTOR(1536),
match_count INT DEFAULT 5
)
RETURNS TABLE (chunk_text TEXT, page_url TEXT, similarity FLOAT)
LANGUAGE plpgsql AS $$
BEGIN
RETURN QUERY
SELECT c.chunk_text, c.page_url,
1 - (c.embedding <=> query_embedding) AS similarity
FROM content_chunks c
ORDER BY c.embedding <=> query_embedding
LIMIT match_count;
END;
$$;Styling the Widget to Match Your Webflow Design
Inject the widget styles through the same custom code block. Use CSS custom properties that reference your Webflow project's design tokens — font family, primary color, border radius — so the chat panel feels native rather than bolted on. Position the widget with fixed positioning in the viewport corner, and use a CSS transition for the open/close animation. Keep the z-index high (9999) so it floats above Webflow interactions and modals.
Never expose your Anthropic API key in client-side JavaScript. Always proxy requests through a serverless function that stores the key as an environment variable. This is non-negotiable for production deployments.
Rate Limiting and Cost Control
Claude API calls cost money per token. Without guardrails, a single abusive user could rack up hundreds of dollars. Implement three layers of protection: first, rate-limit the Edge Function to a maximum number of requests per IP per minute using a simple in-memory counter or Supabase table. Second, cap each conversation at a reasonable length (e.g., 20 turns) and show a message like 'For more detailed help, contact our team.' Third, set a monthly budget alert in the Anthropic console so you are notified before costs exceed your target.
- Rate-limit by IP: max 10 requests per minute per visitor
- Cap conversation length at 20 turns, then surface a CTA
- Set a system prompt token budget — keep it under 1,500 tokens
- Use claude-sonnet-4-20250514 for cost efficiency; reserve Opus for complex queries
- Monitor usage weekly via the Anthropic dashboard
Want an AI chatbot integrated into your Webflow site?
Get a custom chatbot built→
