Add a website

At the end of this guide your bot will be trained on your website content and ready to answer visitor questions.

~5 minutes Free

Before you start

Your website must be publicly accessible — DGbot cannot crawl pages behind a login or paywall. If you have private content to add, use a PDF or FAQ upload instead.

Steps

Open Sources in the admin panel

Go to Sources in the sidebar. This is where all your training data lives.

Open Sources in admin

training-data/sources-empty Screenshot needed — save as: _assets/images/training-data/sources-empty.png

The Sources screen before any sources are added.

Click Add source and select Website URL

Click the Add source button in the top right. A picker appears — select Website URL.

training-data/sources-type-picker Screenshot needed — save as: _assets/images/training-data/sources-type-picker.png

Select Website URL from the source type picker.

Enter your URL and choose a crawl mode

Paste your URL into the input field. Choose how much of your site to crawl:

training-data/sources-website-form Screenshot needed — save as: _assets/images/training-data/sources-website-form.png

Paste your URL and select a crawl mode. Full site crawl is the most common choice.

Start with your homepage

Entering your homepage URL with Full site crawl discovers and trains on all linked pages automatically. You can always add specific pages as separate sources later.

Click Start training and wait for the Ready badge

Click Start training. DGbot begins crawling and processing your pages in the background.

The source status cycles through: Queued → Processing → Ready. Most sites finish in 30–60 seconds.

training-data/sources-ready Screenshot needed — save as: _assets/images/training-data/sources-ready.png

A green Ready badge means training is complete. The bot can now answer from this source.

Large sites take longer

Sites with 500+ pages may take several minutes. Training runs in the background — you don't need to stay on this screen.

Crawl modes

Mode	What it crawls	Best for
Single page	That URL only	A specific pricing or contact page
Linked pages	The URL and all pages linked from it (one level deep)	Product category pages
Full site crawl	The URL and all reachable pages on the same domain	Most websites — recommended default
Sitemap	Uses your sitemap.xml to discover pages	Large sites where crawl takes too long

Limits & quotas

Plan	Sources	Pages per source
Free	3	50
Starter	10	200
Pro	25	500
Business	Unlimited	Unlimited

Troubleshooting

Source stuck on Processing — Refresh the page after 2 minutes. If still stuck, click the source card and look for an error message. Common causes: the URL is not publicly accessible, the server blocked the crawler, or the page uses JavaScript rendering that DGbot cannot execute.

Pages missing from training — DGbot only follows links. Pages linked only via JavaScript navigation may not be discovered by full site crawl. Add them as individual URL sources, or provide your sitemap.xml.

Wrong content trained — If a page has a cookie consent overlay or paywall that appears before the content, DGbot may have trained on the overlay text instead. Remove those pages and re-add them as PDFs of the actual content.

Previous Best practices Next Upload PDF / DOCX