Add a website

At the end of this guide your bot will be trained on your website content and ready to answer visitor questions.

~5 minutes Free

Before you start

Your website must be publicly accessible — DGbot cannot crawl pages behind a login or paywall. If you have private content to add, use a PDF or FAQ upload instead.


Steps

1
Open Sources in the admin panel

Go to Sources in the sidebar. This is where all your training data lives.

Open Sources in admin
training-data/sources-empty Screenshot needed — save as: _assets/images/training-data/sources-empty.png

The Sources screen before any sources are added.

2
Click Add source and select Website URL

Click the Add source button in the top right. A picker appears — select Website URL.

training-data/sources-type-picker Screenshot needed — save as: _assets/images/training-data/sources-type-picker.png

Select Website URL from the source type picker.

3
Enter your URL and choose a crawl mode

Paste your URL into the input field. Choose how much of your site to crawl:

training-data/sources-website-form Screenshot needed — save as: _assets/images/training-data/sources-website-form.png

Paste your URL and select a crawl mode. Full site crawl is the most common choice.

Start with your homepage
Entering your homepage URL with Full site crawl discovers and trains on all linked pages automatically. You can always add specific pages as separate sources later.
4
Click Start training and wait for the Ready badge

Click Start training. DGbot begins crawling and processing your pages in the background.

The source status cycles through: Queued → Processing → Ready. Most sites finish in 30–60 seconds.

training-data/sources-ready Screenshot needed — save as: _assets/images/training-data/sources-ready.png

A green Ready badge means training is complete. The bot can now answer from this source.

Large sites take longer
Sites with 500+ pages may take several minutes. Training runs in the background — you don't need to stay on this screen.

Crawl modes

ModeWhat it crawlsBest for
Single pageThat URL onlyA specific pricing or contact page
Linked pagesThe URL and all pages linked from it (one level deep)Product category pages
Full site crawlThe URL and all reachable pages on the same domainMost websites — recommended default
SitemapUses your sitemap.xml to discover pagesLarge sites where crawl takes too long

Limits & quotas

Plan Sources Pages per source
Free 3 50
Starter 10 200
Pro 25 500
Business Unlimited Unlimited

Troubleshooting

Source stuck on Processing — Refresh the page after 2 minutes. If still stuck, click the source card and look for an error message. Common causes: the URL is not publicly accessible, the server blocked the crawler, or the page uses JavaScript rendering that DGbot cannot execute.

Pages missing from training — DGbot only follows links. Pages linked only via JavaScript navigation may not be discovered by full site crawl. Add them as individual URL sources, or provide your sitemap.xml.

Wrong content trained — If a page has a cookie consent overlay or paywall that appears before the content, DGbot may have trained on the overlay text instead. Remove those pages and re-add them as PDFs of the actual content.