Add a website
At the end of this guide your bot will be trained on your website content and ready to answer visitor questions.
Before you start
Your website must be publicly accessible — DGbot cannot crawl pages behind a login or paywall. If you have private content to add, use a PDF or FAQ upload instead.
Steps
Go to Sources in the sidebar. This is where all your training data lives.
Open Sources in adminThe Sources screen before any sources are added.
Click the Add source button in the top right. A picker appears — select Website URL.
Select Website URL from the source type picker.
Paste your URL into the input field. Choose how much of your site to crawl:
Paste your URL and select a crawl mode. Full site crawl is the most common choice.
Click Start training. DGbot begins crawling and processing your pages in the background.
The source status cycles through: Queued → Processing → Ready. Most sites finish in 30–60 seconds.
A green Ready badge means training is complete. The bot can now answer from this source.
Crawl modes
| Mode | What it crawls | Best for |
|---|---|---|
| Single page | That URL only | A specific pricing or contact page |
| Linked pages | The URL and all pages linked from it (one level deep) | Product category pages |
| Full site crawl | The URL and all reachable pages on the same domain | Most websites — recommended default |
| Sitemap | Uses your sitemap.xml to discover pages | Large sites where crawl takes too long |
Limits & quotas
| Plan | Sources | Pages per source |
|---|---|---|
| Free | 3 | 50 |
| Starter | 10 | 200 |
| Pro | 25 | 500 |
| Business | Unlimited | Unlimited |
Troubleshooting
Source stuck on Processing — Refresh the page after 2 minutes. If still stuck, click the source card and look for an error message. Common causes: the URL is not publicly accessible, the server blocked the crawler, or the page uses JavaScript rendering that DGbot cannot execute.
Pages missing from training — DGbot only follows links. Pages linked only via JavaScript navigation may not be discovered by full site crawl. Add them as individual URL sources, or provide your sitemap.xml.
Wrong content trained — If a page has a cookie consent overlay or paywall that appears before the content, DGbot may have trained on the overlay text instead. Remove those pages and re-add them as PDFs of the actual content.