This article will provide a deeper view into bot training via Web URLs and points to keep in mind.


In This Article


Steps to training your bot using Web URLs

  1. Go to "Bot Training"
  2. Enter an entire URL (along with Https://) and choose one of the three web crawling mode (explained below)
  3. Wait for the URLs to be fetched and crawled
  4. Select the URLs and hit "Train Bot"
  5. Each URL is trained and added to the table below with its status. (Wait for all URLs to be trained before using the bot)


URL Crawling Modes

1. Exact URL

This is the recommended option for precise training. With the Exact URL method, the bot will crawl the exact URL provided and train itself.

Steps:

  1. Choose the option "Exact URL."
  2. Enter the URL you want to crawl and hit "Get Data."
  3. The URL is crawled and the bot is trained on it and added to the Uploaded Links Table


2. All URLs in this Domain

Train your bot with a broader range of information from a specific domain. The bot will crawl all the pages and links on the specified domain and provide you with the option to select which URLs to train from.

Steps:

  1. Choose the option "All URLs in this domain."
  2. Enter the URL and hit "Get Data."
  3. Wait for the pages to load, and then you'll be presented with a list of available URLs.
  4. Choose the pages that will be relevant to training the bot and hit "Train Bot."

During page selection (Step 4 above), you'll encounter two lists:

  1. New Pages - Fresh URLs not part of the bot's current training data. Selecting them will add them to the "Uploaded Links" table once the training is completed
  2. Existing Pages - URLs already part of the bot's current training dataset and visible in the "Uploaded Links" table below. Selecting them will refresh all the URLs selected

            



3. All URLs with this Path

The bot will crawl all pages on the provided URL and allow you to select which pages to train from, based on the existence of the specified path in the page URL. The further steps are the same as "All URLs in this Domain"


All the links/URLs that the bot has been trained on are visible in the Uploaded link table. 

Trained URLs can be refreshed (retrains the bot on the latest information) or deleted (information will be removed from the bot's knowledge base)

Each URL will have one of these 3 statuses:

  • Getting Data - The bot is training again on this URL ie the URL's information is being refreshed
  • Trained - The bot successfully learned from this URL. The "Last data refreshed at" is also visible which can be used to identify if a data refresh is required
  • Failed - The bot failed to train for this URL. You can either refresh and try again or delete the URL


- URLs are not instantly added to the Uploaded Links table. Wait for all URLs to appear in the table before using the bot

- Maintaining concise & relevant data significantly boosts the bot's performance. Regularly review & remove old URLs from Uploaded Links table for better responses