Website crawling

Frontnow Advisor can crawl your website to collect data on your products, pages, and other relevant information.

Frontnow Advisor uses website crawling to collect data and generate answers to customer questions. You can configure the crawling of your website using the Frontnow REST API.

Authentication

To use the Frontnow REST API, you need to authenticate with an API key. You can obtain an API key from the Frontnow Advisor dashboard.

Single Website Crawl

Single website crawl

POST https://api.frontnow.com/crawl

Single website crawls using the Frontnow REST API allow you to collect data and generate answers for your website on demand. You can send a single crawl request to the Frontnow API and retrieve the crawled data once the crawl is complete.

Headers

Name
Type
Description

API_KEY*

String

Frontnow API Key to access the REST API.

Request Body

Name
Type
Description

url*

The URL of the website to crawl.

ignoreRobots

Boolean

Whether to ignore the robots.txt file, default is false.

timeout

Integer

The timeout in seconds, default is 30.

maxPages

Integer

The maximum number of pages to crawl, default is 1,000.

depth

Integer

The maximum depth to crawl, default is 2.

{
    "": "queued",
    "": "1234567890"
}

Scheduled crawling

Frontnow Advisor's REST API also allows you to set up recurring crawlers for your website. This is useful if you want to keep your data up-to-date or collect new data on a regular basis.

To set up a recurring crawler, you can use the following steps:

  1. Authenticate with an API key. You can obtain an API key from the Frontnow Advisor dashboard.

  2. Send a POST request to the following API endpoint: https://api.frontnow.com/crawl/schedule.

  3. In the request body, include the following fields

Scheduled website crawl

POST https://api.frontnow.com/crawl/schedule

Scheduled crawls using the Frontnow REST API allow you to collect data and generate answers for your website on a recurring basis. You can set up a recurring crawler to collect new data or keep existing data up-to-date, ensuring that the answers generated by Frontnow Advisor are always relevant and accurate.

Headers

Name
Type
Description

API_KEY

String

Frontnow API Key to access the REST API.

Request Body

Name
Type
Description

url*

String

The URL of the website to crawl.

timeout

Integer

The timeout in seconds, default is 30.

maxPages

Integer

The maximum number of pages to crawl, default is 1,000.

depth

Integer

The maximum depth to crawl, default is 2.

cron*

String

The cron expression for the recurring crawler. This expression determines when the crawler will run. You can use a tool like crontab.guru to generate a cron expression.

ignoreRobots

Boolean

Whether to ignore the robots.txt file, default is false.

{
    "status": "success",
    "message": "Recurring crawler has been scheduled successfully",
    "schedule_id": "1234567890"
}

Here's an example request body:

{
    "url": "https://example.com",
    "cron": "0 0 * * *",
    "depth": 3,
    "maxPages": 200,
    "timeout": 60,
    "ignoreRobots": true
}

In this example, the cron expression is set to 0 0 * * *, which means the crawler will run every day at midnight.

  1. The response body will include a schedule_id, which you can use to manage the recurring crawler.

By setting up a recurring crawler using the Frontnow REST API, you can keep your data up-to-date and collect new data on a regular basis, making sure that the answers generated by Frontnow Advisor are always relevant and accurate. If you have any questions or need assistance, please don't hesitate to contact us.

Checking Status

You can check the status of the crawl request by sending a GET request to the following API endpoint: https://api.frontnow.com/crawl/{crawl_id}.

In this API endpoint, replace {crawl_id} with the ID of the crawl request.

By using the Frontnow REST API to configure website crawling, you can collect data and generate answers to customer questions that are tailored to your website. If you have any questions or need assistance, please don't hesitate to contact us.

Was this helpful?