# Website crawling

Frontnow Advisor uses website crawling to collect data and generate answers to customer questions. You can configure the crawling of your website using the Frontnow REST API.

## Authentication

To use the Frontnow REST API, you need to authenticate with an API key. You can obtain an API key from the Frontnow Advisor dashboard.

## Single Website Crawl

## Single website crawl

<mark style="color:green;">`POST`</mark> `https://api.frontnow.com/crawl`

Single website crawls using the Frontnow REST API allow you to collect data and generate answers for your website on demand. You can send a single crawl request to the Frontnow API and retrieve the crawled data once the crawl is complete.

#### Headers

| Name                                       | Type   | Description                              |
| ------------------------------------------ | ------ | ---------------------------------------- |
| API\_KEY<mark style="color:red;">\*</mark> | String | Frontnow API Key to access the REST API. |

#### Request Body

| Name                                  | Type    | Description                                              |
| ------------------------------------- | ------- | -------------------------------------------------------- |
| url<mark style="color:red;">\*</mark> |         | The URL of the website to crawl.                         |
| ignoreRobots                          | Boolean | Whether to ignore the robots.txt file, default is false. |
| timeout                               | Integer | The timeout in seconds, default is 30.                   |
| maxPages                              | Integer | The maximum number of pages to crawl, default is 1,000.  |
| depth                                 | Integer | The maximum depth to crawl, default is 2.                |

{% tabs %}
{% tab title="200: OK Successfully created crawl" %}

<pre class="language-json"><code class="lang-json">{
    "<a data-footnote-ref href="#user-content-fn-1">status</a>": "queued",
    "<a data-footnote-ref href="#user-content-fn-2">crawl_id</a>": "1234567890"
}
</code></pre>

{% endtab %}
{% endtabs %}

## Scheduled crawling

Frontnow Advisor's REST API also allows you to set up recurring crawlers for your website. This is useful if you want to keep your data up-to-date or collect new data on a regular basis.

To set up a recurring crawler, you can use the following steps:

1. Authenticate with an API key. You can obtain an API key from the Frontnow Advisor dashboard.
2. Send a POST request to the following API endpoint: `https://api.frontnow.com/crawl/schedule`.
3. In the request body, include the following fields

## Scheduled website crawl

<mark style="color:green;">`POST`</mark> `https://api.frontnow.com/crawl/schedule`

Scheduled crawls using the Frontnow REST API allow you to collect data and generate answers for your website on a recurring basis. You can set up a recurring crawler to collect new data or keep existing data up-to-date, ensuring that the answers generated by Frontnow Advisor are always relevant and accurate.

#### Headers

| Name     | Type   | Description                              |
| -------- | ------ | ---------------------------------------- |
| API\_KEY | String | Frontnow API Key to access the REST API. |

#### Request Body

| Name                                   | Type    | Description                                                                                                                                                              |
| -------------------------------------- | ------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| url<mark style="color:red;">\*</mark>  | String  | The URL of the website to crawl.                                                                                                                                         |
| timeout                                | Integer | The timeout in seconds, default is 30.                                                                                                                                   |
| maxPages                               | Integer | The maximum number of pages to crawl, default is 1,000.                                                                                                                  |
| depth                                  | Integer | The maximum depth to crawl, default is 2.                                                                                                                                |
| cron<mark style="color:red;">\*</mark> | String  | The cron expression for the recurring crawler. This expression determines when the crawler will run. You can use a tool like crontab.guru to generate a cron expression. |
| ignoreRobots                           | Boolean | Whether to ignore the robots.txt file, default is false.                                                                                                                 |

{% tabs %}
{% tab title="200: OK Successfully created scheduled crawl" %}

```javascript
{
    "status": "success",
    "message": "Recurring crawler has been scheduled successfully",
    "schedule_id": "1234567890"
}
```

{% endtab %}
{% endtabs %}

Here's an example request body:

```json
{
    "url": "https://example.com",
    "cron": "0 0 * * *",
    "depth": 3,
    "maxPages": 200,
    "timeout": 60,
    "ignoreRobots": true
}
```

In this example, the cron expression is set to `0 0 * * *`, which means the crawler will run every day at midnight.&#x20;

4. The response body will include a `schedule_id`, which you can use to manage the recurring crawler.

By setting up a recurring crawler using the Frontnow REST API, you can keep your data up-to-date and collect new data on a regular basis, making sure that the answers generated by Frontnow Advisor are always relevant and accurate. If you have any questions or need assistance, please don't hesitate to contact us.

## Checking Status

You can check the status of the crawl request by sending a GET request to the following API endpoint: `https://api.frontnow.com/crawl/{crawl_id}`.

In this API endpoint, replace `{crawl_id}` with the ID of the crawl request.

By using the Frontnow REST API to configure website crawling, you can collect data and generate answers to customer questions that are tailored to your website. If you have any questions or need assistance, please don't hesitate to contact us.

[^1]: The status of the crawl request. Possible values are `queued`, `in_progress`, and `complete`.

[^2]: The ID of the crawl request.
