Enterprise - Auto-Scrape Sub Pages

What is Automatic Subpage Collection?

When new posts or products appear on a list page, the subpage URLs are automatically added to the group and extraction proceeds. With automatic subpage collection, you don't need to manually gather or update addresses one by one—you can automatically detect updated lists and collect new data.

This feature is especially useful when extracting continuously updated pages like posts or product pages. For example, if you connect a shopping mall list page where new products are registered daily, new product detail pages will be automatically collected, maintaining the latest data state.


How to Use

1

Create Parent Task

  • Run Listly on the list page and click the [Parts] button.

  • After selecting the extraction area, choose 'Hyperlink' in the extraction options to collect only links to each detail page.

  • This task that collected only detail page hyperlinks is the parent task needed for 'automatic subpage collection'. The URL of this task will be used later when connecting the child task, so copying it in advance will be convenient.

2

Create Child Task and Connect to Parent Task

  • On the detail page, select the area that will serve as the reference for group extraction and extract data. (This process is the same as regular Group Extraction.)

  • On the results page, check the tab where the selected data is located, then click the [+Group] button.

  • In the URL input field at the bottom of the group extraction settings window, select [Choose URL from existing task] and paste the URL of the parent task you created earlier.

  • Click the [Preview] button and select the column containing detail page URLs. If 5 URLs appear in the bottom preview window, it's working correctly. After confirming, click the [Done] button.

  • When you register a group this way, the child task (detail page group extraction task) is connected under the parent task (hyperlink collection task). Whenever the parent task discovers new links, those URLs are automatically added to the child task and collection proceeds. Therefore, set the schedule on the parent task.

  • Set the schedule according to your desired frequency and time period.

  • When the parent task runs according to the set schedule, if new items are registered on the list page, those URLs will be automatically updated in the child group.

Last updated

Was this helpful?