# Tutorial: Scraping data

In this tutorial, we're going to scrape data from this webpage: <https://demo.goless.com/>.

1. To get started, access the extension, open the dashboard and click on "New Workflow".

<figure><img src="https://742850480-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fmtnl2J19CnNVA5fLFAPs%2Fuploads%2Fgit-blob-feabfcbc5971db87ee4e9d6351332115ab2fe288%2F1.jpg?alt=media" alt=""><figcaption></figcaption></figure>

2. Your workflow will start with a trigger. A [trigger](https://docs.goless.com/blocks/general/trigger) is an action defining when and under what conditions your automation should execute. By default, the trigger is set to "Manual" mode, meaning the automation will only run when you initiate it yourself.

<figure><img src="https://742850480-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fmtnl2J19CnNVA5fLFAPs%2Fuploads%2Fgit-blob-bf0a297551ac74b9811d4b21efc169444e31b7de%2F2.jpg?alt=media" alt=""><figcaption></figcaption></figure>

3. You have the option to select a different trigger or add multiple triggers for your automation. These could include intervals, schedules, context menus (right-click on web pages), specific dates, on browser start-up, or keyboard shortcuts.

<figure><img src="https://742850480-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fmtnl2J19CnNVA5fLFAPs%2Fuploads%2Fgit-blob-6d85e788f0540c7bd7335013e719a05eb9696636%2F4.jpg?alt=media" alt=""><figcaption></figcaption></figure>

4. Next, add the "[New tab](https://docs.goless.com/blocks/browser/new-tab-block)" block. This means that upon automation initiation, a new tab will open with the address you specify, which in this case will be the webpage for scraping: <https://demo.goless.com/>.

<figure><img src="https://742850480-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fmtnl2J19CnNVA5fLFAPs%2Fuploads%2Fgit-blob-14836c4e95c2124251bc50eeef1817751c0ebdaf%2F5.jpg?alt=media" alt=""><figcaption></figcaption></figure>

5. Next, we add a "[Loop elements](https://docs.goless.com/blocks/control-flow/loop-elements-block)" block. This block will iterate over the elements on the page as a list. We need to capture all blocks with the class `.post`. Thus, we specify the CSS Selector as `.post`.

<figure><img src="https://742850480-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fmtnl2J19CnNVA5fLFAPs%2Fuploads%2Fgit-blob-ba9f870b843cc18bc772f73b28d83af366a34850%2F6.jpg?alt=media" alt=""><figcaption></figcaption></figure>

* To select this `.post` class, visit the demo.goless.com site and enable Element Selector in the extension.

<figure><img src="https://742850480-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fmtnl2J19CnNVA5fLFAPs%2Fuploads%2Fgit-blob-35cf9c65a0ffcb112d859f9bb163247833c1cde8%2F7.jpg?alt=media" alt=""><figcaption></figcaption></figure>

* We then hover our mouse over the desired block to obtain information about the required class.

<figure><img src="https://742850480-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fmtnl2J19CnNVA5fLFAPs%2Fuploads%2Fgit-blob-60f073effc852a8ba2ef09da7bccae324fcdeb4f%2Fimage.png?alt=media" alt=""><figcaption></figcaption></figure>

6. Next, within the `.post` element, we need to get the title. To do this, we add a "[Get text](https://docs.goless.com/blocks/web-interaction/get-text-block)" block to our workflow. In the settings, we specify: `{{ loopData@items }} .title` - here we instruct the script to take the elements from the previous block (items in our case, which should be pre-defined as **Loop ID**) and search within it for the CSS class `.title`.

<figure><img src="https://742850480-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fmtnl2J19CnNVA5fLFAPs%2Fuploads%2Fgit-blob-cb536a5861c9c6e68b3506cc925ad5fd55ca3791%2F8.jpg?alt=media" alt=""><figcaption></figcaption></figure>

{% hint style="info" %}
If you need to capture several fields and export them, you will need to set up a [table](https://docs.goless.com/workflow/workflow-table). Select the "Insert to table" checkbox and choose into which field of the table the data should be added. Click on the table icon in the top-right corner beforehand to create a table format.

<img src="https://742850480-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fmtnl2J19CnNVA5fLFAPs%2Fuploads%2Fgit-blob-ecf4e2489bad6ff32738e9a9bb0e6ce8e851c79c%2Fimage.png?alt=media" alt="" data-size="original"><img src="https://742850480-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fmtnl2J19CnNVA5fLFAPs%2Fuploads%2Fgit-blob-d5c20005e371110979e1ff2629fd60970926861a%2Fimage.png?alt=media" alt="" data-size="original">
{% endhint %}

6. To terminate the loop, add a [Loop Breakpoint](https://docs.goless.com/blocks/control-flow/loop-breakpoint) and specify the id of the Loop elements, which is `items` in our case.

<figure><img src="https://742850480-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fmtnl2J19CnNVA5fLFAPs%2Fuploads%2Fgit-blob-2a18b5aff59a8a7dc6ebad215998195fb980ea39%2F9.jpg?alt=media" alt=""><figcaption></figcaption></figure>

6. The final block is the data export. You need to add an "[Export data](https://docs.goless.com/blocks/general/export-data)" block to download the gathered data upon completion.

<figure><img src="https://742850480-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fmtnl2J19CnNVA5fLFAPs%2Fuploads%2Fgit-blob-b588940c891918c015558eec20a8da9bbb512bf1%2F10.jpg?alt=media" alt=""><figcaption></figcaption></figure>

And with that, our workflow setup is complete. Upon running, the automation will save the data from the website in a csv file.
