Using Screaming Frog's CLI mode on a server

Command line crawling with Screaming Frog SEO Spider

Screaming Frog just released SEO Spider v10, with a lot of impressive new features.
Amongst the list is the CLI mode: the ability to use the crawler without a GUI (on a server for example).

Here’s a quick guide on how to get started with Screaming Frog’s CLI mode on a Debian server.

Setup

We’ll assume that you’re correctly logged in to your server, via SSH for instance, and that you have administration (sudo) rights.
Remeber upgrading your system if needed ;-)

You’ll need to install some dependencies.

sudo apt-get install cabextract xfonts-utils  
wget http://ftp.de.debian.org/debian/pool/contrib/m/msttcorefonts/ttf-mscorefonts-installer_3.6_all.deb  
sudo dpkg - i ttf-mscorefonts-installer_3.6_all.deb  
sudo apt-get install xdg-utils zenity libgconf-2-4 fonts-wqy-zenhei  

First, let’s download the latest version:

wget https://download.screamingfrog.co.uk/products/seo-spider/screamingfrogseospider_10.0_all.deb  

Check the official website for an updated link to the latest file.
Once the file is downloaded, launch installation:

sudo dpkg -i screamingfrogseospider_10.0_all.deb  

Check if everything is OK:

screamingfrogseospider --help  

Licence

You’ll need to enter a licence to use SF in headless mode.
Simply edit ~/ScreamingFrogSEOSpider/licence.txt and enter your username on the first line, and your key on the second.

EULA agreement

At first launch, Screaming Frog’s GUI asks you to agree to the terms and conditions. This can’t be done without a GUI.
However, there’s a workaround.

Edit ~/ScreamingFrogSEOSpider/spider.config and add the following line:

eula.accepted=8  

Save and exit.

Start crawling

To start crawling in headless mode, you’ll need to use at least a few arguments:

  • --crawl <url> is the starting URL,
  • --headless is needed, otherwise SF will try to open a GUI (and fail),
  • --save-crawl enables you to save your data to a crawl.seospider file,
  • --output-folder <folder> will save the crawl data to the given folder,
  • --timestamped-output will create a timestamped folder in which your crawl.seospider file will be saved (this is useful to avoid crushing a previous crawl).

Here’s a minimalist example:

screamingfrogseospider --crawl https://www.example.com --headless --save-crawl --output-folder /home/julien/crawls --timestamped-output  

Other options and OS

Checkout SF documentation for more details on how to use Screaming Frog CLI mode on other operating systems, and what command line arguments are available.

Many thanks to the guys at Screaming Frog for this awesome release!

Let's work together !

Contact me !