ArchiveBox: Your Web Archiving Solution

ArchiveBox is a powerful tool for archiving web content. This guide will walk you through the installation process and show you how to use it effectively for both local and remote archiving.

1. Installing ArchiveBox

To install ArchiveBox, run the installation script with the -i option:

kali-cdmx 192.168.99.14 ~ archivo -i
dpkg-query: no packages found matching docker-compose
Instalando el paquete docker-compose...
Reading package lists... Done
Building dependency tree... Done

[*] Checking links from indexes and archive folders (safe to Ctrl+C)...
[*] [2025-04-15 22:41:58] Writing 0 links to main index...
√ ./index.sqlite3

2. Creating a Web UI Admin User

After installation, create an admin user for the Web UI:

[√] Done. A new ArchiveBox collection was initialized (0 links).
[+] Creating new admin user for the Web UI...
Username (leave blank to use 'archivebox'):
Email address:
Password:
Password (again):
Error: Blank passwords aren't allowed.
Password:
Password (again):
Superuser created successfully.

3 - docker running

ArchiveBox Web UI

4. Archiving Web Pages

After downloading the Docker image, it will start running automatically and you can access the local webpage at 127.0.0.1:8000 (or your server IP: 192.168.99.14:8000).

ArchiveBox Web Interface

To archive a webpage, run the script again without the -i option:

kali-cdmx 192.168.99.14 ~ archivo

The script will prompt for a URL. Enter it and wait for the download to complete. The archived webpage will then be available on your server.

ArchiveBox URL Input ArchiveBox Download Progress ArchiveBox Download Complete

5. Remote Archiving via SSH

You can also archive pages remotely over SSH. Either run the script "archiv" or execute the following command:

kali-cdmx 192.168.99.14 ~ ssh kali-cdmx "cd ~/archivebox && echo 'Password' | sudo -S docker-compose run archivebox add --depth=0 https://google.com"
SSH Remote Archiving SSH Remote Archiving Complete

6. Using the Script

If you see the following error when running archiv:

archiv
Error: The 'archiveserv' environment variable is not set.
Please add the following line to your ~/.zshrc file:
export archiveserv='username@ip_or_hostname'

After setting the variable, you can check it and run archiv again:

mini 10.0.4.180 ~ echo $archiveserv
kali-cdmx
mini 10.0.4.180 ~ archiv
Enter the webpage URL (if missing 'https://', it will be added): 4rji.com
Enter your sudo password:
Executing the following command on server $archiveserv (kali-cdmx):
cd ~/archivebox && echo 'BIGPASSWORD' | sudo -S docker-compose run archivebox add --depth=0 https://4rji.com
[sudo] password for mazapana: Some networks were defined but are not used by any service: dns
Creating archivebox_archivebox_run ...
Creating archivebox_archivebox_run ... done
[i] [2025-04-15 23:28:27] ArchiveBox v0.7.3: archivebox add --depth=0 https://4rji.com
> /data

[+] [2025-04-15 23:28:28] Adding 1 links to index (crawl depth=0)...
> Saved verbatim input to sources/1744759708-import.txt
> Parsed 1 URLs from input (Generic TXT)
> Found 0 new URLs not already in index

[*] [2025-04-15 23:28:28] Writing 0 links to main index...
√ ./index.sqlite3

Remember to replace 'Password' with your actual sudo password when using the SSH command.