Converting your Zotero Website Snapshots To PDF
Converting Zotero website snapshots into PDFs for easier annotation.
So, while writing my latest report for university, I am researching some up-to-date topics. Unlike academic literature, much information is distributed by websites.
As I began implementing the methods mentioned in this video (German), I encountered difficulty making notes on cited websites.
Zotero can save website snapshots onto its library. Zotero does this by saving an HTML snapshot, which is good for accessing the state of the website at the time of capturing but not handy for annotating the website from within Zotero.
Furthermore, snapshots cannot properly function when JavaScript is used to display its content or is necessary when accessing information interactively.
Wouldn't having a website conversion tool for creating said PDFs from those snapshots be nice?
Well, you're in luck because I have hacked something together.
Prerequisites
- A working Python 3.12 installation
- Google Chrome
- Some programming knowledge to adjust the script configuration
- Your Zotero Library ID
- A Zotero API key with Read/Write Access
Disclaimer: This script is a solution that should not be taken seriously in terms of software development practices. I will update it accordingly, but there appears to be a demand for how to tackle this issue.
I have tested the script on macOS and Linux. I do not know if its able to run on Windows!
Running the script
You can find the necessary and, most likely, more up-to-date information in the accompanying repository.
Make sure that you follow the guidelines in the repository's README file.
Local Zotero Library Backup
I strongly emphasize creating a backup of your local Zotero library before continuing with this article. I have done this by creating a copy of my ~/Zotero
folder.
Preparing the Python environment
I recommend setting up a virtual environment to run this script.
python3 -m venv myenv && source ./myenv/bin/activate
After this step is done, clone the repository to your target directory by running the following command:
git clone https://github.com/valerius21/zotero-snapshot-to-pdf.git && cd zotero-snapshot-to-pdf
Then, install the dependencies inside the cloned repository with:
python -m pip install -r requirements.txt
Subsequently, copy over the config.example.toml
to config.toml
and adjust the parameters inside according to the comments.
cp config.example.toml config.toml
Running the script
If everything is done correctly, change into the repository's root directory and execute:
python convert_pdfs.py
The script will let you know when it's done.
Questions or comments?
If you would like to contribute to the script and/or have any questions, please drop me a DM on X or email me. PRs are welcome. Thank you!