Introduction
Welcome to the documentation of the Gutenberg project!
The home of the Gutenberg project is its GitHub repository: KSIUJ/gutenberg. It hosts the project's source code, the source of this book and the list of issues and planned features.
Book structure
This book is divided into two sections Admin documentation and Internals.
The Admin documentation section is intended for system administrators who manage a self-hosted Gutenberg instance.
The API documentation section documents Gutenberg's REST API and Internet Printing Protocol (IPP) implementation.
The Internals section is mainly intended for Gutenberg's contributors. It explains the codebase structure, describes the design choices and their rationale. It also describes the suggested implementation and design considerations for planned features.
Contributing to this book
This source of this book is in the same GitHub repository as Gutenberg's code,
in the docs
directory.
The book has been created using mdBook.
These mdBook plugins are also installed:
To contribute, fork this repository and create a pull request with your changes.
Instance setup
note
This document is incomplete
Requirements
- Printer: make printing available for server network
- Linux server: install drivers, configure CUPS
- Linux server: test
lp
command - Check if you have following commands available:
convert
(imagemagick
),unoconv
,gs
(ghostscript
),pdftk
, andbbwrap
(bubblewrap
)- Debian/Ubuntu:
sudo apt install imagemagick unoconv ghostscript bubblewrap pdftk
- Arch Linux:
sudo pacman -S imagemagick unoconv ghostscript bubblewrap pdftk
- Debian/Ubuntu:
- Gutenberg uses
uv
as the Python project manager. See https://docs.astral.sh/uv/getting-started/installation/ for install instructions. - You will also need to have
yarn
ornpm
to build the web interface.
Setting up the app
First, set the temporary GUTENBERG_ENV
environment variable to one of these two values:
export GUTENBERG_ENV=local # local development
export GUTENBERG_ENV=production # production settings
And, if you haven't done it yet, set your $EDITOR
variable:
export EDITOR=vim # flamewar starting in 3, 2, 1...
Now, execute the following commands:
export DJANGO_SETTINGS_MODULE=gutenberg.settings.${GUTENBERG_ENV}_settings
git clone https://github.com/KSIUJ/gutenberg.git
cd gutenberg
# Setup the Python virtual environment in .venv and install required packages.
# uv will also download the correct Python version based on pyproject.toml,
# if the version installed on your machine is different.
cd backend
uv sync
cd ../
cd backend
cp ${GUTENBERG_ENV}_settings.py.example ${GUTENBERG_ENV}_settings.py
$EDITOR ${GUTENBERG_ENV}_settings.py # edit the values appropriately
cd ../
yarn install
yarn build
# Execute all Python commands through uv
cd backend
uv run manage.py migrate
uv run manage.py runserver 0.0.0.0:11111
cd ../
# visit localhost:11111 and check if everything works
You will also need to start at least one worker. In the main directory after activating the virtual environment:
cd backend
uv run celery -A gutenberg worker -B -l INFO
For proper deployment (instead of uv run manage.py runserver
), see the
uWSGI documentation.
IPP features might not work with runserver
- proper front webserver is required (works with eg. nginx
+ uwsgi
).
This is due to an error in Django (or one of its dependencies) - the Expect: 100-continue
HTTP header is not handled
properly by the development server (IPP standard requires it).
Please remember to add both uwsgi
(or your server of choice) AND celery
worker (including celery beat) to systemd
(or the init server you use).
Exemplary production configs for systemd
, uwsgi
and nginx
setup are available in the /examples/
directory.
Printer management
This document is intended for Gutenberg instance admins; it explains how to configure printers and manage printing permissions.
note
This document is incomplete
Admin interface
Management actions are performed in the Django admin interface,
which can be accessed by appending /admin/
to the instance URL.
The simplest way to access it is to create a superuser account:
uv run manage.py createsuperuser
Printer list
A new printer can be created in the Control > Printers section of the admin interface.
Adding a printer via CUPS
CUPS is the standard printing system on Linux operating systems.
CUPS provides a web interface for managing printers, which can be accessed at http://localhost:631/
on the server where the Gutenberg Celery worker is running.
When adding a printer a CUPS, the Printer type field should be set to local cups.
You can use the web interface or the command below to find the list of available printers:
lpstat -v
Use the name (not the URL) from the output of the command above as the value of the Cups printer name field.
Most other fields are optional.
Managing printing permissions
Only users who are in a group listed in the Printer permissions list can access the printer.
important
This restriction also applies to superuser accounts.
IPP and REST API overview
Gutenberg provides two APIs for interacting with it:
See their documentation pages for more details.
The REST API is intended for use in the webapp (UI) component of Gutenberg. In the future a token-based authentication scheme might be implemented for use by other API clients.
Most existing REST API endpoints map to corresponding IPP operations and have similar semantics. This design reduces code duplication in the IPP and REST API modules. The page IPP and REST API comparison contains a table which lists the matching REST API endpoints and IPP operations.
Standard sequences of operations for printing documents
Printing single-document print jobs in a single request
When a print job consists of only a single document, both the REST API and IPP provide a simple way to select print job
attributes and upload the file in a single request.
In IPP the Print-Job
operation is used for this,
the REST API endpoint is POST /api/jobs/submit/
.
The print job is started immediately after the upload is complete.
Printing multi-document print jobs
- A new, empty print job is created using the
Create-Job
IPP operation or thePOST /api/jobs/create_job/
endpoint. The print job attributes are supplied in this request, and they are used to print all the files. The server's response includes the job id, which is used for subsequent requests. - Documents are uploaded sequentially using the
Send-Document
operation or thePOST /api/jobs/:id/upload_artefact/
endpoint. - To complete the print job and to enqueue it, the client must do one of the following:
- Set the
last-document
(IPP) orlast
(REST API) flag in the last artifact upload request in the previous step. - IPP only: execute an additional
Send-Document
operation withlast-document
set totrue
and no document data in the request body. - Execute the
Close-Job
operation or make a request to thePOST /api/jobs/:id/run_job/
endpoint.
- Set the
Internet Printing Protocol (IPP)
The Internet Printing Protocol is an extensible protocol maintained by the Printer Working Group. Check the IPP Guide for an overview of IPP.
Gutenberg implements an IPP server. The IPP operations are not proxied directly to the physical printer but are handled by Gutenberg. Gutenberg verifies printing permissions, manages accounting (print stats and quotas) and processes the supplied documents. This has the implications of:
- Gutenberg might support some IPP operations, attributes or formats that the physical printer does not. (E.g., it might support submitting .docx files, even if the physical printer can only accept PDFs. In this case Gutenberg will convert the document to PDF, which it will then send to the printer).
- Some operations, attributes or formats supported by the physical printer might not be supported when printing via Gutenberg. (E.g., the printer might support stapling the media sheets after a print job is complete, but there is no way to use this feature via Gutenberg).
Supported IPP standards and versions
note
This section is incomplete
Supported IPP operations
note
This section is incomplete
Supported job attributes
note
This section is incomplete
Supported file formats
note
This section is incomplete
IPP endpoint and authentication
note
This section is incomplete
REST API
Gutenberg implements a REST API using Django REST framework.
The endpoint for the REST API is <GUTENBERG_INSTANCE_URL>/api/
.
You can explore the API by browsing it. DRF generates interactive HTML views for all routes.
note
The auto-generated documentation is currently incomplete and in some cases displays incorrect schemas.
Authentication
The REST API supports only cookie-based session authentication. This makes it unsuitable for uses other than the Gutenberg's webapp. Support for other authentication schemes might be added in the future.
IPP and REST API comparison
This table lists the REST API endpoints and implemented IPP operations, which have similar semantics.
REST API endpoint | IPP operation | Notes |
---|---|---|
GET /api/printers/ | not applicable | In Gutenberg IPP itself is not used for printer discovery. An ipp: (or ipps: ) URI is specific to one printer. More recent updates to IPP add support for the output-device attribute, but it's not used in Gutenberg. |
GET /api/printers/:printer_id/ | Get-Printer-Attributes (RFC 8011) | |
POST /api/jobs/submit/ | Print-Job (RFC 8011) | |
none | Validate-Job (RFC 8011) | |
GET /api/jobs/ | Get-Jobs (RFC 8011) | |
GET /api/job/:id/ | Get-Job-Attributes (RFC 8011) | |
POST /api/jobs/:id/cancel/ | Cancel-Job (RFC 8011) | |
POST /api/jobs/create_job/ | Create-Job (RFC 8011) | |
POST /api/jobs/:id/upload_artefact/ | Send-Document (RFC 8011) | |
POST /api/jobs/:id/run_job/ | Close-Job (PWG 5100.7-2023) | This endpoint also maps to the Send-Document operation, but with last-document set to true and no file provided. This used to be the standard way of finishing a job in IPP v1.1 when the document count is not known in advance. |
none | Identify-Printer (PWG 5100.13) | Implemented as a no-op. |
UX and UI design goals
Goals
Printing and config
- Simple printing should be as easy as possible. File upload should be possible from the main page.
- There should be an indication of what file types are allowed.
Print config
- The printing logic should be ready for manual duplex printing, which involves a two-stage printing process.
- The UI should support settings specific for a given format.
- If printing multiple files at once is possible, it should be possible to edit some settings separately for each file.
- The UI should not be overwhelming, irrelevant settings should be hidden.
- If possible, settings should include a visual explanation. If the preview option is good enough, this might not be necessary.
Preview
- The user should be able to tell exactly what the printed pages will look like.
- It should be easy to tell the order of the resulting pages and the backsides of the pages. For duplex printing, the user should be able to clearly see the difference between the "Two-sided (long edge)" and "Two-sided (short edge)" options.
- The user should never see a not up-to-date preview.
IPP
- The IPP feature should be easily discoverable from the main page.
- This feature should be easy to understand for non-technical users.
Print queue
- If there are documents in the user's print queue, an indicator for that should be visible on all/most pages.
- It should be possible to cancel a print job from the print queue.
General UI
- The page should be responsive.
- The UX on mobile and desktop can be different. For example, some options that are always visible on desktop can on mobile appear only after uploading the first file.
- A button with the text
Print
should only be used to start the print job.
UI
Main page
On desktop, the main page will use a two-column layout with a header: The header contains the logo and a user menu.
The left column contains only a card with file upload elements and configuration options. The card will be prominent.
The right column will describe other ways to print: IPP and the REST API. It could also describe what Gutenberg is and how to use to self-host it.
Print preview
After uploading files, the user should be presented with the buttons Preview
and Print
.
Clicking Preview
takes the user to the preview mode:
On desktop the upload + config card will remain visible and will be moved to the left screen edge, the right side will transform to show the print preview.
On mobile the preview will take the whole screen, there should be a button to close it and go back to modify the print config.
The preview updates automatically after the user changes the configuration. While a new preview is loading, the previous preview is grayed out.
The preview can have multiple display modes:
2D mode
This view shows all the pages in a reasonable order. It should be decided if this order respects the "Reverse order" setting. The user can change the preview orientation (independently of the print settings).
If duplex printing is enabled, the preview shows the backsides next to the front sides. The front pages are always on the left, the back pages on the right. The back page should bo oriented as if the page was flipped along the common, vertical edge (in the selected preview orientation) between the front and back pages. In particular this means that if the pages are displayed in landscape mode, plus the "Two-sided (long edge)" option is selected, or portrait mode + "Two-sided (short edge)", the right page is rotated 180 degrees.
The rationale for this is that if the user selects the display orientation that places the front page upright, they will be able to see if the back page is upside down.
3D mode
This view shows the pages in an isometric 3D view as if they were printed on paper and stacked. The order should match the order in the stack of printed pages. This requires admin configuration. The backsides are revealed by a 3D rotation on hover.
Planned API extensions
The first two extensions are designed with the #63 Printing Previews feature in mind. The previews will be generated on the server (likely in a Celery worker), not on the client device. This requires uploading documents before the request to start the print job is made. As the user might want to modify the print settings after generating a preview, new endpoints and operations for modifying a print job are needed. Without them, the client would have to create new jobs for each preview, which would require reuploading the documents each time.
important
The URLs for the endpoints below are not final, defaults generated by the Django REST Framework's ViewSet feature should be used where possible.
Job modifications
The goal of this extension is to allow the client to perform the following actions on a job:
- Modify job attributes
- Get the document (file) list
- Delete an uploaded document
- Change the document print order
The first two actions can be implemented using standard IPP operations.
Modify job attributes
RFC 3380 defines the Set-Job-Attributes
IPP operation,
which does exactly what we need for this action.
The REST API endpoint for this action could be PATCH /api/jobs/:id/
. Alternatively a POST
action could be added.
While a PATCH
request is not required to be idempotent, our implementation of this endpoint probably should be.
See the MDN docs for the PATCH
request method.
The IPP operation must be atomic, which means it must either change all requested attributes or none. The REST API endpoint should also be atomic.
The IPP operation is also sparse, meaning the client only has to provide the attributes which it wishes to change. This REST API endpoint should also allow sparse updates so that the addition of a new attribute is not a breaking change.
Unsupported job configuration handling
The IPP operations
Print-Job
,
Validate-Job
,
Create-Job
and
Set-Job-Attributes
should verify if the selected print configuration is valid and reject the operation if it is not (e.g., if some pair of
provided attribute values is conflicting).
The same behavior might not be the optimal solution for the REST API. When the user is modifying the print configuration, it is desirable to store it on the server after each change in the UI (with proper throttling/debouncing on the webapp side). This way refreshing the page will not cause data loss, as the web app can retrieve the stored configuration after the reload.
The suggested behavior in this case is to allow setting syntactically valid attributes that result in a configuration
not supported by the selected printer in requests to POST /api/jobs/create_job/
and PATCH /api/jobs/:id/
.
If the selected job configuration is invalid, the responses to these requests should indicate operation success (a 2xx
status code) but should include a errors
field in the response body, indicating the errors in the selected
configuration. A human-readable warning message should be included for displaying in the webapp UI. The output could
also include the warnings in a structured form, just like it would be returned from the IPP operations.
The server should return a failure response for calls to POST /api/jobs/submit/
and POST /api/jobs/:id/run_job/
if the current job configuration is invalid.
The same validation should also happen when executing IPP operations which
start the print job (Print-Job
,
Send-Document
with last-document
set to true
and Close-Job
) and the
Validate-Job
operation,
as the configuration might be invalid if it has been created via IPP but modified using the REST API.
The list of errors could also be retrievable using a new endpoint or could be included in the response to calls to the
GET /api/job/:id/
endpoint.
Get the document list
The IPP operations suitable for this action are
Get-Documents
and Get-Document-Attributes
.
In the REST API the file list could be returned either in the response from GET /api/job/:id/
or using a new
ViewSet
endpoint supporting the requests to
GET /api/job/:id/documents/
and GET /api/job/:id/documents/:doc_id/
.
The ViewSet
solution follows the convention of providing a 1 to 1 mapping of the REST API endpoints to IPP operations.
Delete document
A simple DELETE /api/job/:id/documents/:doc_id/
endpoint could accomplish this action in the REST API.
There is no IPP operation suitable to accomplish this action:
- The
Delete-Document
action defined in the PWG 5100.5-2003 Standard for IPP Document Object standard has since been obsoleted and must not be implemented. Even if it wasn't, it could not be used for this purpose, as it can only be used by the printer's operators and administrators, not end-users. - The
Cancel-Document
operation has undesirable semantics. If theResubmit-Job
operation or another way to rerun jobs is implemented, the previously canceled documents should get printed again.
As such, this endpoint should not be marked as mapping to Delete-Document
or Cancel-Document
.
Modify document order
For this action a POST /api/job/:id/documents/reorder
endpoint could be provided accepting the new document print
order in the request body, for example:
{
"order": ["B", "A", "C"]
}
The server should verify that all documents are included exactly once.
There does not exist a standard IPP operation for this action.
Printing previews in the REST API
This feature is described in PR #63.
The preview request generates (low-quality) images for each page in the PDF, and any additional metadata needed to display the preview. Techniques like CSS Sprites could be used to load all images in a single request.
As both the print and preview requests create the same PDF file, job-scoped caching could be used to avoid generating the same PDF multiple times if the settings and input files have not changed.
Please note that the preprocessing job might take a while to complete, and it is done asynchronously in a Celery worker. The API design should account for this:
- when a new preview is requested, the previous request generation celery task should probably be canceled.
- the behavior when a print is requested while the preview is being generated should be specified.
- there needs to be a way for the server to notify the client that the preview is ready. Long-running HTTP requests or server-side events could be used. Consider separating the endpoints for a preview request and preview image retrieval.
Printing previews via IPP
See PR #69.
Per-document print attribute overrides
This feature is described in PR #94.
This could be an opportunity to implement the PWG 5100.5-2024 – IPP Document Object v1.2 standard.
Issues with the IPP operations for document management actions
The Get-Documents
,
Get-Document-Attributes
map directly to the GET /api/job/:id/documents/
and GET /api/job/:id/documents/:doc_id/
operations described in the
Job modifications section above.
This IPP standard uses sequential document numbers for identifying the documents.
The obsoleted Delete-Document
operation creates a gap in the numbering.
This numbering also represents the order in which the documents will be processed. This presents an issue for the
modify document order action, as the document numbers used by IPP will need to be changed after modifying the
document order.
One possible solution to this issue would be to assign new numbers to all the documents if the order gets changed. For example, assume there are three documents A, B, C initially in this order identified in IPP by the numbers: A:1, B:2, C:3. When the REST API client changes the order to C, B, A, these documents will be assigned the numbers C:4, D:5, E:6. Since in this scenario the number of a document can change, the REST API should use different, persistent document IDs.
The Cancel-Document
operation required
by this standard is semantically different from the proposed document delete endpoint. A canceled document should remain
in the document list and will get printed if the job is resubmitted. The webapp should display this canceled status
in the job file list.