Using the - operand you can read the HTML content from stdin, as fetched by a separate command, such as curl. In this sort of setup, percollate does not know the URL from which the content has been fetched, and relative paths on images, anchors, et cetera won't resolve correctly.
Use the --url option to supply the source's original URL.
```sh
curlhttps://example.com | percollate pdf - --url=https://example.com
```
-w, --wait
By default, percollate processes URLs in parallel. Use the --wait option to process them sequentially instead, with a pause between items. The delay is specified in seconds, and can be zero.
```sh
percollateepub--wait=1url1url2url3
```
--individual
By default, percollate bundles all web pages in a single file. Use the --individual flag to export each source to a separate file.
Generate a cover. The option is implicitly enabled when the --title option is provided, or when bundling more than one web page to a single file. Disable this implicit behavior by passing the --no-cover flag.
--toc
Generate a hyperlinked table of contents. The option is implicitly enabled when bundling more than one web page to a single file. Disable this implicit behavior by passing the --no-toc flag.
Applies to pdf and html.
--hyphenate
Hyphenation is enabled by default for pdf, and disabled for epub and html. You can opt into hyphenation with the --hyphenate flag, or disable it with the --no-hyphenate flag.
If you'd like to fetch the HTML with an external command, you can use - as an operand, which stands for stdin (the standard input):
```sh
curlhttps://example.com/page1 | percollate pdf --url=https://example.com/page1 -
```
Notice we're using the url option to tell percollate the source of our (now-anonymous) HTML it gets on stdin, so that relative URLs on links and images resolve correctly.
The --css option
The --css option lets you pass a small snippet of CSS to percollate. Here are some common use-cases:
Custom page size / margins
The default page size is A5 (portrait). You can use the --css option to override it using [any supported CSS size](https://www.w3.org/TR/css3-page/#page-size):
💡 To work correctly, you must have the fonts installed on your machine. Custom web fonts currently require you to use a custom CSS stylesheet / HTML template.
Remove the appended hrefs from hyperlinks
The idea with percollate is to make PDFs that can be printed without losing where the hyperlinks point to. However, for some link-heavy pages, the appended hrefs can become bothersome. You can remove them using:
Hyphenation is only enabled by default for PDFs, but you can opt in or out of it for any output format with a flag.
When hyphenation is enabled, paragraphs will be justified:
```css
.article__content p {
text-align: justify;
}
```
If you prefer left-aligned text:
```sh
percollatepdf--css".article__content p { text-align: left }"http://example.com
```
The --style option
The --style option lets you use your own CSS stylesheet instead of the default one. Here are some common use-cases for this option:
⚠️ TODO add examples here
The --template option
The --template option lets you use a custom HTML template for the PDF.
💡 The HTML template is parsed with nunjucks, which is a close JavaScript relative of Twig for PHP, Jinja2 for Python and L for Ruby.
Here are some common use-cases:
Customizing the page header / footer
Puppeteer can print some basic information about the page in the PDF. The following CSS class names are available for the header / footer, into which the appropriate content will be injected:
- date — The formatted print date
- title — The document title
- url — document location (Note: this will print the path of the _temporary html_, not the original web page URL)
You can add CSS styles to the header / footer with either the --css option or a separate CSS stylesheet (the --style option).
💡 The header / footer template do not inherit their styles from the rest of the page (i.e. they are not part of the cascade), so you'll have to write the full CSS you want to apply to them.
Different formats then use different tools to produce the final file.
PDFs are rendered with [puppeteer](https://github.com/GoogleChrome/puppeteer).
EPUBs have external images fetched and bundled together with the HTML of each article. When the --inline option is used, images are instead converted to data URLs and embedded into the HTML.
HTMLs are saved without any further changes. When the --inline option is used, images are converted to data URLs and embedded into the HTML. External images are not otherwise fetched.
Limitations
Percollate inherits the limitations of two of its main components, Readability and Puppeteer (headless Chrome).
The imperative approach Readability takes will not be perfect in each case, especially on HTML pages with atypical markup; you may occasionally notice that it either leaves in superfluous content, or that it strips out parts of the content. You can confirm the problem against Firefox's Reader View. In this case, consider [filing an issue onmozilla/readability](https://github.com/mozilla/readability/issues).
Using a browser to generate the PDF is a double-edged sword. On the one hand, you get excellent support for web platform features. On the other hand, print CSS as defined by W3C specifications is only partially implemented, and it seems unlikely that support will be improved any time soon. However, even with modest print support, I think Chrome is the best (free) tool for the job.