pdf_translator

PDF Translator

Python OS

CI Python Lint pages-build-deployment CodeQL Advanced

PDF Translation Utilities.

This repository contains two Python command‑line utilities that convert between PDF and PNG/JPG while preserving the original directory structure.

Both scripts are filesystem‑friendly (work with a single file or recursively over a directory).


1) pdf2png_tree.py — PDF → PNG/JPG

Convert PDF pages to images while mirroring the input directory tree under the output root.


### Usage
```bash
# Single file
python pdf2png_tree.py /path/to/file.pdf /path/to/outdir

# Directory (recursive)
python pdf2png_tree.py /path/to/input_dir /path/to/outdir

Options

| Option | Default | Description | |—|—|—| | -d, --dpi | 144 | Output resolution (DPI). 200–300 recommended for print-quality | | --ext | png | Output image format (png or jpg) | | --overwrite | off | Overwrite existing files instead of skipping | | --per-pdf-subdir | off | Create a subfolder per PDF (e.g., <pdf_stem>_png/) | | --suffix | _png | Subfolder suffix when --per-pdf-subdir is used |

Output layout

Default (no subfolder):

input/
 └─ reports/quarter1/file.pdf

output/
 └─ reports/quarter1/file_p001.png
                          file_p002.png

With --per-pdf-subdir:

output/
 └─ reports/quarter1/file_png/
      ├─ file_p001.png
      └─ file_p002.png

2) png2pdf_tree.py — PNG/JPG → PDF

Convert images to PDF while mirroring the input directory tree.
Default: each image becomes one PDF.
Option --merge: merge images per folder into one multi‑page PDF.

Requirements


Usage

0. Create virtual environment

python3 -m venv env
source env/bin/activate
pip install -r requirements.txt

1. Translate

# Per image (default)
python png2pdf_tree.py /path/to/images /path/to/outdir

# Merge images per folder into one PDF
python png2pdf_tree.py /path/to/images /path/to/outdir --merge

2. Test

pip install -r requirements.test.txt
pytest

3. Deactivate environment

deactivate

Options

| Option | Default | Description | |—|—|—| | --exts | png,jpg,jpeg | Comma‑separated list of image extensions to include | | --suffix | _converted | Suffix for output PDF filenames | | --overwrite | off | Overwrite existing PDFs | | --merge | off | Merge images per folder into a single PDF |

Output layout

Per image (default):

input/
 ├─ A/img1.png
 └─ B/C/img2.jpg

output/
 ├─ A/img1_converted.pdf
 └─ B/C/img2_converted.pdf

With --merge:

input/
 ├─ A/img1.png
 ├─ A/img2.png
 └─ B/C/img3.jpg

output/
 ├─ A/A_converted.pdf      # img1 + img2
 └─ B/C/C_converted.pdf    # img3

Tips

License