135 lines
3.0 KiB
Markdown
135 lines
3.0 KiB
Markdown
# PDF Manager
|
|
|
|
A Node.js Express service for PDF processing operations: split PDFs into multiple documents and rename them based on metadata.
|
|
|
|
## Features
|
|
|
|
- **PDF Splitting**: Split a multi-page PDF into separate documents based on a specified page count
|
|
- **Metadata-Based Renaming**: Automatically rename split PDFs using provided metadata (name and lastname)
|
|
- **ZIP Archive Output**: Receive all processed PDFs in a single ZIP archive
|
|
- **Input Validation**: Joi-based request validation for robust error handling
|
|
- **File Size Limits**: 10MB maximum file size for uploads
|
|
|
|
## API Endpoints
|
|
|
|
### POST /process-pdf
|
|
|
|
Process a PDF file by splitting it into multiple documents and renaming them based on metadata.
|
|
|
|
#### Request
|
|
|
|
- **Content-Type**: `multipart/form-data`
|
|
- **Parameters**:
|
|
- `file` (file, required): The PDF file to process (max 10MB)
|
|
- `pagesPerDoc` (number, required): Number of pages per output document
|
|
- `metadata` (JSON string, required): Array of metadata objects for naming
|
|
|
|
#### Metadata Format
|
|
|
|
```json
|
|
[
|
|
{
|
|
"name": "John",
|
|
"lastname": "Doe"
|
|
},
|
|
{
|
|
"name": "Jane",
|
|
"lastname": "Smith"
|
|
}
|
|
]
|
|
```
|
|
|
|
The number of metadata entries must match or exceed the number of output documents that will be created based on `pagesPerDoc`.
|
|
|
|
#### Response
|
|
|
|
- **Content-Type**: `application/zip`
|
|
- **Content-Disposition**: `attachment; filename="processed_pdfs.zip"`
|
|
- **Body**: ZIP archive containing the split and renamed PDF files
|
|
|
|
#### Output Filenames
|
|
|
|
Files are named in the format: `{name}_{lastname}.pdf` (e.g., `john_doe.pdf`, `jane_smith.pdf`)
|
|
|
|
#### Example Request (using curl)
|
|
|
|
```bash
|
|
curl -X POST http://localhost:3000/process-pdf \
|
|
-F "file=@document.pdf" \
|
|
-F "pagesPerDoc=2" \
|
|
-F 'metadata=[{"name":"John","lastname":"Doe"},{"name":"Jane","lastname":"Smith"}]' \
|
|
-o processed_pdfs.zip
|
|
```
|
|
|
|
#### Error Responses
|
|
|
|
- **400 Bad Request**: Invalid metadata JSON, missing file, or validation errors
|
|
- **500 Internal Server Error**: PDF processing failed
|
|
|
|
## Running Locally
|
|
|
|
### Prerequisites
|
|
|
|
- Node.js 20 or higher
|
|
- npm or Docker
|
|
|
|
### Option 1: Using npm
|
|
|
|
1. Install dependencies:
|
|
```bash
|
|
npm install
|
|
```
|
|
|
|
2. Start the server:
|
|
```bash
|
|
npm start
|
|
```
|
|
|
|
3. The server will start on port 3000 (or the port specified in the `PORT` environment variable)
|
|
|
|
### Option 2: Using Docker Compose
|
|
|
|
1. Start the service:
|
|
```bash
|
|
docker-compose up
|
|
```
|
|
|
|
2. The service will be available at `http://localhost:3000`
|
|
|
|
3. Stop the service:
|
|
```bash
|
|
docker-compose down
|
|
```
|
|
|
|
### Option 3: Using Docker
|
|
|
|
1. Build the image:
|
|
```bash
|
|
docker build -t pdf-manager .
|
|
```
|
|
|
|
2. Run the container:
|
|
```bash
|
|
docker run -p 3000:3000 pdf-manager
|
|
```
|
|
|
|
## Dependencies
|
|
|
|
- **express** (^4.18.2): Web framework
|
|
- **multer** (^1.4.5-lts.1): Multipart/form-data handling for file uploads
|
|
- **pdf-lib** (^1.17.1): PDF manipulation library
|
|
- **archiver** (^7.0.1): ZIP archive creation
|
|
- **joi** (^17.11.0): Request validation
|
|
|
|
## Development
|
|
|
|
Start the development server with auto-reload:
|
|
|
|
```bash
|
|
npm run dev
|
|
```
|
|
|
|
## License
|
|
|
|
ISC
|