Skip to content

[Task]: Backend support for seed files #2673

Open
@tw4l

Description

@tw4l

Description

Core requirements:

  • Add endpoints to upload and delete a seed list file (file stored in s3, make sure to check storage quota first, validate that file has expected .txt extensiom is <= 25 MB, scope type is set to "page" if seed list is uploaded, and seeds array is null/empty), save file db record in new mongodb collection
  • Add support for seed file (referenced by id) to /crawlconfigs/ POST + PATCH endpoints, ensure dereferenced files are deleted
  • Add background cron job to check for and clean up orphaned seed files (and maybe collection thumbnails as well?)
  • Pass seed file to crawler if provided

Related tasks:

  • Pass lastCrawlStats to GET /crawlconfigs list and detail endpoints
  • Return seed file total size in org storage stats

Context

No response

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

Projects

Status

Implementing

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions