[content-publishing] Deal with brittle IPFS service #834

JoeCap08055 · 2025-05-20T20:42:48Z

Description

Real-world data from the "Freesky" project has shown us that, at least with a "locally"-deployed Kubo IPFS node, that the IPFS service can be slow & brittle. Performance and other issues with the IPFS node can cause problems with content uploading.

Specifically, the recent addition of the /v2/assets/upload endpoint that streams directly to IPFS does not isolate the client from problems with the IPFS service. This endpoint was developed to address 2 issues:

In the /v1/assets/upload endpoint, files were not only initially buffered in application heap memory, but were also cached in Redis (putting extra load on Redis & also subject to Redis' 512MB limit on the size of any individual value). Streaming directly eliminates both buffering and caching.
In the /v1/assets/upload workflow, the client gets an immediate response without confirmation that the asset has achieved its final disposition in IPFS; thus leading to race conditions were the client immediately makes a request to announce previously-uploaded content, but the content cannot yet be verified on IPFS.

Unfortunately, it may be that the direct-stream approach is not robust enough. One advantage of the previous approach was that it utilized task queues with automatic retry to marshal IPFS uploads, and isolated the client from direct IPFS errors.

Here are some architectural points to consider in devising a more robust solution to this workflow issue:

1. Better Local Caching of Content

We could require content-publishing-service (both -api and -worker) to be provisioned with enough attached storage to temporarily cache uploaded content. In order to eliminate application heap buffering, we could use the Multer middleware's DiskStorage storage engine to stream uploaded files directly to disk. Then the worker task queue could handle upload out-of-band.

This approach would benefit from both:

A query endpoint to get the completion status of an asset upload, or its disposition in IPFS
A webhook that could be called when content upload is complete (see issue Content Publishing Service Success Webhook #315 )

2. Task Dependencies

When uploading assets, if we make the task-id of an uploaded asset be deterministically generated from either its DSNP hash or its CID, then we could:

Check before queuing a content announcement task; if any assets are not already present in IPFS, calculate their task IDs and devise a dependency mechanism to wait for successful task completion of those tasks
- BullMQ has parent/child job dependency capabilities; might be able to devise an appropriate mechanism there

3. Batch-upload Specific Ideas

The original approach in content-publishing-service is a multi-step process, modeled after the way many social media apps' posting workflows seem to work, ie:

Upload media assets
Compose a post about them
Submit

However, that workflow may not be best for some applications, which might be better-served by an all-in-one approach, ie:

Compose a post containing media file references
Submit in one call

For instance, the Freesky project, which creates its own content batch files, then immediately uploads & requests to be announced on-chain, might be better served by an endpoint that combines the file upload & announcement request in one payload. This would allow Gateway to construct a better task pipeline with dependencies right from the start.

To be clear, both models seem to have a place in the Gateway ecosystem, but we only support the first model right now.

The text was updated successfully, but these errors were encountered:

JoeCap08055 added the content-publishing-service label May 20, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[content-publishing] Deal with brittle IPFS service #834

[content-publishing] Deal with brittle IPFS service #834

JoeCap08055 commented May 20, 2025

[content-publishing] Deal with brittle IPFS service #834

[content-publishing] Deal with brittle IPFS service #834

Comments

JoeCap08055 commented May 20, 2025

Description

1. Better Local Caching of Content

2. Task Dependencies

3. Batch-upload Specific Ideas