We are continuing to monitor the changes made for the recent IPFS deployment issues, and the system is now much more reliable. Users have reported successful deployments with fewer retries (2-3 compared to 10-20 previously), and there have been no widespread complaints in the last few hours.
Our SRE team has implemented the following recent enhancements:
- Restarted IPFS with optimized connection settings. - Modified IPFS endpoints to better manage traffic. - Created new dashboards to monitor errors and connection timeouts in real-time. - Reviewed and tweaked rules to ensure community node traffic is handled efficiently.
These changes have led to a noticeable improvement in deployment success rates. However, some users may still experience occasional connection timeouts, which we are actively addressing. We’re continuing to monitor the system closely and will make additional adjustments as needed. If you encounter any issues, please let us know.
Thank you for your patience and support!
Posted May 14, 2025 - 18:23 UTC
Monitoring
We’ve made substantial progress with the recent IPFS deployment issues, and the system is now demonstrating significantly improved reliability.
Our Site Reliability Engineering team has implemented several key enhancements, including: - Applied targeted rules to block suspicious traffic and reduce system load. - Upgraded IPFS Kubo on both testnet and mainnet to include critical stability improvements. - Adjusted nginx connection limits to eliminate "Cannot assign requested address" errors, improving proxy stability. - Resolved a misconfigured nginx caching rule that was returning incorrect IPFS hashes for different files.
These improvements have resulted in more consistent and successful IPFS deployments. We continue to actively monitor system performance and are working on further optimizations to maintain long-term stability.
Thank you for your continued patience and support.
Posted May 13, 2025 - 22:42 UTC
Update
We've upgraded our IPFS to address deployment issues caused by memory limits being exceeded. This update includes fixes for resource leaks that were contributing to the problem. We've also blocked several suspicious IP addresses that may have been overloading the system. While IPFS stability has improved, the root issue is not yet fully resolved. We appreciate your continued patience as we work toward a complete fix.
Posted May 12, 2025 - 19:26 UTC
Identified
We're identified an issue in which our internal IPFS proxy's in-memory cache and aggressive retry logic are causing elevated load and intermittent timeouts. Our engineering team is working to implement improved exponential back-off in our fetch workflows and is evaluating more durable / decoupled caching solutions to ensure continued stability. We appreciate your continued patience as we work to resolve this.
Posted May 07, 2025 - 14:29 UTC
Investigating
We are currently investigating the issue. It should work on multiple retries.
Posted May 06, 2025 - 13:20 UTC
This incident affects: Upgrade Indexer - Miscellaneous (IPFS).