Critical Apache Tika Vulnerabilities Surface Despite Previous Patches
A serious security vulnerability has been discovered in Apache Tika, a popular tool used for extracting data from XML documents. Although the flaw was initially made public last summer and believed to be fixed, recent alerts reveal it is more widespread and dangerous than previously thought. This update highlights the need for immediate action by users and developers alike.
Details of the Flaws and Their Impact
The core issue involves two interconnected vulnerabilities, identified as CVE-2025-54988 and CVE-2025-66516. The first, rated 8.4 in severity, was disclosed in August and affects the PDF parsing module in Apache Tika. This module processes PDF files to normalize data from over a thousand proprietary formats, making it a key part of Tika’s ecosystem. Unfortunately, this same process can be exploited through XML External Entity (XXE) injection attacks, which could allow attackers to access sensitive data or send malicious requests.
The second vulnerability was revealed last week and is rated a maximum 10.0 in severity. It turns out that the initial flaw isn’t limited to just the PDF module. Other components, including the core tika-core library and various parsers, are also vulnerable. This has led to the issuance of a second CVE, which covers a broader range of affected parts. Essentially, even if users patched the first vulnerability, they might still be at risk if other components weren’t updated.
What Developers and Users Need to Do
Currently, there’s no evidence that attackers are actively exploiting these flaws in the wild. However, security experts warn that this could change quickly, especially if proof-of-concept exploits are published or reverse-engineered. Because of the severity, it’s crucial for anyone using Apache Tika to update their software as soon as possible.
To mitigate the risks, users should update to the latest versions: Tika-core 3.2.2, tika-parser-pdf-module 3.2.2, or tika-parsers 2.0.0 if still on legacy versions. Patching these components will help close the vulnerabilities. However, a challenge remains: many applications might not explicitly list Tika in their configuration files, creating blind spots. In such cases, disabling XML parsing altogether in the application settings could be a safer move until updates are fully applied.
Overall, this situation underscores the importance of staying alert to security updates and thoroughly reviewing dependencies. While Apache Tika remains a powerful tool, users must act swiftly to prevent potential exploitation of these serious flaws.















What do you think?
It is nice to know your opinion. Leave a comment.