Store cached child els in Attributes, not field by jhy · Pull Request #2307 · jhy/jsoup (original) (raw)
This MR contains the following updates:
| Package | Type | Update | Change |
|---|---|---|---|
| flow-bin (changelog) | devDependencies | minor | ^0.247.0 -> ^0.274.0 |
| org.postgresql:postgresql (source) | build | patch | 42.7.4 -> 42.7.7 |
| org.liquibase:liquibase-maven-plugin (source) | build | minor | 4.29.2 -> 4.32.0 |
| org.jsoup:jsoup (source) | compile | minor | 1.18.1 -> 1.21.1 |
| net.java.dev.jna:jna | compile | minor | 5.15.0 -> 5.17.0 |
| io.hypersistence:hypersistence-utils-hibernate-70 | compile | patch | 3.10.0 -> 3.10.1 |
| com.diffplug.spotless:spotless-maven-plugin | build | minor | 2.43.0 -> 2.44.5 |
| org.apache.maven.plugins:maven-enforcer-plugin | build | minor | 3.5.0 -> 3.6.0 |
| org.apache.maven.plugins:maven-compiler-plugin | build | minor | 3.13.0 -> 3.14.0 |
Release Notes
flowtype/flow-bin
v0.274.2
v0.274.1
v0.274.0
v0.273.1
v0.272.2
v0.272.1
v0.272.0
v0.271.0
v0.270.0
v0.269.1
v0.268.0
v0.267.0
v0.266.1
v0.266.0
v0.265.3
v0.265.2
v0.265.1
v0.265.0
v0.264.0
v0.263.0
v0.262.0
v0.261.2
v0.261.1
v0.261.0
v0.260.0
v0.259.1
v0.259.0
v0.258.1
v0.258.0
v0.257.1
v0.257.0
v0.256.0
v0.255.0
v0.254.2
v0.254.1
v0.254.0
v0.253.0
v0.252.0
v0.251.1
v0.251.0
v0.250.0
v0.249.0
v0.248.1
v0.248.0
pgjdbc/pgjdbc
v42.7.7
Security
- security: Client Allows Fallback to Insecure Authentication Despite channelBinding=require configuration.
Fix
channel binding requiredhandling to reject non-SASL authentication Previously, when channel binding was set to "require", the driver would silently ignore this requirement for non-SASL authentication methods. This could lead to a false sense of security when channel binding was explicitly requested but not actually enforced. The fix ensures that when channel binding is set to "require", the driver will reject connections that use non-SASL authentication methods or when SASL authentication has not completed properly. See the Security Advisory) for more detail. Reported by George MacKerron The following CVE-2025-49146 has been issued
Added
- test: Added ChannelBindingRequiredTest to verify proper behavior of channel binding settings
v42.7.6
Features
- fix: Enhanced DatabaseMetadata.getIndexInfo() method, added index comment as REMARKS property MR #3513
Performance Improvements
- performance: Improve ResultSetMetadata.fetchFieldMetaData by using IN row values instead of UNION ALL for improved query performance (later reverted) MR #3510
- feat:Use a single simple query for all startup parameters, so groupStartupParameters is no longer needed MR #3613
v42.7.5
Added
- ci: Test with Java 23 MR #3381
Fixed
- regression: revert change in
fc60537MR #3476 - fix: PgDatabaseMetaData implementation of catalog as param and return value MR #3390
- fix: Support default GSS credentials in the Java Postgres client MR #3451
- fix: return only the transactions accessible by the current_user in XAResource.recover MR #3450
- feat: don't force send extra_float_digits for PostgreSQL >= 12 fix Issue #3432 MR #3446
- fix: exclude "include columns" from the list of primary keys MR #3434
- perf: Enhance the meta query performance by specifying the oid. MR #3427
- feat: support getObject(int, byte[].class) for bytea MR #3274
- docs: document infinity and some minor edits MR #3407
- fix: Added way to check for major server version, fixed check for RULE MR #3402
- docs: fixed remaining paragraphs MR #3398
- docs: fixed paragraphs in javadoc comments MR #3397
- fix: Reuse buffers and reduce allocations in GSSInputStream addresses Issue #3251 MR #3255
- chore: Update Gradle to 8.10.2 MR #3388
- fix: getSchemas() MR #3386
- fix: Update rpm postgresql-jdbc.spec.tpl with scram-client MR #3324
- fix: Clearing thisRow and rowBuffer on close() of ResultSet Issue #3383 MR #3384
- fix: Package was renamed to maven-bundle-plugin MR #3382
- fix: As of version 18 the RULE privilege has been removed MR #3378
- fix: use buffered inputstream to create GSSInputStream MR #3373
- test: get rid of 8.4, 9.0 pg versions and use >= jdk version 17 MR #3372
- Changed docker-compose version and renamed script file in instructions to match the real file name MR #3363
- test:Do not assume "test" database in DatabaseMetaDataTransactionIsolationTest MR #3364
- try to categorize dependencies MR #3362
liquibase/liquibase
v4.32.0
See the Liquibase 4.32.0 Release Notes for the complete set of release information.
v4.31.1
[!IMPORTANT] Liquibase 4.31.1 patches vulnerability found in Snowlake driver (CVE-2025-24789) and resolves issue with include and logicalfilepath reported in 4.31.0 (see 4.31.0 Release Notes)
[!NOTE] See the Liquibase 4.31.1 Release Notes for the complete set of release information.
v4.31.0
[!NOTE] See the Liquibase 4.31.0 Release Notes for the complete set of release information.
v4.30.0
[!IMPORTANT] Liquibase 4.30.0 contains new capabilities and notable enhancements for Liquibase OSS and Pro users including Anonymous Analytics and deprecation of the MacOS dmg installer. [!NOTE] See the Liquibase 4.30.0 Release Notes for the complete set of release information.
jhy/jsoup
v1.21.1
Changes
- Removed previously deprecated methods. #2317
- Deprecated the
:matchTextpseduo-selector due to its side effects on the DOM; use the new::textnodeselector and theElement#selectNodes(String css, Class type)method instead. #2343 - Deprecated
Connection.Response#bufferUp()in lieu ofConnection.Response#readFully()which can throw a checked IOException. - Deprecated internal methods
Validate#ensureNotNull(replaced by typedValidate#expectNotNull); protected HTML appenders from Attribute and Node. - If you happen to be using any of the deprecated methods, please take the opportunity now to migrate away from them, as they will be removed in a future release.
Improvements
- Enhanced the
Selectorto support direct matching against nodes such as comments and text nodes. For example, you can now find an element that follows a specific comment:::comment:contains(prices) + pwill selectpelements immediately after a<!-- prices: -->comment. Supported types include::node,::leafnode,::comment,::text,::data, and::cdata. Node contextual selectors like::node:contains(text),:matches(regex), and:blankare also supported. IntroducedElement#selectNodes(String css)andElement#selectNodes(String css, Class nodeType)for direct node selection. #2324 - Added
TagSet#onNewTag(Consumer<Tag> customizer): register a callback that’s invoked for each new or cloned Tag when it’s inserted into the set. Enables dynamic tweaks of tag options (for example, marking all custom tags as self-closing, or everything in a given namespace as preserving whitespace). - Made
TokenQueueandCharacterReaderautocloseable, to ensure that they will release their buffers back to the buffer pool, for later reuse. - Added
Selector#evaluatorOf(String css), as a clearer way to obtain an Evaluator from a CSS query. An alias ofQueryParser.parse(String css). - Custom tags (defined via the
TagSet) in a foreign namespace (e.g. SVG) can be configured to parse as data tags. - Added
NodeVisitor#traverse(Node)to simplify node traversal calls (vs. importingNodeTraversor). - Updated the default user-agent string to improve compatibility. #2341
- The HTML parser now allows the specific text-data type (Data, RcData) to be customized for known tags. (Previously, that was only supported on custom tags.) #2326.
- Added
Connection#readFully()as a replacement forConnection#bufferUp()with an explicit IOException. Similarly, addedConnection#readBody()overConnection#body(). DeprecatedConnection#bufferUp(). #2327 - When serializing HTML, the
<and>characters are now escaped in attributes. This helps prevent a class of mutation XSS attacks. #2337 - Changed
Connectionto prefer using the JDK's HttpClient over HttpUrlConnection, if available, to enable HTTP/2 support by default. Users can disable via-Djsoup.useHttpClient=false. #2340
Bug Fixes
- The contents of a
scriptin asvgforeign context should be parsed as script data, not text. #2320 Tag#isFormSubmittable()was updating the Tag's options. #2323- The HTML pretty-printer would incorrectly trim whitespace when text followed an inline element in a block element. #2325
- Custom tags with hyphens or other non-letter characters in their names now work correctly as Data or RcData tags. Their closing tags are now tokenized properly. #2332
- When cloning an Element, the clone would retain the source's cached child Element list (if any), which could lead to incorrect results when modifying the clone's child elements. #2334
v1.20.1
Changes
- To better follow the HTML5 spec and current browsers, the HTML parser no longer allows self-closing tags (
<foo />) to close HTML elements by default. Foreign content (SVG, MathML), and content parsed with the XML parser, still supports self-closing tags. If you need specific HTML tags to support self-closing, you can register a custom tag via theTagSetconfigured inParser.tagSet(), usingTag#set(Tag.SelfClose). Standard void tags (such as<img>,<br>, etc.) continue to behave as usual and are not affected by this change. #2300. - The following internal components have been deprecated. If you do happen to be using any of these, please take the opportunity now to migrate away from them, as they will be removed in jsoup 1.21.1.
ChangeNotifyingArrayList,Document.updateMetaCharsetElement(),Document.updateMetaCharsetElement(boolean),HtmlTreeBuilder.isContentForTagData(String),Parser.isContentForTagData(String),Parser.setTreeBuilder(TreeBuilder),Tag.formatAsBlock(),Tag.isFormListed(),TokenQueue.addFirst(String),TokenQueue.chompTo(String),TokenQueue.chompToIgnoreCase(String),TokenQueue.consumeToIgnoreCase(String),TokenQueue.consumeWord(),TokenQueue.matchesAny(String...)
Functional Improvements
- Rebuilt the HTML pretty-printer, to simplify and consolidate the implementation, improve consistency, support custom Tags, and provide a cleaner path for ongoing improvements. The specific HTML produced by the pretty-printer may be different from previous versions. #2286.
- Added the ability to define custom tags, and to modify properties of known tags, via the
TagSettag collection. Their properties can impact both the parse and how content is serialized (output as HTML or XML). #2285. Element.cssSelector()will prefer to return shorter selectors by using ancestor IDs when available and unique. E.g.#id > div > pinstead ofhtml > body > div > div > p#2283.- Added
Elements.deselect(int index),Elements.deselect(Object o), andElements.deselectAll()methods to remove elements from theElementslist without removing them from the underlying DOM. Also addedElements.asList()method to get a modifiable list of elements without affecting the DOM. (Individual Elements remain linked to the DOM.) #2100. - Added support for sending a request body from an InputStream with
Connection.requestBodyStream(InputStream stream). #1122. - The XML parser now supports scoped xmlns: prefix namespace declarations, and applies the correct namespace to Tags and
Attributes. Also, added
Tag#prefix(),Tag#localName(),Attribute#prefix(),Attribute#localName(), andAttribute#namespace()to retrieve these. #2299. - CSS identifiers are now escaped and unescaped correctly to the CSS spec.
Element#cssSelector()will emit appropriately escaped selectors, and the QueryParser supports those. AddedSelector.escapeCssIdentifier()andSelector.unescapeCssIdentifier(). #2297, #2305
Structure and Performance Improvements
- Refactored the CSS
QueryParserinto a clearer recursive descent parser. #2310. - CSS selectors with consecutive combinators (e.g.
div >> p) will throw an explicit parse exception. #2311. - Performance: reduced the shallow size of an Element from 40 to 32 bytes, and the NodeList from 32 to 24. #2307.
- Performance: reduced GC load of new StringBuilders when tokenizing input HTML. #2304.
- Made
Parserinstances threadsafe, so that inadvertent use of the same instance across threads will not lead to errors. For actual concurrency, useParser#newInstance()per thread. #2314.
Bug Fixes
- Element names containing characters invalid in XML are now normalized to valid XML names when serializing. #1496.
- When serializing to XML, characters that are invalid in XML 1.0 should be removed (not encoded). #1743.
- When converting a
Documentto the W3C DOM inW3CDom, elements with an attribute in an undeclared namespace now get a declaration ofxmlns:prefix="undefined". This allows subsequent serialization to XML viaW3CDom.asString()to succeed. #2087. - The
StreamParsercould emit the final elements of a document twice, due to howonNodeCompletedwas fired when closing out the stack. #2295. - When parsing with the XML parser and error tracking enabled, the trailing
?in<?xml version="1.0"?>would incorrectly emit an error. #2298. - Calling
Element#cssSelector()on an element with combining characters in the class or ID now produces the correct output. #1984.
v1.19.1
Changes
- Added support for http/2 requests in
Jsoup.connect(), when running on Java 11+, via the Java HttpClient implementation. #2257.- In this version of jsoup, the default is to make requests via the HttpUrlConnection implementation: use
System.setProperty("jsoup.useHttpClient", "true");to enable making requests via the HttpClient instead , which will enable http/2 support, if available. This will become the default in a later version of jsoup, so now is a good time to validate it. - If you are repackaging the jsoup jar in your deployment (i.e. creating a shaded- or a fat-jar), make sure to specify that as a Multi-Release JAR.
- If the
HttpClientimpl is not available in your JRE, requests will continue to be made viaHttpURLConnection(inhttp/1.1mode).
- In this version of jsoup, the default is to make requests via the HttpUrlConnection implementation: use
- Updated the minimum Android API Level validation from 10 to 21. As with previous jsoup versions, Android developers need to enable core library desugaring. The minimum Java version remains Java 8. #2173
- Removed previously deprecated class:
org.jsoup.UncheckedIOException(replace withjava.io.UncheckedIOException); moved previously deprecated methodElement Element#forEach(Consumer)tovoid Element#forEach(Consumer()). #2246 - Deprecated the methods
Document#updateMetaCharsetElement(boolean)andDocument#updateMetaCharsetElement(), as the setting had no effect. WhenDocument#charset(Charset)is called, the document's meta charset or XML encoding instruction is always set. #2247
Improvements
- When cleaning HTML with a
Safelistthat preserves relative links, theisValid()method will now consider these links valid. Additionally, the enforced attributerel=nofollowwill only be added to external links when configured in the safelist. #2245 - Added
Element#selectStream(String query)andElement#selectStream(Evaluator)methods, that return aStreamof matching elements. Elements are evaluated and returned as they are found, and the stream can be terminated early. #2092 Elementobjects now implementIterable, enabling them to be used in enhanced for loops.- Added support for fragment parsing from a
ReaderviaParser#parseFragmentInput(Reader, Element, String). #1177 - Reintroduced CLI executable examples, in
jsoup-examples.jar. #1702 - Optimized performance of selectors like
#id .class(and other similar descendant queries) by around 4.6x, by better balancing the Ancestor evaluator's cost function in the query planner. #2254 - Removed the legacy parsing rules for
<isindex>tags, which would autovivify aformelement with labels. This is no longer in the spec. - Added
Elements.selectFirst(String cssQuery)andElements.expectFirst(String cssQuery), to select the first matching element from anElementslist. #2263 - When parsing with the XML parser, XML Declarations and Processing Instructions are directly handled, vs bouncing
through the HTML parser's bogus comment handler. Serialization for non-doctype declarations no longer end with a
spurious
!. #2275 - When converting parsed HTML to XML or the W3C DOM, element names containing
<are normalized to_to ensure valid XML. For example,<foo<bar>becomes<foo_bar>, as XML does not allow<in element names, but HTML5 does. #2276 - Reimplemented the HTML5 Adoption Agency Algorithm to the current spec. This handles mis-nested formating / structural elements. #2278
Bug Fixes
- If an element has an
;in an attribute name, it could not be converted to a W3C DOM element, and so subsequent XPath queries could miss that element. Now, the attribute name is more completely normalized. #2244 - For backwards compatibility, reverted the internal attribute key for doctype names to "name". #2241
- In
Connection, skip cookies that have no name, rather than throwing a validation exception. #2242 - When running on JDK 1.8, the error
java.lang.NoSuchMethodError: java.nio.ByteBuffer.flip()Ljava/nio/ByteBuffer;could be thrown when callingResponse#body()after parsing from a URL and the buffer size was exceeded. #2250 - For backwards compatibility, allow
nullInputStream inputs toJsoup.parse(InputStream stream, ...), by returning an emptyDocument. #2252 - A
templatetag containing anliwithin an openliwould be parsed incorrectly, as it was not recognized as a "special" tag (which have additional processing rules). Also, added the SVG and MathML namespace tags to the list of special tags. #2258 - A
templatetag containing abuttonwithin an openbuttonwould be parsed incorrectly, as the "in button scope" check was not aware of thetemplateelement. Corrected other instances including MathML and SVG elements, also. #2271 - An
:nth-childselector with a negative digit-less step, such as:nth-child(-n+2), would be parsed incorrectly as a positive step, and so would not match as expected. #1147 - Calling
doc.charset(charset)on an empty XML document would throw anIndexOutOfBoundsException. #2266 - Fixed a memory leak when reusing a nested
StructuralEvaluator(e.g., a selector ancestor chain likeA B C) by ensuring cache reset calls cascade to inner members. #2277 - Concurrent calls to
doc.clone().append(html)were not supported. When a document was cloned, itsParserwas not cloned but was a shallow copy of the original parser. #2281
v1.18.3
Bug Fixes
- When serializing to XML, attribute names containing
-,., or digits were incorrectly marked as invalid and removed. 2235
v1.18.2
Improvements
- Optimized the throughput and memory use throughout the input read and parse flows, with heap allocations and GC
down between -6% and -89%, and throughput improved up to +143% for small inputs. Most inputs sizes will see
throughput increases of ~ 20%. These performance improvements come through recycling the backing
byte[]andchar[]arrays used to read and parse the input. 2186 - Speed optimized
html()andEntities.escape()when the input contains UTF characters in a supplementary plane, by around 49%. 2183 - The form associated elements returned by
FormElement.elements()now reflect changes made to the DOM, subsequently to the original parse. 2140 - In the
TreeBuilder, theonNodeInserted()andonNodeClosed()events are now also fired for the outermost / rootDocumentnode. This enables source position tracking on the Document node (which was previously unset). And it also enables the node traversor to see the outer Document node. 2182 - Selected Elements can now be position swapped inline using
Elements#set(). 2212
Bug Fixes
Element.cssSelector()would fail if the element's class contained a*character. 2169- When tracking source ranges, a text node following an invalid self-closing element may be left untracked. 2175
- When a document has no doctype, or a doctype not named
html, it should be parsed in Quirks Mode. 2197 - With a selector like
div:has(span + a), thehas()component was not working correctly, as the inner combining query caused the evaluator to match those against the outer's siblings, not children. 2187 - A selector query that included multiple
:has()components in a nested:has()might incorrectly execute. 2131 - When cookie names in a response are duplicated, the simple view of cookies available via
Connection.Response#cookies()will provide the last one set. Generally it is better to use the Jsoup.newSession method to maintain a cookie jar, as that applies appropriate path selection on cookies when making requests. 1831 - When parsing named HTML entities, base entities should resolve if they are a prefix of the input token (and not in an attribute). 2207
- Fixed incorrect tracking of source ranges for attributes merged from late-occurring elements that were implicitly
created (
htmlorbody). 2204 - Follow the current HTML specification in the tokenizer to allow
<as part of a tag name, instead of emitting it as a character node. 2230 - Similarly, allow a
<as the start of an attribute name, vs creating a new element. The previous behavior was intended to parse closer to what we anticipated the author's intent to be, but that does not align to the spec or to how browsers behave. 1483
java-native-access/jna
v5.17.0
================
Features
Bug Fixes
- #1647: Fix calls to jnidispatch on Android with 16KB page size (part 2) - @BugsBeGone.
v5.16.0
==============
Features
- #1626: Add caching of field list and field validation in
Structurealong with more efficient reentrant read-write locking instead of synchronized() blocks - @BrettWooldridge
Bug Fixes
- #1618: Fix calls to jnidispatch on Android with 16KB page size - @Thomyrock
vladmihalcea/hypersistence-utils
v3.10.1
================================================================================
Update description in pom.xml to mention support of Hibernate 6.6 #790
Remove the central-publishing-maven-plugin dependency #789
diffplug/spotless
v2.44.0
Added
- New static method to
DiffMessageFormatterwhich allows to retrieve diffs with their line numbers (#1960) - Gradle - Support for formatting shell scripts via shfmt. (#1994)
Fixed
- Fix empty files with biome >= 1.5.0 when formatting files that are in the ignore list of the biome configuration file. (#1989 fixes #1987)
- Fix a regression in BufStep where the same arguments were being provided to every
bufinvocation. (#1976)
Changed
Configuration
📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).
🚦 Automerge: Enabled.
♻ Rebasing: Whenever MR is behind base branch, or you tick the rebase/retry checkbox.
👻 Immortal: This MR will be recreated if closed unmerged. Get config help if that's undesired.
- If you want to rebase/retry this MR, check this box
This MR has been generated by Renovate Bot.