Roland Shoemaker
59706cdaa8
html: impose open element stack size limit
...
The HTML specification contains a number of algorithms which are
quadratic in complexity by design. Instead of adding complicated
workarounds to prevent these cases from becoming extremely expensive in
pathological cases, we impose a limit of 512 to the size of the stack of
open elements. It is extremely unlikely that non-adversarial HTML
documents will ever hit this limit (but if we see cases of this, we may
want to make the limit configurable via a ParseOption).
Thanks to Guido Vranken and Jakub Ciolek for both independently
reporting this issue.
Fixes CVE-2025-47911
Fixes golang/go#75682
Change-Id: I890517b189af4ffbf427d25d3fde7ad7ec3509ad
Reviewed-on: https://go-review.googlesource.com/c/net/+/709876
Reviewed-by: Damien Neil <dneil@google.com >
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com >
2025-10-07 11:18:01 -07:00
Roland Shoemaker
6ec8895aa5
html: align in row insertion mode with spec
...
Update inRowIM to match the HTML specification. This fixes an issue
where a specific HTML document could cause the parser to enter an
infinite loop when trying to parse a </tbody> and implied </tr> next to
each other.
Fixes CVE-2025-58190
Fixes golang/go#70179
Change-Id: Idcb133c87c7d475cc8c7eb1f1550ea21d8bdddea
Reviewed-on: https://go-review.googlesource.com/c/net/+/709875
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com >
Reviewed-by: Damien Neil <dneil@google.com >
2025-10-07 11:17:53 -07:00
cuishuang
875d966983
all: fix some comments
...
Including mismatched function names/struct names, repeated words, typos, etc.
Change-Id: Ia576274bce6e6fbfe4d2fca6dcd6d31bf00936fb
Reviewed-on: https://go-review.googlesource.com/c/net/+/683875
Auto-Submit: Sean Liao <sean@liao.dev >
Reviewed-by: Mark Freeman <markfreeman@google.com >
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com >
Reviewed-by: Michael Knyszek <mknyszek@google.com >
Reviewed-by: Sean Liao <sean@liao.dev >
2025-09-15 17:28:39 -07:00
Roland Shoemaker
e1fcd82abb
html: properly handle trailing solidus in unquoted attribute value in foreign content
...
The parser properly treats tags like <p a=/> as <p a="/">, but the
tokenizer emits the SelfClosingTagToken token incorrectly. When the
parser is used to parse foreign content, this results in an incorrect
DOM.
Thanks to Sean Ng (https://ensy.zip ) for reporting this issue.
Fixes golang/go#73070
Fixes CVE-2025-22872
Change-Id: I65c18df6d6244bf943b61e6c7a87895929e78f4f
Reviewed-on: https://go-review.googlesource.com/c/net/+/661256
Reviewed-by: Neal Patel <nealpatel@google.com >
Reviewed-by: Roland Shoemaker <roland@golang.org >
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com >
Auto-Submit: Gopher Robot <gobot@golang.org >
2025-03-27 12:51:24 -07:00
Pukki
312450e473
html: ensure <search> tag closes <p> and update tests
...
This change ensures that the <search> tag correctly closes an open <p> tag when encountered during parsing.
Changes:
- Added <search> to the list of elements that should close an open <p> tag in parse.go.
- Updated the second list in parse.go to ensure consistency.
- Updated html/atom/gen.go, table.go, and table_test.go accordingly.
- Modified parse_test.go to use strings.Builder instead of bytes.Buffer.
- Updated test error messages to follow Go’s conventions.
- Fixed an accidental colon in the comment in parse.go.
Change-Id: I5835da69f6bb0e14c483e55b7ae82915ae958dc1
Reviewed-on: https://go-review.googlesource.com/c/net/+/655457
Reviewed-by: Damien Neil <dneil@google.com >
Reviewed-by: Ian Lance Taylor <iant@google.com >
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com >
Auto-Submit: Ian Lance Taylor <iant@google.com >
2025-03-12 15:46:46 -07:00
Roland Shoemaker
8e66b04771
html: use strings.EqualFold instead of lowering ourselves
...
Instead of using strings.ToLower and == to check case insensitive
equality, just use strings.EqualFold, even when the strings are only
ASCII. This prevents us unnecessarily lowering extremely long strings,
which can be a somewhat expensive operation, even if we're only
attempting to compare equality with five characters.
Thanks to Guido Vranken for reporting this issue.
Fixes golang/go#70906
Fixes CVE-2024-45338
Change-Id: I323b919f912d60dab6a87cadfdcac3e6b54cd128
Reviewed-on: https://go-review.googlesource.com/c/net/+/637536
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com >
Auto-Submit: Gopher Robot <gobot@golang.org >
Reviewed-by: Roland Shoemaker <roland@golang.org >
Reviewed-by: Tatiana Bradley <tatianabradley@google.com >
2024-12-18 11:24:30 -08:00
yincong
b935f7b5d7
html: avoid endless loop on error token
...
Fixes #70179
Change-Id: I2a0a1fc2e96f7d8eefd0abdf7ef8ba243a6e8645
GitHub-Last-Rev: a601ecd849
GitHub-Pull-Request: golang/net#226
Reviewed-on: https://go-review.googlesource.com/c/net/+/624895
Reviewed-by: Ian Lance Taylor <iant@google.com >
Auto-Submit: Ian Lance Taylor <iant@google.com >
Reviewed-by: Roland Shoemaker <roland@golang.org >
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com >
2024-12-18 08:05:47 -08:00
Carlana Johnson
511cc3a406
html: add Node.{Ancestors,ChildNodes,Descendants}()
...
Adds iterators for the parents, immediate children, and all children of a Node respectively.
Fixes golang/go#62113
Change-Id: Iab015872cc3a20fe5e7cae3bc90b89cba68cc3f8
GitHub-Last-Rev: d99de580ab
GitHub-Pull-Request: golang/net#215
Reviewed-on: https://go-review.googlesource.com/c/net/+/594195
Reviewed-by: Ian Lance Taylor <iant@google.com >
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com >
Auto-Submit: Ian Lance Taylor <iant@google.com >
Reviewed-by: Damien Neil <dneil@google.com >
2024-10-29 04:13:16 +00:00
Damien Neil
c1f5833288
all: replace deprecated io/ioutil calls
...
The io/ioutil package's features were moved to
the io and os packages in Go 1.16.
x/net depends on Go 1.18. Drop ioutil calls,
so gopls doesn't warn about them.
Change-Id: Ibdb576d94f250808ae285aa142e2fd41e7e9afc9
Reviewed-on: https://go-review.googlesource.com/c/net/+/586244
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com >
Reviewed-by: Ian Lance Taylor <iant@google.com >
2024-05-21 19:59:00 +00:00
Eng Zer Jun
f95a3b3a48
html: fix typo in package doc
...
Change-Id: I3cacfadc0396c8ef85addd062d991bb6f5b0483a
Reviewed-on: https://go-review.googlesource.com/c/net/+/580035
Auto-Submit: Ian Lance Taylor <iant@google.com >
Reviewed-by: Ian Lance Taylor <iant@google.com >
Commit-Queue: Ian Lance Taylor <iant@google.com >
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com >
Auto-Submit: Damien Neil <dneil@google.com >
Reviewed-by: Damien Neil <dneil@google.com >
2024-04-18 22:01:12 +00:00
Maciej Mionskowski
643fd162e3
html: fix SOLIDUS '/' handling in attribute parsing
...
Calling the Tokenizer with HTML elements containing SOLIDUS (/) character
in the attribute name results in incorrect tokenization.
This is due to violation of the following rule transitions in the WHATWG spec:
- https://html.spec.whatwg.org/multipage/parsing.html#attribute-name-state ,
where we are not reconsuming the character if '/' is encountered
- https://html.spec.whatwg.org/multipage/parsing.html#after-attribute-name-state ,
where we are not switching to self closing state
Fixes golang/go#63402
Change-Id: I90d998dd8decde877bd63aa664f3657aa6161024
GitHub-Last-Rev: 3546db808c
GitHub-Pull-Request: golang/net#195
Reviewed-on: https://go-review.googlesource.com/c/net/+/533518
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com >
Auto-Submit: Michael Pratt <mpratt@google.com >
Reviewed-by: Roland Shoemaker <roland@golang.org >
Reviewed-by: David Chase <drchase@google.com >
2024-02-07 19:23:52 +00:00
Dmitri Shuralyov
d23d9bc549
all: update go directive to 1.18
...
Done with:
go get go@1.18
go mod tidy
go fix ./...
Using go1.21.3.
With a manual change to keep golang.org/x/net/context testing itself,
not context in the standard library.
For golang/go#60268 .
Change-Id: I00682bf7cf1e3ba4370e2a3e7f63dc245b294a36
Reviewed-on: https://go-review.googlesource.com/c/net/+/534241
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com >
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com >
Reviewed-by: Damien Neil <dneil@google.com >
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org >
2023-10-11 21:58:12 +00:00
Roland Shoemaker
8ffa475fbd
html: only render content literally in the HTML namespace
...
Per the WHATWG HTML specification, section 13.3, only append the literal
content of a text node if we are in the HTML namespace.
Thanks to Mohammad Thoriq Aziz for reporting this issue.
Fixes golang/go#61615
Fixes CVE-2023-3978
Change-Id: I332152904d4e7646bd2441602bcbe591fc655fa4
Reviewed-on: https://team-review.git.corp.google.com/c/golang/go-private/+/1942896
Reviewed-by: Tatiana Bradley <tatianabradley@google.com >
Run-TryBot: Roland Shoemaker <bracewell@google.com >
Reviewed-by: Damien Neil <dneil@google.com >
TryBot-Result: Security TryBots <security-trybots@go-security-trybots.iam.gserviceaccount.com >
Reviewed-on: https://go-review.googlesource.com/c/net/+/514896
Reviewed-by: Roland Shoemaker <roland@golang.org >
TryBot-Result: Gopher Robot <gobot@golang.org >
Run-TryBot: Damien Neil <dneil@google.com >
2023-08-01 17:41:59 +00:00
Roland Shoemaker
4050002696
html: handle equals sign before attribute
...
Apply the correct normalization when an equals sign appears before an
attribute name (e.g. '<tag =>' -> '<tag =="">'), per WHATWG 13.2.5.32.
Change-Id: Id21b428bd86117dd073c502767386bc718a3fb7b
Reviewed-on: https://go-review.googlesource.com/c/net/+/488695
Auto-Submit: Roland Shoemaker <roland@golang.org >
TryBot-Result: Gopher Robot <gobot@golang.org >
Reviewed-by: Nigel Tao <nigeltao@golang.org >
Run-TryBot: Roland Shoemaker <roland@golang.org >
Reviewed-by: Nigel Tao (INACTIVE; USE @golang.org INSTEAD) <nigeltao@google.com >
2023-06-20 17:16:42 +00:00
Roland Shoemaker
eb1572ce7f
html: another shot at security doc
...
Be clearer about the operation of the tokenizer and the parser (and
their differences), and be explicit about the need for re-serialization
when they are being used in security contexts.
Change-Id: Ieb8f2a9d4806fb7a8849a15671667396e81c53b9
Reviewed-on: https://go-review.googlesource.com/c/net/+/484795
Auto-Submit: Roland Shoemaker <roland@golang.org >
Reviewed-by: Damien Neil <dneil@google.com >
Run-TryBot: Roland Shoemaker <roland@golang.org >
TryBot-Result: Gopher Robot <gobot@golang.org >
2023-04-17 17:44:42 +00:00
Roland Shoemaker
08dda57501
html: fix package doc typo
...
Change-Id: Ic16f297e936cf10bafe0656f5db68cd422c430aa
Reviewed-on: https://go-review.googlesource.com/c/net/+/474217
Reviewed-by: Ian Lance Taylor <iant@google.com >
Auto-Submit: Roland Shoemaker <roland@golang.org >
Run-TryBot: Roland Shoemaker <roland@golang.org >
TryBot-Result: Gopher Robot <gobot@golang.org >
2023-03-07 22:09:01 +00:00
Roland Shoemaker
8c4ef2f86b
hmtl: add security section to package comment
...
Adds a short security considerations paragraph to the package comment
detailing the differences between the parser and tokenizer.
Change-Id: I9e6840b20f82ffc6bc4088fffd6b4eda97550c0a
Reviewed-on: https://go-review.googlesource.com/c/net/+/459676
TryBot-Result: Gopher Robot <gobot@golang.org >
Reviewed-by: Damien Neil <dneil@google.com >
Run-TryBot: Roland Shoemaker <roland@golang.org >
Reviewed-by: Rob Pike <r@golang.org >
2023-03-03 16:07:56 +00:00
Nigel Tao
1d46ed8b48
html: have Render escape comments less often
...
Fixes golang/go#58246
Change-Id: I3effbd2afd7e363a42baa4db20691e57c9a08389
Reviewed-on: https://go-review.googlesource.com/c/net/+/469056
TryBot-Result: Gopher Robot <gobot@golang.org >
Run-TryBot: Nigel Tao <nigeltao@golang.org >
Reviewed-by: Bryan Mills <bcmills@google.com >
Reviewed-by: Kunpei Sakai <namusyaka@gmail.com >
Reviewed-by: Damien Neil <dneil@google.com >
2023-02-28 08:42:21 +00:00
Nigel Tao
569fe8158c
html: add "Microsoft Outlook comment" tests
...
This only adds new tests. A follow-up commit will change behavior.
Updates golang/go#58246
Change-Id: I6adf5941d5cfd3c28f7b9328882ac280109ee028
Reviewed-on: https://go-review.googlesource.com/c/net/+/469055
TryBot-Result: Gopher Robot <gobot@golang.org >
Run-TryBot: Nigel Tao <nigeltao@golang.org >
Reviewed-by: Kunpei Sakai <namusyaka@gmail.com >
Reviewed-by: Damien Neil <dneil@google.com >
Reviewed-by: Bryan Mills <bcmills@google.com >
2023-02-23 23:08:33 +00:00
Nigel Tao
39940adcaa
html: parse comments per HTML spec
...
Updates golang/go#58246
Change-Id: Iaba5ed65f5d244fd47372ef0c08fc4cdb5ed90f9
Reviewed-on: https://go-review.googlesource.com/c/net/+/466776
TryBot-Result: Gopher Robot <gobot@golang.org >
Auto-Submit: Nigel Tao <nigeltao@golang.org >
Reviewed-by: Damien Neil <dneil@google.com >
Run-TryBot: Nigel Tao <nigeltao@golang.org >
Reviewed-by: Nigel Tao (INACTIVE; USE @golang.org INSTEAD) <nigeltao@google.com >
2023-02-10 18:21:14 +00:00
cui fliter
415cb6d518
all: fix some comments
...
Change-Id: Iee11c27052222f017b672c06ced9e129ee51619c
Reviewed-on: https://go-review.googlesource.com/c/net/+/465996
Auto-Submit: Ian Lance Taylor <iant@google.com >
Reviewed-by: Ian Lance Taylor <iant@google.com >
Run-TryBot: Ian Lance Taylor <iant@google.com >
Reviewed-by: David Chase <drchase@google.com >
TryBot-Result: Gopher Robot <gobot@golang.org >
2023-02-08 14:49:55 +00:00
Roland Shoemaker
430a433969
html: properly handle exclamation marks in comments
...
Properly handle the case where HTML comments begin with exclamation
marks and have no other content, i.e. "<!--!-->". Previously these
comments would cause the tokenizer to consider everything following to
also be considered part of the comment.
Fixes golang/go#37771
Change-Id: I78ea310debc3846f145d62cba017055abc7fa4e0
Reviewed-on: https://go-review.googlesource.com/c/net/+/442496
Run-TryBot: Roland Shoemaker <roland@golang.org >
TryBot-Result: Gopher Robot <gobot@golang.org >
Reviewed-by: Damien Neil <dneil@google.com >
2022-10-20 16:40:45 +00:00
cui fliter
0b7e1fb9d4
all: fix a few function names on comments
...
Change-Id: I6c853dd402d296701e38289bbc418730b068dde8
Reviewed-on: https://go-review.googlesource.com/c/net/+/441716
Auto-Submit: Ian Lance Taylor <iant@google.com >
Reviewed-by: Joedian Reid <joedian@golang.org >
Reviewed-by: Ian Lance Taylor <iant@google.com >
TryBot-Result: Gopher Robot <gobot@golang.org >
Run-TryBot: Ian Lance Taylor <iant@google.com >
2022-10-12 13:50:44 +00:00
Nigel Tao
0699458419
html: escape comment and doctype tokens' data
...
Fixes golang/go#48237
Change-Id: I309e3ad30684fb71b9b3e67dfac156da08dbc69b
Reviewed-on: https://go-review.googlesource.com/c/net/+/419334
Run-TryBot: Nigel Tao <nigeltao@golang.org >
Reviewed-by: Cherry Mui <cherryyz@google.com >
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com >
Reviewed-by: Kunpei Sakai <namusyaka@gmail.com >
TryBot-Result: Gopher Robot <gobot@golang.org >
2022-07-26 23:03:23 +00:00
Nigel Tao
37e1c6afe0
html: ignore templates nested within foreign content
...
Fixes #46288
Fixes CVE-2021-33194
Change-Id: I2fe39702de8e9aab29965c1526e377a6f9cdf056
Reviewed-on: https://go-review.googlesource.com/c/net/+/311090
Reviewed-by: Filippo Valsorda <filippo@golang.org >
Run-TryBot: Filippo Valsorda <filippo@golang.org >
Trust: Roland Shoemaker <roland@golang.org >
TryBot-Result: Go Bot <gobot@golang.org >
2021-05-20 17:08:46 +00:00
Russ Cox
5f55cee0dc
all: go fmt ./...
...
Make all our package sources use Go 1.17 gofmt format
(adding //go:build lines).
Not strictly necessary but will avoid spurious changes
as files are edited.
Part of //go:build change (#41184 ).
See https://golang.org/design/draft-gobuild
Change-Id: I5b2b7d93424e828a3c5f76ae3f30ab825aca388e
Reviewed-on: https://go-review.googlesource.com/c/net/+/294371
Trust: Russ Cox <rsc@golang.org >
Run-TryBot: Russ Cox <rsc@golang.org >
TryBot-Result: Go Bot <gobot@golang.org >
Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com >
Reviewed-by: Ian Lance Taylor <iant@golang.org >
2021-02-20 03:31:24 +00:00
Kunpei Sakai
28c70e62bb
html: port html5lib tests from html5lib/html5lib-tests
...
To reproduce this, execute following steps in order:
1. git clone git@github.com:html5lib/html5lib-tests.git && git checkout 6ddcf58bea5a01e616911050c173622f84297211
2. cp -Rv html5lib-tests/tree-construction/ testdata/webkit
Change-Id: Id32798b1ff881afad82d87c2fef0841e5223c7e6
Reviewed-on: https://go-review.googlesource.com/c/net/+/263397
Trust: Kunpei Sakai <namusyaka@gmail.com >
Trust: Nigel Tao <nigeltao@golang.org >
Run-TryBot: Kunpei Sakai <namusyaka@gmail.com >
TryBot-Result: Go Bot <gobot@golang.org >
Reviewed-by: Nigel Tao <nigeltao@golang.org >
2020-10-29 22:17:08 +00:00
Kunpei Sakai
942e2f445f
html: avoid using raw text mode even when nested noscript tags
...
Assuming "in head noscript" insertion mode, the scripting flag will be disabled.
Thus, even if nested noscript tags exist,
the tokenizer should not go into the raw text mode.
This change makes the following test happy:
<head><noscript><noscript class="foo"><!--foo--></noscript>
Change-Id: I2620e751d8be3d313c3a2e2f992b1e21ce2dc2ee
Reviewed-on: https://go-review.googlesource.com/c/net/+/263878
Trust: Kunpei Sakai <namusyaka@gmail.com >
Trust: Nigel Tao <nigeltao@golang.org >
Reviewed-by: Nigel Tao <nigeltao@golang.org >
2020-10-29 05:50:24 +00:00
Kunpei Sakai
8adf50f3fe
html: avoid using raw text mode if there are raw tags to be ignored in select IM
...
This follows up on https://golang.org/cl/264977
Change-Id: I5d0e2f39173a8bbd07ca53de4df2a7e8772d4197
Reviewed-on: https://go-review.googlesource.com/c/net/+/265960
Trust: Kunpei Sakai <namusyaka@gmail.com >
Trust: Nigel Tao <nigeltao@golang.org >
Run-TryBot: Kunpei Sakai <namusyaka@gmail.com >
TryBot-Result: Go Bot <gobot@golang.org >
Reviewed-by: Nigel Tao <nigeltao@golang.org >
2020-10-29 05:33:32 +00:00
Kunpei Sakai
e0495509cf
html: skip tests for behavior outside the parsing algorithm
...
This also updates webkit/tests18.dat to latest.
Change-Id: I4ed37e918a7db63afd8d515dd3a2494699cc5b74
Reviewed-on: https://go-review.googlesource.com/c/net/+/264977
Trust: Kunpei Sakai <namusyaka@gmail.com >
Trust: Nigel Tao <nigeltao@golang.org >
Run-TryBot: Kunpei Sakai <namusyaka@gmail.com >
TryBot-Result: Go Bot <gobot@golang.org >
Reviewed-by: Nigel Tao <nigeltao@golang.org >
2020-10-29 03:25:33 +00:00
Kunpei Sakai
d65d470038
html: remove a few attributes from svg attribute adjustments
...
As per the current spec, "contentScriptType", "contentStyleType" and
"externalResourcesRequired" have been removed from the svg attribute adjustments.
See: https://html.spec.whatwg.org/multipage/parsing.html#adjust-svg-attributes
Change-Id: I904914691c3a3c3958868f7e49ba10f7d6f9ec09
Reviewed-on: https://go-review.googlesource.com/c/net/+/263398
Trust: Kunpei Sakai <namusyaka@gmail.com >
Trust: Nigel Tao <nigeltao@golang.org >
Run-TryBot: Kunpei Sakai <namusyaka@gmail.com >
TryBot-Result: Go Bot <gobot@golang.org >
Reviewed-by: Nigel Tao <nigeltao@golang.org >
2020-10-20 06:53:57 +00:00
Kunpei Sakai
4f7140c49a
html: add comments indicating that these have been removed from the spec
...
Change-Id: I44e9234fa4dc65f2b23b3a9f31ffe9d9fcefc4f1
Reviewed-on: https://go-review.googlesource.com/c/net/+/261177
Reviewed-by: Nigel Tao <nigeltao@golang.org >
Trust: Nigel Tao <nigeltao@golang.org >
Trust: Emmanuel Odeke <emm.odeke@gmail.com >
2020-10-10 22:47:23 +00:00
Nigel Tao
16171245cf
html: add the RawNode NodeType
...
Fixes golang/go#36350
Change-Id: Ia11b65940949b7da996b194d48372bc6219d4baa
Reviewed-on: https://go-review.googlesource.com/c/net/+/216800
Reviewed-by: Kunpei Sakai <namusyaka@gmail.com >
Reviewed-by: Nigel Tao <nigeltao@golang.org >
Run-TryBot: Kunpei Sakai <namusyaka@gmail.com >
TryBot-Result: Gobot Gobot <gobot@golang.org >
2020-02-02 09:46:26 +00:00
Kunpei Sakai
e7e4b65ae6
html: improve coding style
...
Change-Id: I05c0ccbad41f5512f8096b0d15991d7d6b5d726e
Reviewed-on: https://go-review.googlesource.com/c/net/+/209398
Reviewed-by: Nigel Tao <nigeltao@golang.org >
2019-12-07 00:06:13 +00:00
Kunpei Sakai
51f093181b
html: update adoption agency algorithm
...
See: https://html.spec.whatwg.org/multipage/parsing.html#adoption-agency-algorithm
This follows up on golang.org/cl/205617
Change-Id: I45862eb81ed421b327e216254169355e63698716
Reviewed-on: https://go-review.googlesource.com/c/net/+/210317
Run-TryBot: Kunpei Sakai <namusyaka@gmail.com >
TryBot-Result: Gobot Gobot <gobot@golang.org >
Reviewed-by: Nigel Tao <nigeltao@golang.org >
2019-12-07 00:05:07 +00:00
Kunpei Sakai
1ddd1de85c
html: implement generic raw text element parsing algorithm
...
See: https://html.spec.whatwg.org/multipage/parsing.html#parsing-elements-that-contain-only-text
This follows up on golang.org/cl/205617
Change-Id: Id99054bc25e9ea90bb3f03b15c14c13573520997
Reviewed-on: https://go-review.googlesource.com/c/net/+/210318
Run-TryBot: Kunpei Sakai <namusyaka@gmail.com >
TryBot-Result: Gobot Gobot <gobot@golang.org >
Reviewed-by: Nigel Tao <nigeltao@golang.org >
2019-12-06 10:30:17 +00:00
Kunpei Sakai
afd1edf42a
html: drop <isindex> and <command> specific handlings
...
This commit also adds remaining tests to follow up on golang.org/cl/205617
Change-Id: I8b155f9f605c6a0eb8745c32f5e785f5b4bc1c7e
Reviewed-on: https://go-review.googlesource.com/c/net/+/208937
Run-TryBot: Kunpei Sakai <namusyaka@gmail.com >
TryBot-Result: Gobot Gobot <gobot@golang.org >
Reviewed-by: Nigel Tao <nigeltao@golang.org >
2019-12-06 10:28:45 +00:00
Kunpei Sakai
ffdde10578
html: implement adjusted current node and make parser support foreign fragment
...
This follows up on golang.org/cl/205617
Change-Id: Id94a4fcef6a604936c404f75999ba37321b6c2c0
Reviewed-on: https://go-review.googlesource.com/c/net/+/206121
Run-TryBot: Kunpei Sakai <namusyaka@gmail.com >
TryBot-Result: Gobot Gobot <gobot@golang.org >
Reviewed-by: Nigel Tao <nigeltao@golang.org >
2019-11-25 08:49:36 +00:00
Kunpei Sakai
72fef5d5e2
html: remove "filterres" from svg attribute adjustments
...
This follows up on golang.org/cl/205617
Ref: c0ffd43f89
Change-Id: I0a7399368bb8c28c5bf65adf3614a84ffeb82b8c
Reviewed-on: https://go-review.googlesource.com/c/net/+/206120
Run-TryBot: Kunpei Sakai <namusyaka@gmail.com >
Reviewed-by: Nigel Tao <nigeltao@golang.org >
2019-11-24 23:54:46 +00:00
Kunpei Sakai
8f7fa2680c
html: support #script-(on|off) directives for tests
...
Those directives are now supported by html5lib-tests.
See: e52ff68cc7/tree-construction/README.md
Also, this fixes missing opts on parsing for identical check
Change-Id: I92f2398ebda0477fd7f6bb438c54f3948063c08d
Reviewed-on: https://go-review.googlesource.com/c/net/+/206118
Run-TryBot: Kunpei Sakai <namusyaka@gmail.com >
TryBot-Result: Gobot Gobot <gobot@golang.org >
Reviewed-by: Nigel Tao <nigeltao@golang.org >
2019-11-24 23:31:50 +00:00
Kunpei Sakai
b954d4e333
html: add Main support
...
This follows up on golang.org/cl/205617
Change-Id: Ic4a232c40a69bcd3ba35abdd36bce933f35248ea
Reviewed-on: https://go-review.googlesource.com/c/net/+/206117
Run-TryBot: Kunpei Sakai <namusyaka@gmail.com >
TryBot-Result: Gobot Gobot <gobot@golang.org >
Reviewed-by: Nigel Tao <nigeltao@golang.org >
2019-11-24 23:23:54 +00:00
Kunpei Sakai
7e6e90b9ea
html: port html5lib test data from html5lib/html5lib-tests
...
1. git clone git@github.com:html5lib/html5lib-tests.git && git checkout 88b8ee967f9f6f42fcd84c65af774860a70ecf3c
2. cp -Rv html5lib-tests/tree-construction/ into testdata/webkit
3. Drop unpassed following changes:
testdata/webkit/foreign-fragment.dat
testdata/webkit/isindex.dat
testdata/webkit/main-element.dat
testdata/webkit/menuitem-element.dat
testdata/webkit/tests11.dat
testdata/webkit/tests16.dat
testdata/webkit/tests19.dat
testdata/webkit/tests25.dat
testdata/webkit/tests5.dat
testdata/webkit/webkit02.dat
Change-Id: Ie60b6e24751a1efb83caf326b7e42f0517ec6b96
Reviewed-on: https://go-review.googlesource.com/c/net/+/205617
Run-TryBot: Kunpei Sakai <namusyaka@gmail.com >
TryBot-Result: Gobot Gobot <gobot@golang.org >
Reviewed-by: Nigel Tao <nigeltao@golang.org >
2019-11-08 06:38:44 +00:00
Kunpei Sakai
6f6bbb1828
html: add Dialog support
...
Change-Id: I16afe71ca444afb03526f94e6743a587cd82a8d4
Reviewed-on: https://go-review.googlesource.com/c/net/+/205618
Run-TryBot: Kunpei Sakai <namusyaka@gmail.com >
TryBot-Result: Gobot Gobot <gobot@golang.org >
Reviewed-by: Nigel Tao <nigeltao@golang.org >
2019-11-08 04:52:52 +00:00
Dario
2ec189313e
html: fix tokenizer error
...
Trailing '<' entities in the text token make the tokenizer fail
for escapable raw text elements like title and textarea
Fixes golang/go#34281
Change-Id: I6fe8f2229b5fd639cf5a02ab1db31f18ea034c8b
GitHub-Last-Rev: 4a9da03177
GitHub-Pull-Request: golang/net#53
Reviewed-on: https://go-review.googlesource.com/c/net/+/196620
Run-TryBot: Kunpei Sakai <kunpei@google.com >
TryBot-Result: Gobot Gobot <gobot@golang.org >
Reviewed-by: Nigel Tao <nigeltao@golang.org >
2019-10-02 03:54:40 +00:00
Nigel Tao
2e5a9a9514
html: add Tokenizer.Raw comment re byte offsets
...
Change-Id: I2a08f28fcc58869b0e8a3b21b9a9c97da5063014
Reviewed-on: https://go-review.googlesource.com/c/net/+/198357
Reviewed-by: David Symonds <dsymonds@golang.org >
2019-10-02 03:42:24 +00:00
Kunpei Sakai
9ce7a6920f
html: implement ParseWithOptions and ParseFragmentWithOptions
...
This commit newly introduces a type for configuring a parser
called ParseOption, and implements two functions depending on it.
Along with that, this introduces ParseOptionEnableScripting to
enable setting of the scripting flag.
Fixes golang/go#16318
Change-Id: Ie7fd7d8ce286e22e7f57182fc2ce353bce578db6
Reviewed-on: https://go-review.googlesource.com/c/net/+/174157
Reviewed-by: Nigel Tao <nigeltao@golang.org >
2019-05-01 00:44:15 +00:00
Tom Anthony
ce75fb3bc6
html: Add missing condition to 'in cell' insertion mode, required by spec
...
In section 12.2.6.4.15 of the spec, there is a condition that the current node is a td or th element, which is not implemented. This can lead to a panic when the open elements stack is popped whilst empty, as outlined in golang/go#30600 . This commit implements that check.
Fixes golang/go#30600
Change-Id: I4837815e2edce21b58a985a100d93d146bf71e24
GitHub-Last-Rev: 79084c5a84
GitHub-Pull-Request: golang/net#41
Reviewed-on: https://go-review.googlesource.com/c/net/+/172377
Reviewed-by: Kunpei Sakai <namusyaka@gmail.com >
Reviewed-by: Nigel Tao <nigeltao@golang.org >
Run-TryBot: Kunpei Sakai <namusyaka@gmail.com >
TryBot-Result: Gobot Gobot <gobot@golang.org >
2019-04-24 02:45:59 +00:00
Kunpei Sakai
574d568418
html: add "in head noscript" im support
...
In the spec 12.2.6.4.5, the "in head noscript" insertion mode is defined.
However, this package and its parser doesn't have the insertion mode,
because the scripting=false case is not considered currently.
This commit adds a test and a support for the "in head noscript"
insertion mode. This change has no effect on the actual behavior.
Updates golang/go#16318
Change-Id: I9314c3342bea27fa2acf2fa7d980a127ee0fbf91
Reviewed-on: https://go-review.googlesource.com/c/net/+/172557
Reviewed-by: Nigel Tao <nigeltao@golang.org >
2019-04-24 02:42:50 +00:00
Mikio Hara
a33f666f30
html: gofmt -w -s
...
Change-Id: I2da52ff2afbf0417dbe6c08105fafeb168e936ee
Reviewed-on: https://go-review.googlesource.com/c/net/+/169358
Run-TryBot: Mikio Hara <mikioh.public.networking@gmail.com >
TryBot-Result: Gobot Gobot <gobot@golang.org >
Reviewed-by: Daniel Martí <mvdan@mvdan.cc >
2019-03-26 08:36:53 +00:00
Tom Anthony
e3b2ff56ed
html: fix parsing where nested tags of unknown types inadvertently close one another
...
The existing implementation behaves differently to all major browsers, for the instance where a self-closing element of an unknown tag type is the child of another element of an unknown tag type. The issue appears to be that nested tags of an differing unknown types will all have an atom value of 0 and `inBodyEndTagOther` will incorrectly match them to one another.
Fixes golang/go#30961
Change-Id: I62b0aa49c027c8432df7d077ffba135201b3b786
GitHub-Last-Rev: fb25181f9a
GitHub-Pull-Request: golang/net#37
Reviewed-on: https://go-review.googlesource.com/c/net/+/168638
Reviewed-by: Nigel Tao <nigeltao@golang.org >
2019-03-24 22:39:53 +00:00