103 Commits

Author SHA1 Message Date
Roland Shoemaker
e1fcd82abb html: properly handle trailing solidus in unquoted attribute value in foreign content
The parser properly treats tags like <p a=/> as <p a="/">, but the
tokenizer emits the SelfClosingTagToken token incorrectly. When the
parser is used to parse foreign content, this results in an incorrect
DOM.

Thanks to Sean Ng (https://ensy.zip) for reporting this issue.

Fixes golang/go#73070
Fixes CVE-2025-22872

Change-Id: I65c18df6d6244bf943b61e6c7a87895929e78f4f
Reviewed-on: https://go-review.googlesource.com/c/net/+/661256
Reviewed-by: Neal Patel <nealpatel@google.com>
Reviewed-by: Roland Shoemaker <roland@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Gopher Robot <gobot@golang.org>
2025-03-27 12:51:24 -07:00
Pukki
312450e473 html: ensure <search> tag closes <p> and update tests
This change ensures that the <search> tag correctly closes an open <p> tag when encountered during parsing.

Changes:
- Added <search> to the list of elements that should close an open <p> tag in parse.go.
- Updated the second list in parse.go to ensure consistency.
- Updated html/atom/gen.go, table.go, and table_test.go accordingly.
- Modified parse_test.go to use strings.Builder instead of bytes.Buffer.
- Updated test error messages to follow Go’s conventions.
- Fixed an accidental colon in the comment in parse.go.

Change-Id: I5835da69f6bb0e14c483e55b7ae82915ae958dc1
Reviewed-on: https://go-review.googlesource.com/c/net/+/655457
Reviewed-by: Damien Neil <dneil@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
2025-03-12 15:46:46 -07:00
Roland Shoemaker
8e66b04771 html: use strings.EqualFold instead of lowering ourselves
Instead of using strings.ToLower and == to check case insensitive
equality, just use strings.EqualFold, even when the strings are only
ASCII. This prevents us unnecessarily lowering extremely long strings,
which can be a somewhat expensive operation, even if we're only
attempting to compare equality with five characters.

Thanks to Guido Vranken for reporting this issue.

Fixes golang/go#70906
Fixes CVE-2024-45338

Change-Id: I323b919f912d60dab6a87cadfdcac3e6b54cd128
Reviewed-on: https://go-review.googlesource.com/c/net/+/637536
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Gopher Robot <gobot@golang.org>
Reviewed-by: Roland Shoemaker <roland@golang.org>
Reviewed-by: Tatiana Bradley <tatianabradley@google.com>
2024-12-18 11:24:30 -08:00
yincong
b935f7b5d7 html: avoid endless loop on error token
Fixes #70179

Change-Id: I2a0a1fc2e96f7d8eefd0abdf7ef8ba243a6e8645
GitHub-Last-Rev: a601ecd849
GitHub-Pull-Request: golang/net#226
Reviewed-on: https://go-review.googlesource.com/c/net/+/624895
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Roland Shoemaker <roland@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-12-18 08:05:47 -08:00
Carlana Johnson
511cc3a406 html: add Node.{Ancestors,ChildNodes,Descendants}()
Adds iterators for the parents, immediate children, and all children of a Node respectively.

Fixes golang/go#62113

Change-Id: Iab015872cc3a20fe5e7cae3bc90b89cba68cc3f8
GitHub-Last-Rev: d99de580ab
GitHub-Pull-Request: golang/net#215
Reviewed-on: https://go-review.googlesource.com/c/net/+/594195
Reviewed-by: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Damien Neil <dneil@google.com>
2024-10-29 04:13:16 +00:00
Damien Neil
c1f5833288 all: replace deprecated io/ioutil calls
The io/ioutil package's features were moved to
the io and os packages in Go 1.16.

x/net depends on Go 1.18. Drop ioutil calls,
so gopls doesn't warn about them.

Change-Id: Ibdb576d94f250808ae285aa142e2fd41e7e9afc9
Reviewed-on: https://go-review.googlesource.com/c/net/+/586244
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
2024-05-21 19:59:00 +00:00
Eng Zer Jun
f95a3b3a48 html: fix typo in package doc
Change-Id: I3cacfadc0396c8ef85addd062d991bb6f5b0483a
Reviewed-on: https://go-review.googlesource.com/c/net/+/580035
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Commit-Queue: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Damien Neil <dneil@google.com>
Reviewed-by: Damien Neil <dneil@google.com>
2024-04-18 22:01:12 +00:00
Maciej Mionskowski
643fd162e3 html: fix SOLIDUS '/' handling in attribute parsing
Calling the Tokenizer with HTML elements containing SOLIDUS (/) character
in the attribute name results in incorrect tokenization.

This is due to violation of the following rule transitions in the WHATWG spec:
- https://html.spec.whatwg.org/multipage/parsing.html#attribute-name-state,
  where we are not reconsuming the character if '/' is encountered
- https://html.spec.whatwg.org/multipage/parsing.html#after-attribute-name-state,
  where we are not switching to self closing state

Fixes golang/go#63402

Change-Id: I90d998dd8decde877bd63aa664f3657aa6161024
GitHub-Last-Rev: 3546db808c
GitHub-Pull-Request: golang/net#195
Reviewed-on: https://go-review.googlesource.com/c/net/+/533518
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
Reviewed-by: Roland Shoemaker <roland@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2024-02-07 19:23:52 +00:00
Dmitri Shuralyov
d23d9bc549 all: update go directive to 1.18
Done with:

go get go@1.18
go mod tidy
go fix ./...

Using go1.21.3.

With a manual change to keep golang.org/x/net/context testing itself,
not context in the standard library.

For golang/go#60268.

Change-Id: I00682bf7cf1e3ba4370e2a3e7f63dc245b294a36
Reviewed-on: https://go-review.googlesource.com/c/net/+/534241
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Damien Neil <dneil@google.com>
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>
2023-10-11 21:58:12 +00:00
Roland Shoemaker
8ffa475fbd html: only render content literally in the HTML namespace
Per the WHATWG HTML specification, section 13.3, only append the literal
content of a text node if we are in the HTML namespace.

Thanks to Mohammad Thoriq Aziz for reporting this issue.

Fixes golang/go#61615
Fixes CVE-2023-3978

Change-Id: I332152904d4e7646bd2441602bcbe591fc655fa4
Reviewed-on: https://team-review.git.corp.google.com/c/golang/go-private/+/1942896
Reviewed-by: Tatiana Bradley <tatianabradley@google.com>
Run-TryBot: Roland Shoemaker <bracewell@google.com>
Reviewed-by: Damien Neil <dneil@google.com>
TryBot-Result: Security TryBots <security-trybots@go-security-trybots.iam.gserviceaccount.com>
Reviewed-on: https://go-review.googlesource.com/c/net/+/514896
Reviewed-by: Roland Shoemaker <roland@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Damien Neil <dneil@google.com>
2023-08-01 17:41:59 +00:00
Roland Shoemaker
4050002696 html: handle equals sign before attribute
Apply the correct normalization when an equals sign appears before an
attribute name (e.g. '<tag =>' -> '<tag =="">'), per WHATWG 13.2.5.32.

Change-Id: Id21b428bd86117dd073c502767386bc718a3fb7b
Reviewed-on: https://go-review.googlesource.com/c/net/+/488695
Auto-Submit: Roland Shoemaker <roland@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Nigel Tao <nigeltao@golang.org>
Run-TryBot: Roland Shoemaker <roland@golang.org>
Reviewed-by: Nigel Tao (INACTIVE; USE @golang.org INSTEAD) <nigeltao@google.com>
2023-06-20 17:16:42 +00:00
Roland Shoemaker
eb1572ce7f html: another shot at security doc
Be clearer about the operation of the tokenizer and the parser (and
their differences), and be explicit about the need for re-serialization
when they are being used in security contexts.

Change-Id: Ieb8f2a9d4806fb7a8849a15671667396e81c53b9
Reviewed-on: https://go-review.googlesource.com/c/net/+/484795
Auto-Submit: Roland Shoemaker <roland@golang.org>
Reviewed-by: Damien Neil <dneil@google.com>
Run-TryBot: Roland Shoemaker <roland@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-04-17 17:44:42 +00:00
Roland Shoemaker
08dda57501 html: fix package doc typo
Change-Id: Ic16f297e936cf10bafe0656f5db68cd422c430aa
Reviewed-on: https://go-review.googlesource.com/c/net/+/474217
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Roland Shoemaker <roland@golang.org>
Run-TryBot: Roland Shoemaker <roland@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-03-07 22:09:01 +00:00
Roland Shoemaker
8c4ef2f86b hmtl: add security section to package comment
Adds a short security considerations paragraph to the package comment
detailing the differences between the parser and tokenizer.

Change-Id: I9e6840b20f82ffc6bc4088fffd6b4eda97550c0a
Reviewed-on: https://go-review.googlesource.com/c/net/+/459676
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Damien Neil <dneil@google.com>
Run-TryBot: Roland Shoemaker <roland@golang.org>
Reviewed-by: Rob Pike <r@golang.org>
2023-03-03 16:07:56 +00:00
Nigel Tao
1d46ed8b48 html: have Render escape comments less often
Fixes golang/go#58246

Change-Id: I3effbd2afd7e363a42baa4db20691e57c9a08389
Reviewed-on: https://go-review.googlesource.com/c/net/+/469056
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Nigel Tao <nigeltao@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
Reviewed-by: Kunpei Sakai <namusyaka@gmail.com>
Reviewed-by: Damien Neil <dneil@google.com>
2023-02-28 08:42:21 +00:00
Nigel Tao
569fe8158c html: add "Microsoft Outlook comment" tests
This only adds new tests. A follow-up commit will change behavior.

Updates golang/go#58246

Change-Id: I6adf5941d5cfd3c28f7b9328882ac280109ee028
Reviewed-on: https://go-review.googlesource.com/c/net/+/469055
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Nigel Tao <nigeltao@golang.org>
Reviewed-by: Kunpei Sakai <namusyaka@gmail.com>
Reviewed-by: Damien Neil <dneil@google.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
2023-02-23 23:08:33 +00:00
Nigel Tao
39940adcaa html: parse comments per HTML spec
Updates golang/go#58246

Change-Id: Iaba5ed65f5d244fd47372ef0c08fc4cdb5ed90f9
Reviewed-on: https://go-review.googlesource.com/c/net/+/466776
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Nigel Tao <nigeltao@golang.org>
Reviewed-by: Damien Neil <dneil@google.com>
Run-TryBot: Nigel Tao <nigeltao@golang.org>
Reviewed-by: Nigel Tao (INACTIVE; USE @golang.org INSTEAD) <nigeltao@google.com>
2023-02-10 18:21:14 +00:00
cui fliter
415cb6d518 all: fix some comments
Change-Id: Iee11c27052222f017b672c06ced9e129ee51619c
Reviewed-on: https://go-review.googlesource.com/c/net/+/465996
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-02-08 14:49:55 +00:00
Roland Shoemaker
430a433969 html: properly handle exclamation marks in comments
Properly handle the case where HTML comments begin with exclamation
marks and have no other content, i.e. "<!--!-->". Previously these
comments would cause the tokenizer to consider everything following to
also be considered part of the comment.

Fixes golang/go#37771

Change-Id: I78ea310debc3846f145d62cba017055abc7fa4e0
Reviewed-on: https://go-review.googlesource.com/c/net/+/442496
Run-TryBot: Roland Shoemaker <roland@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Damien Neil <dneil@google.com>
2022-10-20 16:40:45 +00:00
cui fliter
0b7e1fb9d4 all: fix a few function names on comments
Change-Id: I6c853dd402d296701e38289bbc418730b068dde8
Reviewed-on: https://go-review.googlesource.com/c/net/+/441716
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Joedian Reid <joedian@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Ian Lance Taylor <iant@google.com>
2022-10-12 13:50:44 +00:00
Nigel Tao
0699458419 html: escape comment and doctype tokens' data
Fixes golang/go#48237

Change-Id: I309e3ad30684fb71b9b3e67dfac156da08dbc69b
Reviewed-on: https://go-review.googlesource.com/c/net/+/419334
Run-TryBot: Nigel Tao <nigeltao@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Kunpei Sakai <namusyaka@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-07-26 23:03:23 +00:00
Nigel Tao
37e1c6afe0 html: ignore templates nested within foreign content
Fixes #46288
Fixes CVE-2021-33194

Change-Id: I2fe39702de8e9aab29965c1526e377a6f9cdf056
Reviewed-on: https://go-review.googlesource.com/c/net/+/311090
Reviewed-by: Filippo Valsorda <filippo@golang.org>
Run-TryBot: Filippo Valsorda <filippo@golang.org>
Trust: Roland Shoemaker <roland@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
2021-05-20 17:08:46 +00:00
Russ Cox
5f55cee0dc all: go fmt ./...
Make all our package sources use Go 1.17 gofmt format
(adding //go:build lines).

Not strictly necessary but will avoid spurious changes
as files are edited.

Part of //go:build change (#41184).
See https://golang.org/design/draft-gobuild

Change-Id: I5b2b7d93424e828a3c5f76ae3f30ab825aca388e
Reviewed-on: https://go-review.googlesource.com/c/net/+/294371
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2021-02-20 03:31:24 +00:00
Kunpei Sakai
28c70e62bb html: port html5lib tests from html5lib/html5lib-tests
To reproduce this, execute following steps in order:

1. git clone git@github.com:html5lib/html5lib-tests.git && git checkout 6ddcf58bea5a01e616911050c173622f84297211
2. cp -Rv html5lib-tests/tree-construction/ testdata/webkit

Change-Id: Id32798b1ff881afad82d87c2fef0841e5223c7e6
Reviewed-on: https://go-review.googlesource.com/c/net/+/263397
Trust: Kunpei Sakai <namusyaka@gmail.com>
Trust: Nigel Tao <nigeltao@golang.org>
Run-TryBot: Kunpei Sakai <namusyaka@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Nigel Tao <nigeltao@golang.org>
2020-10-29 22:17:08 +00:00
Kunpei Sakai
942e2f445f html: avoid using raw text mode even when nested noscript tags
Assuming "in head noscript" insertion mode, the scripting flag will be disabled.
Thus, even if nested noscript tags exist,
the tokenizer should not go into the raw text mode.

This change makes the following test happy:
<head><noscript><noscript class="foo"><!--foo--></noscript>

Change-Id: I2620e751d8be3d313c3a2e2f992b1e21ce2dc2ee
Reviewed-on: https://go-review.googlesource.com/c/net/+/263878
Trust: Kunpei Sakai <namusyaka@gmail.com>
Trust: Nigel Tao <nigeltao@golang.org>
Reviewed-by: Nigel Tao <nigeltao@golang.org>
2020-10-29 05:50:24 +00:00
Kunpei Sakai
8adf50f3fe html: avoid using raw text mode if there are raw tags to be ignored in select IM
This follows up on https://golang.org/cl/264977

Change-Id: I5d0e2f39173a8bbd07ca53de4df2a7e8772d4197
Reviewed-on: https://go-review.googlesource.com/c/net/+/265960
Trust: Kunpei Sakai <namusyaka@gmail.com>
Trust: Nigel Tao <nigeltao@golang.org>
Run-TryBot: Kunpei Sakai <namusyaka@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Nigel Tao <nigeltao@golang.org>
2020-10-29 05:33:32 +00:00
Kunpei Sakai
e0495509cf html: skip tests for behavior outside the parsing algorithm
This also updates webkit/tests18.dat to latest.

Change-Id: I4ed37e918a7db63afd8d515dd3a2494699cc5b74
Reviewed-on: https://go-review.googlesource.com/c/net/+/264977
Trust: Kunpei Sakai <namusyaka@gmail.com>
Trust: Nigel Tao <nigeltao@golang.org>
Run-TryBot: Kunpei Sakai <namusyaka@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Nigel Tao <nigeltao@golang.org>
2020-10-29 03:25:33 +00:00
Kunpei Sakai
d65d470038 html: remove a few attributes from svg attribute adjustments
As per the current spec, "contentScriptType", "contentStyleType" and
"externalResourcesRequired" have been removed from the svg attribute adjustments.

See: https://html.spec.whatwg.org/multipage/parsing.html#adjust-svg-attributes

Change-Id: I904914691c3a3c3958868f7e49ba10f7d6f9ec09
Reviewed-on: https://go-review.googlesource.com/c/net/+/263398
Trust: Kunpei Sakai <namusyaka@gmail.com>
Trust: Nigel Tao <nigeltao@golang.org>
Run-TryBot: Kunpei Sakai <namusyaka@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Nigel Tao <nigeltao@golang.org>
2020-10-20 06:53:57 +00:00
Kunpei Sakai
4f7140c49a html: add comments indicating that these have been removed from the spec
Change-Id: I44e9234fa4dc65f2b23b3a9f31ffe9d9fcefc4f1
Reviewed-on: https://go-review.googlesource.com/c/net/+/261177
Reviewed-by: Nigel Tao <nigeltao@golang.org>
Trust: Nigel Tao <nigeltao@golang.org>
Trust: Emmanuel Odeke <emm.odeke@gmail.com>
2020-10-10 22:47:23 +00:00
Nigel Tao
16171245cf html: add the RawNode NodeType
Fixes golang/go#36350

Change-Id: Ia11b65940949b7da996b194d48372bc6219d4baa
Reviewed-on: https://go-review.googlesource.com/c/net/+/216800
Reviewed-by: Kunpei Sakai <namusyaka@gmail.com>
Reviewed-by: Nigel Tao <nigeltao@golang.org>
Run-TryBot: Kunpei Sakai <namusyaka@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2020-02-02 09:46:26 +00:00
Kunpei Sakai
e7e4b65ae6 html: improve coding style
Change-Id: I05c0ccbad41f5512f8096b0d15991d7d6b5d726e
Reviewed-on: https://go-review.googlesource.com/c/net/+/209398
Reviewed-by: Nigel Tao <nigeltao@golang.org>
2019-12-07 00:06:13 +00:00
Kunpei Sakai
51f093181b html: update adoption agency algorithm
See: https://html.spec.whatwg.org/multipage/parsing.html#adoption-agency-algorithm

This follows up on golang.org/cl/205617

Change-Id: I45862eb81ed421b327e216254169355e63698716
Reviewed-on: https://go-review.googlesource.com/c/net/+/210317
Run-TryBot: Kunpei Sakai <namusyaka@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Nigel Tao <nigeltao@golang.org>
2019-12-07 00:05:07 +00:00
Kunpei Sakai
1ddd1de85c html: implement generic raw text element parsing algorithm
See: https://html.spec.whatwg.org/multipage/parsing.html#parsing-elements-that-contain-only-text

This follows up on golang.org/cl/205617

Change-Id: Id99054bc25e9ea90bb3f03b15c14c13573520997
Reviewed-on: https://go-review.googlesource.com/c/net/+/210318
Run-TryBot: Kunpei Sakai <namusyaka@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Nigel Tao <nigeltao@golang.org>
2019-12-06 10:30:17 +00:00
Kunpei Sakai
afd1edf42a html: drop <isindex> and <command> specific handlings
This commit also adds remaining tests to follow up on golang.org/cl/205617

Change-Id: I8b155f9f605c6a0eb8745c32f5e785f5b4bc1c7e
Reviewed-on: https://go-review.googlesource.com/c/net/+/208937
Run-TryBot: Kunpei Sakai <namusyaka@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Nigel Tao <nigeltao@golang.org>
2019-12-06 10:28:45 +00:00
Kunpei Sakai
ffdde10578 html: implement adjusted current node and make parser support foreign fragment
This follows up on golang.org/cl/205617

Change-Id: Id94a4fcef6a604936c404f75999ba37321b6c2c0
Reviewed-on: https://go-review.googlesource.com/c/net/+/206121
Run-TryBot: Kunpei Sakai <namusyaka@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Nigel Tao <nigeltao@golang.org>
2019-11-25 08:49:36 +00:00
Kunpei Sakai
72fef5d5e2 html: remove "filterres" from svg attribute adjustments
This follows up on golang.org/cl/205617
Ref: c0ffd43f89

Change-Id: I0a7399368bb8c28c5bf65adf3614a84ffeb82b8c
Reviewed-on: https://go-review.googlesource.com/c/net/+/206120
Run-TryBot: Kunpei Sakai <namusyaka@gmail.com>
Reviewed-by: Nigel Tao <nigeltao@golang.org>
2019-11-24 23:54:46 +00:00
Kunpei Sakai
8f7fa2680c html: support #script-(on|off) directives for tests
Those directives are now supported by html5lib-tests.
See: e52ff68cc7/tree-construction/README.md

Also, this fixes missing opts on parsing for identical check

Change-Id: I92f2398ebda0477fd7f6bb438c54f3948063c08d
Reviewed-on: https://go-review.googlesource.com/c/net/+/206118
Run-TryBot: Kunpei Sakai <namusyaka@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Nigel Tao <nigeltao@golang.org>
2019-11-24 23:31:50 +00:00
Kunpei Sakai
b954d4e333 html: add Main support
This follows up on golang.org/cl/205617

Change-Id: Ic4a232c40a69bcd3ba35abdd36bce933f35248ea
Reviewed-on: https://go-review.googlesource.com/c/net/+/206117
Run-TryBot: Kunpei Sakai <namusyaka@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Nigel Tao <nigeltao@golang.org>
2019-11-24 23:23:54 +00:00
Kunpei Sakai
7e6e90b9ea html: port html5lib test data from html5lib/html5lib-tests
1. git clone git@github.com:html5lib/html5lib-tests.git && git checkout 88b8ee967f9f6f42fcd84c65af774860a70ecf3c
2. cp -Rv html5lib-tests/tree-construction/ into testdata/webkit
3. Drop unpassed following changes:

testdata/webkit/foreign-fragment.dat
testdata/webkit/isindex.dat
testdata/webkit/main-element.dat
testdata/webkit/menuitem-element.dat
testdata/webkit/tests11.dat
testdata/webkit/tests16.dat
testdata/webkit/tests19.dat
testdata/webkit/tests25.dat
testdata/webkit/tests5.dat
testdata/webkit/webkit02.dat

Change-Id: Ie60b6e24751a1efb83caf326b7e42f0517ec6b96
Reviewed-on: https://go-review.googlesource.com/c/net/+/205617
Run-TryBot: Kunpei Sakai <namusyaka@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Nigel Tao <nigeltao@golang.org>
2019-11-08 06:38:44 +00:00
Kunpei Sakai
6f6bbb1828 html: add Dialog support
Change-Id: I16afe71ca444afb03526f94e6743a587cd82a8d4
Reviewed-on: https://go-review.googlesource.com/c/net/+/205618
Run-TryBot: Kunpei Sakai <namusyaka@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Nigel Tao <nigeltao@golang.org>
2019-11-08 04:52:52 +00:00
Dario
2ec189313e html: fix tokenizer error
Trailing '<' entities in the text token make the tokenizer fail
for escapable raw text elements like title and textarea

Fixes golang/go#34281

Change-Id: I6fe8f2229b5fd639cf5a02ab1db31f18ea034c8b
GitHub-Last-Rev: 4a9da03177
GitHub-Pull-Request: golang/net#53
Reviewed-on: https://go-review.googlesource.com/c/net/+/196620
Run-TryBot: Kunpei Sakai <kunpei@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Nigel Tao <nigeltao@golang.org>
2019-10-02 03:54:40 +00:00
Nigel Tao
2e5a9a9514 html: add Tokenizer.Raw comment re byte offsets
Change-Id: I2a08f28fcc58869b0e8a3b21b9a9c97da5063014
Reviewed-on: https://go-review.googlesource.com/c/net/+/198357
Reviewed-by: David Symonds <dsymonds@golang.org>
2019-10-02 03:42:24 +00:00
Kunpei Sakai
9ce7a6920f html: implement ParseWithOptions and ParseFragmentWithOptions
This commit newly introduces a type for configuring a parser
called ParseOption, and implements two functions depending on it.
Along with that, this introduces ParseOptionEnableScripting to
enable setting of the scripting flag.

Fixes golang/go#16318

Change-Id: Ie7fd7d8ce286e22e7f57182fc2ce353bce578db6
Reviewed-on: https://go-review.googlesource.com/c/net/+/174157
Reviewed-by: Nigel Tao <nigeltao@golang.org>
2019-05-01 00:44:15 +00:00
Tom Anthony
ce75fb3bc6 html: Add missing condition to 'in cell' insertion mode, required by spec
In section 12.2.6.4.15 of the spec, there is a condition that the current node is a td or th element, which is not implemented. This can lead to a panic when the open elements stack is popped whilst empty, as outlined in golang/go#30600. This commit implements that check.

Fixes golang/go#30600

Change-Id: I4837815e2edce21b58a985a100d93d146bf71e24
GitHub-Last-Rev: 79084c5a84
GitHub-Pull-Request: golang/net#41
Reviewed-on: https://go-review.googlesource.com/c/net/+/172377
Reviewed-by: Kunpei Sakai <namusyaka@gmail.com>
Reviewed-by: Nigel Tao <nigeltao@golang.org>
Run-TryBot: Kunpei Sakai <namusyaka@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-04-24 02:45:59 +00:00
Kunpei Sakai
574d568418 html: add "in head noscript" im support
In the spec 12.2.6.4.5, the "in head noscript" insertion mode is defined.
However, this package and its parser doesn't have the insertion mode,
because the scripting=false case is not considered currently.

This commit adds a test and a support for the "in head noscript"
insertion mode. This change has no effect on the actual behavior.

Updates golang/go#16318

Change-Id: I9314c3342bea27fa2acf2fa7d980a127ee0fbf91
Reviewed-on: https://go-review.googlesource.com/c/net/+/172557
Reviewed-by: Nigel Tao <nigeltao@golang.org>
2019-04-24 02:42:50 +00:00
Mikio Hara
a33f666f30 html: gofmt -w -s
Change-Id: I2da52ff2afbf0417dbe6c08105fafeb168e936ee
Reviewed-on: https://go-review.googlesource.com/c/net/+/169358
Run-TryBot: Mikio Hara <mikioh.public.networking@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
2019-03-26 08:36:53 +00:00
Tom Anthony
e3b2ff56ed html: fix parsing where nested tags of unknown types inadvertently close one another
The existing implementation behaves differently to all major browsers, for the instance where a self-closing element of an unknown tag type is the child of another element of an unknown tag type. The issue appears to be that nested tags of an differing unknown types will all have an atom value of 0 and `inBodyEndTagOther` will incorrectly match them to one another.

Fixes golang/go#30961

Change-Id: I62b0aa49c027c8432df7d077ffba135201b3b786
GitHub-Last-Rev: fb25181f9a
GitHub-Pull-Request: golang/net#37
Reviewed-on: https://go-review.googlesource.com/c/net/+/168638
Reviewed-by: Nigel Tao <nigeltao@golang.org>
2019-03-24 22:39:53 +00:00
Brad Fitzpatrick
66a96c8a54 html: remove G+ reference from testdata
A bot inside Google was nagging me about this so change plusone to
minusone to hide it from the bot which doesn't understand that it's
just test data.

Change-Id: I2300a3e8c2ca969d61cae2d6bdc0683743b534e1
Reviewed-on: https://go-review.googlesource.com/c/162888
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Nigel Tao <nigeltao@golang.org>
2019-02-26 19:30:03 +00:00
Kunpei Sakai
3a22650c66 html: remove unnecessary break
The ancestor doesn't always match with the first.

Change-Id: I0edcbffab7e19ba1731e849021ffbb7428ec48d7
Reviewed-on: https://go-review.googlesource.com/c/161857
Run-TryBot: Kunpei Sakai <namusyaka@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Nigel Tao <nigeltao@golang.org>
2019-02-13 06:11:40 +00:00
Kunpei Sakai
d26f9f9a57 html: update inSelectIM and inSelectInTableIM for the latest spec
Fixes golang/go#27842

Change-Id: I06eb3c0c18be3566bd30a29fca5f3f7e6791d2cc
Reviewed-on: https://go-review.googlesource.com/c/137275
Run-TryBot: Kunpei Sakai <namusyaka@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Nigel Tao <nigeltao@golang.org>
2019-01-25 09:10:13 +00:00