Commit Graph

41 Commits

Author SHA1 Message Date
Mikio Hara
3a7846fea0 html: gofmt -w -s
Change-Id: Ic81b9ab72be34a95e677a1dd40e970f86109eefc
Reviewed-on: https://go-review.googlesource.com/111935
Run-TryBot: Mikio Hara <mikioh.mikioh@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-05-08 21:14:25 +00:00
Kunpei Sakai
d41e817464 html: handle rb and rtc elements
Updates golang/go#23071

Change-Id: Ifef79e077801422eb273af3e5a541c85c63bfce4
Reviewed-on: https://go-review.googlesource.com/107575
Reviewed-by: Nigel Tao <nigeltao@golang.org>
2018-04-18 06:21:11 +00:00
Kunpei Sakai
8d16fa6dc9 html: avoid invalid nil pointer access
Updates golang/go#23071

Change-Id: I73d7302c5bde4441aa824093fdcce52e8bb51e31
Reviewed-on: https://go-review.googlesource.com/107379
Run-TryBot: Kunpei Sakai <namusyaka@gmail.com>
Reviewed-by: Nigel Tao <nigeltao@golang.org>
2018-04-17 00:37:50 +00:00
Kunpei Sakai
3121141a29 html/atom: add atom.Rb and atom.Rtc
Updates golang/go#23071

Change-Id: I07aae04757e83a3a03681a2ce92e4cab194ef64a
Reviewed-on: https://go-review.googlesource.com/107198
Run-TryBot: Kunpei Sakai <namusyaka@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Nigel Tao <nigeltao@golang.org>
2018-04-17 00:32:20 +00:00
namusyaka
500e7a4f95 html: add "in template" insertion mode support
See:
https://html.spec.whatwg.org/multipage/parsing.html#parsing-main-intemplate

Updates golang/go#23071

Change-Id: I36529b7cf5d2adf159ed5c471fba9f67890b7eb9
Reviewed-on: https://go-review.googlesource.com/94838
Run-TryBot: Kunpei Sakai <namusyaka@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Nigel Tao <nigeltao@golang.org>
2018-04-15 21:43:07 +00:00
namusyaka
136a25c244 html: update quotes about the list of active formatting elements
See https://html.spec.whatwg.org/multipage/parsing.html#the-list-of-active-formatting-elements

Updates golang/go#23071

Change-Id: I015c394ed34d721e9e4a4d3e797d06d750c1864e
Reviewed-on: https://go-review.googlesource.com/94837
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-02-16 17:17:45 +00:00
namusyaka
2e7f24ace3 html: update section numbers
Updates golang/go#23071

See https://html.spec.whatwg.org/multipage/

Change-Id: I1bde6e07ae9270ba7b320474b9bec8ec09a79f16
Reviewed-on: https://go-review.googlesource.com/94355
Reviewed-by: Nigel Tao <nigeltao@golang.org>
2018-02-16 11:01:04 +00:00
Nigel Tao
309822c5b9 html/atom: add atom.Acronym
Fixes golang/go#23507

Change-Id: Id39b28f211dfdb6a5464752b8b62e2528b46286b
Reviewed-on: https://go-review.googlesource.com/91315
Reviewed-by: David Symonds <dsymonds@golang.org>
2018-02-01 03:00:42 +00:00
Nigel Tao
5ccada7d0a html: fix misleading Tokenizer.Token comment
Change-Id: I39359b5fa52faf5b69005ba47b58be3beec16c4e
Reviewed-on: https://go-review.googlesource.com/87515
Reviewed-by: David Symonds <dsymonds@golang.org>
2018-01-12 01:58:58 +00:00
Frederic Guillot
9dfe398356 net/html: add missing package name in doc example
This code snippet should contains the prefix "html" like
other examples to be consistent.

Change-Id: I32428452625c016894aebc2011cde2dd614e6ed9
Reviewed-on: https://go-review.googlesource.com/77830
Reviewed-by: Gabriel Aszalos <gabriel.aszalos@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Gabriel Aszalos <gabriel.aszalos@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-11-15 15:19:08 +00:00
Kunpei Sakai
278c6cf336 html/atom: sync html table with the current whatwg spec
Some elements and attributes have been removed from the spec,
but are kept here for backwards compatibility.

Also, this makes gen.go support go:generate and write contents into
target files directly.

Finally, this makes `go generate` work on an entire directory.

Change-Id: I8d41840eec69eec1ef08527d8d71786612a3121d
Reviewed-on: https://go-review.googlesource.com/65152
Run-TryBot: Tom Bergan <tombergan@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Tom Bergan <tombergan@google.com>
2017-09-26 18:29:41 +00:00
namusyaka
b60f3a9210 html: add main and keygen as special element
Also, the isindex element has been remove from specification,
so this commit adds a comment about it.

Change-Id: I79a9b1eb9dae8274e2ca498ab73b2e73521d54e9
Reviewed-on: https://go-review.googlesource.com/64230
Reviewed-by: Tom Bergan <tombergan@google.com>
Run-TryBot: Tom Bergan <tombergan@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-09-20 23:43:30 +00:00
Andy Balholm
f1d3149ecb html/charset: replace EUC-KR test
The old test for EUC-KR was copied from the first web page that I could
find that was encoded in EUC-KR; the new one is the first line of
golang.org/x/text/internal/testtext.Korean.

Change-Id: I3de076256c935088a06138056cde216190766a6d
Reviewed-on: https://go-review.googlesource.com/18063
Reviewed-by: Marcel van Lohuizen <mpvl@golang.org>
2016-01-08 17:00:32 +00:00
Marcel van Lohuizen
68a055e15f html/charset: verify correct UTF-8 behavior
Change-Id: I4083c38468981128c3d74310cd02335c35eafa5d
Reviewed-on: https://go-review.googlesource.com/17966
Reviewed-by: Andy Balholm <andy@balholm.com>
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
2015-12-19 10:34:51 +00:00
Marcel van Lohuizen
9b9d6d8d11 html/charset: handle unsupported code points for encoding
Change-Id: I11ffc61623496fae6b32e678c91f7609d71aefe5
Reviewed-on: https://go-review.googlesource.com/17961
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
Reviewed-by: Andy Balholm <andy@balholm.com>
2015-12-17 16:33:40 +00:00
Marcel van Lohuizen
d28a91ad26 html/charset: use x/text/encoding/htmlindex
Saves duplication of work.

Change-Id: I33c715f33cb6cacd8522e480dc96ae71475c5b3c
Reviewed-on: https://go-review.googlesource.com/17805
Reviewed-by: Andy Balholm <andy@balholm.com>
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
2015-12-17 11:26:21 +00:00
Marcel van Lohuizen
72b0708b72 html/charmap: update table with latest data
Change-Id: I7ae395999a3e61afa3a6ee15d076edae73d8a83b
Reviewed-on: https://go-review.googlesource.com/17800
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
Reviewed-by: Andy Balholm <andy@balholm.com>
2015-12-14 17:39:18 +00:00
Ian Lance Taylor
05bc443e7e html: remove license references from benchmark test data
The license references puzzle programs that grep for licenses.

Fixes golang/go#13573.

Change-Id: I601fbc6ba2b189b476af1082c48fb02cd72f59d8
Reviewed-on: https://go-review.googlesource.com/17714
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2015-12-11 03:42:21 +00:00
Dmitri Shuralyov
edab5dc413 html: Use existing standard library interface internally.
Now that Go 1.1 is out, commit 3651a440a7
can be reverted.

Change-Id: I7ac8478aafaa5067630e99cec9eca59792107892
Reviewed-on: https://go-review.googlesource.com/11612
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2015-06-28 04:31:38 +00:00
Ian Lance Taylor
e0403b4e00 html/charset/testdata: update licensing info in README
Update the licensing info following the instructions at
http://www.w3.org/Consortium/Legal/2008/04-testsuite-copyright.html#howtouse

Fixes golang/go#10398.

Change-Id: Ib37b2a696a5287f41d4e85da4eb7ec1cbb346301
Reviewed-on: https://go-review.googlesource.com/8978
Reviewed-by: Martin Packman <martin.packman@canonical.com>
Reviewed-by: Rob Pike <r@golang.org>
2015-05-08 23:18:43 +00:00
Andy Balholm
6460565bec x/net/html/charset: Change NewReaderByName to NewReaderLabel.
Change-Id: Ic4d1df0c4f7048a3e2472cca09ef9390bcfd149d
Reviewed-on: https://go-review.googlesource.com/4533
Reviewed-by: Rob Pike <r@golang.org>
2015-04-03 23:56:49 +00:00
Dmitry Savintsev
3d87fd621c x/net/html: Sync the html parser and atom with the current whatwg spec
The current documentation as well as set of atoms and attributes has
gotten slightly out of sync with the current state of the WHATWG
html5 specification. The change adds and removes several of the atoms
and attributes, updates the documentation (such as steps numbering in
inBodyEndTagFormatting) and modifies the spec URLs to https://

Change-Id: I6dfa52785858c1521301b20b1e585e19a08b1e98
Reviewed-on: https://go-review.googlesource.com/6173
Reviewed-by: Nigel Tao <nigeltao@golang.org>
2015-03-03 04:37:39 +00:00
Andy Balholm
ec18079348 x/net/html/charset: add NewReaderByName
This provides a CharsetReader function for xml.Decoder.

Change-Id: Id00787bbdee90d267d38c84c98a06f9e10d93336
Reviewed-on: https://go-review.googlesource.com/4420
Reviewed-by: Nigel Tao <nigeltao@golang.org>
2015-02-10 23:47:13 +00:00
David Symonds
8aa6e209cb net: add import comments.
Change-Id: Ifab0fdaec1d810d268b7c19ad30f476802203b37
2014-12-09 14:17:11 +11:00
Mikio Hara
ccf541d876 x/net/html/charset: add missing copyright
LGTM=iant
R=golang-codereviews, iant
CC=golang-codereviews
https://golang.org/cl/174240043
2014-11-17 10:54:40 +09:00
Mikio Hara
716c3ccf9b x/net/html/charset: fix nacl build
LGTM=iant
R=golang-codereviews, iant
CC=golang-codereviews
https://golang.org/cl/177880043
2014-11-17 10:54:21 +09:00
Andrew Gerrand
fbe893ddcd go.net: use golang.org/x/... import paths
LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/167030043
2014-11-10 09:04:43 +11:00
Frederick Kelly Mayle III
5755bc4e75 go.net/html: Fix comment handling for "in select" insertion mode
LGTM=andybalholm, nigeltao
R=golang-codereviews, gobot, nigeltao, andybalholm
CC=golang-codereviews
https://golang.org/cl/93680045
2014-06-12 11:53:57 +10:00
Andrew Balholm
4109fccea4 html: handle '<' before a tag
As pointed out at
https://groups.google.com/forum/#!topic/golang-nuts/LJozHIXAAJY,
`<<p>html</p>` was parsed as `&lt;&lt;p&gt;html</p>`.
There was no test case for this. Chrome parses it as `&lt<p>html</p>`,
and that seems to be correct. We were missing the
"Reconcume the current input character" step at
http://www.whatwg.org/specs/web-apps/current-work/multipage/tokenization.html#tag-open-state

LGTM=nigeltao
R=golang-codereviews, gobot, nigeltao
CC=golang-codereviews, nigeltao
https://golang.org/cl/96060044
2014-05-12 16:42:14 +10:00
Robert Griesemer
a6927df230 go.net: fix various typos
LGTM=adonovan
R=adonovan
CC=golang-codereviews, golang-dev
https://golang.org/cl/97950043
2014-05-02 14:50:26 -07:00
Michael Piatek
4698117464 go.net/html: Expose data read from the input reader but not yet tokenized in Tokenizer.
This allows clients to efficiently reconstruct the original input in the case of ErrBufferExceeded. TestMaxBufferReconstruction now properly verifies this.

R=bradfitz
CC=golang-codereviews
https://golang.org/cl/47770043
2014-01-06 10:51:23 -08:00
Michael Piatek
384e4d292e html: limit buffering during tokenization.
This is optional. By default, buffering is unlimited.

Fixes golang/go#7053

R=bradfitz
CC=golang-codereviews
https://golang.org/cl/43190044
2014-01-03 13:16:55 -08:00
Michael Piatek
480e7b06ec go.net/html: Tokenizer.Raw returns the original input when tokenizer errors occur.
Two tweaks enable this:
1) Updating the raw and data span pointers when Tokenizer.Next is called, even
if an error has occurred. This prevents duplicate data from being returned by
Raw in the common case of an EOF.

2) Treating '</>' as an empty comment token to expose the raw text as a
tokenization event. (This matches the semantics of other non-token events,
e.g., '</ >' is treated as '<!-- -->'.)

Fixes golang/go#7029.

R=golang-codereviews, r, bradfitz
CC=golang-codereviews
https://golang.org/cl/46370043
2014-01-02 10:51:00 -08:00
Andrew Balholm
3f04d1ffd7 go.net/html/charset: add NewReader
NewReader is a convenience function for finding the encoding of
an io.Reader and making a UTF-8 version of that Reader.

R=nigeltao
CC=golang-dev
https://golang.org/cl/43510043
2013-12-19 17:30:38 +11:00
Andrew Balholm
74213743f3 go.net/html/charset: implement the encoding sniffing algorithm
R=nigeltao
CC=golang-dev
https://golang.org/cl/31220043
2013-12-13 16:04:21 +11:00
Andrew Balholm
7eb0b7e953 go.net/html/charset: encoding names
Lookup now returns the canonical name as well as the Encoding.

This will make it easier for users to discover what encoding they
actually have as a return value from functions in this package.
They will also be able to store the name for re-use.

R=nigeltao, mpvl
CC=golang-dev
https://golang.org/cl/30090043
2013-11-23 10:13:36 +11:00
Andrew Balholm
e2719b3103 go.net/html/charset: new package
Implement retrieving encodings by name, according to the names listed
at http://encoding.spec.whatwg.org/#encodings

This is the first step toward implementing the encoding detection
algorithm.

R=nigeltao
CC=golang-dev
https://golang.org/cl/27110043
2013-11-19 21:51:02 +11:00
Nigel Tao
e8489d83dd go.net/html: fix the tokenizer when the underlying io.Reader returns
either (0, nil) or an (n, err) such that n > 0 && err != nil. Both
cases are valid by the io.Reader contract.

R=r
CC=golang-dev
https://golang.org/cl/12513043
2013-08-07 12:55:39 +10:00
Andrew Gerrand
46c4a49ebb go.net/html: put escaping tests escape_test.go
R=golang-dev, r
CC=golang-dev
https://golang.org/cl/11094043
2013-07-10 17:32:24 +10:00
Shenghou Ma
3651a440a7 go.net/html: don't use Go tip io.ByteWriter
So that Go 1.0 user could also use this package.
Fixes golang/go#4931.

R=golang-dev, dsymonds
CC=golang-dev
https://golang.org/cl/7424044
2013-02-28 16:17:17 +08:00
Nigel Tao
ea127e889c go.net/html: move exp/html and exp/html/atom here to the go.net
sub-repo.

It's a straight copy, except for these modifications:
* "exp/html" and "exp/html/atom" imports were renamed, and
* the "TODO... When this package moves out of exp" comment was
  deleted from atom/atom.go.

The matching change is at https://golang.org/cl/7317043

The rationale was discussed at
https://groups.google.com/d/topic/golang-nuts/Qq5hTQyPuLg/discussion

R=adg, remyoudompheng, dave
CC=golang-dev
https://golang.org/cl/7310063
2013-02-11 11:55:20 +11:00