Skip to content

An unclosed, unescaped <script> tag in markdown should be rendered as text to match GH/GL behavior #2246

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
2 of 7 tasks
wyattoday opened this issue Aug 2, 2017 · 14 comments
Labels
issue/confirmed Issue has been reviewed and confirmed to be present or accepted to be implemented type/proposal The new feature has not been accepted yet but needs to be discussed first.

Comments

@wyattoday
Copy link

wyattoday commented Aug 2, 2017

Description

An unclosed, unescaped <script> tag in markdown breaks all subsequent markdown rendering in gitea. The same problem does not effect more benign tags like <strong>.

Gitea should render the <script> tag "as is" (that is, the text, but not emitting the <script> HTML). That would match the behavior in github.

Here's the raw Markdown file: https://try.gitea.io/wyattoday/simple-respository/raw/new-feature-branch/BrokenRendering.md

Here's the broken rendering: https://try.gitea.io/wyattoday/simple-respository/src/new-feature-branch/BrokenRendering.md

Screenshots

gitea-broken-markdown

@lafriks lafriks added this to the 1.x.x milestone Aug 2, 2017
@stale
Copy link

stale bot commented Feb 13, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs during the next 2 weeks. Thank you for your contributions.

@stale stale bot added the issue/stale label Feb 13, 2019
@wyattoday
Copy link
Author

This issue is still very much alive: https://try.gitea.io/wyattoday/Test1234/src/branch/master/README.md

@stale stale bot removed the issue/stale label Feb 13, 2019
@techknowlogick techknowlogick added the issue/confirmed Issue has been reviewed and confirmed to be present or accepted to be implemented label Feb 27, 2019
@lunny
Copy link
Member

lunny commented Jan 2, 2021

@lunny lunny closed this as completed Jan 2, 2021
@lunny lunny removed this from the 1.x.x milestone Jan 2, 2021
@wyattoday
Copy link
Author

Still very broken.

See : https://try.gitea.io/wyattoday/Test1234/raw/branch/master/README.md

Notice how the lines after the <script> tag are not written out.

The markdown renderer should remove malicious tags (expecially <script>) to match how other systems like Gitlab / Github render markdown.

@zeripath
Copy link
Contributor

zeripath commented Jan 2, 2021

How would you suggest we do this?

How should they be rendered? Please link to an actual example of how you expect it be rendered and explain your rules for how we are supposed to determine what was "supposed to inside" the unclosed tag that we need to sanitize away. Including what you do when the tag is closed.

@wyattoday
Copy link
Author

wyattoday commented Jan 2, 2021

Script tags should be stripped completely. Or render them as text.

Either way, the current markdown behavior of breaking after a tag it doesn't like isn't a good way to handle things.

@zeripath
Copy link
Contributor

zeripath commented Jan 2, 2021

The script tag is stripped along with its contents. Script is a block level element so an unclosed script block contains everything after it.

@wyattoday
Copy link
Author

wyattoday commented Jan 2, 2021

We ran into this issue in real life by writing documentation with <script> being written in some markdown documentation. We wanted it rendered as text. This is how it’s done on GitHub / Gitlab. (See 2 sentences back in this comment for how it’s just rendered as text, no backticks necessary. Just write it and the markdown renders it as text.)

So, the ideal solution is to either strip it out correctly (I.e. if the tag is never closed, then loop back and assume it’s a standalone tag), or ideally match the behavior of GH/GL and render as text.

What gittea does currently is the worst of both worlds (strips it and breaks the rest of the rendering).

@lunny lunny reopened this Jan 3, 2021
@lunny lunny added type/proposal The new feature has not been accepted yet but needs to be discussed first. and removed type/bug labels Jan 3, 2021
@lunny
Copy link
Member

lunny commented Jan 3, 2021

Then, I think the title should be changed to render script tag as text but not it's a break.

@wyattoday wyattoday changed the title An unclosed, unescaped <script> tag in markdown breaks all subsequent markdown rendering. An unclosed, unescaped <script> tag in markdown should be rendered as text to match GH/GL behavior Jan 3, 2021
@zeripath
Copy link
Contributor

zeripath commented Jan 3, 2021

it appears that @wyattoday wants a completely different DOM to the way bluemonday creates it. They almost want the file to be passed through html tidy before sanitizing,

I suspect however that regexp replacing <(/?script[ >]) with &lt;$1 in sanitizer.go would work.

zeripath added a commit to zeripath/gitea that referenced this issue Jan 3, 2021
@milahu
Copy link

milahu commented Oct 1, 2023

Script tags should be stripped completely. Or render them as text.

generally, im voting to ignore these tags in markdown in the blob api:

  • <script>...</script>
  • <style>...</style>
  • <head>...</head>
  • <!doctype html>
  • <html>
  • </html>

this is useful to render html files as markdown in the blob api
by creating a symlink from readme.md to index.html

example:

We wanted it rendered as text. This is how it’s done on GitHub / Gitlab.

both do it wrong.
if you want to render the text <script> then just write &lt;script&gt;
which works in all markdown renderers

@wxiaoguang
Copy link
Contributor

wxiaoguang commented Apr 6, 2025

It's impossible to correctly support it by the current goldmark render used by Gitea, unless someone writes a special HTML tag parser for goldmark to handle "unclosed" tags.

I believe writing `<script>` or &lt;script&gt; is the right approach for all markdown renders.

@wxiaoguang wxiaoguang closed this as not planned Won't fix, can't repro, duplicate, stale Apr 6, 2025
@wxiaoguang
Copy link
Contributor

wxiaoguang commented Apr 6, 2025

And we can have a simple improvement to output the broken (unclosed) tags: Make markdown render match GitHub's behavior #34129 , it should be good enough for most cases, telling the writer that "you forgot to correctly close or escape the tags".

Image

@wxiaoguang
Copy link
Contributor

Hmm, good news, according to my test, I think the fix should almost behave the same as GitHub.

So I think this issue could be closed as "completed" now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
issue/confirmed Issue has been reviewed and confirmed to be present or accepted to be implemented type/proposal The new feature has not been accepted yet but needs to be discussed first.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants