Skip to content

RFC - json_validate() #9399

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 22 commits into from
Oct 8, 2022
Merged

RFC - json_validate() #9399

merged 22 commits into from
Oct 8, 2022

Conversation

juan-morales
Copy link
Contributor

@juan-morales
Copy link
Contributor Author

Hello @bukka.

Here is my new PR, I changed the name of the function to json_validate as you were the 3rd person suggesting that name actually 👍

Honestly, I can go further with the code, but I would have to touch more the parser, and you mentioned that I should stay away from json_decode the more that I could. What I had in mind (but I did not do it) was to move out all hooks that the parser could trigger, out of the parser itself, into a different place; giving the developer the chance to re-use the parser with custom function hooks, but doing so would involve move things to different places, and I though that doing something like that would affect negatively my RFC; at the end, is the functionality what I car more about.

Anyway, hopefully, this is in the direction you suggested me.

I don't mind changing the code as many times as needed, seriously.

Thanks in advance, and I will wait for your reply.

PD: I will keep the PR as Draft, and keep the commits also, until the moment to squash comes.

@TimWolla TimWolla added the RFC label Aug 23, 2022
@juan-morales
Copy link
Contributor Author

juan-morales commented Aug 25, 2022

@cmb69 @dstogov @bukka @nikic

Whenever you can, can you give me a feedback about this?

Thanks in advance 😄

PD:

  1. Sorry for specifically tagging you, but you are the ones I have being in contact for this RFC so far
  2. The RFC document still needs some completion, but I would lik to know if this implementation is fine enough

@bukka
Copy link
Member

bukka commented Aug 25, 2022

It looks good to me for the RFC purpose. There are some minor things that we can iron out if the RFC gets accepted but otherwise it's fine.

Copy link
Member

@cmb69 cmb69 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, certainly good enough to proceed with the RFC process. Thank you!

@cmb69 cmb69 marked this pull request as ready for review August 25, 2022 11:52
@juan-morales juan-morales changed the title RFC - json_validate() - WIP RFC - json_validate() Aug 25, 2022
ext/json/json.c Outdated
}

if (!(options & PHP_JSON_THROW_ON_ERROR)) {
JSON_G(error_code) = PHP_JSON_ERROR_NONE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're setting the error_code here, but wouldn't it make more sense to return this value from the function? That would mean that instead of returning a boolean, it would have to be an integer, with 0 likely having to mean no error, but that is not something that is unheard of.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to say that returning a boolean was my personal preferred approach. If you think we should do it with the int way, let me know.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have being thinking, and I really prefer returning bool, and being able to call json_last_error() to find out what happened

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wrote this down in the RFC as an open question

ext/json/json.c Outdated
if (php_json_yyparse(&parser)) {
php_json_error_code error_code = php_json_parser_error_code(&parser);
if (!(options & PHP_JSON_THROW_ON_ERROR)) {
JSON_G(error_code) = error_code;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A validation routine should not have side effects, such as setting a global error_code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will check this and will be back to you

Copy link
Contributor Author

@juan-morales juan-morales Aug 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@derickr
The reason I kept that code, was because of the usage of json_last_error() function.

I , and other developers in the mailing list find it useful to have such information, to know why a validation failed so we can have messages like:

  • Malformed UTF-8 characters, possibly incorrectly encoded
  • Syntax error
  • Maximum stack depth exceeded

etc.

I think is good thing to have, also some other devs from mailing list.

What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a legit solution; an alternative could be to return the error code as int (instead of a bool).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wrote this down in the RFC as an open question

@juan-morales
Copy link
Contributor Author

I Will update tomorrow with all reviews

@juan-morales
Copy link
Contributor Author

juan-morales commented Aug 28, 2022

@bukka I already pushed a change your review.

Can you take a look at the change again?

@juan-morales
Copy link
Contributor Author

@TysonAndre @bukka wow!!!!! Thanks for such a feedback,I will make the changes later today, traveling at the moment

@juan-morales
Copy link
Contributor Author

Thanks in advance, hopefully I got the comments in the right way. .... at least the automatic tests are green

@juan-morales
Copy link
Contributor Author

juan-morales commented Oct 7, 2022

@bukka @TysonAndre @cmb69 , RFC Accepted! 🥳

Should I squeeze the commits on this PR to one?

Is there anything else I should do besides updating the RFC as stated in step 7.1 of the RFC process? RFC How-To

@TimWolla
Copy link
Member

TimWolla commented Oct 7, 2022

Is there anything else I should do besides updating the RFC as stated in step 7.1 of the RFC

You should likely send another email to the list, mentioning that the vote finished and summarizing the voting results.

Copy link
Contributor

@TysonAndre TysonAndre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any issues, the implementation looks correct. One optional style nit on the bit flag check.

ZEND_PARSE_PARAMETERS_END();


if ((options != 0) && (options != PHP_JSON_INVALID_UTF8_IGNORE)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if ((options != 0) && (options != PHP_JSON_INVALID_UTF8_IGNORE)) {
if (options & ~PHP_JSON_INVALID_UTF8_IGNORE) {

this works, but checking for any bits found in bitwise not in allowed flags is shorter and conventional - it also works and is used for combinations of multiple flags

@TysonAndre TysonAndre merged commit 2e8699f into php:master Oct 8, 2022
@TimWolla
Copy link
Member

TimWolla commented Oct 8, 2022

@TysonAndre Will you also make the NEWS/UPGRADING changes?

@TysonAndre
Copy link
Contributor

Will you also make the NEWS/UPGRADING changes?

Done, I added documentation of those changes in 4ed8d52 and 68301b1

@staabm
Copy link
Contributor

staabm commented Jun 27, 2023

@juan-morales do you plan to add a new php.net manual page for this new function?

I would like to add support for it in PHPStan and was wondering which JSON flags are supported

@TimWolla
Copy link
Member

do you plan to add a new php.net manual page for this new function?

Based off my experience with PHP 8.2, the manual pages are written “in bulk” based off the UPGRADING notes once the release comes nearer. See: php/doc-en#1803

was wondering which JSON flags are supported

Only PHP_JSON_INVALID_UTF8_IGNORE.

@marksgerasimovs
Copy link

marksgerasimovs commented Oct 16, 2023

Greetings, I have a small corner case I don't think gets addressed here, what if we have json object where same key is used two times, php json_decode would just overwrite it and leave the last in the order, but JAVA for example would throw an error as the json is not valid. I do think it is okey that PHP still can parse it, and json_decode remains silent on an issue in a way, but possibly a function that is validating my json should be giving me information about it. What do you think guys?

@juan-morales
Copy link
Contributor Author

Hello @marksgerasimovs , thanks for the message.
This is a very good question. I Will address this in the INTERNALS email list

@a1vin1au
Copy link

a1vin1au commented Jan 6, 2025

Update: Thanks to Ayesh for pointing out that it is a valid JSON.

Hi I just found a bug in both 8.3.15 and 8.4.2

when string "" it will returns true, but it should be false as it is not a valid JSON, right?

json_validate('""') // return true

Below screenshots, I tested function in Laravel thinker with PHP 8.3.15 and 8.4.2
image
image

@Ayesh
Copy link
Member

Ayesh commented Jan 6, 2025

@a1vin1au

when string "" it will returns true, but it should be false as it is not a valid JSON, right?

"" is a valid JSON string. You can confirm this on JavaScript by running this in a browser console:

JSON.parse('""'); 

It returns an empty string, which is consistent with PHP's json_decode.

@a1vin1au
Copy link

a1vin1au commented Jan 6, 2025

@Ayesh
Oh, I never knew that "" was a valid JSON string, I'm coding with something like {"a":"b"} the whole time.
Thank you for letting me learn new thing

@juan-morales
Copy link
Contributor Author

Remember, in this context a "valid JSON" is a JSON valid for json_decode 😄

@cmb69
Copy link
Member

cmb69 commented Jan 6, 2025

An empty string ("") is valid JSON according to https://datatracker.ietf.org/doc/html/rfc8259 (as well as https://datatracker.ietf.org/doc/html/rfc7159); it is only invalid for the informational https://datatracker.ietf.org/doc/html/rfc4627.html.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.