JSTL issue

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

JSTL issue

Jean-Louis MONTEIRO
Hi guys,

I have one JSTL issue and I'd need your feedback on it.
https://github.com/eclipse-ee4j/jstl-api/issues/140

Can you guys have a look and let me know what you think?

--
Jean-Louis
Reply | Threaded
Open this post in threaded view
|

Re: JSTL issue

Mark Thomas-2
On 13/04/2021 20:17, Jean-Louis MONTEIRO wrote:
> Hi guys,
>
> I have one JSTL issue and I'd need your feedback on it.
> https://github.com/eclipse-ee4j/jstl-api/issues/140
>
> Can you guys have a look and let me know what you think?

That looks like a side-effect of the various improvements we made to the
Default Servlet to do a better job of including content with a variety
of (potentially incompatible) encodings.

Generally, I'd expect the BoM to be skipped.

Historically, Tomcat didn't skip the BoM, so the original golden file
was generated on that basis.

Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: JSTL issue

Jean-Louis MONTEIRO
Thanks for the answer.

Do you mind adding the comment in the issue?
I can copy/paste if not. It's just to give context to others

Thanks

Le mar. 13 avr. 2021 à 21:44, Mark Thomas <[hidden email]> a écrit :

> On 13/04/2021 20:17, Jean-Louis MONTEIRO wrote:
> > Hi guys,
> >
> > I have one JSTL issue and I'd need your feedback on it.
> > https://github.com/eclipse-ee4j/jstl-api/issues/140
> >
> > Can you guys have a look and let me know what you think?
>
> That looks like a side-effect of the various improvements we made to the
> Default Servlet to do a better job of including content with a variety
> of (potentially incompatible) encodings.
>
> Generally, I'd expect the BoM to be skipped.
>
> Historically, Tomcat didn't skip the BoM, so the original golden file
> was generated on that basis.
>
> Mark
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>

--
Jean-Louis
Reply | Threaded
Open this post in threaded view
|

Re: JSTL issue

Jean-Louis MONTEIRO
I've got an answer from JSTL team.
Here it is

Here is my take:

>
>    1. Jakarta Tags Specification Section 7.4 details the <c:import> tag:
>    c:import
>    <https://github.com/eclipse-ee4j/jstl-api/blob/master/spec/src/main/asciidoc/jakarta-stl.adoc#74-cimport>
>
> Within this, it details the following: Character Encoding : The
> <c:import/> for import-encoded.txt does not include an encoding so the
> default encoding is used: ISO-8859-1
>
>    1. If I were to do the following: <c:import url="import-encoded.txt"
>    charEncoding="UTF-16"/> then we would not see the BOM since the proper
>    encoding is used and would see the actual text in the file rather than
>    garbled characters which is what the test is actually looking for, see
>    below.
>
> This test is just checking the following:
>
> <%-- If encoding is not specified, and no encoding is specified in the
>      response of the imported resource, the default of ISO-8859-1
>      will be used. --%>
>
> As to whether or not to strip the BOM when the content type of the file
> uses a BOM and the content type being used does not use a BOM I disagree
> that it should be removed, and here is why:
>
>    -
>
>    The encoding defaults to ISO-8859-1 which as far as I know doesn't use
>    the BOM so just including the text as is in the <c:import/> seems
>    reasonable to me.
>    -
>
>    Tomcat remove the BOM from the file even when the ISO-8859-1 encoding
>    is used to read the UTF-16 encoded text which I don't see how that
>    improves anything. It sounds like this is stripped outside of any Jakarta
>    Tags or Jakarta Pages specification and is done as a specific change to
>    Tomcat since Tomcat last was executed against the TCK.
>
> At this point from a Jakarta Tags perspective, I believe the golden file
> is correct.
>





Le mer. 14 avr. 2021 à 10:16, Jean-Louis MONTEIRO <[hidden email]> a
écrit :

> Thanks for the answer.
>
> Do you mind adding the comment in the issue?
> I can copy/paste if not. It's just to give context to others
>
> Thanks
>
> Le mar. 13 avr. 2021 à 21:44, Mark Thomas <[hidden email]> a écrit :
>
>> On 13/04/2021 20:17, Jean-Louis MONTEIRO wrote:
>> > Hi guys,
>> >
>> > I have one JSTL issue and I'd need your feedback on it.
>> > https://github.com/eclipse-ee4j/jstl-api/issues/140
>> >
>> > Can you guys have a look and let me know what you think?
>>
>> That looks like a side-effect of the various improvements we made to the
>> Default Servlet to do a better job of including content with a variety
>> of (potentially incompatible) encodings.
>>
>> Generally, I'd expect the BoM to be skipped.
>>
>> Historically, Tomcat didn't skip the BoM, so the original golden file
>> was generated on that basis.
>>
>> Mark
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>>
>
> --
> Jean-Louis
>


--
Jean-Louis
Reply | Threaded
Open this post in threaded view
|

Re: JSTL issue

Mark Thomas-2
On 15/04/2021 09:03, Jean-Louis MONTEIRO wrote:
> I've got an answer from JSTL team.
> Here it is

<snip/>

>>     1. Jakarta Tags Specification Section 7.4 details the <c:import> tag:
>>     c:import
>>     <https://github.com/eclipse-ee4j/jstl-api/blob/master/spec/src/main/asciidoc/jakarta-stl.adoc#74-cimport>
>>
>> Within this, it details the following: Character Encoding : The
>> <c:import/> for import-encoded.txt does not include an encoding so the
>> default encoding is used: ISO-8859-1

</snip>

>> At this point from a Jakarta Tags perspective, I believe the golden file
>> is correct.

Thanks for passing that on.

The Default Servlet improvements were written from the perspective of
including static content with a variety of encodings where the correct
encoding was not always known (or maintaining an accurate mapping of
encoding to resource would impose a significant overhead).

JSTL is coming from the perspective that the encoding of the included
target is always known.

Currently Tomcat provides the 'useBomIfPresent' option to control the
BoM handling. The current values are:

- true  - BoM is stripped if present and any BoM found used to determine
           the encoding used to read the resource. This is the default.

- false - BoM is stripped and resource is read using the configured file
           encoding (which will be the platform default if not explicitly
           configured)

If we wanted to address this and provide a way to allow JSTL to have the
control over the included content required to pass this TCK test then we
could modify 'useBomIfPresent' as follows:

- true   - no change - remains the default

- false  - no change

- ignore - as current false but does not strip the BoM from the output

This would have no impact on existing users but using the new ignore
option would allow the JSTL TCK to pass.

I do wonder if this use case has any real world consequences. For that
to be that case there would need to be an application where:
- JSTL was importing static resources
- the content of static resource started with the same bytes as a valid
   BoM

That seems unlikely as the BoM values look to have been chosen to avoid
this. While it is (very?) unlikely, it isn't impossible so I'm not
against this change. Normally, I'd worry about regressions but the test
case coverage is good in this area.

Any objections to implementing this? Thoughts on a better solution?

Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: JSTL issue

Jean-Louis MONTEIRO
Hi,

I definitely value the feedback and thoughts.

I understand and I agree it's unlikely to be a real world use case.
I am happy to give it a try to improve the flag and submit a PR for it.

I'll make sure to fire a new TCK build with everything (servlet, jsp, etc)
so we have more confidence in the fix.

Thanks

Le jeu. 15 avr. 2021 à 10:57, Mark Thomas <[hidden email]> a écrit :

> On 15/04/2021 09:03, Jean-Louis MONTEIRO wrote:
> > I've got an answer from JSTL team.
> > Here it is
>
> <snip/>
>
> >>     1. Jakarta Tags Specification Section 7.4 details the <c:import>
> tag:
> >>     c:import
> >>     <
> https://github.com/eclipse-ee4j/jstl-api/blob/master/spec/src/main/asciidoc/jakarta-stl.adoc#74-cimport
> >
> >>
> >> Within this, it details the following: Character Encoding : The
> >> <c:import/> for import-encoded.txt does not include an encoding so the
> >> default encoding is used: ISO-8859-1
>
> </snip>
>
> >> At this point from a Jakarta Tags perspective, I believe the golden file
> >> is correct.
>
> Thanks for passing that on.
>
> The Default Servlet improvements were written from the perspective of
> including static content with a variety of encodings where the correct
> encoding was not always known (or maintaining an accurate mapping of
> encoding to resource would impose a significant overhead).
>
> JSTL is coming from the perspective that the encoding of the included
> target is always known.
>
> Currently Tomcat provides the 'useBomIfPresent' option to control the
> BoM handling. The current values are:
>
> - true  - BoM is stripped if present and any BoM found used to determine
>            the encoding used to read the resource. This is the default.
>
> - false - BoM is stripped and resource is read using the configured file
>            encoding (which will be the platform default if not explicitly
>            configured)
>
> If we wanted to address this and provide a way to allow JSTL to have the
> control over the included content required to pass this TCK test then we
> could modify 'useBomIfPresent' as follows:
>
> - true   - no change - remains the default
>
> - false  - no change
>
> - ignore - as current false but does not strip the BoM from the output
>
> This would have no impact on existing users but using the new ignore
> option would allow the JSTL TCK to pass.
>
> I do wonder if this use case has any real world consequences. For that
> to be that case there would need to be an application where:
> - JSTL was importing static resources
> - the content of static resource started with the same bytes as a valid
>    BoM
>
> That seems unlikely as the BoM values look to have been chosen to avoid
> this. While it is (very?) unlikely, it isn't impossible so I'm not
> against this change. Normally, I'd worry about regressions but the test
> case coverage is good in this area.
>
> Any objections to implementing this? Thoughts on a better solution?
>
> Mark
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>

--
Jean-Louis
Reply | Threaded
Open this post in threaded view
|

Re: JSTL issue

Christopher Schultz-2
In reply to this post by Mark Thomas-2
Mark,

On 4/15/21 04:57, Mark Thomas wrote:
> If we wanted to address this and provide a way to allow JSTL to have the
> control over the included content required to pass this TCK test then we
> could modify 'useBomIfPresent' as follows:
>
> - true   - no change - remains the default
>
> - false  - no change
>
> - ignore - as current false but does not strip the BoM from the output

I might re-name the "ignore" case to "pass-through" to be perfectly
clear about what's happening. "Ignore" might be mis-interpreted to mean
that the BOM would be removed. "Pass-through" makes it clear that the
BOM will still be sent IMHO.

-chris

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: JSTL issue

Jean-Louis MONTEIRO
Hi,

Now that we crossed the finish line with TomEE compatibility, I'd like to give back the BOM changes according to this discussion.

I did the updated and created another subclass for DefaultServletEncodingBaseTest

Did some fixes around that test to take the changes into account.
I'll see if I can get this to fully pass



Good news, it's backward compatible as we wanted.
I may post if I need some help or guidance.

As soon as it's done, I believe we want a bugzilla ticket so I can link it to a PR?


Le jeu. 15 avr. 2021 à 19:19, Christopher Schultz <[hidden email]> a écrit :
Mark,

On 4/15/21 04:57, Mark Thomas wrote:
> If we wanted to address this and provide a way to allow JSTL to have the
> control over the included content required to pass this TCK test then we
> could modify 'useBomIfPresent' as follows:
>
> - true   - no change - remains the default
>
> - false  - no change
>
> - ignore - as current false but does not strip the BoM from the output

I might re-name the "ignore" case to "pass-through" to be perfectly
clear about what's happening. "Ignore" might be mis-interpreted to mean
that the BOM would be removed. "Pass-through" makes it clear that the
BOM will still be sent IMHO.

-chris

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]



--
Jean-Louis
Reply | Threaded
Open this post in threaded view
|

Re: JSTL issue

Christopher Schultz-2
Jean-Louis,

On 5/5/21 09:49, Jean-Louis MONTEIRO wrote:

> Now that we crossed the finish line with TomEE compatibility, I'd like
> to give back the BOM changes according to this discussion.
>
> I did the updated and created another subclass for
> DefaultServletEncodingBaseTest
> image.png
> Did some fixes around that test to take the changes into account.
> I'll see if I can get this to fully pass
>
> image.png
>
> Good news, it's backward compatible as we wanted.
> I may post if I need some help or guidance.
>
> As soon as it's done, I believe we want a bugzilla ticket so I can link
> it to a PR?

All the images were stripped from the list-posting.

Either Bugzilla or GitHub PR is fine.

-chris

> Le jeu. 15 avr. 2021 à 19:19, Christopher Schultz
> <[hidden email] <mailto:[hidden email]>> a
> écrit :
>
>     Mark,
>
>     On 4/15/21 04:57, Mark Thomas wrote:
>      > If we wanted to address this and provide a way to allow JSTL to
>     have the
>      > control over the included content required to pass this TCK test
>     then we
>      > could modify 'useBomIfPresent' as follows:
>      >
>      > - true   - no change - remains the default
>      >
>      > - false  - no change
>      >
>      > - ignore - as current false but does not strip the BoM from the
>     output
>
>     I might re-name the "ignore" case to "pass-through" to be perfectly
>     clear about what's happening. "Ignore" might be mis-interpreted to mean
>     that the BOM would be removed. "Pass-through" makes it clear that the
>     BOM will still be sent IMHO.
>
>     -chris
>
>     ---------------------------------------------------------------------
>     To unsubscribe, e-mail: [hidden email]
>     <mailto:[hidden email]>
>     For additional commands, e-mail: [hidden email]
>     <mailto:[hidden email]>
>
>
>
> --
> Jean-Louis

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: JSTL issue

Jean-Louis MONTEIRO
Hi,

I went ahead and pull the code into a draft PR.
I'd appreciate any feedback or guidance.

https://github.com/apache/tomcat/pull/418

Lemme know if that's ok.
I believe I should maybe open something in the bug tracker?

Thanks

Le mer. 5 mai 2021 à 16:38, Christopher Schultz <
[hidden email]> a écrit :

> Jean-Louis,
>
> On 5/5/21 09:49, Jean-Louis MONTEIRO wrote:
> > Now that we crossed the finish line with TomEE compatibility, I'd like
> > to give back the BOM changes according to this discussion.
> >
> > I did the updated and created another subclass for
> > DefaultServletEncodingBaseTest
> > image.png
> > Did some fixes around that test to take the changes into account.
> > I'll see if I can get this to fully pass
> >
> > image.png
> >
> > Good news, it's backward compatible as we wanted.
> > I may post if I need some help or guidance.
> >
> > As soon as it's done, I believe we want a bugzilla ticket so I can link
> > it to a PR?
>
> All the images were stripped from the list-posting.
>
> Either Bugzilla or GitHub PR is fine.
>
> -chris
>
> > Le jeu. 15 avr. 2021 à 19:19, Christopher Schultz
> > <[hidden email] <mailto:[hidden email]>> a
> > écrit :
> >
> >     Mark,
> >
> >     On 4/15/21 04:57, Mark Thomas wrote:
> >      > If we wanted to address this and provide a way to allow JSTL to
> >     have the
> >      > control over the included content required to pass this TCK test
> >     then we
> >      > could modify 'useBomIfPresent' as follows:
> >      >
> >      > - true   - no change - remains the default
> >      >
> >      > - false  - no change
> >      >
> >      > - ignore - as current false but does not strip the BoM from the
> >     output
> >
> >     I might re-name the "ignore" case to "pass-through" to be perfectly
> >     clear about what's happening. "Ignore" might be mis-interpreted to
> mean
> >     that the BOM would be removed. "Pass-through" makes it clear that the
> >     BOM will still be sent IMHO.
> >
> >     -chris
> >
> >     ---------------------------------------------------------------------
> >     To unsubscribe, e-mail: [hidden email]
> >     <mailto:[hidden email]>
> >     For additional commands, e-mail: [hidden email]
> >     <mailto:[hidden email]>
> >
> >
> >
> > --
> > Jean-Louis
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>

--
Jean-Louis