[Bug 60963] New: Optimize class loading for unpackWARs=false case

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
28 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[Bug 60963] New: Optimize class loading for unpackWARs=false case

Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60963

            Bug ID: 60963
           Summary: Optimize class loading for unpackWARs=false case
           Product: Tomcat 8
           Version: 8.0.43
          Hardware: Other
                OS: Linux
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: Catalina
          Assignee: [hidden email]
          Reporter: [hidden email]
  Target Milestone: ----

Created attachment 34902
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=34902&action=edit
Proposed patch

Attached patch improves the class loading time for unpackWARs=false especially
when the war is build with only uncompressed entries.

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 60963] Optimize class loading for unpackWARs=false case

Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60963

--- Comment #1 from Thomas Meyer <[hidden email]> ---
Depending on the deployed WAR files loading is up to 7 times faster.
Would be nice if somebody from the tomcat developers could have a look at this.

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 60963] Optimize class loading for unpackWARs=false case

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60963

--- Comment #2 from Thomas Meyer <[hidden email]> ---
I did als upload the patch here:
http://static.217.14.99.88.clients.your-server.de/501

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 60963] Optimize class loading for unpackWARs=false case

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60963

--- Comment #3 from Mark Thomas <[hidden email]> ---
Looking at this is on my TODO list. The main question I have is does it still
avoid file locking issues? These will be more obvious Windows (you won't be
able to delete the WAR) but they affect all platforms.

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 60963] Optimize class loading for unpackWARs=false case

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60963

--- Comment #4 from Christopher Schultz <[hidden email]> ---
It looks like isInWAR in getJarInputStreamWrapper can be leaked.

Will calling getWebResourceSet().closeJarFile undo this performance
optimization?

For ZipInputStreamWithPosition constructor, should you do this instead:

        super(new PushbackCountingInputStream(in, 512));

That seems a little cleaner.

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 60963] Optimize class loading for unpackWARs=false case

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60963

--- Comment #5 from Thomas Meyer <[hidden email]> ---
Regarding file locking issue: I didn't check this. I'll try to test this.

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 60963] Optimize class loading for unpackWARs=false case

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60963

--- Comment #6 from Thomas Meyer <[hidden email]> ---
Regarding ZipInputStreamWithPosition:
I think you found the ugly part of this patch. I guess your suggestion wouldn't
work, because ZipInputStream will construct itself with a new
PushBackInputstream and the provided input stream as decorated object. Passing
the input stream as you suggest would result in a PushBackInputStream with an
PushBackInputStreamWithPosition as decorated object, which is wrong for our
intention here.
By setting the decorated object after the super constructor has finished we
replace the PushbsckInputStream with our implementation.

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 60963] Optimize class loading for unpackWARs=false case

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60963

--- Comment #7 from Christopher Schultz <[hidden email]> ---
(In reply to Thomas Meyer from comment #6)
> Regarding ZipInputStreamWithPosition:
> I think you found the ugly part of this patch. I guess your suggestion
> wouldn't work, because ZipInputStream will construct itself with a new
> PushBackInputstream and the provided input stream as decorated object.
> Passing the input stream as you suggest would result in a
> PushBackInputStream with an PushBackInputStreamWithPosition as decorated
> object, which is wrong for our intention here.
> By setting the decorated object after the super constructor has finished we
> replace the PushbsckInputStream with our implementation.

I'm sorry, I don't follow. What's the problem with the superclass's constructor
using the PushbackCountingInputStream instead of handing it an undecorated
InputStream, and then replacing it after the constructor has completed?

Does the superclass constructor do something with the InputStream that we don't
want to happen to the PushbackCountingInputStream?

I would think that "bytesread" would be incorrect (initialized to zero) if the
superclass constructor performs a read. Don't you want those reads to be
counted?

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 60963] Optimize class loading for unpackWARs=false case

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60963

--- Comment #8 from Thomas Meyer <[hidden email]> ---
(In reply to Christopher Schultz from comment #7)

> (In reply to Thomas Meyer from comment #6)
> > Regarding ZipInputStreamWithPosition:
> > I think you found the ugly part of this patch. I guess your suggestion
> > wouldn't work, because ZipInputStream will construct itself with a new
> > PushBackInputstream and the provided input stream as decorated object.
> > Passing the input stream as you suggest would result in a
> > PushBackInputStream with an PushBackInputStreamWithPosition as decorated
> > object, which is wrong for our intention here.
> > By setting the decorated object after the super constructor has finished we
> > replace the PushbsckInputStream with our implementation.
>
> I'm sorry, I don't follow. What's the problem with the superclass's
> constructor using the PushbackCountingInputStream instead of handing it an
> undecorated InputStream, and then replacing it after the constructor has
> completed?
Yes, exactly this is what is happening:
"super(new PushbackInputStream(in, 512), new Inflater(true), 512);"
In the super constructor.

> Does the superclass constructor do something with the InputStream that we
> don't want to happen to the PushbackCountingInputStream?

Yes, it decorates the passed InputStream with a PushBackInputStream.

>
> I would think that "bytesread" would be incorrect (initialized to zero) if
> the superclass constructor performs a read. Don't you want those reads to be
> counted?

Theoretically we want those bytes to be counted, but the ZipInputStream doesn't
read anything in it's construction phase, that's why the whole "fast forward
skipping works at all".
And yes this is a somewhat hackish solution but it works.

I did also look at what commons-compress provides but their implementation also
doesn't have a callback for the event "new PK entry in the InputStream reached"

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 60963] Optimize class loading for unpackWARs=false case

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60963

--- Comment #9 from Mark Thomas <[hidden email]> ---
I've spent a little time looking at this.

I can confirm that I see similar performance improvements with local testing.

I can also confirm I have not observed any file locking. I haven't looked for
other leaks but if there are any, I expect them to be simple to fix.

The one thing that worries me about this patch is the degree to which it
depends on the JVM internals. While this works with the current Oracle JVM, my
concern is for JVMs from other vendors and future Oracle versions.

I did take a quick look at the ZIP specification [1] and it appears it should
be fairly simple to read the file names and data offsets from the local file
headers. I wonder if writing a parser that extracts just the info we need and
skips the rest might be a better option.

Finally, there are a few changes in the patch that aren't strictly related to
fixing the problem at hand. It is generally better to put that sort of clean-up
in a separate patch (no need to re-submit this patch - the comment is more for
future reference).

[1] https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 60963] Optimize class loading for unpackWARs=false case

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60963

--- Comment #10 from Thomas Meyer <[hidden email]> ---
(In reply to Mark Thomas from comment #9)

> The one thing that worries me about this patch is the degree to which it
> depends on the JVM internals. While this works with the current Oracle JVM,
> my concern is for JVMs from other vendors and future Oracle versions.

I ha exactly the same concerns! But I was sure about how much change the
upstream development would accept.

>
> I did take a quick look at the ZIP specification [1] and it appears it
> should be fairly simple to read the file names and data offsets from the
> local file headers. I wonder if writing a parser that extracts just the info
> we need and skips the rest might be a better option.

Yes, I had the same idea while figuring how to abuse the ZipInputStream for
above solution. How hard could it be to parse an header entry...

>
> Finally, there are a few changes in the patch that aren't strictly related
> to fixing the problem at hand. It is generally better to put that sort of
> clean-up in a separate patch (no need to re-submit this patch - the comment
> is more for future reference).

Okay, I always try to separate the relevant changes from unrelated stuff, but
sometimes I miss something while preparing a patch.

>
> [1] https://pkware.cachefly.net/webdocs/casestudies/

Another remark from my side:
The JarInputStream is used to parse each i.e. Jar file once per Webapp class
loader. I tried to understand the verifying of the manifest and possible signed
entries, but I failed. Do you have a better understanding of this topic and can
you say if something did break in this area? How to check?

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 60963] Optimize class loading for unpackWARs=false case

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60963

--- Comment #11 from Konstantin Kolinko <[hidden email]> ---
(In reply to Thomas Meyer from comment #8)
>
> Theoretically we want those bytes to be counted, but the ZipInputStream
> doesn't read anything in it's construction phase, that's why the whole "fast
> forward skipping works at all".
> And yes this is a somewhat hackish solution but it works.
>

1. I do not know whether this is a concern (I have not looked at the patch),
but note the behaviour that was recently reported as bug 60940:

JarInputStream constructor does read MANIFEST.MF and META-INF/ directory entry
that may precede it.

2. Zip* and Jar* APIs differ in their handling of i18n filenames. JAR
specification uses UTF-8 for file names, Zip uses platform encoding by default.

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 60963] Optimize class loading for unpackWARs=false case

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60963

--- Comment #12 from Thomas Meyer <[hidden email]> ---
(In reply to Konstantin Kolinko from comment #11)

> (In reply to Thomas Meyer from comment #8)
> >
> > Theoretically we want those bytes to be counted, but the ZipInputStream
> > doesn't read anything in it's construction phase, that's why the whole "fast
> > forward skipping works at all".
> > And yes this is a somewhat hackish solution but it works.
> >
>
> 1. I do not know whether this is a concern (I have not looked at the patch),
> but note the behaviour that was recently reported as bug 60940:
>
> JarInputStream constructor does read MANIFEST.MF and META-INF/ directory
> entry that may precede it.

Yes, I saw this and it's still done after the patch.

What I didn't finish to understand is the validation of the signed MANIFEST.MF
entries. Is it really enough to just read through the whole JAR file with a
JarInputStream with getNextJarEntry() and everything is automagically checked
for correctness?

Do we care that a WAR file contains only correct signed JAR files? I never
really was involved in the topic signed war files and/or containing jar files
inside a war file.

>
> 2. Zip* and Jar* APIs differ in their handling of i18n filenames. JAR
> specification uses UTF-8 for file names, Zip uses platform encoding by
> default.

That is an important hint, I think we should really go for an minimal
functional ZIP "PK entry" parser to extract the offset of each "PK entry
start", to fast forward in the jar file inside the war file.

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 60963] Optimize class loading for unpackWARs=false case

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60963

--- Comment #13 from Thomas Meyer <[hidden email]> ---
Hi,

I implemented a basic ZIP file parser. An updated patch is here:
http://static.217.14.99.88.clients.your-server.de/501

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 60963] Optimize class loading for unpackWARs=false case

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60963

--- Comment #14 from Thomas Meyer <[hidden email]> ---
Hi,

any news on this? Do you want me to attach the patch here? Anything else I can
do?

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 60963] Optimize class loading for unpackWARs=false case

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60963

Thomas Meyer <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #34902|0                           |1
        is obsolete|                            |

--- Comment #15 from Thomas Meyer <[hidden email]> ---
Created attachment 34980
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=34980&action=edit
updated patch

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 60963] Optimize class loading for unpackWARs=false case

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60963

Thomas Meyer <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #34980|0                           |1
        is obsolete|                            |

--- Comment #16 from Thomas Meyer <[hidden email]> ---
Created attachment 34981
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=34981&action=edit
updated patch

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 60963] Optimize class loading for unpackWARs=false case

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60963

--- Comment #17 from Thomas Meyer <[hidden email]> ---
The upcoming commons-compress 1.14 library will have support for fetching the
offsets of the zip local headers -
https://github.com/apache/commons-compress/pull/22

should we use that library in favour of my custom written zip parser?

opinions?

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 60963] Optimize class loading for unpackWARs=false case

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60963

--- Comment #18 from Mark Thomas <[hidden email]> ---
That is probably overkill in terms of size of the library compared to the
current patch.

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 60963] Optimize class loading for unpackWARs=false case

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60963

--- Comment #19 from Mark Thomas <[hidden email]> ---
Just a quick note to say I haven't forgotten about this. It is still on my TODO
list. I hope to get to it in the next week or so.

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

12