Re: excessive commits / autowarming

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: excessive commits / autowarming

Barnett, Jeffrey
On closer examination of a longer log snippet, the better question might be Why is MarcImporter opening closing and autowarming Searchers all over the place?

INFO: Opening Searcher@776482 main
Jul 6, 2008 12:26:21 PM org.apache.solr.update.DirectUpdateHandler2 commit
INFO: end_commit_flush
Jul 6, 2008 12:26:21 PM org.apache.solr.search.SolrIndexSearcher warm
INFO: autowarming Searcher@776482 main from Searcher@15e134 main
        filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
Jul 6, 2008 12:26:21 PM org.apache.solr.search.SolrIndexSearcher warm
INFO: autowarming result for Searcher@776482 main
        filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
Jul 6, 2008 12:26:21 PM org.apache.solr.search.SolrIndexSearcher warm
INFO: autowarming Searcher@776482 main from Searcher@15e134 main
        queryResultCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
Jul 6, 2008 12:26:21 PM org.apache.solr.search.SolrIndexSearcher warm
INFO: autowarming result for Searcher@776482 main
        queryResultCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
Jul 6, 2008 12:26:21 PM org.apache.solr.search.SolrIndexSearcher warm
INFO: autowarming Searcher@776482 main from Searcher@15e134 main
        documentCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
Jul 6, 2008 12:26:21 PM org.apache.solr.search.SolrIndexSearcher warm
INFO: autowarming result for Searcher@776482 main
        documentCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
Jul 6, 2008 12:26:21 PM org.apache.solr.core.SolrCore registerSearcher
INFO: [Solr] Registered new searcher Searcher@776482 main
Jul 6, 2008 12:26:21 PM org.apache.solr.search.SolrIndexSearcher close
INFO: Closing Searcher@15e134 main
        filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
        queryResultCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
        documentCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}

________________________________________
From: [hidden email] [[hidden email]] On Behalf Of Barnett, Jeffrey [[hidden email]]
Sent: Sunday, July 06, 2008 12:25 PM
To: Wayne Graham; [hidden email]
Subject: [VuFind-Tech] excessive commits

We are trying to get our full set of 7 million records loaded, and along with problems with the records themselves, it seems to take longer and longer to add records.

One thin I notice in the logs is that a  possible excessive amount of committing and deduping on very small record sets.
1) is this normal?
2) is there a control to minimize it?
3) is this a good thing, and I just don't realize why?

Sample Log:
Jul 6, 2008 12:08:54 PM org.apache.solr.update.DirectUpdateHandler2 commit
INFO: start commit(optimize=false,waitFlush=true,waitSearcher=true)
Jul 6, 2008 12:08:54 PM org.apache.solr.update.DirectUpdateHandler2 doDeletions
INFO: DirectUpdateHandler2 deleting and removing dups for 89 ids
Jul 6, 2008 12:08:54 PM org.apache.solr.search.SolrIndexSearcher <init>
INFO: Opening Searcher@33f652 DirectUpdateHandler2
Jul 6, 2008 12:10:15 PM org.apache.solr.update.DirectUpdateHandler2 commit
INFO: start commit(optimize=false,waitFlush=true,waitSearcher=true)
Jul 6, 2008 12:10:15 PM org.apache.solr.update.DirectUpdateHandler2 doDeletions
INFO: DirectUpdateHandler2 deleting and removing dups for 91 ids
Jul 6, 2008 12:10:15 PM org.apache.solr.search.SolrIndexSearcher <init>
INFO: Opening Searcher@81f8ea DirectUpdateHandler2
Jul 6, 2008 12:10:17 PM org.apache.solr.update.DirectUpdateHandler2 doDeletions
INFO: DirectUpdateHandler2 docs deleted=0
Jul 6, 2008 12:10:17 PM org.apache.solr.search.SolrIndexSearcher <init>
INFO: Opening Searcher@fcf912 main

-------------------------------------------------------------------------
Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
Studies have shown that voting for your favorite open source project,
along with a healthy diet, reduces your potential for chronic lameness
and boredom. Vote Now at http://www.sourceforge.net/community/cca08
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech

-------------------------------------------------------------------------
Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
Studies have shown that voting for your favorite open source project,
along with a healthy diet, reduces your potential for chronic lameness
and boredom. Vote Now at http://www.sourceforge.net/community/cca08
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech
Reply | Threaded
Open this post in threaded view
|

Re: excessive commits / autowarming

wsgrah
Administrator
This is a "feature" of Solr 1.3. In 1.2, you only set a maxDocs for
autocommits in the update handler. This behavior changed a bit in 1.3 to
also include a maxTime element. Essentially, if one of the two criteria
are met, a commit occurs (which you're seeing).  I just committed a
change that will extend this to 20 seconds or 10,000 records.

Wayne

Barnett, Jeffrey wrote:

> On closer examination of a longer log snippet, the better question might be Why is MarcImporter opening closing and autowarming Searchers all over the place?
>
> INFO: Opening Searcher@776482 main
> Jul 6, 2008 12:26:21 PM org.apache.solr.update.DirectUpdateHandler2 commit
> INFO: end_commit_flush
> Jul 6, 2008 12:26:21 PM org.apache.solr.search.SolrIndexSearcher warm
> INFO: autowarming Searcher@776482 main from Searcher@15e134 main
>         filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
> Jul 6, 2008 12:26:21 PM org.apache.solr.search.SolrIndexSearcher warm
> INFO: autowarming result for Searcher@776482 main
>         filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
> Jul 6, 2008 12:26:21 PM org.apache.solr.search.SolrIndexSearcher warm
> INFO: autowarming Searcher@776482 main from Searcher@15e134 main
>         queryResultCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
> Jul 6, 2008 12:26:21 PM org.apache.solr.search.SolrIndexSearcher warm
> INFO: autowarming result for Searcher@776482 main
>         queryResultCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
> Jul 6, 2008 12:26:21 PM org.apache.solr.search.SolrIndexSearcher warm
> INFO: autowarming Searcher@776482 main from Searcher@15e134 main
>         documentCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
> Jul 6, 2008 12:26:21 PM org.apache.solr.search.SolrIndexSearcher warm
> INFO: autowarming result for Searcher@776482 main
>         documentCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
> Jul 6, 2008 12:26:21 PM org.apache.solr.core.SolrCore registerSearcher
> INFO: [Solr] Registered new searcher Searcher@776482 main
> Jul 6, 2008 12:26:21 PM org.apache.solr.search.SolrIndexSearcher close
> INFO: Closing Searcher@15e134 main
>         filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
>         queryResultCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
>         documentCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
>
> ________________________________________
> From: [hidden email] [[hidden email]] On Behalf Of Barnett, Jeffrey [[hidden email]]
> Sent: Sunday, July 06, 2008 12:25 PM
> To: Wayne Graham; [hidden email]
> Subject: [VuFind-Tech] excessive commits
>
> We are trying to get our full set of 7 million records loaded, and along with problems with the records themselves, it seems to take longer and longer to add records.
>
> One thin I notice in the logs is that a  possible excessive amount of committing and deduping on very small record sets.
> 1) is this normal?
> 2) is there a control to minimize it?
> 3) is this a good thing, and I just don't realize why?
>
> Sample Log:
> Jul 6, 2008 12:08:54 PM org.apache.solr.update.DirectUpdateHandler2 commit
> INFO: start commit(optimize=false,waitFlush=true,waitSearcher=true)
> Jul 6, 2008 12:08:54 PM org.apache.solr.update.DirectUpdateHandler2 doDeletions
> INFO: DirectUpdateHandler2 deleting and removing dups for 89 ids
> Jul 6, 2008 12:08:54 PM org.apache.solr.search.SolrIndexSearcher <init>
> INFO: Opening Searcher@33f652 DirectUpdateHandler2
> Jul 6, 2008 12:10:15 PM org.apache.solr.update.DirectUpdateHandler2 commit
> INFO: start commit(optimize=false,waitFlush=true,waitSearcher=true)
> Jul 6, 2008 12:10:15 PM org.apache.solr.update.DirectUpdateHandler2 doDeletions
> INFO: DirectUpdateHandler2 deleting and removing dups for 91 ids
> Jul 6, 2008 12:10:15 PM org.apache.solr.search.SolrIndexSearcher <init>
> INFO: Opening Searcher@81f8ea DirectUpdateHandler2
> Jul 6, 2008 12:10:17 PM org.apache.solr.update.DirectUpdateHandler2 doDeletions
> INFO: DirectUpdateHandler2 docs deleted=0
> Jul 6, 2008 12:10:17 PM org.apache.solr.search.SolrIndexSearcher <init>
> INFO: Opening Searcher@fcf912 main
>
> -------------------------------------------------------------------------
> Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
> Studies have shown that voting for your favorite open source project,
> along with a healthy diet, reduces your potential for chronic lameness
> and boredom. Vote Now at http://www.sourceforge.net/community/cca08
> _______________________________________________
> Vufind-tech mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/vufind-tech
>  

--
/**
  * Wayne Graham
  * Earl Gregg Swem Library
  * PO Box 8794
  * Williamsburg, VA 23188
  * 757.221.3112
  * http://swem.wm.edu/blogs/waynegraham/
  */



-------------------------------------------------------------------------
Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
Studies have shown that voting for your favorite open source project,
along with a healthy diet, reduces your potential for chronic lameness
and boredom. Vote Now at http://www.sourceforge.net/community/cca08
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech
Reply | Threaded
Open this post in threaded view
|

Re: excessive commits / autowarming

Barnett, Jeffrey
I'm glad to hear this is tuneable.  If the change isn't too complex (I assume a solrconfig parameter), could you post it separately (or send me a note) so that it can be customized on a site by site (or even run by run) basis?

-----Original Message-----
From: Wayne Graham [mailto:[hidden email]]
Sent: Monday, July 07, 2008 9:31 AM
To: Barnett, Jeffrey
Cc: [hidden email]
Subject: Re: excessive commits / autowarming

This is a "feature" of Solr 1.3. In 1.2, you only set a maxDocs for
autocommits in the update handler. This behavior changed a bit in 1.3 to
also include a maxTime element. Essentially, if one of the two criteria
are met, a commit occurs (which you're seeing).  I just committed a
change that will extend this to 20 seconds or 10,000 records.

Wayne

Barnett, Jeffrey wrote:
> On closer examination of a longer log snippet, the better question might be Why is MarcImporter opening closing and autowarming Searchers all over the place?
>....

-------------------------------------------------------------------------
Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
Studies have shown that voting for your favorite open source project,
along with a healthy diet, reduces your potential for chronic lameness
and boredom. Vote Now at http://www.sourceforge.net/community/cca08
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech
Reply | Threaded
Open this post in threaded view
|

Re: excessive commits / autowarming

wsgrah
Administrator
In solrconfig.xml:

...
<maxTime>20000</maxTime>
...

Barnett, Jeffrey wrote:

> I'm glad to hear this is tuneable.  If the change isn't too complex (I assume a solrconfig parameter), could you post it separately (or send me a note) so that it can be customized on a site by site (or even run by run) basis?
>
> -----Original Message-----
> From: Wayne Graham [mailto:[hidden email]]
> Sent: Monday, July 07, 2008 9:31 AM
> To: Barnett, Jeffrey
> Cc: [hidden email]
> Subject: Re: excessive commits / autowarming
>
> This is a "feature" of Solr 1.3. In 1.2, you only set a maxDocs for
> autocommits in the update handler. This behavior changed a bit in 1.3 to
> also include a maxTime element. Essentially, if one of the two criteria
> are met, a commit occurs (which you're seeing).  I just committed a
> change that will extend this to 20 seconds or 10,000 records.
>
> Wayne
>
> Barnett, Jeffrey wrote:
>  
>> On closer examination of a longer log snippet, the better question might be Why is MarcImporter opening closing and autowarming Searchers all over the place?
>> ....
>>    

--
/**
  * Wayne Graham
  * Earl Gregg Swem Library
  * PO Box 8794
  * Williamsburg, VA 23188
  * 757.221.3112
  * http://swem.wm.edu/blogs/waynegraham/
  */



-------------------------------------------------------------------------
Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
Studies have shown that voting for your favorite open source project,
along with a healthy diet, reduces your potential for chronic lameness
and boredom. Vote Now at http://www.sourceforge.net/community/cca08
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech
Reply | Threaded
Open this post in threaded view
|

Re: excessive commits / autowarming

Barnett, Jeffrey
Great, thanks.

Are there any other solrconfig tweaks you migh recommend for building large indices?  I've scanned and learned from the solr wiki you pointed to earlier (thanks for that too), but I'm thinking there might be specific things about the vufind environment that might make special case logic apply.  For one thing, I'm doing all of this offline, with no simultaneous query activity.

-----Original Message-----
From: Wayne Graham [mailto:[hidden email]]
Sent: Monday, July 07, 2008 9:58 AM
To: Barnett, Jeffrey
Cc: [hidden email]
Subject: Re: excessive commits / autowarming

In solrconfig.xml:

...
<maxTime>20000</maxTime>
...

Barnett, Jeffrey wrote:

> I'm glad to hear this is tuneable.  If the change isn't too complex (I assume a solrconfig parameter), could you post it separately (or send me a note) so that it can be customized on a site by site (or even run by run) basis?
>
> -----Original Message-----
> From: Wayne Graham [mailto:[hidden email]]
> Sent: Monday, July 07, 2008 9:31 AM
> To: Barnett, Jeffrey
> Cc: [hidden email]
> Subject: Re: excessive commits / autowarming
>
> This is a "feature" of Solr 1.3. In 1.2, you only set a maxDocs for
> autocommits in the update handler. This behavior changed a bit in 1.3 to
> also include a maxTime element. Essentially, if one of the two criteria
> are met, a commit occurs (which you're seeing).  I just committed a
> change that will extend this to 20 seconds or 10,000 records.
>
> Wayne
>
> Barnett, Jeffrey wrote:
>
>> On closer examination of a longer log snippet, the better question might be Why is MarcImporter opening closing and autowarming Searchers all over the place?
>> ....
>>

--
/**
  * Wayne Graham
  * Earl Gregg Swem Library
  * PO Box 8794
  * Williamsburg, VA 23188
  * 757.221.3112
  * http://swem.wm.edu/blogs/waynegraham/
  */



-------------------------------------------------------------------------
Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
Studies have shown that voting for your favorite open source project,
along with a healthy diet, reduces your potential for chronic lameness
and boredom. Vote Now at http://www.sourceforge.net/community/cca08
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech
Reply | Threaded
Open this post in threaded view
|

Re: excessive commits / autowarming

wsgrah
Administrator
I'm still going through the new stuff in Solr, but there are some
interesting new elements.

ramBufferedSizeMB (can be set with maxBufferedDocs) If both are set, the
first one reached triggers a flush. The default is 32MB. You may also
want to look at the httpCaching section if you are using caching, also
if you're behind a load balancer, check out the healthcheck section
(uncomment the server-eneabled section).

Wayne

Barnett, Jeffrey wrote:

> Great, thanks.
>
> Are there any other solrconfig tweaks you migh recommend for building large indices?  I've scanned and learned from the solr wiki you pointed to earlier (thanks for that too), but I'm thinking there might be specific things about the vufind environment that might make special case logic apply.  For one thing, I'm doing all of this offline, with no simultaneous query activity.
>
> -----Original Message-----
> From: Wayne Graham [mailto:[hidden email]]
> Sent: Monday, July 07, 2008 9:58 AM
> To: Barnett, Jeffrey
> Cc: [hidden email]
> Subject: Re: excessive commits / autowarming
>
> In solrconfig.xml:
>
> ...
> <maxTime>20000</maxTime>
> ...
>
> Barnett, Jeffrey wrote:
>  
>> I'm glad to hear this is tuneable.  If the change isn't too complex (I assume a solrconfig parameter), could you post it separately (or send me a note) so that it can be customized on a site by site (or even run by run) basis?
>>
>> -----Original Message-----
>> From: Wayne Graham [mailto:[hidden email]]
>> Sent: Monday, July 07, 2008 9:31 AM
>> To: Barnett, Jeffrey
>> Cc: [hidden email]
>> Subject: Re: excessive commits / autowarming
>>
>> This is a "feature" of Solr 1.3. In 1.2, you only set a maxDocs for
>> autocommits in the update handler. This behavior changed a bit in 1.3 to
>> also include a maxTime element. Essentially, if one of the two criteria
>> are met, a commit occurs (which you're seeing).  I just committed a
>> change that will extend this to 20 seconds or 10,000 records.
>>
>> Wayne
>>
>> Barnett, Jeffrey wrote:
>>
>>    
>>> On closer examination of a longer log snippet, the better question might be Why is MarcImporter opening closing and autowarming Searchers all over the place?
>>> ....
>>>
>>>      
>
> --
> /**
>   * Wayne Graham
>   * Earl Gregg Swem Library
>   * PO Box 8794
>   * Williamsburg, VA 23188
>   * 757.221.3112
>   * http://swem.wm.edu/blogs/waynegraham/
>   */
>
>
>  

--
/**
  * Wayne Graham
  * Earl Gregg Swem Library
  * PO Box 8794
  * Williamsburg, VA 23188
  * 757.221.3112
  * http://swem.wm.edu/blogs/waynegraham/
  */



-------------------------------------------------------------------------
Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
Studies have shown that voting for your favorite open source project,
along with a healthy diet, reduces your potential for chronic lameness
and boredom. Vote Now at http://www.sourceforge.net/community/cca08
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech
Reply | Threaded
Open this post in threaded view
|

Re: excessive commits / autowarming

Barnett, Jeffrey
Thanks again.  I'm going to give all of these a try on my next batch of a million records (current batch still loading after 19 hours) and report back on the results.

-----Original Message-----
From: Wayne Graham [mailto:[hidden email]]
Sent: Monday, July 07, 2008 11:55 AM
To: Barnett, Jeffrey
Cc: [hidden email]
Subject: Re: excessive commits / autowarming

I'm still going through the new stuff in Solr, but there are some
interesting new elements.

ramBufferedSizeMB (can be set with maxBufferedDocs) If both are set, the
first one reached triggers a flush. The default is 32MB. You may also
want to look at the httpCaching section if you are using caching, also
if you're behind a load balancer, check out the healthcheck section
(uncomment the server-eneabled section).

Wayne

Barnett, Jeffrey wrote:

> Great, thanks.
>
> Are there any other solrconfig tweaks you migh recommend for building large indices?  I've scanned and learned from the solr wiki you pointed to earlier (thanks for that too), but I'm thinking there might be specific things about the vufind environment that might make special case logic apply.  For one thing, I'm doing all of this offline, with no simultaneous query activity.
>
> -----Original Message-----
> From: Wayne Graham [mailto:[hidden email]]
> Sent: Monday, July 07, 2008 9:58 AM
> To: Barnett, Jeffrey
> Cc: [hidden email]
> Subject: Re: excessive commits / autowarming
>
> In solrconfig.xml:
>
> ...
> <maxTime>20000</maxTime>
> ...
>
> Barnett, Jeffrey wrote:
>
>> I'm glad to hear this is tuneable.  If the change isn't too complex (I assume a solrconfig parameter), could you post it separately (or send me a note) so that it can be customized on a site by site (or even run by run) basis?
>>
>> -----Original Message-----
>> From: Wayne Graham [mailto:[hidden email]]
>> Sent: Monday, July 07, 2008 9:31 AM
>> To: Barnett, Jeffrey
>> Cc: [hidden email]
>> Subject: Re: excessive commits / autowarming
>>
>> This is a "feature" of Solr 1.3. In 1.2, you only set a maxDocs for
>> autocommits in the update handler. This behavior changed a bit in 1.3 to
>> also include a maxTime element. Essentially, if one of the two criteria
>> are met, a commit occurs (which you're seeing).  I just committed a
>> change that will extend this to 20 seconds or 10,000 records.
>>
>> Wayne
>>
>> Barnett, Jeffrey wrote:
>>
>>
>>> On closer examination of a longer log snippet, the better question might be Why is MarcImporter opening closing and autowarming Searchers all over the place?
>>> ....
>>>
>>>
>
> --
> /**
>   * Wayne Graham
>   * Earl Gregg Swem Library
>   * PO Box 8794
>   * Williamsburg, VA 23188
>   * 757.221.3112
>   * http://swem.wm.edu/blogs/waynegraham/
>   */
>
>
>

--
/**
  * Wayne Graham
  * Earl Gregg Swem Library
  * PO Box 8794
  * Williamsburg, VA 23188
  * 757.221.3112
  * http://swem.wm.edu/blogs/waynegraham/
  */



-------------------------------------------------------------------------
Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
Studies have shown that voting for your favorite open source project,
along with a healthy diet, reduces your potential for chronic lameness
and boredom. Vote Now at http://www.sourceforge.net/community/cca08
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech
Reply | Threaded
Open this post in threaded view
|

Re: excessive commits / autowarming

wsgrah
Administrator
Are you running on Slowlaris, er, Solaris? Which JDK are you using? What
kind of processor?

A million records really shouldn't take that long. I'm doing 2 million
on my  desktop in about 90 minutes.

Wayne

Barnett, Jeffrey wrote:

> Thanks again.  I'm going to give all of these a try on my next batch of a million records (current batch still loading after 19 hours) and report back on the results.
>
> -----Original Message-----
> From: Wayne Graham [mailto:[hidden email]]
> Sent: Monday, July 07, 2008 11:55 AM
> To: Barnett, Jeffrey
> Cc: [hidden email]
> Subject: Re: excessive commits / autowarming
>
> I'm still going through the new stuff in Solr, but there are some
> interesting new elements.
>
> ramBufferedSizeMB (can be set with maxBufferedDocs) If both are set, the
> first one reached triggers a flush. The default is 32MB. You may also
> want to look at the httpCaching section if you are using caching, also
> if you're behind a load balancer, check out the healthcheck section
> (uncomment the server-eneabled section).
>
> Wayne
>
> Barnett, Jeffrey wrote:
>  
>> Great, thanks.
>>
>> Are there any other solrconfig tweaks you migh recommend for building large indices?  I've scanned and learned from the solr wiki you pointed to earlier (thanks for that too), but I'm thinking there might be specific things about the vufind environment that might make special case logic apply.  For one thing, I'm doing all of this offline, with no simultaneous query activity.
>>
>> -----Original Message-----
>> From: Wayne Graham [mailto:[hidden email]]
>> Sent: Monday, July 07, 2008 9:58 AM
>> To: Barnett, Jeffrey
>> Cc: [hidden email]
>> Subject: Re: excessive commits / autowarming
>>
>> In solrconfig.xml:
>>
>> ...
>> <maxTime>20000</maxTime>
>> ...
>>
>> Barnett, Jeffrey wrote:
>>
>>    
>>> I'm glad to hear this is tuneable.  If the change isn't too complex (I assume a solrconfig parameter), could you post it separately (or send me a note) so that it can be customized on a site by site (or even run by run) basis?
>>>
>>> -----Original Message-----
>>> From: Wayne Graham [mailto:[hidden email]]
>>> Sent: Monday, July 07, 2008 9:31 AM
>>> To: Barnett, Jeffrey
>>> Cc: [hidden email]
>>> Subject: Re: excessive commits / autowarming
>>>
>>> This is a "feature" of Solr 1.3. In 1.2, you only set a maxDocs for
>>> autocommits in the update handler. This behavior changed a bit in 1.3 to
>>> also include a maxTime element. Essentially, if one of the two criteria
>>> are met, a commit occurs (which you're seeing).  I just committed a
>>> change that will extend this to 20 seconds or 10,000 records.
>>>
>>> Wayne
>>>
>>> Barnett, Jeffrey wrote:
>>>
>>>
>>>      
>>>> On closer examination of a longer log snippet, the better question might be Why is MarcImporter opening closing and autowarming Searchers all over the place?
>>>> ....
>>>>
>>>>
>>>>        
>> --
>> /**
>>   * Wayne Graham
>>   * Earl Gregg Swem Library
>>   * PO Box 8794
>>   * Williamsburg, VA 23188
>>   * 757.221.3112
>>   * http://swem.wm.edu/blogs/waynegraham/
>>   */
>>
>>
>>
>>    
>
> --
> /**
>   * Wayne Graham
>   * Earl Gregg Swem Library
>   * PO Box 8794
>   * Williamsburg, VA 23188
>   * 757.221.3112
>   * http://swem.wm.edu/blogs/waynegraham/
>   */
>
>
>  

--
/**
  * Wayne Graham
  * Earl Gregg Swem Library
  * PO Box 8794
  * Williamsburg, VA 23188
  * 757.221.3112
  * http://swem.wm.edu/blogs/waynegraham/
  */



-------------------------------------------------------------------------
Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
Studies have shown that voting for your favorite open source project,
along with a healthy diet, reduces your potential for chronic lameness
and boredom. Vote Now at http://www.sourceforge.net/community/cca08
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech
Reply | Threaded
Open this post in threaded view
|

Re: excessive commits / autowarming

Barnett, Jeffrey
Solaris 10 on a five year old v880 (Sparc) 6 cpu, 16GB ram.
JDK 1.5.0_7

While indexing never use more that 20% of one cpu and 6GB memory, 0 swapping.

We used to get about 400K/hr, but never had more than 2 million loaded at a time.  Time became a problem when we went above that number and/or when we upgraded beyond rev 680 (current rev is 759).

I doubt the hardware or OS is the culprit.  The same configuration runs our production Voyager system for 600 staff and 20000+ faculty and students.  We also run a staff-only Tomcat server on a smaller machine.  Somewhere we just have a bad choice/default of parameters.

-----Original Message-----
From: Wayne Graham [mailto:[hidden email]]
Sent: Monday, July 07, 2008 12:20 PM
To: Barnett, Jeffrey
Cc: [hidden email]
Subject: Re: excessive commits / autowarming

Are you running on Slowlaris, er, Solaris? Which JDK are you using? What
kind of processor?

A million records really shouldn't take that long. I'm doing 2 million
on my  desktop in about 90 minutes.

Wayne

Barnett, Jeffrey wrote:

> Thanks again.  I'm going to give all of these a try on my next batch of a million records (current batch still loading after 19 hours) and report back on the results.
>
> -----Original Message-----
> From: Wayne Graham [mailto:[hidden email]]
> Sent: Monday, July 07, 2008 11:55 AM
> To: Barnett, Jeffrey
> Cc: [hidden email]
> Subject: Re: excessive commits / autowarming
>
> I'm still going through the new stuff in Solr, but there are some
> interesting new elements.
>
> ramBufferedSizeMB (can be set with maxBufferedDocs) If both are set, the
> first one reached triggers a flush. The default is 32MB. You may also
> want to look at the httpCaching section if you are using caching, also
> if you're behind a load balancer, check out the healthcheck section
> (uncomment the server-eneabled section).
>
> Wayne
>
> Barnett, Jeffrey wrote:
>
>> Great, thanks.
>>
>> Are there any other solrconfig tweaks you migh recommend for building large indices?  I've scanned and learned from the solr wiki you pointed to earlier (thanks for that too), but I'm thinking there might be specific things about the vufind environment that might make special case logic apply.  For one thing, I'm doing all of this offline, with no simultaneous query activity.
>>
>> -----Original Message-----
>> From: Wayne Graham [mailto:[hidden email]]
>> Sent: Monday, July 07, 2008 9:58 AM
>> To: Barnett, Jeffrey
>> Cc: [hidden email]
>> Subject: Re: excessive commits / autowarming
>>
>> In solrconfig.xml:
>>
>> ...
>> <maxTime>20000</maxTime>
>> ...
>>
>> Barnett, Jeffrey wrote:
>>
>>
>>> I'm glad to hear this is tuneable.  If the change isn't too complex (I assume a solrconfig parameter), could you post it separately (or send me a note) so that it can be customized on a site by site (or even run by run) basis?
>>>
>>> -----Original Message-----
>>> From: Wayne Graham [mailto:[hidden email]]
>>> Sent: Monday, July 07, 2008 9:31 AM
>>> To: Barnett, Jeffrey
>>> Cc: [hidden email]
>>> Subject: Re: excessive commits / autowarming
>>>
>>> This is a "feature" of Solr 1.3. In 1.2, you only set a maxDocs for
>>> autocommits in the update handler. This behavior changed a bit in 1.3 to
>>> also include a maxTime element. Essentially, if one of the two criteria
>>> are met, a commit occurs (which you're seeing).  I just committed a
>>> change that will extend this to 20 seconds or 10,000 records.
>>>
>>> Wayne
>>>
>>> Barnett, Jeffrey wrote:
>>>
>>>
>>>
>>>> On closer examination of a longer log snippet, the better question might be Why is MarcImporter opening closing and autowarming Searchers all over the place?
>>>> ....
>>>>
>>>>
>>>>
>> --
>> /**
>>   * Wayne Graham
>>   * Earl Gregg Swem Library
>>   * PO Box 8794
>>   * Williamsburg, VA 23188
>>   * 757.221.3112
>>   * http://swem.wm.edu/blogs/waynegraham/
>>   */
>>
>>
>>
>>
>
> --
> /**
>   * Wayne Graham
>   * Earl Gregg Swem Library
>   * PO Box 8794
>   * Williamsburg, VA 23188
>   * 757.221.3112
>   * http://swem.wm.edu/blogs/waynegraham/
>   */
>
>
>

--
/**
  * Wayne Graham
  * Earl Gregg Swem Library
  * PO Box 8794
  * Williamsburg, VA 23188
  * 757.221.3112
  * http://swem.wm.edu/blogs/waynegraham/
  */



-------------------------------------------------------------------------
Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
Studies have shown that voting for your favorite open source project,
along with a healthy diet, reduces your potential for chronic lameness
and boredom. Vote Now at http://www.sourceforge.net/community/cca08
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech
Reply | Threaded
Open this post in threaded view
|

Re: excessive commits / autowarming

wsgrah
Administrator
Since you're on an older system with an older Java build, you can get a
little more performance out of tweaking the young generation objects
(something like -XX:NewRatio=5). Also, you may want to profile your GC.
You can see it by passing the -XX:-PrintGC. Also, Solaris has some
customized options for Java (like intimate shared memory and multiple
page size support for large memory).

For a big list of VM options, check out
http://java.sun.com/javase/technologies/hotspot/vmoptions.jsp

Also, there is a lot more processing going on now for more in-depth
faceting. Still, even on my wife's laptop (a four year-old Centrino with
Java 6 and a gig of RAM), I was getting around 380/second (doing the
math, that's 1,380,00/hour).

That makes me think...have you tried running the indexing from solrmarc
on your system? I can help you out with that offline if you'd like to
investigate it.

Wayne

Barnett, Jeffrey wrote:

> Solaris 10 on a five year old v880 (Sparc) 6 cpu, 16GB ram.
> JDK 1.5.0_7
>
> While indexing never use more that 20% of one cpu and 6GB memory, 0 swapping.
>
> We used to get about 400K/hr, but never had more than 2 million loaded at a time.  Time became a problem when we went above that number and/or when we upgraded beyond rev 680 (current rev is 759).
>
> I doubt the hardware or OS is the culprit.  The same configuration runs our production Voyager system for 600 staff and 20000+ faculty and students.  We also run a staff-only Tomcat server on a smaller machine.  Somewhere we just have a bad choice/default of parameters.
>
> -----Original Message-----
> From: Wayne Graham [mailto:[hidden email]]
> Sent: Monday, July 07, 2008 12:20 PM
> To: Barnett, Jeffrey
> Cc: [hidden email]
> Subject: Re: excessive commits / autowarming
>
> Are you running on Slowlaris, er, Solaris? Which JDK are you using? What
> kind of processor?
>
> A million records really shouldn't take that long. I'm doing 2 million
> on my  desktop in about 90 minutes.
>
> Wayne
>
> Barnett, Jeffrey wrote:
>  
>> Thanks again.  I'm going to give all of these a try on my next batch of a million records (current batch still loading after 19 hours) and report back on the results.
>>
>> -----Original Message-----
>> From: Wayne Graham [mailto:[hidden email]]
>> Sent: Monday, July 07, 2008 11:55 AM
>> To: Barnett, Jeffrey
>> Cc: [hidden email]
>> Subject: Re: excessive commits / autowarming
>>
>> I'm still going through the new stuff in Solr, but there are some
>> interesting new elements.
>>
>> ramBufferedSizeMB (can be set with maxBufferedDocs) If both are set, the
>> first one reached triggers a flush. The default is 32MB. You may also
>> want to look at the httpCaching section if you are using caching, also
>> if you're behind a load balancer, check out the healthcheck section
>> (uncomment the server-eneabled section).
>>
>> Wayne
>>
>> Barnett, Jeffrey wrote:
>>
>>    
>>> Great, thanks.
>>>
>>> Are there any other solrconfig tweaks you migh recommend for building large indices?  I've scanned and learned from the solr wiki you pointed to earlier (thanks for that too), but I'm thinking there might be specific things about the vufind environment that might make special case logic apply.  For one thing, I'm doing all of this offline, with no simultaneous query activity.
>>>
>>> -----Original Message-----
>>> From: Wayne Graham [mailto:[hidden email]]
>>> Sent: Monday, July 07, 2008 9:58 AM
>>> To: Barnett, Jeffrey
>>> Cc: [hidden email]
>>> Subject: Re: excessive commits / autowarming
>>>
>>> In solrconfig.xml:
>>>
>>> ...
>>> <maxTime>20000</maxTime>
>>> ...
>>>
>>> Barnett, Jeffrey wrote:
>>>
>>>
>>>      
>>>> I'm glad to hear this is tuneable.  If the change isn't too complex (I assume a solrconfig parameter), could you post it separately (or send me a note) so that it can be customized on a site by site (or even run by run) basis?
>>>>
>>>> -----Original Message-----
>>>> From: Wayne Graham [mailto:[hidden email]]
>>>> Sent: Monday, July 07, 2008 9:31 AM
>>>> To: Barnett, Jeffrey
>>>> Cc: [hidden email]
>>>> Subject: Re: excessive commits / autowarming
>>>>
>>>> This is a "feature" of Solr 1.3. In 1.2, you only set a maxDocs for
>>>> autocommits in the update handler. This behavior changed a bit in 1.3 to
>>>> also include a maxTime element. Essentially, if one of the two criteria
>>>> are met, a commit occurs (which you're seeing).  I just committed a
>>>> change that will extend this to 20 seconds or 10,000 records.
>>>>
>>>> Wayne
>>>>
>>>> Barnett, Jeffrey wrote:
>>>>
>>>>
>>>>
>>>>        
>>>>> On closer examination of a longer log snippet, the better question might be Why is MarcImporter opening closing and autowarming Searchers all over the place?
>>>>> ....
>>>>>
>>>>>
>>>>>
>>>>>          
>>> --
>>> /**
>>>   * Wayne Graham
>>>   * Earl Gregg Swem Library
>>>   * PO Box 8794
>>>   * Williamsburg, VA 23188
>>>   * 757.221.3112
>>>   * http://swem.wm.edu/blogs/waynegraham/
>>>   */
>>>
>>>
>>>
>>>
>>>      
>> --
>> /**
>>   * Wayne Graham
>>   * Earl Gregg Swem Library
>>   * PO Box 8794
>>   * Williamsburg, VA 23188
>>   * 757.221.3112
>>   * http://swem.wm.edu/blogs/waynegraham/
>>   */
>>
>>
>>
>>    
>
> --
> /**
>   * Wayne Graham
>   * Earl Gregg Swem Library
>   * PO Box 8794
>   * Williamsburg, VA 23188
>   * 757.221.3112
>   * http://swem.wm.edu/blogs/waynegraham/
>   */
>
>
>  

--
/**
  * Wayne Graham
  * Earl Gregg Swem Library
  * PO Box 8794
  * Williamsburg, VA 23188
  * 757.221.3112
  * http://swem.wm.edu/blogs/waynegraham/
  */



-------------------------------------------------------------------------
Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
Studies have shown that voting for your favorite open source project,
along with a healthy diet, reduces your potential for chronic lameness
and boredom. Vote Now at http://www.sourceforge.net/community/cca08
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech
Reply | Threaded
Open this post in threaded view
|

Re: excessive commits / autowarming

Steven McPhillips
In reply to this post by Barnett, Jeffrey
Hi Jeff,

My input here is purely incidental - I don't have anything much to  
offer in terms of speeding up your jobs, but you might find this info  
useful nonetheless.

On 08/07/2008, at 3:17 AM, Barnett, Jeffrey wrote:

> Solaris 10 on a five year old v880 (Sparc) 6 cpu, 16GB ram.
> JDK 1.5.0_7

Snap! Until last month, we had a similar machine (8cpu v880) for our  
voyager server as well. We had the 900Mhz cpus from memory. Our  
experience has been that the SOLR import jobs like higher clock  
speeds. In fact, systems like SOLR don't really seem to benefit from  
the usual Solaris/SPARC strengths of a massive backplane and high  
throughput: IO is short random reads, the requests are short lived  
and they thrive on a high clock speed.

>
> While indexing never use more that 20% of one cpu and 6GB memory, 0  
> swapping.

Just checking - is this what plain "top" tells you? In a 6 cpu box, 1  
cpu fully utilised will give you about 17% total cpu use, so 20%  
would show you about 3%? Is this correct?

My guess is that the 20% is 1 cpu fully utilised, plus a bit more for  
helper tasks. solr is multithreaded, but the indexing process is  
probably only invoking 1 UpdatehHandler, meaning your update jobs can  
only run on 1 cpu.

>
> We used to get about 400K/hr, but never had more than 2 million  
> loaded at a time.  Time became a problem when we went above that  
> number and/or when we upgraded beyond rev 680 (current rev is 759).

Do you get similar system usage loading less than 2million records?  
If not, it could be an IO issue with larger import files perhaps.

>
> I doubt the hardware or OS is the culprit.  The same configuration  
> runs our production Voyager system for 600 staff and 20000+ faculty  
> and students.

Maybe, but SOLR and Voyager and very different beasts. It would be  
better to compare SOLR to SOLR, so perhaps getting some more  
information on how your box performs with different import loads  
(100K, 500K, 1mil, 2mil record sizes perhaps) and even different  
versions of Java (if possible) would be useful.

>
> ----------------------------------------------------------------------
> ---
> Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
> Studies have shown that voting for your favorite open source project,
> along with a healthy diet, reduces your potential for chronic lameness
> and boredom. Vote Now at http://www.sourceforge.net/community/cca08
> _______________________________________________
> Vufind-tech mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/vufind-tech




----
Steven McPhillips <[hidden email]>
IT Business Systems
National Library of Australia
Try our new catalogue - http://catalogue.nla.gov.au




-------------------------------------------------------------------------
Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
Studies have shown that voting for your favorite open source project,
along with a healthy diet, reduces your potential for chronic lameness
and boredom. Vote Now at http://www.sourceforge.net/community/cca08
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech