Solrmarc - requires filesystem access to solr?

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Solrmarc - requires filesystem access to solr?

Greg Pendlebury
Solrmarc - requires filesystem access to solr?

Now that I've got vufind up on a server I've been playing around with the possibility of removing solr from inside vufind so it can live on its own.

Eg:
'/home/solr' and '/home/vufind'
Instead of
'/home/vufind/' and '/home/vufind/solr'

The issue I want to clarify is why does vufind need to know the physical location of solr when solr is a web service? I've found the solr path in the config file for vufind as well as a part of solrmarc. From looking inside the solrmarc source quickly it _seems_ that solrmarc is reading the schema information and such out of solr's filesystem space directly.

Now not knowing a lot about solr/solrmarc at this stage I had a few thoughts/questions:

* Would it be possible/better for solrmarc to get this information from jetty via the web (if it is even available there), or a local copy of such information.

* Is solrmarc indexing records straight into solr (filesystem) or doing so via jetty (web)?
* Could this be why (on my windows dev box) solrmarc was putting indexed records in the right spot whilst jetty (with incorrect paths as a windows service) thought the index was empty?

* Is this why jetty must be restarted to find indexed items?

If what I suspect above is true it sounds like it's a fundamental of the way solrmarc works, so (for importing) solr couldn't live in a separate location (or more significantly, a separate server). But would this be true for vufind if the import process is ignored? Since the local path to solr in vufind's config.ini?

Any thoughts appreciated.

Ta,

Greg Pendlebury
Electronic Services Officer (Systems Team)
Division of Academic Information Services
University of Southern Queensland
Phone: +61 7 4631 1501
Fax: +61 7 4631 1841


This email (including any attached files) is confidential and is for the intended recipient(s) only. If you received this email by mistake, please, as a courtesy, tell the sender, then delete this email.

The views and opinions are the originator's and do not necessarily reflect those of the University of Southern Queensland. Although all reasonable precautions were taken to ensure that this email contained no viruses at the time it was sent we accept no liability for any losses arising from its receipt.

The University of Southern Queensland is a registered provider of education with the Australian Government (CRICOS Institution Code No's. QLD 00244B / NSW 02225M)
------------------------------------------------------------------------------
SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you.  Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Solrmarc - requires filesystem access to solr?

wsgrah
Administrator
Greg,

Yes, solrmarc needs to know where to look for the config file so it can start its indexing. The idea behind solrmarc was to make the indexing of marc as fast as possible. To do this, we skip talking to Solr through Jetty, and just talk directly to Solr.

Would it be possible/better for solrmarc to get this information from jetty via the web (if it is even available there), or a local copy of such information.

You do need a way to tell the underlying Java where to look; in the web interface this is done in the Jetty configuration files. The easiest way to do this is to just point to where the files are.

Is solrmarc indexing records straight into solr (filesystem) or doing so via jetty (web)?
Solrmarc indexes records directly into solr

Could this be why (on my windows dev box) solrmarc was putting indexed records in the right spot whilst jetty (with incorrect paths as a windows service) thought the index was empty?
Yes...when you index with solrmarc, the updateRequestHandler is on the filesystem and the Jetty interface doesn't know to look for potentially new records until something happens through Jetty to trigger this (e.g. update request, restart jetty, etc.).

And yes, Solr can live on a seperate server...in fact, it's probably a good idea. Depending on your server infrastructure, you may also find it easier to maintain in Tomcat too. In my benchmarking a couple of years ago, Jetty was only slightly faster. You may also find if you have a server with a slow clock speed that you can index your files faster on a desktop and move them to a server faster than you can on a slower processor.

Probably the only thing you don't want to do is to be indexing over http to another server...it could potentially bomb your network.

Did I get them all?

Wayne


/**
* @author Wayne Graham
* @web http://www.liquidfoot.com
*/
Franklin P. Jones  - "All women should know how to take care of children. Most of them will have a husband some day."

On Wed, Dec 10, 2008 at 7:33 PM, Greg Pendlebury <[hidden email]> wrote:

Now that I've got vufind up on a server I've been playing around with the possibility of removing solr from inside vufind so it can live on its own.

Eg:
'/home/solr' and '/home/vufind'
Instead of
'/home/vufind/' and '/home/vufind/solr'

The issue I want to clarify is why does vufind need to know the physical location of solr when solr is a web service? I've found the solr path in the config file for vufind as well as a part of solrmarc. From looking inside the solrmarc source quickly it _seems_ that solrmarc is reading the schema information and such out of solr's filesystem space directly.

Now not knowing a lot about solr/solrmarc at this stage I had a few thoughts/questions:

* Would it be possible/better for solrmarc to get this information from jetty via the web (if it is even available there), or a local copy of such information.

* Is solrmarc indexing records straight into solr (filesystem) or doing so via jetty (web)?
* Could this be why (on my windows dev box) solrmarc was putting indexed records in the right spot whilst jetty (with incorrect paths as a windows service) thought the index was empty?

* Is this why jetty must be restarted to find indexed items?

If what I suspect above is true it sounds like it's a fundamental of the way solrmarc works, so (for importing) solr couldn't live in a separate location (or more significantly, a separate server). But would this be true for vufind if the import process is ignored? Since the local path to solr in vufind's config.ini?

Any thoughts appreciated.

Ta,

Greg Pendlebury
Electronic Services Officer (Systems Team)
Division of Academic Information Services
University of Southern Queensland
Phone: +61 7 4631 1501
Fax: +61 7 4631 1841


This email (including any attached files) is confidential and is for the intended recipient(s) only. If you received this email by mistake, please, as a courtesy, tell the sender, then delete this email.

The views and opinions are the originator's and do not necessarily reflect those of the University of Southern Queensland. Although all reasonable precautions were taken to ensure that this email contained no viruses at the time it was sent we accept no liability for any losses arising from its receipt.

The University of Southern Queensland is a registered provider of education with the Australian Government (CRICOS Institution Code No's. QLD 00244B / NSW 02225M)

------------------------------------------------------------------------------
SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you.  Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech



------------------------------------------------------------------------------
SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you.  Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Solrmarc - requires filesystem access to solr?

Greg Pendlebury
Thanks Wayne, you got the important stuff for indexing :)
 
I guess I could move solrmarc to live with solr instead of with vufind. Don't know why I didn't think of that before.
 
Probably the only question mark for me is the presence of solr's local path in vufind's config.ini?
 
Greg Pendlebury
Electronic Services Officer (Systems Team)
Division of Academic Information Services
University of Southern Queensland
Phone: +61 7 4631 1501
Fax: +61 7 4631 1841
 


From: [hidden email] [mailto:[hidden email]] On Behalf Of Wayne Graham
Sent: Thursday, 11 December 2008 11:04 AM
To: Greg Pendlebury
Cc: [hidden email]
Subject: Re: [VuFind-Tech] Solrmarc - requires filesystem access to solr?

Greg,

Yes, solrmarc needs to know where to look for the config file so it can start its indexing. The idea behind solrmarc was to make the indexing of marc as fast as possible. To do this, we skip talking to Solr through Jetty, and just talk directly to Solr.

Would it be possible/better for solrmarc to get this information from jetty via the web (if it is even available there), or a local copy of such information.

You do need a way to tell the underlying Java where to look; in the web interface this is done in the Jetty configuration files. The easiest way to do this is to just point to where the files are.

Is solrmarc indexing records straight into solr (filesystem) or doing so via jetty (web)?
Solrmarc indexes records directly into solr

Could this be why (on my windows dev box) solrmarc was putting indexed records in the right spot whilst jetty (with incorrect paths as a windows service) thought the index was empty?
Yes...when you index with solrmarc, the updateRequestHandler is on the filesystem and the Jetty interface doesn't know to look for potentially new records until something happens through Jetty to trigger this (e.g. update request, restart jetty, etc.).

And yes, Solr can live on a seperate server...in fact, it's probably a good idea. Depending on your server infrastructure, you may also find it easier to maintain in Tomcat too. In my benchmarking a couple of years ago, Jetty was only slightly faster. You may also find if you have a server with a slow clock speed that you can index your files faster on a desktop and move them to a server faster than you can on a slower processor.

Probably the only thing you don't want to do is to be indexing over http to another server...it could potentially bomb your network.

Did I get them all?

Wayne


/**
* @author Wayne Graham
* @web http://www.liquidfoot.com
*/
Franklin P. Jones  - "All women should know how to take care of children. Most of them will have a husband some day."

On Wed, Dec 10, 2008 at 7:33 PM, Greg Pendlebury <[hidden email]> wrote:

Now that I've got vufind up on a server I've been playing around with the possibility of removing solr from inside vufind so it can live on its own.

Eg:
'/home/solr' and '/home/vufind'
Instead of
'/home/vufind/' and '/home/vufind/solr'

The issue I want to clarify is why does vufind need to know the physical location of solr when solr is a web service? I've found the solr path in the config file for vufind as well as a part of solrmarc. From looking inside the solrmarc source quickly it _seems_ that solrmarc is reading the schema information and such out of solr's filesystem space directly.

Now not knowing a lot about solr/solrmarc at this stage I had a few thoughts/questions:

* Would it be possible/better for solrmarc to get this information from jetty via the web (if it is even available there), or a local copy of such information.

* Is solrmarc indexing records straight into solr (filesystem) or doing so via jetty (web)?
* Could this be why (on my windows dev box) solrmarc was putting indexed records in the right spot whilst jetty (with incorrect paths as a windows service) thought the index was empty?

* Is this why jetty must be restarted to find indexed items?

If what I suspect above is true it sounds like it's a fundamental of the way solrmarc works, so (for importing) solr couldn't live in a separate location (or more significantly, a separate server). But would this be true for vufind if the import process is ignored? Since the local path to solr in vufind's config.ini?

Any thoughts appreciated.

Ta,

Greg Pendlebury
Electronic Services Officer (Systems Team)
Division of Academic Information Services
University of Southern Queensland
Phone: +61 7 4631 1501
Fax: +61 7 4631 1841


This email (including any attached files) is confidential and is for the intended recipient(s) only. If you received this email by mistake, please, as a courtesy, tell the sender, then delete this email.

The views and opinions are the originator's and do not necessarily reflect those of the University of Southern Queensland. Although all reasonable precautions were taken to ensure that this email contained no viruses at the time it was sent we accept no liability for any losses arising from its receipt.

The University of Southern Queensland is a registered provider of education with the Australian Government (CRICOS Institution Code No's. QLD 00244B / NSW 02225M)

------------------------------------------------------------------------------
SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you.  Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech



This email (including any attached files) is confidential and is for the intended recipient(s) only. If you received this email by mistake, please, as a courtesy, tell the sender, then delete this email.

The views and opinions are the originator's and do not necessarily reflect those of the University of Southern Queensland. Although all reasonable precautions were taken to ensure that this email contained no viruses at the time it was sent we accept no liability for any losses arising from its receipt.

The University of Southern Queensland is a registered provider of education with the Australian Government (CRICOS Institution Code No's. QLD 00244B / NSW 02225M)
------------------------------------------------------------------------------
SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you.  Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Solrmarc - requires filesystem access to solr?

Wayne Graham
I think we put it in there when we were trying to just use a single ini file...Andrew, is that still used?

/**
* @author Wayne Graham
* @web http://www.liquidfoot.com
*/
Marie von Ebner-Eschenbach  - "Even a stopped clock is right twice a day."

On Wed, Dec 10, 2008 at 8:12 PM, Greg Pendlebury <[hidden email]> wrote:
Thanks Wayne, you got the important stuff for indexing :)
 
I guess I could move solrmarc to live with solr instead of with vufind. Don't know why I didn't think of that before.
 
Probably the only question mark for me is the presence of solr's local path in vufind's config.ini?
 
Greg Pendlebury
Electronic Services Officer (Systems Team)
Division of Academic Information Services
University of Southern Queensland
Phone: +61 7 4631 1501
Fax: +61 7 4631 1841
 


From: [hidden email] [mailto:[hidden email]] On Behalf Of Wayne Graham
Sent: Thursday, 11 December 2008 11:04 AM
To: Greg Pendlebury
Cc: [hidden email]
Subject: Re: [VuFind-Tech] Solrmarc - requires filesystem access to solr?

Greg,

Yes, solrmarc needs to know where to look for the config file so it can start its indexing. The idea behind solrmarc was to make the indexing of marc as fast as possible. To do this, we skip talking to Solr through Jetty, and just talk directly to Solr.

Would it be possible/better for solrmarc to get this information from jetty via the web (if it is even available there), or a local copy of such information.

You do need a way to tell the underlying Java where to look; in the web interface this is done in the Jetty configuration files. The easiest way to do this is to just point to where the files are.

Is solrmarc indexing records straight into solr (filesystem) or doing so via jetty (web)?
Solrmarc indexes records directly into solr

Could this be why (on my windows dev box) solrmarc was putting indexed records in the right spot whilst jetty (with incorrect paths as a windows service) thought the index was empty?
Yes...when you index with solrmarc, the updateRequestHandler is on the filesystem and the Jetty interface doesn't know to look for potentially new records until something happens through Jetty to trigger this (e.g. update request, restart jetty, etc.).

And yes, Solr can live on a seperate server...in fact, it's probably a good idea. Depending on your server infrastructure, you may also find it easier to maintain in Tomcat too. In my benchmarking a couple of years ago, Jetty was only slightly faster. You may also find if you have a server with a slow clock speed that you can index your files faster on a desktop and move them to a server faster than you can on a slower processor.

Probably the only thing you don't want to do is to be indexing over http to another server...it could potentially bomb your network.

Did I get them all?

Wayne


/**
* @author Wayne Graham
* @web http://www.liquidfoot.com
*/
Franklin P. Jones  - "All women should know how to take care of children. Most of them will have a husband some day."

On Wed, Dec 10, 2008 at 7:33 PM, Greg Pendlebury <[hidden email]> wrote:

Now that I've got vufind up on a server I've been playing around with the possibility of removing solr from inside vufind so it can live on its own.

Eg:
'/home/solr' and '/home/vufind'
Instead of
'/home/vufind/' and '/home/vufind/solr'

The issue I want to clarify is why does vufind need to know the physical location of solr when solr is a web service? I've found the solr path in the config file for vufind as well as a part of solrmarc. From looking inside the solrmarc source quickly it _seems_ that solrmarc is reading the schema information and such out of solr's filesystem space directly.

Now not knowing a lot about solr/solrmarc at this stage I had a few thoughts/questions:

* Would it be possible/better for solrmarc to get this information from jetty via the web (if it is even available there), or a local copy of such information.

* Is solrmarc indexing records straight into solr (filesystem) or doing so via jetty (web)?
* Could this be why (on my windows dev box) solrmarc was putting indexed records in the right spot whilst jetty (with incorrect paths as a windows service) thought the index was empty?

* Is this why jetty must be restarted to find indexed items?

If what I suspect above is true it sounds like it's a fundamental of the way solrmarc works, so (for importing) solr couldn't live in a separate location (or more significantly, a separate server). But would this be true for vufind if the import process is ignored? Since the local path to solr in vufind's config.ini?

Any thoughts appreciated.

Ta,

Greg Pendlebury
Electronic Services Officer (Systems Team)
Division of Academic Information Services
University of Southern Queensland
Phone: +61 7 4631 1501
Fax: +61 7 4631 1841


This email (including any attached files) is confidential and is for the intended recipient(s) only. If you received this email by mistake, please, as a courtesy, tell the sender, then delete this email.

The views and opinions are the originator's and do not necessarily reflect those of the University of Southern Queensland. Although all reasonable precautions were taken to ensure that this email contained no viruses at the time it was sent we accept no liability for any losses arising from its receipt.

The University of Southern Queensland is a registered provider of education with the Australian Government (CRICOS Institution Code No's. QLD 00244B / NSW 02225M)

------------------------------------------------------------------------------
SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you.  Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech



This email (including any attached files) is confidential and is for the intended recipient(s) only. If you received this email by mistake, please, as a courtesy, tell the sender, then delete this email.

The views and opinions are the originator's and do not necessarily reflect those of the University of Southern Queensland. Although all reasonable precautions were taken to ensure that this email contained no viruses at the time it was sent we accept no liability for any losses arising from its receipt.

The University of Southern Queensland is a registered provider of education with the Australian Government (CRICOS Institution Code No's. QLD 00244B / NSW 02225M)


------------------------------------------------------------------------------
SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you.  Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Solrmarc - requires filesystem access to solr?

Andrew Nagy-4
Solr's path is needed for VuFind only for the in-development admin module.  The admin module will offer a web based interface for editing the stopwords, synonyms, etc.  So it is okay to remove solr from vufind - just make sure that the config.ini points to the new location.

Andrew

On Wed, Dec 10, 2008 at 8:50 PM, Wayne Graham <[hidden email]> wrote:
I think we put it in there when we were trying to just use a single ini file...Andrew, is that still used?


/**
* @author Wayne Graham
* @web http://www.liquidfoot.com
*/
Marie von Ebner-Eschenbach  - "Even a stopped clock is right twice a day."

On Wed, Dec 10, 2008 at 8:12 PM, Greg Pendlebury <[hidden email]> wrote:
Thanks Wayne, you got the important stuff for indexing :)
 
I guess I could move solrmarc to live with solr instead of with vufind. Don't know why I didn't think of that before.
 
Probably the only question mark for me is the presence of solr's local path in vufind's config.ini?
 
Greg Pendlebury
Electronic Services Officer (Systems Team)
Division of Academic Information Services
University of Southern Queensland
Phone: +61 7 4631 1501
Fax: +61 7 4631 1841
 


From: [hidden email] [mailto:[hidden email]] On Behalf Of Wayne Graham
Sent: Thursday, 11 December 2008 11:04 AM
To: Greg Pendlebury
Cc: [hidden email]
Subject: Re: [VuFind-Tech] Solrmarc - requires filesystem access to solr?

Greg,

Yes, solrmarc needs to know where to look for the config file so it can start its indexing. The idea behind solrmarc was to make the indexing of marc as fast as possible. To do this, we skip talking to Solr through Jetty, and just talk directly to Solr.

Would it be possible/better for solrmarc to get this information from jetty via the web (if it is even available there), or a local copy of such information.

You do need a way to tell the underlying Java where to look; in the web interface this is done in the Jetty configuration files. The easiest way to do this is to just point to where the files are.

Is solrmarc indexing records straight into solr (filesystem) or doing so via jetty (web)?
Solrmarc indexes records directly into solr

Could this be why (on my windows dev box) solrmarc was putting indexed records in the right spot whilst jetty (with incorrect paths as a windows service) thought the index was empty?
Yes...when you index with solrmarc, the updateRequestHandler is on the filesystem and the Jetty interface doesn't know to look for potentially new records until something happens through Jetty to trigger this (e.g. update request, restart jetty, etc.).

And yes, Solr can live on a seperate server...in fact, it's probably a good idea. Depending on your server infrastructure, you may also find it easier to maintain in Tomcat too. In my benchmarking a couple of years ago, Jetty was only slightly faster. You may also find if you have a server with a slow clock speed that you can index your files faster on a desktop and move them to a server faster than you can on a slower processor.

Probably the only thing you don't want to do is to be indexing over http to another server...it could potentially bomb your network.

Did I get them all?

Wayne


/**
* @author Wayne Graham
* @web http://www.liquidfoot.com
*/
Franklin P. Jones  - "All women should know how to take care of children. Most of them will have a husband some day."

On Wed, Dec 10, 2008 at 7:33 PM, Greg Pendlebury <[hidden email]> wrote:

Now that I've got vufind up on a server I've been playing around with the possibility of removing solr from inside vufind so it can live on its own.

Eg:
'/home/solr' and '/home/vufind'
Instead of
'/home/vufind/' and '/home/vufind/solr'

The issue I want to clarify is why does vufind need to know the physical location of solr when solr is a web service? I've found the solr path in the config file for vufind as well as a part of solrmarc. From looking inside the solrmarc source quickly it _seems_ that solrmarc is reading the schema information and such out of solr's filesystem space directly.

Now not knowing a lot about solr/solrmarc at this stage I had a few thoughts/questions:

* Would it be possible/better for solrmarc to get this information from jetty via the web (if it is even available there), or a local copy of such information.

* Is solrmarc indexing records straight into solr (filesystem) or doing so via jetty (web)?
* Could this be why (on my windows dev box) solrmarc was putting indexed records in the right spot whilst jetty (with incorrect paths as a windows service) thought the index was empty?

* Is this why jetty must be restarted to find indexed items?

If what I suspect above is true it sounds like it's a fundamental of the way solrmarc works, so (for importing) solr couldn't live in a separate location (or more significantly, a separate server). But would this be true for vufind if the import process is ignored? Since the local path to solr in vufind's config.ini?

Any thoughts appreciated.

Ta,

Greg Pendlebury
Electronic Services Officer (Systems Team)
Division of Academic Information Services
University of Southern Queensland
Phone: +61 7 4631 1501
Fax: +61 7 4631 1841


This email (including any attached files) is confidential and is for the intended recipient(s) only. If you received this email by mistake, please, as a courtesy, tell the sender, then delete this email.

The views and opinions are the originator's and do not necessarily reflect those of the University of Southern Queensland. Although all reasonable precautions were taken to ensure that this email contained no viruses at the time it was sent we accept no liability for any losses arising from its receipt.

The University of Southern Queensland is a registered provider of education with the Australian Government (CRICOS Institution Code No's. QLD 00244B / NSW 02225M)

------------------------------------------------------------------------------
SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you.  Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech



This email (including any attached files) is confidential and is for the intended recipient(s) only. If you received this email by mistake, please, as a courtesy, tell the sender, then delete this email.

The views and opinions are the originator's and do not necessarily reflect those of the University of Southern Queensland. Although all reasonable precautions were taken to ensure that this email contained no viruses at the time it was sent we accept no liability for any losses arising from its receipt.

The University of Southern Queensland is a registered provider of education with the Australian Government (CRICOS Institution Code No's. QLD 00244B / NSW 02225M)


------------------------------------------------------------------------------
SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you.  Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech



------------------------------------------------------------------------------
SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you.  Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Solrmarc - requires filesystem access to solr?

Greg Pendlebury
I suppose the significant dependence then is that the path of solr must be writeable to the vufind user?
 

Greg Pendlebury
Electronic Services Officer (Systems Team)
Division of Academic Information Services
University of Southern Queensland
Phone: +61 7 4631 1501
Fax: +61 7 4631 1841

 


From: Andrew Nagy [mailto:[hidden email]]
Sent: Thursday, 11 December 2008 1:55 PM
To: [hidden email]
Cc: Greg Pendlebury; [hidden email]
Subject: Re: [VuFind-Tech] Solrmarc - requires filesystem access to solr?

Solr's path is needed for VuFind only for the in-development admin module.  The admin module will offer a web based interface for editing the stopwords, synonyms, etc.  So it is okay to remove solr from vufind - just make sure that the config.ini points to the new location.

Andrew

On Wed, Dec 10, 2008 at 8:50 PM, Wayne Graham <[hidden email]> wrote:
I think we put it in there when we were trying to just use a single ini file...Andrew, is that still used?


/**
* @author Wayne Graham
* @web http://www.liquidfoot.com
*/
Marie von Ebner-Eschenbach  - "Even a stopped clock is right twice a day."

On Wed, Dec 10, 2008 at 8:12 PM, Greg Pendlebury <[hidden email]> wrote:
Thanks Wayne, you got the important stuff for indexing :)
 
I guess I could move solrmarc to live with solr instead of with vufind. Don't know why I didn't think of that before.
 
Probably the only question mark for me is the presence of solr's local path in vufind's config.ini?
 
Greg Pendlebury
Electronic Services Officer (Systems Team)
Division of Academic Information Services
University of Southern Queensland
Phone: +61 7 4631 1501
Fax: +61 7 4631 1841
 


From: [hidden email] [mailto:[hidden email]] On Behalf Of Wayne Graham
Sent: Thursday, 11 December 2008 11:04 AM
To: Greg Pendlebury
Cc: [hidden email]
Subject: Re: [VuFind-Tech] Solrmarc - requires filesystem access to solr?

Greg,

Yes, solrmarc needs to know where to look for the config file so it can start its indexing. The idea behind solrmarc was to make the indexing of marc as fast as possible. To do this, we skip talking to Solr through Jetty, and just talk directly to Solr.

Would it be possible/better for solrmarc to get this information from jetty via the web (if it is even available there), or a local copy of such information.

You do need a way to tell the underlying Java where to look; in the web interface this is done in the Jetty configuration files. The easiest way to do this is to just point to where the files are.

Is solrmarc indexing records straight into solr (filesystem) or doing so via jetty (web)?
Solrmarc indexes records directly into solr

Could this be why (on my windows dev box) solrmarc was putting indexed records in the right spot whilst jetty (with incorrect paths as a windows service) thought the index was empty?
Yes...when you index with solrmarc, the updateRequestHandler is on the filesystem and the Jetty interface doesn't know to look for potentially new records until something happens through Jetty to trigger this (e.g. update request, restart jetty, etc.).

And yes, Solr can live on a seperate server...in fact, it's probably a good idea. Depending on your server infrastructure, you may also find it easier to maintain in Tomcat too. In my benchmarking a couple of years ago, Jetty was only slightly faster. You may also find if you have a server with a slow clock speed that you can index your files faster on a desktop and move them to a server faster than you can on a slower processor.

Probably the only thing you don't want to do is to be indexing over http to another server...it could potentially bomb your network.

Did I get them all?

Wayne


/**
* @author Wayne Graham
* @web http://www.liquidfoot.com
*/
Franklin P. Jones  - "All women should know how to take care of children. Most of them will have a husband some day."

On Wed, Dec 10, 2008 at 7:33 PM, Greg Pendlebury <[hidden email]> wrote:

Now that I've got vufind up on a server I've been playing around with the possibility of removing solr from inside vufind so it can live on its own.

Eg:
'/home/solr' and '/home/vufind'
Instead of
'/home/vufind/' and '/home/vufind/solr'

The issue I want to clarify is why does vufind need to know the physical location of solr when solr is a web service? I've found the solr path in the config file for vufind as well as a part of solrmarc. From looking inside the solrmarc source quickly it _seems_ that solrmarc is reading the schema information and such out of solr's filesystem space directly.

Now not knowing a lot about solr/solrmarc at this stage I had a few thoughts/questions:

* Would it be possible/better for solrmarc to get this information from jetty via the web (if it is even available there), or a local copy of such information.

* Is solrmarc indexing records straight into solr (filesystem) or doing so via jetty (web)?
* Could this be why (on my windows dev box) solrmarc was putting indexed records in the right spot whilst jetty (with incorrect paths as a windows service) thought the index was empty?

* Is this why jetty must be restarted to find indexed items?

If what I suspect above is true it sounds like it's a fundamental of the way solrmarc works, so (for importing) solr couldn't live in a separate location (or more significantly, a separate server). But would this be true for vufind if the import process is ignored? Since the local path to solr in vufind's config.ini?

Any thoughts appreciated.

Ta,

Greg Pendlebury
Electronic Services Officer (Systems Team)
Division of Academic Information Services
University of Southern Queensland
Phone: +61 7 4631 1501
Fax: +61 7 4631 1841


This email (including any attached files) is confidential and is for the intended recipient(s) only. If you received this email by mistake, please, as a courtesy, tell the sender, then delete this email.

The views and opinions are the originator's and do not necessarily reflect those of the University of Southern Queensland. Although all reasonable precautions were taken to ensure that this email contained no viruses at the time it was sent we accept no liability for any losses arising from its receipt.

The University of Southern Queensland is a registered provider of education with the Australian Government (CRICOS Institution Code No's. QLD 00244B / NSW 02225M)

------------------------------------------------------------------------------
SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you.  Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech



This email (including any attached files) is confidential and is for the intended recipient(s) only. If you received this email by mistake, please, as a courtesy, tell the sender, then delete this email.

The views and opinions are the originator's and do not necessarily reflect those of the University of Southern Queensland. Although all reasonable precautions were taken to ensure that this email contained no viruses at the time it was sent we accept no liability for any losses arising from its receipt.

The University of Southern Queensland is a registered provider of education with the Australian Government (CRICOS Institution Code No's. QLD 00244B / NSW 02225M)


------------------------------------------------------------------------------
SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you.  Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech



This email (including any attached files) is confidential and is for the intended recipient(s) only. If you received this email by mistake, please, as a courtesy, tell the sender, then delete this email.

The views and opinions are the originator's and do not necessarily reflect those of the University of Southern Queensland. Although all reasonable precautions were taken to ensure that this email contained no viruses at the time it was sent we accept no liability for any losses arising from its receipt.

The University of Southern Queensland is a registered provider of education with the Australian Government (CRICOS Institution Code No's. QLD 00244B / NSW 02225M)
------------------------------------------------------------------------------
SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you.  Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Solrmarc - requires filesystem access to solr?

Andrew Nagy-4
Just the config files - which I haven't implemented yet.  And they need to be writable by the apache user - which is generally apache and not vufind.  Unless you have a dedicated server in which you can change the apache user to be the vufind user.

On Wed, Dec 10, 2008 at 11:03 PM, Greg Pendlebury <[hidden email]> wrote:
I suppose the significant dependence then is that the path of solr must be writeable to the vufind user?
 

Greg Pendlebury
Electronic Services Officer (Systems Team)
Division of Academic Information Services
University of Southern Queensland
Phone: +61 7 4631 1501
Fax: +61 7 4631 1841

 


From: Andrew Nagy [mailto:[hidden email]]
Sent: Thursday, 11 December 2008 1:55 PM
To: [hidden email]
Cc: Greg Pendlebury; [hidden email]

Subject: Re: [VuFind-Tech] Solrmarc - requires filesystem access to solr?

Solr's path is needed for VuFind only for the in-development admin module.  The admin module will offer a web based interface for editing the stopwords, synonyms, etc.  So it is okay to remove solr from vufind - just make sure that the config.ini points to the new location.

Andrew

On Wed, Dec 10, 2008 at 8:50 PM, Wayne Graham <[hidden email]> wrote:
I think we put it in there when we were trying to just use a single ini file...Andrew, is that still used?


/**
* @author Wayne Graham
* @web http://www.liquidfoot.com
*/
Marie von Ebner-Eschenbach  - "Even a stopped clock is right twice a day."

On Wed, Dec 10, 2008 at 8:12 PM, Greg Pendlebury <[hidden email]> wrote:
Thanks Wayne, you got the important stuff for indexing :)
 
I guess I could move solrmarc to live with solr instead of with vufind. Don't know why I didn't think of that before.
 
Probably the only question mark for me is the presence of solr's local path in vufind's config.ini?
 
Greg Pendlebury
Electronic Services Officer (Systems Team)
Division of Academic Information Services
University of Southern Queensland
Phone: +61 7 4631 1501
Fax: +61 7 4631 1841
 


From: [hidden email] [mailto:[hidden email]] On Behalf Of Wayne Graham
Sent: Thursday, 11 December 2008 11:04 AM
To: Greg Pendlebury
Cc: [hidden email]
Subject: Re: [VuFind-Tech] Solrmarc - requires filesystem access to solr?

Greg,

Yes, solrmarc needs to know where to look for the config file so it can start its indexing. The idea behind solrmarc was to make the indexing of marc as fast as possible. To do this, we skip talking to Solr through Jetty, and just talk directly to Solr.

Would it be possible/better for solrmarc to get this information from jetty via the web (if it is even available there), or a local copy of such information.

You do need a way to tell the underlying Java where to look; in the web interface this is done in the Jetty configuration files. The easiest way to do this is to just point to where the files are.

Is solrmarc indexing records straight into solr (filesystem) or doing so via jetty (web)?
Solrmarc indexes records directly into solr

Could this be why (on my windows dev box) solrmarc was putting indexed records in the right spot whilst jetty (with incorrect paths as a windows service) thought the index was empty?
Yes...when you index with solrmarc, the updateRequestHandler is on the filesystem and the Jetty interface doesn't know to look for potentially new records until something happens through Jetty to trigger this (e.g. update request, restart jetty, etc.).

And yes, Solr can live on a seperate server...in fact, it's probably a good idea. Depending on your server infrastructure, you may also find it easier to maintain in Tomcat too. In my benchmarking a couple of years ago, Jetty was only slightly faster. You may also find if you have a server with a slow clock speed that you can index your files faster on a desktop and move them to a server faster than you can on a slower processor.

Probably the only thing you don't want to do is to be indexing over http to another server...it could potentially bomb your network.

Did I get them all?

Wayne


/**
* @author Wayne Graham
* @web http://www.liquidfoot.com
*/
Franklin P. Jones  - "All women should know how to take care of children. Most of them will have a husband some day."

On Wed, Dec 10, 2008 at 7:33 PM, Greg Pendlebury <[hidden email]> wrote:

Now that I've got vufind up on a server I've been playing around with the possibility of removing solr from inside vufind so it can live on its own.

Eg:
'/home/solr' and '/home/vufind'
Instead of
'/home/vufind/' and '/home/vufind/solr'

The issue I want to clarify is why does vufind need to know the physical location of solr when solr is a web service? I've found the solr path in the config file for vufind as well as a part of solrmarc. From looking inside the solrmarc source quickly it _seems_ that solrmarc is reading the schema information and such out of solr's filesystem space directly.

Now not knowing a lot about solr/solrmarc at this stage I had a few thoughts/questions:

* Would it be possible/better for solrmarc to get this information from jetty via the web (if it is even available there), or a local copy of such information.

* Is solrmarc indexing records straight into solr (filesystem) or doing so via jetty (web)?
* Could this be why (on my windows dev box) solrmarc was putting indexed records in the right spot whilst jetty (with incorrect paths as a windows service) thought the index was empty?

* Is this why jetty must be restarted to find indexed items?

If what I suspect above is true it sounds like it's a fundamental of the way solrmarc works, so (for importing) solr couldn't live in a separate location (or more significantly, a separate server). But would this be true for vufind if the import process is ignored? Since the local path to solr in vufind's config.ini?

Any thoughts appreciated.

Ta,

Greg Pendlebury
Electronic Services Officer (Systems Team)
Division of Academic Information Services
University of Southern Queensland
Phone: +61 7 4631 1501
Fax: +61 7 4631 1841


This email (including any attached files) is confidential and is for the intended recipient(s) only. If you received this email by mistake, please, as a courtesy, tell the sender, then delete this email.

The views and opinions are the originator's and do not necessarily reflect those of the University of Southern Queensland. Although all reasonable precautions were taken to ensure that this email contained no viruses at the time it was sent we accept no liability for any losses arising from its receipt.

The University of Southern Queensland is a registered provider of education with the Australian Government (CRICOS Institution Code No's. QLD 00244B / NSW 02225M)

------------------------------------------------------------------------------
SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you.  Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech



This email (including any attached files) is confidential and is for the intended recipient(s) only. If you received this email by mistake, please, as a courtesy, tell the sender, then delete this email.

The views and opinions are the originator's and do not necessarily reflect those of the University of Southern Queensland. Although all reasonable precautions were taken to ensure that this email contained no viruses at the time it was sent we accept no liability for any losses arising from its receipt.

The University of Southern Queensland is a registered provider of education with the Australian Government (CRICOS Institution Code No's. QLD 00244B / NSW 02225M)


------------------------------------------------------------------------------
SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you.  Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech



This email (including any attached files) is confidential and is for the intended recipient(s) only. If you received this email by mistake, please, as a courtesy, tell the sender, then delete this email.

The views and opinions are the originator's and do not necessarily reflect those of the University of Southern Queensland. Although all reasonable precautions were taken to ensure that this email contained no viruses at the time it was sent we accept no liability for any losses arising from its receipt.

The University of Southern Queensland is a registered provider of education with the Australian Government (CRICOS Institution Code No's. QLD 00244B / NSW 02225M)


------------------------------------------------------------------------------
SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you.  Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Solrmarc - requires filesystem access to solr?

Greg Pendlebury
We do, I'm running apache as 'vufind' and jetty as 'solr'. I was trying to make them live in their own 'silos' (proof-of-concept for separate servers) but I guess they can't. I'll set up write access.
 
Thanks for the responses.
 

Greg Pendlebury
Electronic Services Officer (Systems Team)
Division of Academic Information Services
University of Southern Queensland
Phone: +61 7 4631 1501
Fax: +61 7 4631 1841

 


From: Andrew Nagy [mailto:[hidden email]]
Sent: Thursday, 11 December 2008 2:10 PM
To: Greg Pendlebury
Cc: [hidden email]
Subject: Re: [VuFind-Tech] Solrmarc - requires filesystem access to solr?

Just the config files - which I haven't implemented yet.  And they need to be writable by the apache user - which is generally apache and not vufind.  Unless you have a dedicated server in which you can change the apache user to be the vufind user.

On Wed, Dec 10, 2008 at 11:03 PM, Greg Pendlebury <[hidden email]> wrote:
I suppose the significant dependence then is that the path of solr must be writeable to the vufind user?
 

Greg Pendlebury
Electronic Services Officer (Systems Team)
Division of Academic Information Services
University of Southern Queensland
Phone: +61 7 4631 1501
Fax: +61 7 4631 1841

 


From: Andrew Nagy [mailto:[hidden email]]
Sent: Thursday, 11 December 2008 1:55 PM
To: [hidden email]
Cc: Greg Pendlebury; [hidden email]

Subject: Re: [VuFind-Tech] Solrmarc - requires filesystem access to solr?

Solr's path is needed for VuFind only for the in-development admin module.  The admin module will offer a web based interface for editing the stopwords, synonyms, etc.  So it is okay to remove solr from vufind - just make sure that the config.ini points to the new location.

Andrew

On Wed, Dec 10, 2008 at 8:50 PM, Wayne Graham <[hidden email]> wrote:
I think we put it in there when we were trying to just use a single ini file...Andrew, is that still used?


/**
* @author Wayne Graham
* @web http://www.liquidfoot.com
*/
Marie von Ebner-Eschenbach  - "Even a stopped clock is right twice a day."

On Wed, Dec 10, 2008 at 8:12 PM, Greg Pendlebury <[hidden email]> wrote:
Thanks Wayne, you got the important stuff for indexing :)
 
I guess I could move solrmarc to live with solr instead of with vufind. Don't know why I didn't think of that before.
 
Probably the only question mark for me is the presence of solr's local path in vufind's config.ini?
 
Greg Pendlebury
Electronic Services Officer (Systems Team)
Division of Academic Information Services
University of Southern Queensland
Phone: +61 7 4631 1501
Fax: +61 7 4631 1841
 


From: [hidden email] [mailto:[hidden email]] On Behalf Of Wayne Graham
Sent: Thursday, 11 December 2008 11:04 AM
To: Greg Pendlebury
Cc: [hidden email]
Subject: Re: [VuFind-Tech] Solrmarc - requires filesystem access to solr?

Greg,

Yes, solrmarc needs to know where to look for the config file so it can start its indexing. The idea behind solrmarc was to make the indexing of marc as fast as possible. To do this, we skip talking to Solr through Jetty, and just talk directly to Solr.

Would it be possible/better for solrmarc to get this information from jetty via the web (if it is even available there), or a local copy of such information.

You do need a way to tell the underlying Java where to look; in the web interface this is done in the Jetty configuration files. The easiest way to do this is to just point to where the files are.

Is solrmarc indexing records straight into solr (filesystem) or doing so via jetty (web)?
Solrmarc indexes records directly into solr

Could this be why (on my windows dev box) solrmarc was putting indexed records in the right spot whilst jetty (with incorrect paths as a windows service) thought the index was empty?
Yes...when you index with solrmarc, the updateRequestHandler is on the filesystem and the Jetty interface doesn't know to look for potentially new records until something happens through Jetty to trigger this (e.g. update request, restart jetty, etc.).

And yes, Solr can live on a seperate server...in fact, it's probably a good idea. Depending on your server infrastructure, you may also find it easier to maintain in Tomcat too. In my benchmarking a couple of years ago, Jetty was only slightly faster. You may also find if you have a server with a slow clock speed that you can index your files faster on a desktop and move them to a server faster than you can on a slower processor.

Probably the only thing you don't want to do is to be indexing over http to another server...it could potentially bomb your network.

Did I get them all?

Wayne


/**
* @author Wayne Graham
* @web http://www.liquidfoot.com
*/
Franklin P. Jones  - "All women should know how to take care of children. Most of them will have a husband some day."

On Wed, Dec 10, 2008 at 7:33 PM, Greg Pendlebury <[hidden email]> wrote:

Now that I've got vufind up on a server I've been playing around with the possibility of removing solr from inside vufind so it can live on its own.

Eg:
'/home/solr' and '/home/vufind'
Instead of
'/home/vufind/' and '/home/vufind/solr'

The issue I want to clarify is why does vufind need to know the physical location of solr when solr is a web service? I've found the solr path in the config file for vufind as well as a part of solrmarc. From looking inside the solrmarc source quickly it _seems_ that solrmarc is reading the schema information and such out of solr's filesystem space directly.

Now not knowing a lot about solr/solrmarc at this stage I had a few thoughts/questions:

* Would it be possible/better for solrmarc to get this information from jetty via the web (if it is even available there), or a local copy of such information.

* Is solrmarc indexing records straight into solr (filesystem) or doing so via jetty (web)?
* Could this be why (on my windows dev box) solrmarc was putting indexed records in the right spot whilst jetty (with incorrect paths as a windows service) thought the index was empty?

* Is this why jetty must be restarted to find indexed items?

If what I suspect above is true it sounds like it's a fundamental of the way solrmarc works, so (for importing) solr couldn't live in a separate location (or more significantly, a separate server). But would this be true for vufind if the import process is ignored? Since the local path to solr in vufind's config.ini?

Any thoughts appreciated.

Ta,

Greg Pendlebury
Electronic Services Officer (Systems Team)
Division of Academic Information Services
University of Southern Queensland
Phone: +61 7 4631 1501
Fax: +61 7 4631 1841


This email (including any attached files) is confidential and is for the intended recipient(s) only. If you received this email by mistake, please, as a courtesy, tell the sender, then delete this email.

The views and opinions are the originator's and do not necessarily reflect those of the University of Southern Queensland. Although all reasonable precautions were taken to ensure that this email contained no viruses at the time it was sent we accept no liability for any losses arising from its receipt.

The University of Southern Queensland is a registered provider of education with the Australian Government (CRICOS Institution Code No's. QLD 00244B / NSW 02225M)

------------------------------------------------------------------------------
SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you.  Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech



This email (including any attached files) is confidential and is for the intended recipient(s) only. If you received this email by mistake, please, as a courtesy, tell the sender, then delete this email.

The views and opinions are the originator's and do not necessarily reflect those of the University of Southern Queensland. Although all reasonable precautions were taken to ensure that this email contained no viruses at the time it was sent we accept no liability for any losses arising from its receipt.

The University of Southern Queensland is a registered provider of education with the Australian Government (CRICOS Institution Code No's. QLD 00244B / NSW 02225M)


------------------------------------------------------------------------------
SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you.  Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech



This email (including any attached files) is confidential and is for the intended recipient(s) only. If you received this email by mistake, please, as a courtesy, tell the sender, then delete this email.

The views and opinions are the originator's and do not necessarily reflect those of the University of Southern Queensland. Although all reasonable precautions were taken to ensure that this email contained no viruses at the time it was sent we accept no liability for any losses arising from its receipt.

The University of Southern Queensland is a registered provider of education with the Australian Government (CRICOS Institution Code No's. QLD 00244B / NSW 02225M)


This email (including any attached files) is confidential and is for the intended recipient(s) only. If you received this email by mistake, please, as a courtesy, tell the sender, then delete this email.

The views and opinions are the originator's and do not necessarily reflect those of the University of Southern Queensland. Although all reasonable precautions were taken to ensure that this email contained no viruses at the time it was sent we accept no liability for any losses arising from its receipt.

The University of Southern Queensland is a registered provider of education with the Australian Government (CRICOS Institution Code No's. QLD 00244B / NSW 02225M)
------------------------------------------------------------------------------
SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you.  Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Solrmarc - requires filesystem access to solr?

Andrew Nagy-4
You definitely can run vufind on one machine and solr on another.  The only ties are the solr config - you can set that up as an NFS mount so that the admin module can access the files. 

Andrew

On Wed, Dec 10, 2008 at 11:13 PM, Greg Pendlebury <[hidden email]> wrote:
We do, I'm running apache as 'vufind' and jetty as 'solr'. I was trying to make them live in their own 'silos' (proof-of-concept for separate servers) but I guess they can't. I'll set up write access.
 
Thanks for the responses.
 

Greg Pendlebury
Electronic Services Officer (Systems Team)
Division of Academic Information Services
University of Southern Queensland
Phone: +61 7 4631 1501
Fax: +61 7 4631 1841

 


From: Andrew Nagy [mailto:[hidden email]]
Sent: Thursday, 11 December 2008 2:10 PM

To: Greg Pendlebury
Cc: [hidden email]
Subject: Re: [VuFind-Tech] Solrmarc - requires filesystem access to solr?

Just the config files - which I haven't implemented yet.  And they need to be writable by the apache user - which is generally apache and not vufind.  Unless you have a dedicated server in which you can change the apache user to be the vufind user.

On Wed, Dec 10, 2008 at 11:03 PM, Greg Pendlebury <[hidden email]> wrote:
I suppose the significant dependence then is that the path of solr must be writeable to the vufind user?
 

Greg Pendlebury
Electronic Services Officer (Systems Team)
Division of Academic Information Services
University of Southern Queensland
Phone: +61 7 4631 1501
Fax: +61 7 4631 1841

 


From: Andrew Nagy [mailto:[hidden email]]
Sent: Thursday, 11 December 2008 1:55 PM
To: [hidden email]
Cc: Greg Pendlebury; [hidden email]

Subject: Re: [VuFind-Tech] Solrmarc - requires filesystem access to solr?

Solr's path is needed for VuFind only for the in-development admin module.  The admin module will offer a web based interface for editing the stopwords, synonyms, etc.  So it is okay to remove solr from vufind - just make sure that the config.ini points to the new location.

Andrew

On Wed, Dec 10, 2008 at 8:50 PM, Wayne Graham <[hidden email]> wrote:
I think we put it in there when we were trying to just use a single ini file...Andrew, is that still used?


/**
* @author Wayne Graham
* @web http://www.liquidfoot.com
*/
Marie von Ebner-Eschenbach  - "Even a stopped clock is right twice a day."

On Wed, Dec 10, 2008 at 8:12 PM, Greg Pendlebury <[hidden email]> wrote:
Thanks Wayne, you got the important stuff for indexing :)
 
I guess I could move solrmarc to live with solr instead of with vufind. Don't know why I didn't think of that before.
 
Probably the only question mark for me is the presence of solr's local path in vufind's config.ini?
 
Greg Pendlebury
Electronic Services Officer (Systems Team)
Division of Academic Information Services
University of Southern Queensland
Phone: +61 7 4631 1501
Fax: +61 7 4631 1841
 


From: [hidden email] [mailto:[hidden email]] On Behalf Of Wayne Graham
Sent: Thursday, 11 December 2008 11:04 AM
To: Greg Pendlebury
Cc: [hidden email]
Subject: Re: [VuFind-Tech] Solrmarc - requires filesystem access to solr?

Greg,

Yes, solrmarc needs to know where to look for the config file so it can start its indexing. The idea behind solrmarc was to make the indexing of marc as fast as possible. To do this, we skip talking to Solr through Jetty, and just talk directly to Solr.

Would it be possible/better for solrmarc to get this information from jetty via the web (if it is even available there), or a local copy of such information.

You do need a way to tell the underlying Java where to look; in the web interface this is done in the Jetty configuration files. The easiest way to do this is to just point to where the files are.

Is solrmarc indexing records straight into solr (filesystem) or doing so via jetty (web)?
Solrmarc indexes records directly into solr

Could this be why (on my windows dev box) solrmarc was putting indexed records in the right spot whilst jetty (with incorrect paths as a windows service) thought the index was empty?
Yes...when you index with solrmarc, the updateRequestHandler is on the filesystem and the Jetty interface doesn't know to look for potentially new records until something happens through Jetty to trigger this (e.g. update request, restart jetty, etc.).

And yes, Solr can live on a seperate server...in fact, it's probably a good idea. Depending on your server infrastructure, you may also find it easier to maintain in Tomcat too. In my benchmarking a couple of years ago, Jetty was only slightly faster. You may also find if you have a server with a slow clock speed that you can index your files faster on a desktop and move them to a server faster than you can on a slower processor.

Probably the only thing you don't want to do is to be indexing over http to another server...it could potentially bomb your network.

Did I get them all?

Wayne


/**
* @author Wayne Graham
* @web http://www.liquidfoot.com
*/
Franklin P. Jones  - "All women should know how to take care of children. Most of them will have a husband some day."

On Wed, Dec 10, 2008 at 7:33 PM, Greg Pendlebury <[hidden email]> wrote:

Now that I've got vufind up on a server I've been playing around with the possibility of removing solr from inside vufind so it can live on its own.

Eg:
'/home/solr' and '/home/vufind'
Instead of
'/home/vufind/' and '/home/vufind/solr'

The issue I want to clarify is why does vufind need to know the physical location of solr when solr is a web service? I've found the solr path in the config file for vufind as well as a part of solrmarc. From looking inside the solrmarc source quickly it _seems_ that solrmarc is reading the schema information and such out of solr's filesystem space directly.

Now not knowing a lot about solr/solrmarc at this stage I had a few thoughts/questions:

* Would it be possible/better for solrmarc to get this information from jetty via the web (if it is even available there), or a local copy of such information.

* Is solrmarc indexing records straight into solr (filesystem) or doing so via jetty (web)?
* Could this be why (on my windows dev box) solrmarc was putting indexed records in the right spot whilst jetty (with incorrect paths as a windows service) thought the index was empty?

* Is this why jetty must be restarted to find indexed items?

If what I suspect above is true it sounds like it's a fundamental of the way solrmarc works, so (for importing) solr couldn't live in a separate location (or more significantly, a separate server). But would this be true for vufind if the import process is ignored? Since the local path to solr in vufind's config.ini?

Any thoughts appreciated.

Ta,

Greg Pendlebury
Electronic Services Officer (Systems Team)
Division of Academic Information Services
University of Southern Queensland
Phone: +61 7 4631 1501
Fax: +61 7 4631 1841


This email (including any attached files) is confidential and is for the intended recipient(s) only. If you received this email by mistake, please, as a courtesy, tell the sender, then delete this email.

The views and opinions are the originator's and do not necessarily reflect those of the University of Southern Queensland. Although all reasonable precautions were taken to ensure that this email contained no viruses at the time it was sent we accept no liability for any losses arising from its receipt.

The University of Southern Queensland is a registered provider of education with the Australian Government (CRICOS Institution Code No's. QLD 00244B / NSW 02225M)

------------------------------------------------------------------------------
SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you.  Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech



This email (including any attached files) is confidential and is for the intended recipient(s) only. If you received this email by mistake, please, as a courtesy, tell the sender, then delete this email.

The views and opinions are the originator's and do not necessarily reflect those of the University of Southern Queensland. Although all reasonable precautions were taken to ensure that this email contained no viruses at the time it was sent we accept no liability for any losses arising from its receipt.

The University of Southern Queensland is a registered provider of education with the Australian Government (CRICOS Institution Code No's. QLD 00244B / NSW 02225M)


------------------------------------------------------------------------------
SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you.  Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech



This email (including any attached files) is confidential and is for the intended recipient(s) only. If you received this email by mistake, please, as a courtesy, tell the sender, then delete this email.

The views and opinions are the originator's and do not necessarily reflect those of the University of Southern Queensland. Although all reasonable precautions were taken to ensure that this email contained no viruses at the time it was sent we accept no liability for any losses arising from its receipt.

The University of Southern Queensland is a registered provider of education with the Australian Government (CRICOS Institution Code No's. QLD 00244B / NSW 02225M)


This email (including any attached files) is confidential and is for the intended recipient(s) only. If you received this email by mistake, please, as a courtesy, tell the sender, then delete this email.

The views and opinions are the originator's and do not necessarily reflect those of the University of Southern Queensland. Although all reasonable precautions were taken to ensure that this email contained no viruses at the time it was sent we accept no liability for any losses arising from its receipt.

The University of Southern Queensland is a registered provider of education with the Australian Government (CRICOS Institution Code No's. QLD 00244B / NSW 02225M)


------------------------------------------------------------------------------
SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you.  Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Solrmarc - requires filesystem access to solr?

Greg Pendlebury
Ahh, good thinking.
 
Do you think the admin module should include a check for if this didn't exist? And limit functionality accordingly?
 
That way you could use local indexes or remote without as great a concern. I'm only thinking in future when we want to add secondary indexes to the system we might not have ownership of (very far future for us, but worth considering).
 

Greg Pendlebury
Electronic Services Officer (Systems Team)
Division of Academic Information Services
University of Southern Queensland
Phone: +61 7 4631 1501
Fax: +61 7 4631 1841

 


From: Andrew Nagy [mailto:[hidden email]]
Sent: Thursday, 11 December 2008 2:26 PM
To: Greg Pendlebury
Cc: [hidden email]
Subject: Re: [VuFind-Tech] Solrmarc - requires filesystem access to solr?

You definitely can run vufind on one machine and solr on another.  The only ties are the solr config - you can set that up as an NFS mount so that the admin module can access the files. 

Andrew

On Wed, Dec 10, 2008 at 11:13 PM, Greg Pendlebury <[hidden email]> wrote:
We do, I'm running apache as 'vufind' and jetty as 'solr'. I was trying to make them live in their own 'silos' (proof-of-concept for separate servers) but I guess they can't. I'll set up write access.
 
Thanks for the responses.
 

Greg Pendlebury
Electronic Services Officer (Systems Team)
Division of Academic Information Services
University of Southern Queensland
Phone: +61 7 4631 1501
Fax: +61 7 4631 1841

 


From: Andrew Nagy [mailto:[hidden email]]
Sent: Thursday, 11 December 2008 2:10 PM

To: Greg Pendlebury
Cc: [hidden email]
Subject: Re: [VuFind-Tech] Solrmarc - requires filesystem access to solr?

Just the config files - which I haven't implemented yet.  And they need to be writable by the apache user - which is generally apache and not vufind.  Unless you have a dedicated server in which you can change the apache user to be the vufind user.

On Wed, Dec 10, 2008 at 11:03 PM, Greg Pendlebury <[hidden email]> wrote:
I suppose the significant dependence then is that the path of solr must be writeable to the vufind user?
 

Greg Pendlebury
Electronic Services Officer (Systems Team)
Division of Academic Information Services
University of Southern Queensland
Phone: +61 7 4631 1501
Fax: +61 7 4631 1841

 


From: Andrew Nagy [mailto:[hidden email]]
Sent: Thursday, 11 December 2008 1:55 PM
To: [hidden email]
Cc: Greg Pendlebury; [hidden email]

Subject: Re: [VuFind-Tech] Solrmarc - requires filesystem access to solr?

Solr's path is needed for VuFind only for the in-development admin module.  The admin module will offer a web based interface for editing the stopwords, synonyms, etc.  So it is okay to remove solr from vufind - just make sure that the config.ini points to the new location.

Andrew

On Wed, Dec 10, 2008 at 8:50 PM, Wayne Graham <[hidden email]> wrote:
I think we put it in there when we were trying to just use a single ini file...Andrew, is that still used?


/**
* @author Wayne Graham
* @web http://www.liquidfoot.com
*/
Marie von Ebner-Eschenbach  - "Even a stopped clock is right twice a day."

On Wed, Dec 10, 2008 at 8:12 PM, Greg Pendlebury <[hidden email]> wrote:
Thanks Wayne, you got the important stuff for indexing :)
 
I guess I could move solrmarc to live with solr instead of with vufind. Don't know why I didn't think of that before.
 
Probably the only question mark for me is the presence of solr's local path in vufind's config.ini?
 
Greg Pendlebury
Electronic Services Officer (Systems Team)
Division of Academic Information Services
University of Southern Queensland
Phone: +61 7 4631 1501
Fax: +61 7 4631 1841
 


From: [hidden email] [mailto:[hidden email]] On Behalf Of Wayne Graham
Sent: Thursday, 11 December 2008 11:04 AM
To: Greg Pendlebury
Cc: [hidden email]
Subject: Re: [VuFind-Tech] Solrmarc - requires filesystem access to solr?

Greg,

Yes, solrmarc needs to know where to look for the config file so it can start its indexing. The idea behind solrmarc was to make the indexing of marc as fast as possible. To do this, we skip talking to Solr through Jetty, and just talk directly to Solr.

Would it be possible/better for solrmarc to get this information from jetty via the web (if it is even available there), or a local copy of such information.

You do need a way to tell the underlying Java where to look; in the web interface this is done in the Jetty configuration files. The easiest way to do this is to just point to where the files are.

Is solrmarc indexing records straight into solr (filesystem) or doing so via jetty (web)?
Solrmarc indexes records directly into solr

Could this be why (on my windows dev box) solrmarc was putting indexed records in the right spot whilst jetty (with incorrect paths as a windows service) thought the index was empty?
Yes...when you index with solrmarc, the updateRequestHandler is on the filesystem and the Jetty interface doesn't know to look for potentially new records until something happens through Jetty to trigger this (e.g. update request, restart jetty, etc.).

And yes, Solr can live on a seperate server...in fact, it's probably a good idea. Depending on your server infrastructure, you may also find it easier to maintain in Tomcat too. In my benchmarking a couple of years ago, Jetty was only slightly faster. You may also find if you have a server with a slow clock speed that you can index your files faster on a desktop and move them to a server faster than you can on a slower processor.

Probably the only thing you don't want to do is to be indexing over http to another server...it could potentially bomb your network.

Did I get them all?

Wayne


/**
* @author Wayne Graham
* @web http://www.liquidfoot.com
*/
Franklin P. Jones  - "All women should know how to take care of children. Most of them will have a husband some day."

On Wed, Dec 10, 2008 at 7:33 PM, Greg Pendlebury <[hidden email]> wrote:

Now that I've got vufind up on a server I've been playing around with the possibility of removing solr from inside vufind so it can live on its own.

Eg:
'/home/solr' and '/home/vufind'
Instead of
'/home/vufind/' and '/home/vufind/solr'

The issue I want to clarify is why does vufind need to know the physical location of solr when solr is a web service? I've found the solr path in the config file for vufind as well as a part of solrmarc. From looking inside the solrmarc source quickly it _seems_ that solrmarc is reading the schema information and such out of solr's filesystem space directly.

Now not knowing a lot about solr/solrmarc at this stage I had a few thoughts/questions:

* Would it be possible/better for solrmarc to get this information from jetty via the web (if it is even available there), or a local copy of such information.

* Is solrmarc indexing records straight into solr (filesystem) or doing so via jetty (web)?
* Could this be why (on my windows dev box) solrmarc was putting indexed records in the right spot whilst jetty (with incorrect paths as a windows service) thought the index was empty?

* Is this why jetty must be restarted to find indexed items?

If what I suspect above is true it sounds like it's a fundamental of the way solrmarc works, so (for importing) solr couldn't live in a separate location (or more significantly, a separate server). But would this be true for vufind if the import process is ignored? Since the local path to solr in vufind's config.ini?

Any thoughts appreciated.

Ta,

Greg Pendlebury
Electronic Services Officer (Systems Team)
Division of Academic Information Services
University of Southern Queensland
Phone: +61 7 4631 1501
Fax: +61 7 4631 1841


This email (including any attached files) is confidential and is for the intended recipient(s) only. If you received this email by mistake, please, as a courtesy, tell the sender, then delete this email.

The views and opinions are the originator's and do not necessarily reflect those of the University of Southern Queensland. Although all reasonable precautions were taken to ensure that this email contained no viruses at the time it was sent we accept no liability for any losses arising from its receipt.

The University of Southern Queensland is a registered provider of education with the Australian Government (CRICOS Institution Code No's. QLD 00244B / NSW 02225M)

------------------------------------------------------------------------------
SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you.  Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech



This email (including any attached files) is confidential and is for the intended recipient(s) only. If you received this email by mistake, please, as a courtesy, tell the sender, then delete this email.

The views and opinions are the originator's and do not necessarily reflect those of the University of Southern Queensland. Although all reasonable precautions were taken to ensure that this email contained no viruses at the time it was sent we accept no liability for any losses arising from its receipt.

The University of Southern Queensland is a registered provider of education with the Australian Government (CRICOS Institution Code No's. QLD 00244B / NSW 02225M)


------------------------------------------------------------------------------
SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you.  Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech



This email (including any attached files) is confidential and is for the intended recipient(s) only. If you received this email by mistake, please, as a courtesy, tell the sender, then delete this email.

The views and opinions are the originator's and do not necessarily reflect those of the University of Southern Queensland. Although all reasonable precautions were taken to ensure that this email contained no viruses at the time it was sent we accept no liability for any losses arising from its receipt.

The University of Southern Queensland is a registered provider of education with the Australian Government (CRICOS Institution Code No's. QLD 00244B / NSW 02225M)


This email (including any attached files) is confidential and is for the intended recipient(s) only. If you received this email by mistake, please, as a courtesy, tell the sender, then delete this email.

The views and opinions are the originator's and do not necessarily reflect those of the University of Southern Queensland. Although all reasonable precautions were taken to ensure that this email contained no viruses at the time it was sent we accept no liability for any losses arising from its receipt.

The University of Southern Queensland is a registered provider of education with the Australian Government (CRICOS Institution Code No's. QLD 00244B / NSW 02225M)


This email (including any attached files) is confidential and is for the intended recipient(s) only. If you received this email by mistake, please, as a courtesy, tell the sender, then delete this email.

The views and opinions are the originator's and do not necessarily reflect those of the University of Southern Queensland. Although all reasonable precautions were taken to ensure that this email contained no viruses at the time it was sent we accept no liability for any losses arising from its receipt.

The University of Southern Queensland is a registered provider of education with the Australian Government (CRICOS Institution Code No's. QLD 00244B / NSW 02225M)
------------------------------------------------------------------------------
SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you.  Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech
Loading...