Extended Dismax

classic Classic list List threaded Threaded
27 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Extended Dismax

Demian Katz
I've done a very simple implementation of eDismax here:

https://github.com/vufind-org/vufind/pull/42

This switches VuFind to use eDismax by default and eliminates lots of YAML configuration that is no longer necessary. It still allows the ability to switch back to the old behavior on demand through a new YAML setting, though the only use case I can imagine for this would be if you needed VuFind to talk to an older Solr index that doesn't support eDismax; still, no harm in maintaining backward compatibility for the moment at least!

Please let me know what you think -- if there are no objections, this can be merged into master soon.

thanks,
Demian

------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134791&iu=/4140/ostg.clktrk
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech
Reply | Threaded
Open this post in threaded view
|

Re: Extended Dismax

anna headley
How will this change affect search results?

Thanks,
Anna





On Fri, Oct 4, 2013 at 10:45 AM, Demian Katz <[hidden email]> wrote:
I've done a very simple implementation of eDismax here:

https://github.com/vufind-org/vufind/pull/42

This switches VuFind to use eDismax by default and eliminates lots of YAML configuration that is no longer necessary. It still allows the ability to switch back to the old behavior on demand through a new YAML setting, though the only use case I can imagine for this would be if you needed VuFind to talk to an older Solr index that doesn't support eDismax; still, no harm in maintaining backward compatibility for the moment at least!

Please let me know what you think -- if there are no objections, this can be merged into master soon.

thanks,
Demian

------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134791&iu=/4140/ostg.clktrk
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech



------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech
Reply | Threaded
Open this post in threaded view
|

Re: Extended Dismax

Demian Katz

I’ve done a bit of testing, and for the most part, you get similar (not always identical) result sets, but in some cases, the relevance ranking is different.  I suspect that all significant differences are for the better – using real Dismax for advanced queries instead of the crazy Lucene-syntax hack we had before is likely eliminating some weird outliers that shouldn’t have been there in the first place.

 

- Demian

 

From: anna headley [mailto:[hidden email]]
Sent: Monday, October 07, 2013 1:59 PM
To: Demian Katz
Cc: [hidden email]
Subject: Re: [VuFind-Tech] Extended Dismax

 

How will this change affect search results?

Thanks,
Anna

 

 

On Fri, Oct 4, 2013 at 10:45 AM, Demian Katz <[hidden email]> wrote:

I've done a very simple implementation of eDismax here:

https://github.com/vufind-org/vufind/pull/42

This switches VuFind to use eDismax by default and eliminates lots of YAML configuration that is no longer necessary. It still allows the ability to switch back to the old behavior on demand through a new YAML setting, though the only use case I can imagine for this would be if you needed VuFind to talk to an older Solr index that doesn't support eDismax; still, no harm in maintaining backward compatibility for the moment at least!

Please let me know what you think -- if there are no objections, this can be merged into master soon.

thanks,
Demian


------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134791&iu=/4140/ostg.clktrk
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech

 


------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech
Reply | Threaded
Open this post in threaded view
|

Re: Extended Dismax

anna headley
It looks like lowercase and / or are treated as boolean operators by default [1]. I think we should make this false by default in VuFind.

Changing things like this which make special words behave differently is very jarring to librarians, so I'm concerned that there may be other things in edismax that "make things easier" (i.e. less transparent) that I'm not seeing, and which actually treat the same input differently in confusing ways.

I will try to look for other potential problems; if anyone else has greater familiarity with the details it would be great to hear some reassurance or specifics.

Anna

[1] http://wiki.apache.org/solr/ExtendedDisMax




On Mon, Oct 7, 2013 at 2:55 PM, Demian Katz <[hidden email]> wrote:

I’ve done a bit of testing, and for the most part, you get similar (not always identical) result sets, but in some cases, the relevance ranking is different.  I suspect that all significant differences are for the better – using real Dismax for advanced queries instead of the crazy Lucene-syntax hack we had before is likely eliminating some weird outliers that shouldn’t have been there in the first place.

 

- Demian

 

From: anna headley [mailto:[hidden email]]
Sent: Monday, October 07, 2013 1:59 PM
To: Demian Katz
Cc: [hidden email]
Subject: Re: [VuFind-Tech] Extended Dismax

 

How will this change affect search results?

Thanks,
Anna

 

 

On Fri, Oct 4, 2013 at 10:45 AM, Demian Katz <[hidden email]> wrote:

I've done a very simple implementation of eDismax here:

https://github.com/vufind-org/vufind/pull/42

This switches VuFind to use eDismax by default and eliminates lots of YAML configuration that is no longer necessary. It still allows the ability to switch back to the old behavior on demand through a new YAML setting, though the only use case I can imagine for this would be if you needed VuFind to talk to an older Solr index that doesn't support eDismax; still, no harm in maintaining backward compatibility for the moment at least!

Please let me know what you think -- if there are no objections, this can be merged into master soon.

thanks,
Demian


------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134791&iu=/4140/ostg.clktrk
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech

 



------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech
Reply | Threaded
Open this post in threaded view
|

Re: Extended Dismax

Demian Katz
Thanks, Anna -- I was aware of the lowercase operator feature but hadn't realized it was on by default! That should definitely be turned off for consistency with past VuFind behavior.

I don't *think* there are any other surprises along these lines, but it definitely doesn't hurt to have more eyes on the problem!

- Demian

From: anna headley [[hidden email]]
Sent: Monday, October 07, 2013 4:24 PM
To: Demian Katz
Cc: [hidden email]
Subject: Re: [VuFind-Tech] Extended Dismax

It looks like lowercase and / or are treated as boolean operators by default [1]. I think we should make this false by default in VuFind.

Changing things like this which make special words behave differently is very jarring to librarians, so I'm concerned that there may be other things in edismax that "make things easier" (i.e. less transparent) that I'm not seeing, and which actually treat the same input differently in confusing ways.

I will try to look for other potential problems; if anyone else has greater familiarity with the details it would be great to hear some reassurance or specifics.

Anna

[1] http://wiki.apache.org/solr/ExtendedDisMax




On Mon, Oct 7, 2013 at 2:55 PM, Demian Katz <[hidden email]> wrote:

I’ve done a bit of testing, and for the most part, you get similar (not always identical) result sets, but in some cases, the relevance ranking is different.  I suspect that all significant differences are for the better – using real Dismax for advanced queries instead of the crazy Lucene-syntax hack we had before is likely eliminating some weird outliers that shouldn’t have been there in the first place.

 

- Demian

 

From: anna headley [mailto:[hidden email]]
Sent: Monday, October 07, 2013 1:59 PM
To: Demian Katz
Cc: [hidden email]
Subject: Re: [VuFind-Tech] Extended Dismax

 

How will this change affect search results?

Thanks,
Anna

 

 

On Fri, Oct 4, 2013 at 10:45 AM, Demian Katz <[hidden email]> wrote:

I've done a very simple implementation of eDismax here:

https://github.com/vufind-org/vufind/pull/42

This switches VuFind to use eDismax by default and eliminates lots of YAML configuration that is no longer necessary. It still allows the ability to switch back to the old behavior on demand through a new YAML setting, though the only use case I can imagine for this would be if you needed VuFind to talk to an older Solr index that doesn't support eDismax; still, no harm in maintaining backward compatibility for the moment at least!

Please let me know what you think -- if there are no objections, this can be merged into master soon.

thanks,
Demian


------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134791&iu=/4140/ostg.clktrk
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech

 



------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech
Reply | Threaded
Open this post in threaded view
|

Re: Extended Dismax

Demian Katz
In reply to this post by anna headley

Thanks again, Anna – fixed here:

 

https://github.com/vufind-org/vufind/commit/42df67acb3b8ab90e3733121a725c2a236c006c1

 

Please let me know if you spot any potential issues! Hopefully we can discuss this further at next week’s Summit and perhaps merge to master after the next dev call if no major problems are found.

 

- Demian

 

From: anna headley [mailto:[hidden email]]
Sent: Monday, October 07, 2013 4:25 PM
To: Demian Katz
Cc: [hidden email]
Subject: Re: [VuFind-Tech] Extended Dismax

 

It looks like lowercase and / or are treated as boolean operators by default [1]. I think we should make this false by default in VuFind.

Changing things like this which make special words behave differently is very jarring to librarians, so I'm concerned that there may be other things in edismax that "make things easier" (i.e. less transparent) that I'm not seeing, and which actually treat the same input differently in confusing ways.

I will try to look for other potential problems; if anyone else has greater familiarity with the details it would be great to hear some reassurance or specifics.

 

Anna


[1] http://wiki.apache.org/solr/ExtendedDisMax

 

On Mon, Oct 7, 2013 at 2:55 PM, Demian Katz <[hidden email]> wrote:

I’ve done a bit of testing, and for the most part, you get similar (not always identical) result sets, but in some cases, the relevance ranking is different.  I suspect that all significant differences are for the better – using real Dismax for advanced queries instead of the crazy Lucene-syntax hack we had before is likely eliminating some weird outliers that shouldn’t have been there in the first place.

 

- Demian

 

From: anna headley [mailto:[hidden email]]
Sent: Monday, October 07, 2013 1:59 PM
To: Demian Katz
Cc: [hidden email]
Subject: Re: [VuFind-Tech] Extended Dismax

 

How will this change affect search results?

Thanks,
Anna

 

 

On Fri, Oct 4, 2013 at 10:45 AM, Demian Katz <[hidden email]> wrote:

I've done a very simple implementation of eDismax here:

https://github.com/vufind-org/vufind/pull/42

This switches VuFind to use eDismax by default and eliminates lots of YAML configuration that is no longer necessary. It still allows the ability to switch back to the old behavior on demand through a new YAML setting, though the only use case I can imagine for this would be if you needed VuFind to talk to an older Solr index that doesn't support eDismax; still, no harm in maintaining backward compatibility for the moment at least!

Please let me know what you think -- if there are no objections, this can be merged into master soon.

thanks,
Demian


------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134791&iu=/4140/ostg.clktrk
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech

 

 


------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech
Reply | Threaded
Open this post in threaded view
|

Re: Extended Dismax

Tod Olson
I'll just chime in that I'm in favor. 

My only concern is that the YAML Query stanzas are already eliminated. If someone goes production and then discovers something really off, it's more than flipping a switch to go back to the old behavior. It might be nice to have a period while that old config is still available in a pinch, but it's not a major concern.

The message is still "edismax++".

-Tod


On Oct 8, 2013, at 8:02 AM, Demian Katz <[hidden email]> wrote:

Thanks again, Anna – fixed here:
 
 
Please let me know if you spot any potential issues! Hopefully we can discuss this further at next week’s Summit and perhaps merge to master after the next dev call if no major problems are found.
 
- Demian
 
From: anna headley [mailto:anna3lc@gmail.com] 
Sent: Monday, October 07, 2013 4:25 PM
To: Demian Katz
Cc: [hidden email]
Subject: Re: [VuFind-Tech] Extended Dismax
 

It looks like lowercase and / or are treated as boolean operators by default [1]. I think we should make this false by default in VuFind.

Changing things like this which make special words behave differently is very jarring to librarians, so I'm concerned that there may be other things in edismax that "make things easier" (i.e. less transparent) that I'm not seeing, and which actually treat the same input differently in confusing ways.

I will try to look for other potential problems; if anyone else has greater familiarity with the details it would be great to hear some reassurance or specifics.
 
Anna


[1] http://wiki.apache.org/solr/ExtendedDisMax

 

On Mon, Oct 7, 2013 at 2:55 PM, Demian Katz <[hidden email]> wrote:

I’ve done a bit of testing, and for the most part, you get similar (not always identical) result sets, but in some cases, the relevance ranking is different.  I suspect that all significant differences are for the better – using real Dismax for advanced queries instead of the crazy Lucene-syntax hack we had before is likely eliminating some weird outliers that shouldn’t have been there in the first place.

 

- Demian

 

From: anna headley [mailto:[hidden email]] 
Sent: Monday, October 07, 2013 1:59 PM
To: Demian Katz
Cc: [hidden email]
Subject: Re: [VuFind-Tech] Extended Dismax

 

How will this change affect search results?

Thanks,
Anna

 

 

On Fri, Oct 4, 2013 at 10:45 AM, Demian Katz <[hidden email]> wrote:

I've done a very simple implementation of eDismax here:

https://github.com/vufind-org/vufind/pull/42

This switches VuFind to use eDismax by default and eliminates lots of YAML configuration that is no longer necessary. It still allows the ability to switch back to the old behavior on demand through a new YAML setting, though the only use case I can imagine for this would be if you needed VuFind to talk to an older Solr index that doesn't support eDismax; still, no harm in maintaining backward compatibility for the moment at least!

Please let me know what you think -- if there are no objections, this can be merged into master soon.

thanks,
Demian


------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134791&iu=/4140/ostg.clktrk
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech

 

 
------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech


------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech
Reply | Threaded
Open this post in threaded view
|

Re: Extended Dismax

Demian Katz

Regarding the elimination of the Query stanzas, my motivation here was to make the YAML less confusing for new users; if somebody wants to roll back to legacy behavior, downloading the 2.1 YAML file from Git is not incredibly difficult. We certainly could leave them in for a while, or comment them out before removing them – I’m not totally opposed to either of those possibilities – but as I say, my main motivation here is to get rid of potentially confusing noise before we forget about it and leave it there for longer than we need to.

 

Anyone else have thoughts on this?

 

- Demian

 

From: Tod Olson [mailto:[hidden email]]
Sent: Wednesday, October 09, 2013 3:08 PM
To: Demian Katz
Cc: Tod Olson; anna headley; [hidden email]
Subject: Re: [VuFind-Tech] Extended Dismax

 

I'll just chime in that I'm in favor. 

 

My only concern is that the YAML Query stanzas are already eliminated. If someone goes production and then discovers something really off, it's more than flipping a switch to go back to the old behavior. It might be nice to have a period while that old config is still available in a pinch, but it's not a major concern.

 

The message is still "edismax++".

 

-Tod

 

 

On Oct 8, 2013, at 8:02 AM, Demian Katz <[hidden email]> wrote:



Thanks again, Anna – fixed here:

 

 

Please let me know if you spot any potential issues! Hopefully we can discuss this further at next week’s Summit and perhaps merge to master after the next dev call if no major problems are found.

 

- Demian

 

From: anna headley [mailto:anna3lc@gmail.com] 
Sent: Monday, October 07, 2013 4:25 PM
To: Demian Katz
Cc: [hidden email]
Subject: Re: [VuFind-Tech] Extended Dismax

 

It looks like lowercase and / or are treated as boolean operators by default [1]. I think we should make this false by default in VuFind.

Changing things like this which make special words behave differently is very jarring to librarians, so I'm concerned that there may be other things in edismax that "make things easier" (i.e. less transparent) that I'm not seeing, and which actually treat the same input differently in confusing ways.

I will try to look for other potential problems; if anyone else has greater familiarity with the details it would be great to hear some reassurance or specifics.

 

Anna


[1] http://wiki.apache.org/solr/ExtendedDisMax


 

On Mon, Oct 7, 2013 at 2:55 PM, Demian Katz <[hidden email]> wrote:

I’ve done a bit of testing, and for the most part, you get similar (not always identical) result sets, but in some cases, the relevance ranking is different.  I suspect that all significant differences are for the better – using real Dismax for advanced queries instead of the crazy Lucene-syntax hack we had before is likely eliminating some weird outliers that shouldn’t have been there in the first place.

 

- Demian

 

From: anna headley [mailto:[hidden email]] 
Sent: Monday, October 07, 2013 1:59 PM
To: Demian Katz
Cc: [hidden email]
Subject: Re: [VuFind-Tech] Extended Dismax

 

How will this change affect search results?

Thanks,
Anna

 

 

On Fri, Oct 4, 2013 at 10:45 AM, Demian Katz <[hidden email]> wrote:

I've done a very simple implementation of eDismax here:

https://github.com/vufind-org/vufind/pull/42

This switches VuFind to use eDismax by default and eliminates lots of YAML configuration that is no longer necessary. It still allows the ability to switch back to the old behavior on demand through a new YAML setting, though the only use case I can imagine for this would be if you needed VuFind to talk to an older Solr index that doesn't support eDismax; still, no harm in maintaining backward compatibility for the moment at least!

Please let me know what you think -- if there are no objections, this can be merged into master soon.

thanks,
Demian


------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134791&iu=/4140/ostg.clktrk
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech

 

 

------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech

 


------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech
Reply | Threaded
Open this post in threaded view
|

Re: Extended Dismax

Eoghan Ó Carragáin
Hi,
Definitely, edismax++

I think it is ok to remove the Query stanzas to keep things simple for new users. However, it'd be useful to:
  • update the wiki (https://vufind.org/wiki/searches_customizing_tuning_adding) to reflect edismax config & perhaps add a "Legacy VuFind/Lucene Behaviour" section which points out that versions prior to 2.2 took a different approach which is still supported if necessary & links to an earlier version of searchspecs.yaml (which has always been pretty self-documenting anyway) 
  • add a flag in the next upgrade script (if people have customised query field boosts, they may want to evaluate the impact of edismax)

Eoghan












On 9 October 2013 20:11, Demian Katz <[hidden email]> wrote:

Regarding the elimination of the Query stanzas, my motivation here was to make the YAML less confusing for new users; if somebody wants to roll back to legacy behavior, downloading the 2.1 YAML file from Git is not incredibly difficult. We certainly could leave them in for a while, or comment them out before removing them – I’m not totally opposed to either of those possibilities – but as I say, my main motivation here is to get rid of potentially confusing noise before we forget about it and leave it there for longer than we need to.

 

Anyone else have thoughts on this?

 

- Demian

 

From: Tod Olson [mailto:[hidden email]]
Sent: Wednesday, October 09, 2013 3:08 PM
To: Demian Katz
Cc: Tod Olson; anna headley; [hidden email]


Subject: Re: [VuFind-Tech] Extended Dismax

 

I'll just chime in that I'm in favor. 

 

My only concern is that the YAML Query stanzas are already eliminated. If someone goes production and then discovers something really off, it's more than flipping a switch to go back to the old behavior. It might be nice to have a period while that old config is still available in a pinch, but it's not a major concern.

 

The message is still "edismax++".

 

-Tod

 

 

On Oct 8, 2013, at 8:02 AM, Demian Katz <[hidden email]> wrote:



Thanks again, Anna – fixed here:

 

 

Please let me know if you spot any potential issues! Hopefully we can discuss this further at next week’s Summit and perhaps merge to master after the next dev call if no major problems are found.

 

- Demian

 

From: anna headley [mailto:[hidden email]gmail.com] 
Sent: Monday, October 07, 2013 4:25 PM
To: Demian Katz
Cc: [hidden email]
Subject: Re: [VuFind-Tech] Extended Dismax

 

It looks like lowercase and / or are treated as boolean operators by default [1]. I think we should make this false by default in VuFind.

Changing things like this which make special words behave differently is very jarring to librarians, so I'm concerned that there may be other things in edismax that "make things easier" (i.e. less transparent) that I'm not seeing, and which actually treat the same input differently in confusing ways.

I will try to look for other potential problems; if anyone else has greater familiarity with the details it would be great to hear some reassurance or specifics.

 

Anna


[1] http://wiki.apache.org/solr/ExtendedDisMax


 

On Mon, Oct 7, 2013 at 2:55 PM, Demian Katz <[hidden email]> wrote:

I’ve done a bit of testing, and for the most part, you get similar (not always identical) result sets, but in some cases, the relevance ranking is different.  I suspect that all significant differences are for the better – using real Dismax for advanced queries instead of the crazy Lucene-syntax hack we had before is likely eliminating some weird outliers that shouldn’t have been there in the first place.

 

- Demian

 

From: anna headley [mailto:[hidden email]] 
Sent: Monday, October 07, 2013 1:59 PM
To: Demian Katz
Cc: [hidden email]
Subject: Re: [VuFind-Tech] Extended Dismax

 

How will this change affect search results?

Thanks,
Anna

 

 

On Fri, Oct 4, 2013 at 10:45 AM, Demian Katz <[hidden email]> wrote:

I've done a very simple implementation of eDismax here:

https://github.com/vufind-org/vufind/pull/42

This switches VuFind to use eDismax by default and eliminates lots of YAML configuration that is no longer necessary. It still allows the ability to switch back to the old behavior on demand through a new YAML setting, though the only use case I can imagine for this would be if you needed VuFind to talk to an older Solr index that doesn't support eDismax; still, no harm in maintaining backward compatibility for the moment at least!

Please let me know what you think -- if there are no objections, this can be merged into master soon.

thanks,
Demian


------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134791&iu=/4140/ostg.clktrk
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech

 

 

------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech

 


------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech



------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech
Reply | Threaded
Open this post in threaded view
|

Re: Extended Dismax (advanced search bug!)

Demian Katz

This makes sense to me.

 

One other very important detail I just noticed: advanced search is currently broken in the edismax branch. I’ll have to take another look at the query builder and see what’s going on. Might have to wait until after next week’s conferences, but I’ll try to squeeze it in this week if I can!

 

- Demian

 

From: Eoghan Ó Carragáin [mailto:[hidden email]]
Sent: Wednesday, October 09, 2013 3:38 PM
To: Demian Katz
Cc: Tod Olson; [hidden email]
Subject: Re: [VuFind-Tech] Extended Dismax

 

Hi,
Definitely, edismax++

I think it is ok to remove the Query stanzas to keep things simple for new users. However, it'd be useful to:

  • update the wiki (https://vufind.org/wiki/searches_customizing_tuning_adding) to reflect edismax config & perhaps add a "Legacy VuFind/Lucene Behaviour" section which points out that versions prior to 2.2 took a different approach which is still supported if necessary & links to an earlier version of searchspecs.yaml (which has always been pretty self-documenting anyway) 
  • add a flag in the next upgrade script (if people have customised query field boosts, they may want to evaluate the impact of edismax)

Eoghan

 







 

On 9 October 2013 20:11, Demian Katz <[hidden email]> wrote:

Regarding the elimination of the Query stanzas, my motivation here was to make the YAML less confusing for new users; if somebody wants to roll back to legacy behavior, downloading the 2.1 YAML file from Git is not incredibly difficult. We certainly could leave them in for a while, or comment them out before removing them – I’m not totally opposed to either of those possibilities – but as I say, my main motivation here is to get rid of potentially confusing noise before we forget about it and leave it there for longer than we need to.

 

Anyone else have thoughts on this?

 

- Demian

 

From: Tod Olson [mailto:[hidden email]]
Sent: Wednesday, October 09, 2013 3:08 PM
To: Demian Katz
Cc: Tod Olson; anna headley; [hidden email]


Subject: Re: [VuFind-Tech] Extended Dismax

 

I'll just chime in that I'm in favor. 

 

My only concern is that the YAML Query stanzas are already eliminated. If someone goes production and then discovers something really off, it's more than flipping a switch to go back to the old behavior. It might be nice to have a period while that old config is still available in a pinch, but it's not a major concern.

 

The message is still "edismax++".

 

-Tod

 

 

On Oct 8, 2013, at 8:02 AM, Demian Katz <[hidden email]> wrote:

 

Thanks again, Anna – fixed here:

 

 

Please let me know if you spot any potential issues! Hopefully we can discuss this further at next week’s Summit and perhaps merge to master after the next dev call if no major problems are found.

 

- Demian

 

From: anna headley [mailto:[hidden email]gmail.com
Sent: Monday, October 07, 2013 4:25 PM
To: Demian Katz
Cc: [hidden email]
Subject: Re: [VuFind-Tech] Extended Dismax

 

It looks like lowercase and / or are treated as boolean operators by default [1]. I think we should make this false by default in VuFind.

Changing things like this which make special words behave differently is very jarring to librarians, so I'm concerned that there may be other things in edismax that "make things easier" (i.e. less transparent) that I'm not seeing, and which actually treat the same input differently in confusing ways.

I will try to look for other potential problems; if anyone else has greater familiarity with the details it would be great to hear some reassurance or specifics.

 

Anna


[1] http://wiki.apache.org/solr/ExtendedDisMax

 

On Mon, Oct 7, 2013 at 2:55 PM, Demian Katz <[hidden email]> wrote:

I’ve done a bit of testing, and for the most part, you get similar (not always identical) result sets, but in some cases, the relevance ranking is different.  I suspect that all significant differences are for the better – using real Dismax for advanced queries instead of the crazy Lucene-syntax hack we had before is likely eliminating some weird outliers that shouldn’t have been there in the first place.

 

- Demian

 

From: anna headley [mailto:[hidden email]
Sent: Monday, October 07, 2013 1:59 PM
To: Demian Katz
Cc: [hidden email]
Subject: Re: [VuFind-Tech] Extended Dismax

 

How will this change affect search results?

Thanks,
Anna

 

 

On Fri, Oct 4, 2013 at 10:45 AM, Demian Katz <[hidden email]> wrote:

I've done a very simple implementation of eDismax here:

https://github.com/vufind-org/vufind/pull/42

This switches VuFind to use eDismax by default and eliminates lots of YAML configuration that is no longer necessary. It still allows the ability to switch back to the old behavior on demand through a new YAML setting, though the only use case I can imagine for this would be if you needed VuFind to talk to an older Solr index that doesn't support eDismax; still, no harm in maintaining backward compatibility for the moment at least!

Please let me know what you think -- if there are no objections, this can be merged into master soon.

thanks,
Demian


------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134791&iu=/4140/ostg.clktrk
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech

 

 

------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech

 


------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech

 


------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech
Reply | Threaded
Open this post in threaded view
|

Re: Extended Dismax

Greg Pendlebury-3
In reply to this post by Eoghan Ó Carragáin
Depending on how searches are constructed you may want to take a look at these tickets: https://issues.apache.org/jira/browse/SOLR-2368, particularly this one: https://issues.apache.org/jira/browse/SOLR-2649

We are trying to switch to eDisMax on one of our new systems and run into the problem with the NOT operator. Users typing "term1 term2 NOT term3" get unexpected results because term1 and term2 get OR'd together instead of AND'd as soon as the NOT operator is present.


From our advanced search screen we can specifically construct a query to workaround this, but when users manually type the extra NOT term on the end of an existing search it screws up.

Ta,
Greg


On 10 October 2013 06:38, Eoghan Ó Carragáin <[hidden email]> wrote:
Hi,
Definitely, edismax++

I think it is ok to remove the Query stanzas to keep things simple for new users. However, it'd be useful to:
  • update the wiki (https://vufind.org/wiki/searches_customizing_tuning_adding) to reflect edismax config & perhaps add a "Legacy VuFind/Lucene Behaviour" section which points out that versions prior to 2.2 took a different approach which is still supported if necessary & links to an earlier version of searchspecs.yaml (which has always been pretty self-documenting anyway) 
  • add a flag in the next upgrade script (if people have customised query field boosts, they may want to evaluate the impact of edismax)

Eoghan












On 9 October 2013 20:11, Demian Katz <[hidden email]> wrote:

Regarding the elimination of the Query stanzas, my motivation here was to make the YAML less confusing for new users; if somebody wants to roll back to legacy behavior, downloading the 2.1 YAML file from Git is not incredibly difficult. We certainly could leave them in for a while, or comment them out before removing them – I’m not totally opposed to either of those possibilities – but as I say, my main motivation here is to get rid of potentially confusing noise before we forget about it and leave it there for longer than we need to.

 

Anyone else have thoughts on this?

 

- Demian

 

From: Tod Olson [mailto:[hidden email]]
Sent: Wednesday, October 09, 2013 3:08 PM
To: Demian Katz
Cc: Tod Olson; anna headley; [hidden email]


Subject: Re: [VuFind-Tech] Extended Dismax

 

I'll just chime in that I'm in favor. 

 

My only concern is that the YAML Query stanzas are already eliminated. If someone goes production and then discovers something really off, it's more than flipping a switch to go back to the old behavior. It might be nice to have a period while that old config is still available in a pinch, but it's not a major concern.

 

The message is still "edismax++".

 

-Tod

 

 

On Oct 8, 2013, at 8:02 AM, Demian Katz <[hidden email]> wrote:



Thanks again, Anna – fixed here:

 

 

Please let me know if you spot any potential issues! Hopefully we can discuss this further at next week’s Summit and perhaps merge to master after the next dev call if no major problems are found.

 

- Demian

 

From: anna headley [mailto:[hidden email]gmail.com] 
Sent: Monday, October 07, 2013 4:25 PM
To: Demian Katz
Cc: [hidden email]
Subject: Re: [VuFind-Tech] Extended Dismax

 

It looks like lowercase and / or are treated as boolean operators by default [1]. I think we should make this false by default in VuFind.

Changing things like this which make special words behave differently is very jarring to librarians, so I'm concerned that there may be other things in edismax that "make things easier" (i.e. less transparent) that I'm not seeing, and which actually treat the same input differently in confusing ways.

I will try to look for other potential problems; if anyone else has greater familiarity with the details it would be great to hear some reassurance or specifics.

 

Anna


[1] http://wiki.apache.org/solr/ExtendedDisMax


 

On Mon, Oct 7, 2013 at 2:55 PM, Demian Katz <[hidden email]> wrote:

I’ve done a bit of testing, and for the most part, you get similar (not always identical) result sets, but in some cases, the relevance ranking is different.  I suspect that all significant differences are for the better – using real Dismax for advanced queries instead of the crazy Lucene-syntax hack we had before is likely eliminating some weird outliers that shouldn’t have been there in the first place.

 

- Demian

 

From: anna headley [mailto:[hidden email]] 
Sent: Monday, October 07, 2013 1:59 PM
To: Demian Katz
Cc: [hidden email]
Subject: Re: [VuFind-Tech] Extended Dismax

 

How will this change affect search results?

Thanks,
Anna

 

 

On Fri, Oct 4, 2013 at 10:45 AM, Demian Katz <[hidden email]> wrote:

I've done a very simple implementation of eDismax here:

https://github.com/vufind-org/vufind/pull/42

This switches VuFind to use eDismax by default and eliminates lots of YAML configuration that is no longer necessary. It still allows the ability to switch back to the old behavior on demand through a new YAML setting, though the only use case I can imagine for this would be if you needed VuFind to talk to an older Solr index that doesn't support eDismax; still, no harm in maintaining backward compatibility for the moment at least!

Please let me know what you think -- if there are no objections, this can be merged into master soon.

thanks,
Demian


------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134791&iu=/4140/ostg.clktrk
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech

 

 

------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech

 


------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk

_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech



------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech



------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech
Reply | Threaded
Open this post in threaded view
|

Re: Extended Dismax

Demian Katz

Thanks, Greg – good to know that there are still some rough edges; this makes me a little more inclined to be conservative about the transition…  but we’ll discuss in more detail on the next call.

 

Regarding the OR/NOT problem, is there a ticket for that? I didn’t notice it among the links you shared. In any case, I’ll take a closer look at that issue the next time I work on the edismax branch.

 

- Demian

 

From: Greg Pendlebury [mailto:[hidden email]]
Sent: Wednesday, October 09, 2013 10:27 PM
To: Eoghan Ó Carragáin
Cc: Demian Katz; [hidden email]
Subject: Re: [VuFind-Tech] Extended Dismax

 

Depending on how searches are constructed you may want to take a look at these tickets: https://issues.apache.org/jira/browse/SOLR-2368, particularly this one: https://issues.apache.org/jira/browse/SOLR-2649

We are trying to switch to eDisMax on one of our new systems and run into the problem with the NOT operator. Users typing "term1 term2 NOT term3" get unexpected results because term1 and term2 get OR'd together instead of AND'd as soon as the NOT operator is present.

For example, it turns this:
<a href="http://vufind.org/demo/Search/Results?lookfor=test&#43;terms&#43;NOT&#43;limits&amp;type=AllFields&amp;submit=Find&amp;limit=20&amp;sort=relevance">http://vufind.org/demo/Search/Results?lookfor=test+terms+NOT+limits&type=AllFields&submit=Find&limit=20&sort=relevance

into this:
<a href="http://vufind.org/demo/Search/Results?lookfor=test&#43;OR&#43;terms&#43;NOT&#43;limits&amp;type=AllFields&amp;submit=Find&amp;limit=20&amp;sort=relevance">http://vufind.org/demo/Search/Results?lookfor=test+OR+terms+NOT+limits&type=AllFields&submit=Find&limit=20&sort=relevance

and we're forced to manually inject the AND to return it to normal:
<a href="http://vufind.org/demo/Search/Results?lookfor=test&#43;AND&#43;terms&#43;NOT&#43;limits&amp;type=AllFields&amp;submit=Find&amp;limit=20&amp;sort=relevance">http://vufind.org/demo/Search/Results?lookfor=test+AND+terms+NOT+limits&type=AllFields&submit=Find&limit=20&sort=relevance

 

From our advanced search screen we can specifically construct a query to workaround this, but when users manually type the extra NOT term on the end of an existing search it screws up.

Ta,
Greg

 

On 10 October 2013 06:38, Eoghan Ó Carragáin <[hidden email]> wrote:

Hi,
Definitely, edismax++

I think it is ok to remove the Query stanzas to keep things simple for new users. However, it'd be useful to:

  • update the wiki (https://vufind.org/wiki/searches_customizing_tuning_adding) to reflect edismax config & perhaps add a "Legacy VuFind/Lucene Behaviour" section which points out that versions prior to 2.2 took a different approach which is still supported if necessary & links to an earlier version of searchspecs.yaml (which has always been pretty self-documenting anyway) 
  • add a flag in the next upgrade script (if people have customised query field boosts, they may want to evaluate the impact of edismax)

Eoghan

 







 

On 9 October 2013 20:11, Demian Katz <[hidden email]> wrote:

Regarding the elimination of the Query stanzas, my motivation here was to make the YAML less confusing for new users; if somebody wants to roll back to legacy behavior, downloading the 2.1 YAML file from Git is not incredibly difficult. We certainly could leave them in for a while, or comment them out before removing them – I’m not totally opposed to either of those possibilities – but as I say, my main motivation here is to get rid of potentially confusing noise before we forget about it and leave it there for longer than we need to.

 

Anyone else have thoughts on this?

 

- Demian

 

From: Tod Olson [mailto:[hidden email]]
Sent: Wednesday, October 09, 2013 3:08 PM
To: Demian Katz
Cc: Tod Olson; anna headley; [hidden email]


Subject: Re: [VuFind-Tech] Extended Dismax

 

I'll just chime in that I'm in favor. 

 

My only concern is that the YAML Query stanzas are already eliminated. If someone goes production and then discovers something really off, it's more than flipping a switch to go back to the old behavior. It might be nice to have a period while that old config is still available in a pinch, but it's not a major concern.

 

The message is still "edismax++".

 

-Tod

 

 

On Oct 8, 2013, at 8:02 AM, Demian Katz <[hidden email]> wrote:

 

Thanks again, Anna – fixed here:

 

 

Please let me know if you spot any potential issues! Hopefully we can discuss this further at next week’s Summit and perhaps merge to master after the next dev call if no major problems are found.

 

- Demian

 

From: anna headley [mailto:[hidden email]gmail.com
Sent: Monday, October 07, 2013 4:25 PM
To: Demian Katz
Cc: [hidden email]
Subject: Re: [VuFind-Tech] Extended Dismax

 

It looks like lowercase and / or are treated as boolean operators by default [1]. I think we should make this false by default in VuFind.

Changing things like this which make special words behave differently is very jarring to librarians, so I'm concerned that there may be other things in edismax that "make things easier" (i.e. less transparent) that I'm not seeing, and which actually treat the same input differently in confusing ways.

I will try to look for other potential problems; if anyone else has greater familiarity with the details it would be great to hear some reassurance or specifics.

 

Anna


[1] http://wiki.apache.org/solr/ExtendedDisMax

 

On Mon, Oct 7, 2013 at 2:55 PM, Demian Katz <[hidden email]> wrote:

I’ve done a bit of testing, and for the most part, you get similar (not always identical) result sets, but in some cases, the relevance ranking is different.  I suspect that all significant differences are for the better – using real Dismax for advanced queries instead of the crazy Lucene-syntax hack we had before is likely eliminating some weird outliers that shouldn’t have been there in the first place.

 

- Demian

 

From: anna headley [mailto:[hidden email]
Sent: Monday, October 07, 2013 1:59 PM
To: Demian Katz
Cc: [hidden email]
Subject: Re: [VuFind-Tech] Extended Dismax

 

How will this change affect search results?

Thanks,
Anna

 

 

On Fri, Oct 4, 2013 at 10:45 AM, Demian Katz <[hidden email]> wrote:

I've done a very simple implementation of eDismax here:

https://github.com/vufind-org/vufind/pull/42

This switches VuFind to use eDismax by default and eliminates lots of YAML configuration that is no longer necessary. It still allows the ability to switch back to the old behavior on demand through a new YAML setting, though the only use case I can imagine for this would be if you needed VuFind to talk to an older Solr index that doesn't support eDismax; still, no harm in maintaining backward compatibility for the moment at least!

Please let me know what you think -- if there are no objections, this can be merged into master soon.

thanks,
Demian


------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134791&iu=/4140/ostg.clktrk
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech

 

 

------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech

 


------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >

http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk


_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech

 


------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech

 


------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech
Reply | Threaded
Open this post in threaded view
|

Re: Extended Dismax (advanced search bug -- FIXED!)

Demian Katz
In reply to this post by Demian Katz

Update: I’ve fixed the advanced search bug, so I believe that the edismax branch is now fully functional. I’ve rebased onto the latest master and pushed a new version to update the pull request. (If this causes problems for anyone who was experimenting with this, just delete your local edismax branch and check it out again).

 

I’ve also confirmed that Greg’s report about bad behavior with NOT and the - operator is a real problem (not that I doubted him, of course). I’m not sure what to do about this; seems we have a few options:

 

1.)    Keep regular dismax the default until this is fixed, but add edismax as an experimental configurable option

2.)    Make edismax the new default but leave “classic” configuration so people can roll back if needed

3.)    Try to hack around it somehow – e.g. tokenize the query and inject ANDs. Obviously not a pretty option, and easily capable of causing as many problems as it solves.

 

Thoughts/preferences?

 

- Demian

 

From: Demian Katz [mailto:[hidden email]]
Sent: Wednesday, October 09, 2013 3:46 PM
To: Eoghan Ó Carragáin
Cc: [hidden email]
Subject: Re: [VuFind-Tech] Extended Dismax (advanced search bug!)

 

This makes sense to me.

 

One other very important detail I just noticed: advanced search is currently broken in the edismax branch. I’ll have to take another look at the query builder and see what’s going on. Might have to wait until after next week’s conferences, but I’ll try to squeeze it in this week if I can!

 

- Demian

 

From: Eoghan Ó Carragáin [mailto:[hidden email]]
Sent: Wednesday, October 09, 2013 3:38 PM
To: Demian Katz
Cc: Tod Olson; [hidden email]
Subject: Re: [VuFind-Tech] Extended Dismax

 

Hi,
Definitely, edismax++

I think it is ok to remove the Query stanzas to keep things simple for new users. However, it'd be useful to:

  • update the wiki (https://vufind.org/wiki/searches_customizing_tuning_adding) to reflect edismax config & perhaps add a "Legacy VuFind/Lucene Behaviour" section which points out that versions prior to 2.2 took a different approach which is still supported if necessary & links to an earlier version of searchspecs.yaml (which has always been pretty self-documenting anyway) 
  • add a flag in the next upgrade script (if people have customised query field boosts, they may want to evaluate the impact of edismax)

Eoghan

 








 

On 9 October 2013 20:11, Demian Katz <[hidden email]> wrote:

Regarding the elimination of the Query stanzas, my motivation here was to make the YAML less confusing for new users; if somebody wants to roll back to legacy behavior, downloading the 2.1 YAML file from Git is not incredibly difficult. We certainly could leave them in for a while, or comment them out before removing them – I’m not totally opposed to either of those possibilities – but as I say, my main motivation here is to get rid of potentially confusing noise before we forget about it and leave it there for longer than we need to.

 

Anyone else have thoughts on this?

 

- Demian

 

From: Tod Olson [mailto:[hidden email]]
Sent: Wednesday, October 09, 2013 3:08 PM
To: Demian Katz
Cc: Tod Olson; anna headley; [hidden email]


Subject: Re: [VuFind-Tech] Extended Dismax

 

I'll just chime in that I'm in favor. 

 

My only concern is that the YAML Query stanzas are already eliminated. If someone goes production and then discovers something really off, it's more than flipping a switch to go back to the old behavior. It might be nice to have a period while that old config is still available in a pinch, but it's not a major concern.

 

The message is still "edismax++".

 

-Tod

 

 

On Oct 8, 2013, at 8:02 AM, Demian Katz <[hidden email]> wrote:

 

Thanks again, Anna – fixed here:

 

 

Please let me know if you spot any potential issues! Hopefully we can discuss this further at next week’s Summit and perhaps merge to master after the next dev call if no major problems are found.

 

- Demian

 

From: anna headley [mailto:[hidden email]gmail.com
Sent: Monday, October 07, 2013 4:25 PM
To: Demian Katz
Cc: [hidden email]
Subject: Re: [VuFind-Tech] Extended Dismax

 

It looks like lowercase and / or are treated as boolean operators by default [1]. I think we should make this false by default in VuFind.

Changing things like this which make special words behave differently is very jarring to librarians, so I'm concerned that there may be other things in edismax that "make things easier" (i.e. less transparent) that I'm not seeing, and which actually treat the same input differently in confusing ways.

I will try to look for other potential problems; if anyone else has greater familiarity with the details it would be great to hear some reassurance or specifics.

 

Anna


[1] http://wiki.apache.org/solr/ExtendedDisMax


 

On Mon, Oct 7, 2013 at 2:55 PM, Demian Katz <[hidden email]> wrote:

I’ve done a bit of testing, and for the most part, you get similar (not always identical) result sets, but in some cases, the relevance ranking is different.  I suspect that all significant differences are for the better – using real Dismax for advanced queries instead of the crazy Lucene-syntax hack we had before is likely eliminating some weird outliers that shouldn’t have been there in the first place.

 

- Demian

 

From: anna headley [mailto:[hidden email]
Sent: Monday, October 07, 2013 1:59 PM
To: Demian Katz
Cc: [hidden email]
Subject: Re: [VuFind-Tech] Extended Dismax

 

How will this change affect search results?

Thanks,
Anna

 

 

On Fri, Oct 4, 2013 at 10:45 AM, Demian Katz <[hidden email]> wrote:

I've done a very simple implementation of eDismax here:

https://github.com/vufind-org/vufind/pull/42

This switches VuFind to use eDismax by default and eliminates lots of YAML configuration that is no longer necessary. It still allows the ability to switch back to the old behavior on demand through a new YAML setting, though the only use case I can imagine for this would be if you needed VuFind to talk to an older Solr index that doesn't support eDismax; still, no harm in maintaining backward compatibility for the moment at least!

Please let me know what you think -- if there are no objections, this can be merged into master soon.

thanks,
Demian


------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134791&iu=/4140/ostg.clktrk
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech

 

 

------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech

 


------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech

 


------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech
Reply | Threaded
Open this post in threaded view
|

Re: Extended Dismax (advanced search bug -- FIXED!)

Greg Pendlebury-3
For what it is worth, in our case, my proposal to the business area is that we still switch to eDismax, and:

 1) In the case of searches coming from the advanced search screen we can ensure that they are tokenized correctly with the additional AND operators. It isn't too much extra work in that controlled environment.

 2) For basic searches we primitively search for the word 'NOT' in queries and display contextual help about constructing boolean queries with the new search handler.

Nothing has been decided though, and I'm expecting some arguing.

Ta,
Greg


On 11 October 2013 05:52, Demian Katz <[hidden email]> wrote:

Update: I’ve fixed the advanced search bug, so I believe that the edismax branch is now fully functional. I’ve rebased onto the latest master and pushed a new version to update the pull request. (If this causes problems for anyone who was experimenting with this, just delete your local edismax branch and check it out again).

 

I’ve also confirmed that Greg’s report about bad behavior with NOT and the - operator is a real problem (not that I doubted him, of course). I’m not sure what to do about this; seems we have a few options:

 

1.)    Keep regular dismax the default until this is fixed, but add edismax as an experimental configurable option

2.)    Make edismax the new default but leave “classic” configuration so people can roll back if needed

3.)    Try to hack around it somehow – e.g. tokenize the query and inject ANDs. Obviously not a pretty option, and easily capable of causing as many problems as it solves.

 

Thoughts/preferences?

 

- Demian

 

From: Demian Katz [mailto:[hidden email]]
Sent: Wednesday, October 09, 2013 3:46 PM
To: Eoghan Ó Carragáin
Cc: [hidden email]
Subject: Re: [VuFind-Tech] Extended Dismax (advanced search bug!)

 

This makes sense to me.

 

One other very important detail I just noticed: advanced search is currently broken in the edismax branch. I’ll have to take another look at the query builder and see what’s going on. Might have to wait until after next week’s conferences, but I’ll try to squeeze it in this week if I can!

 

- Demian

 

From: Eoghan Ó Carragáin [mailto:[hidden email]]
Sent: Wednesday, October 09, 2013 3:38 PM
To: Demian Katz
Cc: Tod Olson; [hidden email]
Subject: Re: [VuFind-Tech] Extended Dismax

 

Hi,
Definitely, edismax++

I think it is ok to remove the Query stanzas to keep things simple for new users. However, it'd be useful to:

  • update the wiki (https://vufind.org/wiki/searches_customizing_tuning_adding) to reflect edismax config & perhaps add a "Legacy VuFind/Lucene Behaviour" section which points out that versions prior to 2.2 took a different approach which is still supported if necessary & links to an earlier version of searchspecs.yaml (which has always been pretty self-documenting anyway) 
  • add a flag in the next upgrade script (if people have customised query field boosts, they may want to evaluate the impact of edismax)

Eoghan

 








 

On 9 October 2013 20:11, Demian Katz <[hidden email]> wrote:

Regarding the elimination of the Query stanzas, my motivation here was to make the YAML less confusing for new users; if somebody wants to roll back to legacy behavior, downloading the 2.1 YAML file from Git is not incredibly difficult. We certainly could leave them in for a while, or comment them out before removing them – I’m not totally opposed to either of those possibilities – but as I say, my main motivation here is to get rid of potentially confusing noise before we forget about it and leave it there for longer than we need to.

 

Anyone else have thoughts on this?

 

- Demian

 

From: Tod Olson [mailto:[hidden email]]
Sent: Wednesday, October 09, 2013 3:08 PM
To: Demian Katz
Cc: Tod Olson; anna headley; [hidden email]


Subject: Re: [VuFind-Tech] Extended Dismax

 

I'll just chime in that I'm in favor. 

 

My only concern is that the YAML Query stanzas are already eliminated. If someone goes production and then discovers something really off, it's more than flipping a switch to go back to the old behavior. It might be nice to have a period while that old config is still available in a pinch, but it's not a major concern.

 

The message is still "edismax++".

 

-Tod

 

 

On Oct 8, 2013, at 8:02 AM, Demian Katz <[hidden email]> wrote:

 

Thanks again, Anna – fixed here:

 

 

Please let me know if you spot any potential issues! Hopefully we can discuss this further at next week’s Summit and perhaps merge to master after the next dev call if no major problems are found.

 

- Demian

 

From: anna headley [mailto:[hidden email]gmail.com
Sent: Monday, October 07, 2013 4:25 PM
To: Demian Katz
Cc: [hidden email]
Subject: Re: [VuFind-Tech] Extended Dismax

 

It looks like lowercase and / or are treated as boolean operators by default [1]. I think we should make this false by default in VuFind.

Changing things like this which make special words behave differently is very jarring to librarians, so I'm concerned that there may be other things in edismax that "make things easier" (i.e. less transparent) that I'm not seeing, and which actually treat the same input differently in confusing ways.

I will try to look for other potential problems; if anyone else has greater familiarity with the details it would be great to hear some reassurance or specifics.

 

Anna


[1] http://wiki.apache.org/solr/ExtendedDisMax


 

On Mon, Oct 7, 2013 at 2:55 PM, Demian Katz <[hidden email]> wrote:

I’ve done a bit of testing, and for the most part, you get similar (not always identical) result sets, but in some cases, the relevance ranking is different.  I suspect that all significant differences are for the better – using real Dismax for advanced queries instead of the crazy Lucene-syntax hack we had before is likely eliminating some weird outliers that shouldn’t have been there in the first place.

 

- Demian

 

From: anna headley [mailto:[hidden email]
Sent: Monday, October 07, 2013 1:59 PM
To: Demian Katz
Cc: [hidden email]
Subject: Re: [VuFind-Tech] Extended Dismax

 

How will this change affect search results?

Thanks,
Anna

 

 

On Fri, Oct 4, 2013 at 10:45 AM, Demian Katz <[hidden email]> wrote:

I've done a very simple implementation of eDismax here:

https://github.com/vufind-org/vufind/pull/42

This switches VuFind to use eDismax by default and eliminates lots of YAML configuration that is no longer necessary. It still allows the ability to switch back to the old behavior on demand through a new YAML setting, though the only use case I can imagine for this would be if you needed VuFind to talk to an older Solr index that doesn't support eDismax; still, no harm in maintaining backward compatibility for the moment at least!

Please let me know what you think -- if there are no objections, this can be merged into master soon.

thanks,
Demian


------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134791&iu=/4140/ostg.clktrk
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech

 

 

------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech

 


------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech

 


------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech



------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech
Reply | Threaded
Open this post in threaded view
|

Re: Extended Dismax

Greg Pendlebury-3
In reply to this post by Demian Katz
>> is there a ticket for that?

You want this one: https://issues.apache.org/jira/browse/SOLR-2649

My (perhaps imprecise) description would be that for eDisMax, all 'q.op' and 'defaultOperator' settings eventually just become an 'mm' value (eg. q.op=AND becomes mm=100%), and this value is then ignored when OR or NOT operators are observed in the query string.

Ta,
Greg


On 11 October 2013 00:00, Demian Katz <[hidden email]> wrote:

Thanks, Greg – good to know that there are still some rough edges; this makes me a little more inclined to be conservative about the transition…  but we’ll discuss in more detail on the next call.

 

Regarding the OR/NOT problem, is there a ticket for that? I didn’t notice it among the links you shared. In any case, I’ll take a closer look at that issue the next time I work on the edismax branch.

 

- Demian

 

From: Greg Pendlebury [mailto:[hidden email]]
Sent: Wednesday, October 09, 2013 10:27 PM
To: Eoghan Ó Carragáin
Cc: Demian Katz; [hidden email]


Subject: Re: [VuFind-Tech] Extended Dismax

 

Depending on how searches are constructed you may want to take a look at these tickets: https://issues.apache.org/jira/browse/SOLR-2368, particularly this one: https://issues.apache.org/jira/browse/SOLR-2649

We are trying to switch to eDisMax on one of our new systems and run into the problem with the NOT operator. Users typing "term1 term2 NOT term3" get unexpected results because term1 and term2 get OR'd together instead of AND'd as soon as the NOT operator is present.

 

From our advanced search screen we can specifically construct a query to workaround this, but when users manually type the extra NOT term on the end of an existing search it screws up.

Ta,
Greg

 

On 10 October 2013 06:38, Eoghan Ó Carragáin <[hidden email]> wrote:

Hi,
Definitely, edismax++

I think it is ok to remove the Query stanzas to keep things simple for new users. However, it'd be useful to:

  • update the wiki (https://vufind.org/wiki/searches_customizing_tuning_adding) to reflect edismax config & perhaps add a "Legacy VuFind/Lucene Behaviour" section which points out that versions prior to 2.2 took a different approach which is still supported if necessary & links to an earlier version of searchspecs.yaml (which has always been pretty self-documenting anyway) 
  • add a flag in the next upgrade script (if people have customised query field boosts, they may want to evaluate the impact of edismax)

Eoghan

 







 

On 9 October 2013 20:11, Demian Katz <[hidden email]> wrote:

Regarding the elimination of the Query stanzas, my motivation here was to make the YAML less confusing for new users; if somebody wants to roll back to legacy behavior, downloading the 2.1 YAML file from Git is not incredibly difficult. We certainly could leave them in for a while, or comment them out before removing them – I’m not totally opposed to either of those possibilities – but as I say, my main motivation here is to get rid of potentially confusing noise before we forget about it and leave it there for longer than we need to.

 

Anyone else have thoughts on this?

 

- Demian

 

From: Tod Olson [mailto:[hidden email]]
Sent: Wednesday, October 09, 2013 3:08 PM
To: Demian Katz
Cc: Tod Olson; anna headley; [hidden email]


Subject: Re: [VuFind-Tech] Extended Dismax

 

I'll just chime in that I'm in favor. 

 

My only concern is that the YAML Query stanzas are already eliminated. If someone goes production and then discovers something really off, it's more than flipping a switch to go back to the old behavior. It might be nice to have a period while that old config is still available in a pinch, but it's not a major concern.

 

The message is still "edismax++".

 

-Tod

 

 

On Oct 8, 2013, at 8:02 AM, Demian Katz <[hidden email]> wrote:

 

Thanks again, Anna – fixed here:

 

 

Please let me know if you spot any potential issues! Hopefully we can discuss this further at next week’s Summit and perhaps merge to master after the next dev call if no major problems are found.

 

- Demian

 

From: anna headley [mailto:[hidden email]gmail.com
Sent: Monday, October 07, 2013 4:25 PM
To: Demian Katz
Cc: [hidden email]
Subject: Re: [VuFind-Tech] Extended Dismax

 

It looks like lowercase and / or are treated as boolean operators by default [1]. I think we should make this false by default in VuFind.

Changing things like this which make special words behave differently is very jarring to librarians, so I'm concerned that there may be other things in edismax that "make things easier" (i.e. less transparent) that I'm not seeing, and which actually treat the same input differently in confusing ways.

I will try to look for other potential problems; if anyone else has greater familiarity with the details it would be great to hear some reassurance or specifics.

 

Anna


[1] http://wiki.apache.org/solr/ExtendedDisMax

 

On Mon, Oct 7, 2013 at 2:55 PM, Demian Katz <[hidden email]> wrote:

I’ve done a bit of testing, and for the most part, you get similar (not always identical) result sets, but in some cases, the relevance ranking is different.  I suspect that all significant differences are for the better – using real Dismax for advanced queries instead of the crazy Lucene-syntax hack we had before is likely eliminating some weird outliers that shouldn’t have been there in the first place.

 

- Demian

 

From: anna headley [mailto:[hidden email]
Sent: Monday, October 07, 2013 1:59 PM
To: Demian Katz
Cc: [hidden email]
Subject: Re: [VuFind-Tech] Extended Dismax

 

How will this change affect search results?

Thanks,
Anna

 

 

On Fri, Oct 4, 2013 at 10:45 AM, Demian Katz <[hidden email]> wrote:

I've done a very simple implementation of eDismax here:

https://github.com/vufind-org/vufind/pull/42

This switches VuFind to use eDismax by default and eliminates lots of YAML configuration that is no longer necessary. It still allows the ability to switch back to the old behavior on demand through a new YAML setting, though the only use case I can imagine for this would be if you needed VuFind to talk to an older Solr index that doesn't support eDismax; still, no harm in maintaining backward compatibility for the moment at least!

Please let me know what you think -- if there are no objections, this can be merged into master soon.

thanks,
Demian


------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134791&iu=/4140/ostg.clktrk
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech

 

 

------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech

 


------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >

http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk


_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech

 


------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech

 



------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech
Reply | Threaded
Open this post in threaded view
|

Re: Extended Dismax

Demian Katz
Makes sense -- I've voted on this one -- if others could too, I'm sure it wouldn't hurt! (I also see that Naomi Dushay of Blacklight fame is already advocating for this one as well...  so we're not alone in needing it!)

- Demian

From: Greg Pendlebury [[hidden email]]
Sent: Thursday, October 10, 2013 4:51 PM
To: Demian Katz
Cc: Eoghan Ó Carragáin; [hidden email]
Subject: Re: [VuFind-Tech] Extended Dismax

>> is there a ticket for that?

You want this one: https://issues.apache.org/jira/browse/SOLR-2649

My (perhaps imprecise) description would be that for eDisMax, all 'q.op' and 'defaultOperator' settings eventually just become an 'mm' value (eg. q.op=AND becomes mm=100%), and this value is then ignored when OR or NOT operators are observed in the query string.

Ta,
Greg


On 11 October 2013 00:00, Demian Katz <[hidden email]> wrote:

Thanks, Greg – good to know that there are still some rough edges; this makes me a little more inclined to be conservative about the transition…  but we’ll discuss in more detail on the next call.

 

Regarding the OR/NOT problem, is there a ticket for that? I didn’t notice it among the links you shared. In any case, I’ll take a closer look at that issue the next time I work on the edismax branch.

 

- Demian

 

From: Greg Pendlebury [mailto:[hidden email]]
Sent: Wednesday, October 09, 2013 10:27 PM
To: Eoghan Ó Carragáin
Cc: Demian Katz; [hidden email]


Subject: Re: [VuFind-Tech] Extended Dismax

 

Depending on how searches are constructed you may want to take a look at these tickets: https://issues.apache.org/jira/browse/SOLR-2368, particularly this one: https://issues.apache.org/jira/browse/SOLR-2649

We are trying to switch to eDisMax on one of our new systems and run into the problem with the NOT operator. Users typing "term1 term2 NOT term3" get unexpected results because term1 and term2 get OR'd together instead of AND'd as soon as the NOT operator is present.

For example, it turns this:
<a href="http://vufind.org/demo/Search/Results?lookfor=test&#43;terms&#43;NOT&#43;limits&amp;type=AllFields&amp;submit=Find&amp;limit=20&amp;sort=relevance" target="_blank">http://vufind.org/demo/Search/Results?lookfor=test+terms+NOT+limits&type=AllFields&submit=Find&limit=20&sort=relevance

into this:
<a href="http://vufind.org/demo/Search/Results?lookfor=test&#43;OR&#43;terms&#43;NOT&#43;limits&amp;type=AllFields&amp;submit=Find&amp;limit=20&amp;sort=relevance" target="_blank">http://vufind.org/demo/Search/Results?lookfor=test+OR+terms+NOT+limits&type=AllFields&submit=Find&limit=20&sort=relevance

and we're forced to manually inject the AND to return it to normal:
<a href="http://vufind.org/demo/Search/Results?lookfor=test&#43;AND&#43;terms&#43;NOT&#43;limits&amp;type=AllFields&amp;submit=Find&amp;limit=20&amp;sort=relevance" target="_blank">http://vufind.org/demo/Search/Results?lookfor=test+AND+terms+NOT+limits&type=AllFields&submit=Find&limit=20&sort=relevance

 

From our advanced search screen we can specifically construct a query to workaround this, but when users manually type the extra NOT term on the end of an existing search it screws up.

Ta,
Greg

 

On 10 October 2013 06:38, Eoghan Ó Carragáin <[hidden email]> wrote:

Hi,
Definitely, edismax++

I think it is ok to remove the Query stanzas to keep things simple for new users. However, it'd be useful to:

  • update the wiki (https://vufind.org/wiki/searches_customizing_tuning_adding) to reflect edismax config & perhaps add a "Legacy VuFind/Lucene Behaviour" section which points out that versions prior to 2.2 took a different approach which is still supported if necessary & links to an earlier version of searchspecs.yaml (which has always been pretty self-documenting anyway) 
  • add a flag in the next upgrade script (if people have customised query field boosts, they may want to evaluate the impact of edismax)

Eoghan

 







 

On 9 October 2013 20:11, Demian Katz <[hidden email]> wrote:

Regarding the elimination of the Query stanzas, my motivation here was to make the YAML less confusing for new users; if somebody wants to roll back to legacy behavior, downloading the 2.1 YAML file from Git is not incredibly difficult. We certainly could leave them in for a while, or comment them out before removing them – I’m not totally opposed to either of those possibilities – but as I say, my main motivation here is to get rid of potentially confusing noise before we forget about it and leave it there for longer than we need to.

 

Anyone else have thoughts on this?

 

- Demian

 

From: Tod Olson [mailto:[hidden email]]
Sent: Wednesday, October 09, 2013 3:08 PM
To: Demian Katz
Cc: Tod Olson; anna headley; [hidden email]


Subject: Re: [VuFind-Tech] Extended Dismax

 

I'll just chime in that I'm in favor. 

 

My only concern is that the YAML Query stanzas are already eliminated. If someone goes production and then discovers something really off, it's more than flipping a switch to go back to the old behavior. It might be nice to have a period while that old config is still available in a pinch, but it's not a major concern.

 

The message is still "edismax++".

 

-Tod

 

 

On Oct 8, 2013, at 8:02 AM, Demian Katz <[hidden email]> wrote:

 

Thanks again, Anna – fixed here:

 

 

Please let me know if you spot any potential issues! Hopefully we can discuss this further at next week’s Summit and perhaps merge to master after the next dev call if no major problems are found.

 

- Demian

 

From: anna headley [mailto:[hidden email]gmail.com
Sent: Monday, October 07, 2013 4:25 PM
To: Demian Katz
Cc: [hidden email]
Subject: Re: [VuFind-Tech] Extended Dismax

 

It looks like lowercase and / or are treated as boolean operators by default [1]. I think we should make this false by default in VuFind.

Changing things like this which make special words behave differently is very jarring to librarians, so I'm concerned that there may be other things in edismax that "make things easier" (i.e. less transparent) that I'm not seeing, and which actually treat the same input differently in confusing ways.

I will try to look for other potential problems; if anyone else has greater familiarity with the details it would be great to hear some reassurance or specifics.

 

Anna


[1] http://wiki.apache.org/solr/ExtendedDisMax

 

On Mon, Oct 7, 2013 at 2:55 PM, Demian Katz <[hidden email]> wrote:

I’ve done a bit of testing, and for the most part, you get similar (not always identical) result sets, but in some cases, the relevance ranking is different.  I suspect that all significant differences are for the better – using real Dismax for advanced queries instead of the crazy Lucene-syntax hack we had before is likely eliminating some weird outliers that shouldn’t have been there in the first place.

 

- Demian

 

From: anna headley [mailto:[hidden email]
Sent: Monday, October 07, 2013 1:59 PM
To: Demian Katz
Cc: [hidden email]
Subject: Re: [VuFind-Tech] Extended Dismax

 

How will this change affect search results?

Thanks,
Anna

 

 

On Fri, Oct 4, 2013 at 10:45 AM, Demian Katz <[hidden email]> wrote:

I've done a very simple implementation of eDismax here:

https://github.com/vufind-org/vufind/pull/42

This switches VuFind to use eDismax by default and eliminates lots of YAML configuration that is no longer necessary. It still allows the ability to switch back to the old behavior on demand through a new YAML setting, though the only use case I can imagine for this would be if you needed VuFind to talk to an older Solr index that doesn't support eDismax; still, no harm in maintaining backward compatibility for the moment at least!

Please let me know what you think -- if there are no objections, this can be merged into master soon.

thanks,
Demian


------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134791&iu=/4140/ostg.clktrk
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech

 

 

------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech

 


------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >

http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk


_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech

 


------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech

 



------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech
Reply | Threaded
Open this post in threaded view
|

Re: Extended Dismax

Tod Olson
Yes, I hate to backpedal, but that bug report gives me pause about edismax in its current state. Both the absurd hit counts and the diminished control over ranking see problematic, given the pickiness shown in our recent ranking testing. I think we'd need to run some ranking tests locally with edismax before we'd try it while that bug is open.

-Tod

On Oct 11, 2013, at 6:40 AM, Demian Katz <[hidden email]> wrote:

Makes sense -- I've voted on this one -- if others could too, I'm sure it wouldn't hurt! (I also see that Naomi Dushay of Blacklight fame is already advocating for this one as well...  so we're not alone in needing it!)

- Demian

From: Greg Pendlebury [[hidden email]]
Sent: Thursday, October 10, 2013 4:51 PM
To: Demian Katz
Cc: Eoghan Ó Carragáin; [hidden email]
Subject: Re: [VuFind-Tech] Extended Dismax

>> is there a ticket for that?

You want this one: https://issues.apache.org/jira/browse/SOLR-2649

My (perhaps imprecise) description would be that for eDisMax, all 'q.op' and 'defaultOperator' settings eventually just become an 'mm' value (eg. q.op=AND becomes mm=100%), and this value is then ignored when OR or NOT operators are observed in the query string.

Ta,
Greg


On 11 October 2013 00:00, Demian Katz <[hidden email]> wrote:
Thanks, Greg – good to know that there are still some rough edges; this makes me a little more inclined to be conservative about the transition…  but we’ll discuss in more detail on the next call.
 
Regarding the OR/NOT problem, is there a ticket for that? I didn’t notice it among the links you shared. In any case, I’ll take a closer look at that issue the next time I work on the edismax branch.
 
- Demian
 
From: Greg Pendlebury [mailto:[hidden email]] 
Sent: Wednesday, October 09, 2013 10:27 PM
To: Eoghan Ó Carragáin
Cc: Demian Katz; [hidden email]

Subject: Re: [VuFind-Tech] Extended Dismax

 

Depending on how searches are constructed you may want to take a look at these tickets:https://issues.apache.org/jira/browse/SOLR-2368, particularly this one:https://issues.apache.org/jira/browse/SOLR-2649

We are trying to switch to eDisMax on one of our new systems and run into the problem with the NOT operator. Users typing "term1 term2 NOT term3" get unexpected results because term1 and term2 get OR'd together instead of AND'd as soon as the NOT operator is present.

For example, it turns this:
<a href="http://vufind.org/demo/Search/Results?lookfor=test&#43;terms&#43;NOT&#43;limits&amp;type=AllFields&amp;submit=Find&amp;limit=20&amp;sort=relevance" target="_blank">http://vufind.org/demo/Search/Results?lookfor=test+terms+NOT+limits&type=AllFields&submit=Find&limit=20&sort=relevance

into this:
<a href="http://vufind.org/demo/Search/Results?lookfor=test&#43;OR&#43;terms&#43;NOT&#43;limits&amp;type=AllFields&amp;submit=Find&amp;limit=20&amp;sort=relevance" target="_blank">http://vufind.org/demo/Search/Results?lookfor=test+OR+terms+NOT+limits&type=AllFields&submit=Find&limit=20&sort=relevance

and we're forced to manually inject the AND to return it to normal:
<a href="http://vufind.org/demo/Search/Results?lookfor=test&#43;AND&#43;terms&#43;NOT&#43;limits&amp;type=AllFields&amp;submit=Find&amp;limit=20&amp;sort=relevance" target="_blank">http://vufind.org/demo/Search/Results?lookfor=test+AND+terms+NOT+limits&type=AllFields&submit=Find&limit=20&sort=relevance
 

From our advanced search screen we can specifically construct a query to workaround this, but when users manually type the extra NOT term on the end of an existing search it screws up.

Ta,
Greg

 

On 10 October 2013 06:38, Eoghan Ó Carragáin <[hidden email]> wrote:

Hi,
Definitely, edismax++

I think it is ok to remove the Query stanzas to keep things simple for new users. However, it'd be useful to:
  • update the wiki (https://vufind.org/wiki/searches_customizing_tuning_adding) to reflect edismax config & perhaps add a "Legacy VuFind/Lucene Behaviour" section which points out that versions prior to 2.2 took a different approach which is still supported if necessary & links to an earlier version of searchspecs.yaml (which has always been pretty self-documenting anyway) 
  • add a flag in the next upgrade script (if people have customised query field boosts, they may want to evaluate the impact of edismax)
Eoghan

 







 

On 9 October 2013 20:11, Demian Katz <[hidden email]> wrote:
Regarding the elimination of the Query stanzas, my motivation here was to make the YAML less confusing for new users; if somebody wants to roll back to legacy behavior, downloading the 2.1 YAML file from Git is not incredibly difficult. We certainly could leave them in for a while, or comment them out before removing them – I’m not totally opposed to either of those possibilities – but as I say, my main motivation here is to get rid of potentially confusing noise before we forget about it and leave it there for longer than we need to.
 
Anyone else have thoughts on this?
 
- Demian
 
From: Tod Olson [mailto:[hidden email]] 
Sent: Wednesday, October 09, 2013 3:08 PM
To: Demian Katz
Cc: Tod Olson; anna headley; [hidden email]

Subject: Re: [VuFind-Tech] Extended Dismax
 
I'll just chime in that I'm in favor. 
 
My only concern is that the YAML Query stanzas are already eliminated. If someone goes production and then discovers something really off, it's more than flipping a switch to go back to the old behavior. It might be nice to have a period while that old config is still available in a pinch, but it's not a major concern.
 
The message is still "edismax++".
 
-Tod
 
 
On Oct 8, 2013, at 8:02 AM, Demian Katz <[hidden email]> wrote:

 

Thanks again, Anna – fixed here:
 
 
Please let me know if you spot any potential issues! Hopefully we can discuss this further at next week’s Summit and perhaps merge to master after the next dev call if no major problems are found.
 
- Demian
 
From: anna headley [mailto:[hidden email]gmail.com
Sent: Monday, October 07, 2013 4:25 PM
To: Demian Katz
Cc: [hidden email]
Subject: Re: [VuFind-Tech] Extended Dismax
 

It looks like lowercase and / or are treated as boolean operators by default [1]. I think we should make this false by default in VuFind.

Changing things like this which make special words behave differently is very jarring to librarians, so I'm concerned that there may be other things in edismax that "make things easier" (i.e. less transparent) that I'm not seeing, and which actually treat the same input differently in confusing ways.

I will try to look for other potential problems; if anyone else has greater familiarity with the details it would be great to hear some reassurance or specifics.
 
Anna


[1] http://wiki.apache.org/solr/ExtendedDisMax

 

On Mon, Oct 7, 2013 at 2:55 PM, Demian Katz <[hidden email]> wrote:
I’ve done a bit of testing, and for the most part, you get similar (not always identical) result sets, but in some cases, the relevance ranking is different.  I suspect that all significant differences are for the better – using real Dismax for advanced queries instead of the crazy Lucene-syntax hack we had before is likely eliminating some weird outliers that shouldn’t have been there in the first place.
 
- Demian
 
From: anna headley [mailto:[hidden email]
Sent: Monday, October 07, 2013 1:59 PM
To: Demian Katz
Cc: [hidden email]
Subject: Re: [VuFind-Tech] Extended Dismax
 

How will this change affect search results?

Thanks,
Anna

 

 

On Fri, Oct 4, 2013 at 10:45 AM, Demian Katz <[hidden email]> wrote:
I've done a very simple implementation of eDismax here:

https://github.com/vufind-org/vufind/pull/42

This switches VuFind to use eDismax by default and eliminates lots of YAML configuration that is no longer necessary. It still allows the ability to switch back to the old behavior on demand through a new YAML setting, though the only use case I can imagine for this would be if you needed VuFind to talk to an older Solr index that doesn't support eDismax; still, no harm in maintaining backward compatibility for the moment at least!

Please let me know what you think -- if there are no objections, this can be merged into master soon.

thanks,
Demian


------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134791&iu=/4140/ostg.clktrk
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech

 
 
------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech
 

------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >


_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech

 


------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech

 

------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech


------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60135031&iu=/4140/ostg.clktrk
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech
Reply | Threaded
Open this post in threaded view
|

Re: Extended Dismax

Greg Pendlebury-3
It is quite sad that we have now passed the two year anniversary on this email... but SOLR-2649 looks likes it might actually make it into the next release of Solr. Erick Erickson said he was planning on committing it today.

Ta,
Greg


On 18 October 2013 at 14:04, Tod Olson <[hidden email]> wrote:
Yes, I hate to backpedal, but that bug report gives me pause about edismax in its current state. Both the absurd hit counts and the diminished control over ranking see problematic, given the pickiness shown in our recent ranking testing. I think we'd need to run some ranking tests locally with edismax before we'd try it while that bug is open.

-Tod

On Oct 11, 2013, at 6:40 AM, Demian Katz <[hidden email]> wrote:

Makes sense -- I've voted on this one -- if others could too, I'm sure it wouldn't hurt! (I also see that Naomi Dushay of Blacklight fame is already advocating for this one as well...  so we're not alone in needing it!)

- Demian

From: Greg Pendlebury [[hidden email]]
Sent: Thursday, October 10, 2013 4:51 PM
To: Demian Katz
Cc: Eoghan Ó Carragáin; [hidden email]
Subject: Re: [VuFind-Tech] Extended Dismax

>> is there a ticket for that?

You want this one: https://issues.apache.org/jira/browse/SOLR-2649

My (perhaps imprecise) description would be that for eDisMax, all 'q.op' and 'defaultOperator' settings eventually just become an 'mm' value (eg. q.op=AND becomes mm=100%), and this value is then ignored when OR or NOT operators are observed in the query string.

Ta,
Greg


On 11 October 2013 00:00, Demian Katz <[hidden email]> wrote:
Thanks, Greg – good to know that there are still some rough edges; this makes me a little more inclined to be conservative about the transition…  but we’ll discuss in more detail on the next call.
 
Regarding the OR/NOT problem, is there a ticket for that? I didn’t notice it among the links you shared. In any case, I’ll take a closer look at that issue the next time I work on the edismax branch.
 
- Demian
 
From: Greg Pendlebury [mailto:[hidden email]] 
Sent: Wednesday, October 09, 2013 10:27 PM
To: Eoghan Ó Carragáin
Cc: Demian Katz; [hidden email]

Subject: Re: [VuFind-Tech] Extended Dismax

 

Depending on how searches are constructed you may want to take a look at these tickets:https://issues.apache.org/jira/browse/SOLR-2368, particularly this one:https://issues.apache.org/jira/browse/SOLR-2649

We are trying to switch to eDisMax on one of our new systems and run into the problem with the NOT operator. Users typing "term1 term2 NOT term3" get unexpected results because term1 and term2 get OR'd together instead of AND'd as soon as the NOT operator is present.

 

From our advanced search screen we can specifically construct a query to workaround this, but when users manually type the extra NOT term on the end of an existing search it screws up.

Ta,
Greg

 

On 10 October 2013 06:38, Eoghan Ó Carragáin <[hidden email]> wrote:

Hi,
Definitely, edismax++

I think it is ok to remove the Query stanzas to keep things simple for new users. However, it'd be useful to:
  • update the wiki (https://vufind.org/wiki/searches_customizing_tuning_adding) to reflect edismax config & perhaps add a "Legacy VuFind/Lucene Behaviour" section which points out that versions prior to 2.2 took a different approach which is still supported if necessary & links to an earlier version of searchspecs.yaml (which has always been pretty self-documenting anyway) 
  • add a flag in the next upgrade script (if people have customised query field boosts, they may want to evaluate the impact of edismax)
Eoghan

 







 

On 9 October 2013 20:11, Demian Katz <[hidden email]> wrote:
Regarding the elimination of the Query stanzas, my motivation here was to make the YAML less confusing for new users; if somebody wants to roll back to legacy behavior, downloading the 2.1 YAML file from Git is not incredibly difficult. We certainly could leave them in for a while, or comment them out before removing them – I’m not totally opposed to either of those possibilities – but as I say, my main motivation here is to get rid of potentially confusing noise before we forget about it and leave it there for longer than we need to.
 
Anyone else have thoughts on this?
 
- Demian
 
From: Tod Olson [mailto:[hidden email]] 
Sent: Wednesday, October 09, 2013 3:08 PM
To: Demian Katz
Cc: Tod Olson; anna headley; [hidden email]

Subject: Re: [VuFind-Tech] Extended Dismax
 
I'll just chime in that I'm in favor. 
 
My only concern is that the YAML Query stanzas are already eliminated. If someone goes production and then discovers something really off, it's more than flipping a switch to go back to the old behavior. It might be nice to have a period while that old config is still available in a pinch, but it's not a major concern.
 
The message is still "edismax++".
 
-Tod
 
 
On Oct 8, 2013, at 8:02 AM, Demian Katz <[hidden email]> wrote:

 

Thanks again, Anna – fixed here:
 
 
Please let me know if you spot any potential issues! Hopefully we can discuss this further at next week’s Summit and perhaps merge to master after the next dev call if no major problems are found.
 
- Demian
 
From: anna headley [mailto:[hidden email]gmail.com
Sent: Monday, October 07, 2013 4:25 PM
To: Demian Katz
Cc: [hidden email]
Subject: Re: [VuFind-Tech] Extended Dismax
 

It looks like lowercase and / or are treated as boolean operators by default [1]. I think we should make this false by default in VuFind.

Changing things like this which make special words behave differently is very jarring to librarians, so I'm concerned that there may be other things in edismax that "make things easier" (i.e. less transparent) that I'm not seeing, and which actually treat the same input differently in confusing ways.

I will try to look for other potential problems; if anyone else has greater familiarity with the details it would be great to hear some reassurance or specifics.
 
Anna


[1] http://wiki.apache.org/solr/ExtendedDisMax

 

On Mon, Oct 7, 2013 at 2:55 PM, Demian Katz <[hidden email]> wrote:
I’ve done a bit of testing, and for the most part, you get similar (not always identical) result sets, but in some cases, the relevance ranking is different.  I suspect that all significant differences are for the better – using real Dismax for advanced queries instead of the crazy Lucene-syntax hack we had before is likely eliminating some weird outliers that shouldn’t have been there in the first place.
 
- Demian
 
From: anna headley [mailto:[hidden email]
Sent: Monday, October 07, 2013 1:59 PM
To: Demian Katz
Cc: [hidden email]
Subject: Re: [VuFind-Tech] Extended Dismax
 

How will this change affect search results?

Thanks,
Anna

 

 

On Fri, Oct 4, 2013 at 10:45 AM, Demian Katz <[hidden email]> wrote:
I've done a very simple implementation of eDismax here:

https://github.com/vufind-org/vufind/pull/42

This switches VuFind to use eDismax by default and eliminates lots of YAML configuration that is no longer necessary. It still allows the ability to switch back to the old behavior on demand through a new YAML setting, though the only use case I can imagine for this would be if you needed VuFind to talk to an older Solr index that doesn't support eDismax; still, no harm in maintaining backward compatibility for the moment at least!

Please let me know what you think -- if there are no objections, this can be merged into master soon.

thanks,
Demian


------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134791&iu=/4140/ostg.clktrk
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech

 
 
------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech
 

------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >


_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech

 


------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech

 

------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech



------------------------------------------------------------------------------

_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech
Reply | Threaded
Open this post in threaded view
|

Re: Extended Dismax

Demian Katz

That would be great news! Any idea what release it would end up in, assuming the merge occurs?

 

Speaking of Solr tickets that never get resolved, I’d also love to see an end to this one: https://issues.apache.org/jira/browse/SOLR-2798 -- I’ve been trying to see if some Solr developer could give me some quick tips about how I might fix this myself (since I’m totally new to the code base, I figure a rough pointer in the approximate direction would save me a lot of time) but so far my requests have not yielded any responses. Any idea if there’s a better place than the ticket itself and the solr-user list to try to stir up this sort of assistance?

 

- Demian

 

From: Greg Pendlebury [mailto:[hidden email]]
Sent: Sunday, December 13, 2015 8:50 PM
To: Tod Olson
Cc: Demian Katz; [hidden email]
Subject: Re: [VuFind-Tech] Extended Dismax

 

It is quite sad that we have now passed the two year anniversary on this email... but SOLR-2649 looks likes it might actually make it into the next release of Solr. Erick Erickson said he was planning on committing it today.

Ta,

Greg

 

On 18 October 2013 at 14:04, Tod Olson <[hidden email]> wrote:

Yes, I hate to backpedal, but that bug report gives me pause about edismax in its current state. Both the absurd hit counts and the diminished control over ranking see problematic, given the pickiness shown in our recent ranking testing. I think we'd need to run some ranking tests locally with edismax before we'd try it while that bug is open.

 

-Tod

 

On Oct 11, 2013, at 6:40 AM, Demian Katz <[hidden email]> wrote:



Makes sense -- I've voted on this one -- if others could too, I'm sure it wouldn't hurt! (I also see that Naomi Dushay of Blacklight fame is already advocating for this one as well...  so we're not alone in needing it!)

- Demian


From: Greg Pendlebury [[hidden email]]
Sent: Thursday, October 10, 2013 4:51 PM
To: Demian Katz
Cc: Eoghan Ó Carragáin;
[hidden email]
Subject: Re: [VuFind-Tech] Extended Dismax

>> is there a ticket for that?

You want this one: https://issues.apache.org/jira/browse/SOLR-2649

My (perhaps imprecise) description would be that for eDisMax, all 'q.op' and 'defaultOperator' settings eventually just become an 'mm' value (eg. q.op=AND becomes mm=100%), and this value is then ignored when OR or NOT operators are observed in the query string.

Ta,
Greg

 

On 11 October 2013 00:00, Demian Katz <[hidden email]> wrote:

Thanks, Greg – good to know that there are still some rough edges; this makes me a little more inclined to be conservative about the transition…  but we’ll discuss in more detail on the next call.

 

Regarding the OR/NOT problem, is there a ticket for that? I didn’t notice it among the links you shared. In any case, I’ll take a closer look at that issue the next time I work on the edismax branch.

 

- Demian

 

From: Greg Pendlebury [mailto:[hidden email]
Sent: Wednesday, October 09, 2013 10:27 PM
To: Eoghan Ó Carragáin
Cc: Demian Katz; 
[hidden email]


Subject: Re: [VuFind-Tech] Extended Dismax

 

Depending on how searches are constructed you may want to take a look at these tickets:https://issues.apache.org/jira/browse/SOLR-2368, particularly this one:https://issues.apache.org/jira/browse/SOLR-2649

We are trying to switch to eDisMax on one of our new systems and run into the problem with the NOT operator. Users typing "term1 term2 NOT term3" get unexpected results because term1 and term2 get OR'd together instead of AND'd as soon as the NOT operator is present.

For example, it turns this:
<a href="http://vufind.org/demo/Search/Results?lookfor=test&#43;terms&#43;NOT&#43;limits&amp;type=AllFields&amp;submit=Find&amp;limit=20&amp;sort=relevance" target="_blank">http://vufind.org/demo/Search/Results?lookfor=test+terms+NOT+limits&type=AllFields&submit=Find&limit=20&sort=relevance

into this:
<a href="http://vufind.org/demo/Search/Results?lookfor=test&#43;OR&#43;terms&#43;NOT&#43;limits&amp;type=AllFields&amp;submit=Find&amp;limit=20&amp;sort=relevance" target="_blank">http://vufind.org/demo/Search/Results?lookfor=test+OR+terms+NOT+limits&type=AllFields&submit=Find&limit=20&sort=relevance

and we're forced to manually inject the AND to return it to normal:
<a href="http://vufind.org/demo/Search/Results?lookfor=test&#43;AND&#43;terms&#43;NOT&#43;limits&amp;type=AllFields&amp;submit=Find&amp;limit=20&amp;sort=relevance" target="_blank">http://vufind.org/demo/Search/Results?lookfor=test+AND+terms+NOT+limits&type=AllFields&submit=Find&limit=20&sort=relevance

 

From our advanced search screen we can specifically construct a query to workaround this, but when users manually type the extra NOT term on the end of an existing search it screws up.

Ta,
Greg

 

On 10 October 2013 06:38, Eoghan Ó Carragáin <[hidden email]> wrote:

Hi,
Definitely, edismax++

I think it is ok to remove the Query stanzas to keep things simple for new users. However, it'd be useful to:

  • update the wiki (https://vufind.org/wiki/searches_customizing_tuning_adding) to reflect edismax config & perhaps add a "Legacy VuFind/Lucene Behaviour" section which points out that versions prior to 2.2 took a different approach which is still supported if necessary & links to an earlier version of searchspecs.yaml (which has always been pretty self-documenting anyway) 
  • add a flag in the next upgrade script (if people have customised query field boosts, they may want to evaluate the impact of edismax)

Eoghan

 






 

On 9 October 2013 20:11, Demian Katz <[hidden email]> wrote:

Regarding the elimination of the Query stanzas, my motivation here was to make the YAML less confusing for new users; if somebody wants to roll back to legacy behavior, downloading the 2.1 YAML file from Git is not incredibly difficult. We certainly could leave them in for a while, or comment them out before removing them – I’m not totally opposed to either of those possibilities – but as I say, my main motivation here is to get rid of potentially confusing noise before we forget about it and leave it there for longer than we need to.

 

Anyone else have thoughts on this?

 

- Demian

 

From: Tod Olson [mailto:[hidden email]
Sent: Wednesday, October 09, 2013 3:08 PM
To: Demian Katz
Cc: Tod Olson; anna headley; 
[hidden email]


Subject: Re: [VuFind-Tech] Extended Dismax

 

I'll just chime in that I'm in favor. 

 

My only concern is that the YAML Query stanzas are already eliminated. If someone goes production and then discovers something really off, it's more than flipping a switch to go back to the old behavior. It might be nice to have a period while that old config is still available in a pinch, but it's not a major concern.

 

The message is still "edismax++".

 

-Tod

 

 

On Oct 8, 2013, at 8:02 AM, Demian Katz <[hidden email]> wrote:

 

Thanks again, Anna – fixed here:

 

 

Please let me know if you spot any potential issues! Hopefully we can discuss this further at next week’s Summit and perhaps merge to master after the next dev call if no major problems are found.

 

- Demian

 

From: anna headley [mailto:[hidden email]gmail.com
Sent: Monday, October 07, 2013 4:25 PM
To: Demian Katz
Cc: 
[hidden email]
Subject: Re: [VuFind-Tech] Extended Dismax

 

It looks like lowercase and / or are treated as boolean operators by default [1]. I think we should make this false by default in VuFind.

Changing things like this which make special words behave differently is very jarring to librarians, so I'm concerned that there may be other things in edismax that "make things easier" (i.e. less transparent) that I'm not seeing, and which actually treat the same input differently in confusing ways.

I will try to look for other potential problems; if anyone else has greater familiarity with the details it would be great to hear some reassurance or specifics.

 

Anna


[1] http://wiki.apache.org/solr/ExtendedDisMax

 

On Mon, Oct 7, 2013 at 2:55 PM, Demian Katz <[hidden email]> wrote:

I’ve done a bit of testing, and for the most part, you get similar (not always identical) result sets, but in some cases, the relevance ranking is different.  I suspect that all significant differences are for the better – using real Dismax for advanced queries instead of the crazy Lucene-syntax hack we had before is likely eliminating some weird outliers that shouldn’t have been there in the first place.

 

- Demian

 

From: anna headley [mailto:[hidden email]
Sent: Monday, October 07, 2013 1:59 PM
To: Demian Katz
Cc: 
[hidden email]
Subject: Re: [VuFind-Tech] Extended Dismax

 

How will this change affect search results?

Thanks,
Anna

 

 

On Fri, Oct 4, 2013 at 10:45 AM, Demian Katz <[hidden email]> wrote:

I've done a very simple implementation of eDismax here:

https://github.com/vufind-org/vufind/pull/42

This switches VuFind to use eDismax by default and eliminates lots of YAML configuration that is no longer necessary. It still allows the ability to switch back to the old behavior on demand through a new YAML setting, though the only use case I can imagine for this would be if you needed VuFind to talk to an older Solr index that doesn't support eDismax; still, no harm in maintaining backward compatibility for the moment at least!

Please let me know what you think -- if there are no objections, this can be merged into master soon.

thanks,
Demian


------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134791&iu=/4140/ostg.clktrk
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech

 

 

------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech

 


------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >


_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech

 


------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech

 

 

------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech

 

 


------------------------------------------------------------------------------

_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech
Reply | Threaded
Open this post in threaded view
|

Re: Extended Dismax

Greg Pendlebury-3
Not sure on the version. Erick was talking about maybe 5.5 but there are backwards compatibility considerations I haven't seen the decision on yet.

With regards to 2798, I'm not sure sorry. Somewhere in the DisMaxQParser class seems likely, but I would have to run it a few times to test and I don't have the resources for that now in the Xmas rush. The Solr codebase is an unwieldy beast :(

If you want to run the code yourself and tinker with things I would recommend dropping trunk into an IDE that can run the tests for you. TestDisjunctionMaxQuery seems to test the processing of DisMax, but does not have anything related to parsing boosts. If you can manually inject multiple boosts from there you would know whether it is just a case of the parser not handling multiples (which I would assume makes it a much more manageable problem).

I got Erick's attention on 2649 by posting to the solr-dev list. I'm not sure whether I was just lucky though.

Ta,
Greg


On 15 December 2015 at 01:33, Demian Katz <[hidden email]> wrote:

That would be great news! Any idea what release it would end up in, assuming the merge occurs?

 

Speaking of Solr tickets that never get resolved, I’d also love to see an end to this one: https://issues.apache.org/jira/browse/SOLR-2798 -- I’ve been trying to see if some Solr developer could give me some quick tips about how I might fix this myself (since I’m totally new to the code base, I figure a rough pointer in the approximate direction would save me a lot of time) but so far my requests have not yielded any responses. Any idea if there’s a better place than the ticket itself and the solr-user list to try to stir up this sort of assistance?

 

- Demian

 

From: Greg Pendlebury [mailto:[hidden email]]
Sent: Sunday, December 13, 2015 8:50 PM
To: Tod Olson


Cc: Demian Katz; [hidden email]
Subject: Re: [VuFind-Tech] Extended Dismax

 

It is quite sad that we have now passed the two year anniversary on this email... but SOLR-2649 looks likes it might actually make it into the next release of Solr. Erick Erickson said he was planning on committing it today.

Ta,

Greg

 

On 18 October 2013 at 14:04, Tod Olson <[hidden email]> wrote:

Yes, I hate to backpedal, but that bug report gives me pause about edismax in its current state. Both the absurd hit counts and the diminished control over ranking see problematic, given the pickiness shown in our recent ranking testing. I think we'd need to run some ranking tests locally with edismax before we'd try it while that bug is open.

 

-Tod

 

On Oct 11, 2013, at 6:40 AM, Demian Katz <[hidden email]> wrote:



Makes sense -- I've voted on this one -- if others could too, I'm sure it wouldn't hurt! (I also see that Naomi Dushay of Blacklight fame is already advocating for this one as well...  so we're not alone in needing it!)

- Demian


From: Greg Pendlebury [[hidden email]]
Sent: Thursday, October 10, 2013 4:51 PM
To: Demian Katz
Cc: Eoghan Ó Carragáin;
[hidden email]
Subject: Re: [VuFind-Tech] Extended Dismax

>> is there a ticket for that?

You want this one: https://issues.apache.org/jira/browse/SOLR-2649

My (perhaps imprecise) description would be that for eDisMax, all 'q.op' and 'defaultOperator' settings eventually just become an 'mm' value (eg. q.op=AND becomes mm=100%), and this value is then ignored when OR or NOT operators are observed in the query string.

Ta,
Greg

 

On 11 October 2013 00:00, Demian Katz <[hidden email]> wrote:

Thanks, Greg – good to know that there are still some rough edges; this makes me a little more inclined to be conservative about the transition…  but we’ll discuss in more detail on the next call.

 

Regarding the OR/NOT problem, is there a ticket for that? I didn’t notice it among the links you shared. In any case, I’ll take a closer look at that issue the next time I work on the edismax branch.

 

- Demian

 

From: Greg Pendlebury [mailto:[hidden email]
Sent: Wednesday, October 09, 2013 10:27 PM
To: Eoghan Ó Carragáin
Cc: Demian Katz; 
[hidden email]


Subject: Re: [VuFind-Tech] Extended Dismax

 

Depending on how searches are constructed you may want to take a look at these tickets:https://issues.apache.org/jira/browse/SOLR-2368, particularly this one:https://issues.apache.org/jira/browse/SOLR-2649

We are trying to switch to eDisMax on one of our new systems and run into the problem with the NOT operator. Users typing "term1 term2 NOT term3" get unexpected results because term1 and term2 get OR'd together instead of AND'd as soon as the NOT operator is present.

 

From our advanced search screen we can specifically construct a query to workaround this, but when users manually type the extra NOT term on the end of an existing search it screws up.

Ta,
Greg

 

On 10 October 2013 06:38, Eoghan Ó Carragáin <[hidden email]> wrote:

Hi,
Definitely, edismax++

I think it is ok to remove the Query stanzas to keep things simple for new users. However, it'd be useful to:

  • update the wiki (https://vufind.org/wiki/searches_customizing_tuning_adding) to reflect edismax config & perhaps add a "Legacy VuFind/Lucene Behaviour" section which points out that versions prior to 2.2 took a different approach which is still supported if necessary & links to an earlier version of searchspecs.yaml (which has always been pretty self-documenting anyway) 
  • add a flag in the next upgrade script (if people have customised query field boosts, they may want to evaluate the impact of edismax)

Eoghan

 






 

On 9 October 2013 20:11, Demian Katz <[hidden email]> wrote:

Regarding the elimination of the Query stanzas, my motivation here was to make the YAML less confusing for new users; if somebody wants to roll back to legacy behavior, downloading the 2.1 YAML file from Git is not incredibly difficult. We certainly could leave them in for a while, or comment them out before removing them – I’m not totally opposed to either of those possibilities – but as I say, my main motivation here is to get rid of potentially confusing noise before we forget about it and leave it there for longer than we need to.

 

Anyone else have thoughts on this?

 

- Demian

 

From: Tod Olson [mailto:[hidden email]
Sent: Wednesday, October 09, 2013 3:08 PM
To: Demian Katz
Cc: Tod Olson; anna headley; 
[hidden email]


Subject: Re: [VuFind-Tech] Extended Dismax

 

I'll just chime in that I'm in favor. 

 

My only concern is that the YAML Query stanzas are already eliminated. If someone goes production and then discovers something really off, it's more than flipping a switch to go back to the old behavior. It might be nice to have a period while that old config is still available in a pinch, but it's not a major concern.

 

The message is still "edismax++".

 

-Tod

 

 

On Oct 8, 2013, at 8:02 AM, Demian Katz <[hidden email]> wrote:

 

Thanks again, Anna – fixed here:

 

 

Please let me know if you spot any potential issues! Hopefully we can discuss this further at next week’s Summit and perhaps merge to master after the next dev call if no major problems are found.

 

- Demian

 

From: anna headley [mailto:[hidden email]gmail.com
Sent: Monday, October 07, 2013 4:25 PM
To: Demian Katz
Cc: 
[hidden email]
Subject: Re: [VuFind-Tech] Extended Dismax

 

It looks like lowercase and / or are treated as boolean operators by default [1]. I think we should make this false by default in VuFind.

Changing things like this which make special words behave differently is very jarring to librarians, so I'm concerned that there may be other things in edismax that "make things easier" (i.e. less transparent) that I'm not seeing, and which actually treat the same input differently in confusing ways.

I will try to look for other potential problems; if anyone else has greater familiarity with the details it would be great to hear some reassurance or specifics.

 

Anna


[1] http://wiki.apache.org/solr/ExtendedDisMax

 

On Mon, Oct 7, 2013 at 2:55 PM, Demian Katz <[hidden email]> wrote:

I’ve done a bit of testing, and for the most part, you get similar (not always identical) result sets, but in some cases, the relevance ranking is different.  I suspect that all significant differences are for the better – using real Dismax for advanced queries instead of the crazy Lucene-syntax hack we had before is likely eliminating some weird outliers that shouldn’t have been there in the first place.

 

- Demian

 

From: anna headley [mailto:[hidden email]
Sent: Monday, October 07, 2013 1:59 PM
To: Demian Katz
Cc: 
[hidden email]
Subject: Re: [VuFind-Tech] Extended Dismax

 

How will this change affect search results?

Thanks,
Anna

 

 

On Fri, Oct 4, 2013 at 10:45 AM, Demian Katz <[hidden email]> wrote:

I've done a very simple implementation of eDismax here:

https://github.com/vufind-org/vufind/pull/42

This switches VuFind to use eDismax by default and eliminates lots of YAML configuration that is no longer necessary. It still allows the ability to switch back to the old behavior on demand through a new YAML setting, though the only use case I can imagine for this would be if you needed VuFind to talk to an older Solr index that doesn't support eDismax; still, no harm in maintaining backward compatibility for the moment at least!

Please let me know what you think -- if there are no objections, this can be merged into master soon.

thanks,
Demian


------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134791&iu=/4140/ostg.clktrk
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech

 

 

------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech

 


------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >


_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech

 


------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech

 

 

------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech

 

 



------------------------------------------------------------------------------

_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech
12