use of copyfields: title vs. titleStr

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

use of copyfields: title vs. titleStr

Naomi Dushay
(sorry I have so many questions - it's a side effect of coding ...)

I've been thinking about title vs. titleStr.

title -- tokenized, used for searching, displayed in search results

titleStr -- not tokenized, used for searching, used to sort by title

Am I missing the obvious?

Why are we searching the same data twice?  That is, why does the query  
formula include terms both for title and titleStr?   Both are used in  
default and in fielded title queries.

Why is title used for "getMoreLikeThis" but titleStr used for "did you  
mean" suggestion?

Why are we displaying title instead of titleStr? (e.g. in the Search  
results)

Of course, we have other copy fields that would have the same  
questions applied (e.g. author, topic ...)

Naomi Dushay
[hidden email]




-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech
Reply | Threaded
Open this post in threaded view
|

Re: use of copyfields: title vs. titleStr

Andrew Nagy-2
Naomi - We use both title and titleStr for searching since the title field is stemmed.  We can do better relevancy ranking by using both fields to search on.  Exact matches will work better with titleStr.

So to answer your question, we use the non-stemmed "string" fields for exact matching and wildcards and the "text" field for the use of stemming and character normalization, etc.

Does this answer your question?

Andrew
________________________________________
From: [hidden email] [[hidden email]] On Behalf Of Naomi Dushay [[hidden email]]
Sent: Thursday, July 31, 2008 7:51 PM
To: [hidden email]
Subject: [VuFind-Tech] use of copyfields: title vs. titleStr

(sorry I have so many questions - it's a side effect of coding ...)

I've been thinking about title vs. titleStr.

title -- tokenized, used for searching, displayed in search results

titleStr -- not tokenized, used for searching, used to sort by title

Am I missing the obvious?

Why are we searching the same data twice?  That is, why does the query
formula include terms both for title and titleStr?   Both are used in
default and in fielded title queries.

Why is title used for "getMoreLikeThis" but titleStr used for "did you
mean" suggestion?

Why are we displaying title instead of titleStr? (e.g. in the Search
results)

Of course, we have other copy fields that would have the same
questions applied (e.g. author, topic ...)

Naomi Dushay
[hidden email]




-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech
Reply | Threaded
Open this post in threaded view
|

field config option to preserve subfield order

Naomi Dushay

On Aug 4, 2008, at 5:54 AM, Andrew Nagy wrote:

>> -----Original Message-----
>> From: Naomi Dushay [mailto:[hidden email]]
>> Sent: Friday, August 01, 2008 5:34 PM
>> To: Andrew Nagy
>>
>> (snip)

>> Thinking about:  solrmarc field option to preserve the order in which
>> subfields are encountered in the resulting field value.  This could
>> address some of the issues with the subject display, and possibly
>> improve the more complex titles as well.

Sneaking a look at the latest solrmarc code, it looks like this is  
*nearly* implemented.  I'm not sure how quickly I'll have cycles to do  
this, as I just regenned our index and it seems to have broken some  
stuff in the UI, but it will be a high priority for us to fix the  
title and subject displays of subdivided marc data in VuFind.

This "ordered" value would be for display - an untokenized string  
field, preserving punctuation and the like.  Stored, but not indexed,  
potentially.

Then we'd have the indexed version, tokenized, stemmed, etc.

I think.

Naomi Dushay
[hidden email]




-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Vufind-tech mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/vufind-tech