Padre Cooler Options
Description
This page describes the possible options for tuning the ranking using the cool query processor option. For more information about how raking works, see Funnelback_Ranking_Algorithms.
Those options can either be set in Query processor options (collection.cfg) or using CGI parameters (e.g. ...&cool.2=12&cool.3=34...).
List of cooler options
| Number | Description |
|---|---|
| 0 | content: content weight |
| 1 | onlink: onsite link weight |
| 2 | offlink: offsite link weight |
| 3 | urllen: URL length weight |
| 4 | qie: external evidence (qie) weight |
| 5 | date_proximity: proximity to current date weight |
| 6 | urltype: URL attractiveness (Homepages favoured. Copyright pages and URLS with lots of punctuation deprecated.) |
| 7 | annie: annotation weight (annie) |
| 8 | domain_weight: weight associated with this domain |
| 9 | geoprox: geographical proximity to origin |
| 10 | nonbin: non-binariness (1 for html, xml, txt, 0 otherwise) |
| 11 | no_ads: freedom from ads |
| 12 | imp_phrase: implicit phrase match score |
| 13 | consistency: consistency of evidence. (Extra reward for docs with non-zero scores on both content and annie.) |
| 14 | log_annie: logarithm of annotation weight (log(annie)) |
| 15 | anlog_annie: absolute-normalised logarithm of annotation weight. |
| 16 | annie_rank: annotation rank = (k - rank)/ k. where k = 2 x highest rank requested - if rank > k, rank = k |
| 17 | BM25F: field-weighted Okapi score |
| 18 | an_okapi: absolute-normalised Okapi score. |
| 19 | BM25F_rank: field-weighted Okapi rank. |
| 20 | mainhosts: bias in favour of principal servers (web search only). |
| 21 | comp_wt: component collection weighting. (meta collections only). |
| 22 | document_number: document number in the crawl. An early position in the crawl may correlate with importance |
| 23 | host_incoming_link_score |
| 24 | host_click_score |
| 25 | host_linking_hosts_score |
| 26 | host_linked_hosts_score |
| 27 | host_rank_in_crawl_order_score |
| 28 | host_domain_shallowness_score |
| 29 | doc_matches_regex: document matches administrator supplied regex |
| 30 | doc_does_not_match_regex: document does not match administrator supplied regex |
| 31 | titleWords: number of words in title |
| 32 | contentWords: number of indexed words in document |
| 33 | compressionFactor: compressibility of document text |
| 34 | entropy: entropy of document |
| 35 | stopwordFraction: fraction of stopwords in the document |
| 36 | stopwordCover: fraction of stopword list present in the document |
| 37 | averageTermLen: average term length |
| 38 | distinctWords: number of distinct words in the document |
| 39 | maxFreq: frequency of most frequently occurring term |
| 40 | titleWords_neg: Neg number of words in title |
| 41 | contentWords_neg: Neg number of indexed words in document |
| 42 | compressionFactor_neg: Neg compressibility of document text |
| 43 | entropy_neg: Neg entropy of document |
| 44 | stopwordFraction_neg: Neg fraction of stopwords in the document |
| 45 | stopwordCover_neg: Neg fraction of stopword list present in the document |
| 46 | averageTermLen_neg: Neg average term length |
| 47 | distinctWords_neg: Neg number of distinct words in the document |
| 48 | maxFreq_neg: Neg frequency of most frequently occurring term |
| 49 | titleWords_abs: Abs number of words in title |
| 50 | contentWords_abs: Abs number of indexed words in document |
| 51 | compressionFactor_abs: Abs compressibility of document text |
| 52 | entropy_abs: Abs entropy of document |
| 53 | stopwordFraction_abs: Abs fraction of stopwords in the document |
| 54 | stopwordCover_abs: Abs fraction of stopword list present in the document |
| 55 | averageTermLen_abs: Abs average term length |
| 56 | distinctWords_abs: Abs number of distinct words in the document |
| 57 | maxFreq_abs: Abs frequency of most frequently occurring term |
| 58 | titleWords_abs_neg: Abs number of words in title |
| 59 | contentWords_abs_neg: Neg abs number of indexed words in document |
| 60 | compressionFactor_abs_neg: Neg abs compressibility of document text |
| 61 | entropy_abs_neg: Neg abs entropy of document |
| 62 | stopwordFraction_abs_neg: Neg abs fraction of stopwords in the document |
| 63 | stopwordCover_abs_neg: Neg abs fraction of stopword list present in the document |
| 64 | averageTermLen_abs_neg: Neg abs average term length |
| 65 | distinctWords_abs_neg: Neg abs number of distinct words in the document |
| 66 | maxFreq_abs_neg: Neg abs frequency of most frequently occurring term |
| 67 | lexical_span_score |
| 68 | doc_matches_cgscope1: Documents which match gscope defined by -cgscope1 (if defined) |
| 69 | doc_matches_cgscope2: Documents which match gscope defined by -cgscope2 (if defined) |
| 70 | doc_does_not_match_cgscope1: Documents which do not match gscope defined by -cgscope1 (if defined) |
| 71 | doc_does_not_match_cgscope2: Documents which do not match gscope defined by -cgscope2 (if defined) |
| 72 | raw_annie: Untransformed annie score linealry scaled to 0..1 |
Values
Values are unbounded, but typical weights range from 0-100.
Example
To set the query processor to ignore URL length, but give a high weight to phrase matches implied by the query:
query_processor_options=-cool.3=0 -cool.12=100