search-engine,information-retrieval,data-retrieval , How does trec_eval calculates Mean Average Precision (MAP)?

How does trec_eval calculates Mean Average Precision (MAP)?


Tag: search-engine,information-retrieval,data-retrieval

I'm using TREC_EVAL to evaluate a search engine. I'd like to know how it calculates the Mean Average Precision (MAP). I'm sure it doesn't calculate a simple average of the average precisions (AP). It seems a weighted arithmetic but I can't understand which weights are used.


MAP is indeed a simple arithmetic mean of the AP scores for individual topics. (But remember that AP for an individual topic is computed over all relevant documents. There is a frequently used incorrect definition of AP that computes 'AP' over only relevant retrieved documents, but that is a nonsensical measure as it rewards retrieving fewer relevant.)

The "Common Measures" section of the Appendix to each TREC proceedings has the definition of the most commonly used TREC eval measures. TREC proceedings are in the Publications section of the TREC web site,

Ellen Voorhees TREC project manager NIST


Automatic Search Using WWW::Mechanize

I am trying to write a Perl script which will automatically key in search variables on this LexisNexis search page and retrieve the search results. I am using the WWW::Mechanize module but I am not sure how to figure out the field name of the search bar itself. This is...

How to get certain information out of arraylist grouped into other lists in Java

I wrote a program, that reads multiple (similar) textfiles out of a Folder. Im splitting the information by space and store everything in one arraylist which contains data kind of this: key1=hello key2=good key3=1234 ... key15=repetition key1=morning key2=night key3=5678 ... Now I'm looking for a way to get those information...

Keep non-stemmed tokens on Elasticsearch

I'm using a stemmer (for the Brazilian Portuguese Language) when I index documents on Elasticsearch. This is what my default analyzer looks like(nvm minor mistakes here because I've copied this by hand from my code in the server): "analysis":{ "filter":{ "my_asciifolding": { "type":"asciifolding", "preserve_original":true, }, "stop_pt":{ "type": "stop", "ignore_case": true,...

ERROR: index 'products': too many string attributes (current index format allows up to 4 GB)

Got this when tried to index table in database with 25GB of data. Sphinx contains index declaration with following fields: sql_field_string = field_indexer #some keywords sql_field_string = product_name sql_field_string = description sql_attr_float = price sql_field_string = product_url sql_field_string = image_url sql_field_string = sku sql_attr_uint = merchant_id sql_attr_uint = network_id All...

How to make the most match item to the top in ElasticSearch query result

I'm using ElasticSearch to build a e-commerce search engine like or There are some item like: iPhone 6 Case - iPhone 6 Wallet Case , iPhone 6 Leather Case ,Flip Wallet Leather Case Cover with Credit Card Holder For Apple iPhone 6 4.7'' Black iPhone 6 / 6...

PHP mySQL search not working

<?php $username = "root"; $password = ""; $hostname = "localhost"; $db_handle = mysql_connect($hostname, $username, $password) or die ("Could not connect to database"); $selected= mysql_select_db("login", $db_handle); $output=''; if(isset($_POST['search'])){ $searchq = $_POST['search']; $query= "SELECT * FROM PHP_Item WHERE Name LIKE '%searchq%' OR Description LIKE '%serachq%'" or die ("could not search"); $result= mysql_query($query);...

How does trec_eval calculates Mean Average Precision (MAP)?

I'm using TREC_EVAL to evaluate a search engine. I'd like to know how it calculates the Mean Average Precision (MAP). I'm sure it doesn't calculate a simple average of the average precisions (AP). It seems a weighted arithmetic but I can't understand which weights are used.

Questions about CACM collection

I'm using CACM document collection. I tried to search more information on this collection online but unfortunately I didn't find what I was looking for. If I've understood correctly, this collection contains documents from a paper journal. As far as this is concerned, I don't understand why every document always...

Natural Language Search (user intent search)

I'm trying to build a search engine that allows my users to search with natural language commands, just like Google Now. Except, my search engine is slightly more constrained, in that it is mainly going to be used within an e-commerce site, and allow the users to search for certain...

How to define a CAS in database as external resource for an annotator in uimaFIT?

I am trying to structure my a data processing pipeline using uimaFit as follows: [annotatorA] => [Consumer to dump annotatorA's annotations from CAS into DB] [annotatorB (should take on annotatorA's annotations from DB as input)]=>[Consumer for annotatorB] The driver code: /* Step 0: Create a reader */ CollectionReader readerInstance= CollectionReaderFactory.createCollectionReader(...

Replacing .html files with .php files while maintaining search engine rankings

I maintain a website that contains a dozen or so .html documents which I have just rewritten to include php code. As search engines currently index the .html documents, I would rather not break those links and I certainly don't want to do anything that will affect my search rankings....

Using Google Custom Search engine with a little privacy

I would like to use a Google Custom Search Engine on my website. With Google's default implementation, you have to put Javascript on each page that has the search box. For privacy reasons, I would like to load that Javascript only for those users who actually use the search engine....

Where can I find a corpus of search engine queries?

I'm interested in training a question-answering system on top of user-generated search queries but so far it looks like such data is not made available. Are there some research centers or industry labs that have compiled corpora of search-engine queries?

Umbraco Omitting Media files from search results

As my title suggests, I am wanting my search code to omit anything in my media folder. Right now, if I search for "test" it will bring back any page with the word test in it as well as any document or image which has test on it as well....

Search functionality outputting as 'array'

I followed a tutorial on how to make a search bar functional and I am not seeing what I'm doing wrong. I am trying to give users the option to search for products. The end result is everything is being out-putted as 'Array'. The correct amount of search results show...

How do I create search like excel in DataGridView?

I used this code to search in DataGridView to find and select a row (no filter)! But, when DataGridView has repetitive values in rows it won't get the next row! How do I go to the next row with every click to Btn_find (Find similar to Excel)? private void button1_Click(object...

calculating tf-idf for web pages

I am new to IR and I would like to calculate tf-idf for webpages. For the "tf" part, I want to calculate see frequency of each word in content of one webpage. For the "idf" part, I want to compare multiple webpages for the content. Is there a tool/API that...

Search box/field design with multiple search locations

Not sure if this question is better suited for a different StackExchange site but, here goes: I have a search page that searches a number of different type of things. All (at the moment) requiring a different input field for each type of search. For example, one might search for...

Elasticsearch two sets of terms against two fields

I'm trying to use Elasticsearch to return docs that have different terms in two fields. Not knowing how to write this it would be something like this: query: field1: "term set #1" field2: "very different term set #2" Ideally the term sets would be arrays of strings. I'd like all...

Multiple Aggregate Ratings of

I have multiple aggregate ratings snippets in one page. Is there a way to make one of them the default one? The one that will be displayed in the results of Search Engines? Thanks all! Update: That webpage is, essentially, the page of a Brand. It contains the aggregate ratings...

Fulltext Search engine, multiple columns, boolean mode

I am making a search engine for an android app that does fulltext search and match against multiple columns against '+word1 +word2' in boolean mode. However, I can't get any search result. E.g. search field type- "open sea" then, Sql will search Match...Against ('+open +sea' IN BOOLEAN MODE) and display...

Euclidean vs Cosine for text data

IF I use tf-idf feature representation (or just document length normalization), then is euclidean distance and (1 - cosine similarity) basically the same? All text books I have read and other forums, discussions say cosine similarity works better for text... I wrote some basic code to test this and found...

Search SharePoint Foundation 2013 Picture Library by terms defined in Keywords field

Since Term Store functionality (and probably most of metadata functionality) isn't available in SharePoint Foundation 2013, I couldn't find a way to search through the pictures using some sort of tagging. Thus I decided to employ something what is available already in Foundation version. When you edit the picture, you...

how google crawls dynamic pages? [closed]

I am about to create an Online Shopping site for my one of the client. I have to make this site SEO Friendly and therefore I must have to understand few things before I proceed to make a custom CMS Based website. As I said I am going to make...

Can anyone help me make the search bar work as I now have the JS prompt? [on hold]

I have created a small program that pulls from the YouTube API which allows you to search for a random video for whatever title you enter when prompted. My goal is to have this work like a search engine. I would like to make my search bar the input instead...

robots.txt allow all except few sub-directories

I want my site to be indexed in search engines except few sub-directories. Following are my robots.txt settings: robots.txt in the root directory User-agent: * Allow: / Separate robots.txt in the sub-directory (to be excluded) User-agent: * Disallow: / Is it the correct way or the root directory rule will...

Disallow specific folders in robots.txt with wildcards

Can i hide specific folders from crawlers with wildcards like: User-agent: * Disallow: /system/ Disallow: /v* I want to hide all folders starts with "v" character. It will work this way?...

Is it necesary to generate sitemaps for old indexed urls?

I have a web site with content from 2001 and I need to remake the sitemap. Question arises: if the old urls have already been indexed do I need to add them again (the same urls) to the sitemap even if not haven't changed? for example: the sitemap have this...