五分钟之内搭建solr

Solr使搭建一个全功能的搜索服务变得更加容易。实际上也非常容易,五分钟搭好。

Installing Solr
Starting Solr
Indexing Data
Searching
Shutdown

Installing Solr

本文内容可以在Linux和Mac环境上进行实践,如果你用的是windows,再见

必须提前安装好JDK 7,或者更高版本,如果是openjdk,请安装openjdk-7。

wget http://download.nextag.com/apache/lucene/solr/5.3.0/solr-5.3.0.tgz  
tar -zxvf solr-5.3.0.tgz  
cd solr-5.3.0  

Starting Solr

Solr默认会附带一个example文件夹,内含一些我们可以直接拿来试用的样例,下面对这些例子进行简单说明:

cloud        : SolrCloud example
dih          : Data Import Handler (rdbms, mail, rss, tika)
schemaless   : Schema-less example (schema is inferred from data during indexing)
techproducts : Kitchen sink example providing comprehensive examples of Solr features

用Solr -e [样例名]可以直接运行Solr附带的这些样例,比如:

Sold -e dih  
可以直接运行dih的样例

我们来跑一下techproducts的例子。

输入

bin/solr -e techproducts  

应该会看到终端里出现如下字样:

Creating Solr home directory /tmp/solrt/solr-5.3.0/example/techproducts/solr

Starting up Solr on port 8983 using command:  
bin/solr start -p 8983 -s "example/techproducts/solr"

Waiting up to 30 seconds to see Solr running on port 8983 [/]  
Started Solr server on port 8983 (pid=12281). Happy searching!

Setup new core instance directory:  
/tmp/solrt/solr-5.3.0/example/techproducts/solr/techproducts

Creating new core 'techproducts' using command:  
http://localhost:8983/solr/admin/cores?action=CREATE&name=techproducts&instanceDir=techproducts

{
  "responseHeader":{
    "status":0,
    "QTime":2060},
  "core":"techproducts"}


Indexing tech product example docs from /tmp/solrt/solr-5.3.0/example/exampledocs  
SimplePostTool version 5.0.0  
Posting files to [base] url http://localhost:8983/solr/techproducts/update using content-type application/xml...  
POSTing file money.xml to [base]  
POSTing file manufacturers.xml to [base]  
POSTing file hd.xml to [base]  
POSTing file sd500.xml to [base]  
POSTing file solr.xml to [base]  
POSTing file utf8-example.xml to [base]  
POSTing file mp500.xml to [base]  
POSTing file monitor2.xml to [base]  
POSTing file vidcard.xml to [base]  
POSTing file ipod_video.xml to [base]  
POSTing file monitor.xml to [base]  
POSTing file mem.xml to [base]  
POSTing file ipod_other.xml to [base]  
POSTing file gb18030-example.xml to [base]  
14 files indexed.  
COMMITting Solr index changes to http://localhost:8983/solr/techproducts/update...  
Time spent: 0:00:00.491

Solr techproducts example launched successfully. Direct your Web browser to http://localhost:8983/solr to visit the Solr Admin UI  

确认Solr的运行状态,可以键入:

bin/solr status  

大致会有以下这样的输出:

Found 1 Solr nodes: 

Solr process 12281 running on port 8983  
{
  "solr_home":"/tmp/solrt/solr-5.3.0/example/techproducts/solr/",
  "version":"5.3.0 1696229 - noble - 2015-08-17 17:10:43",
  "startTime":"2015-09-14T22:41:56.876Z",
  "uptime":"0 days, 0 hours, 1 minutes, 7 seconds",
  "memory":"32.7 MB (%6.7) of 490.7 MB"
}

这样Solr就已经开始运行了,现在你可以在浏览器里直接访问http://localhost:8983/solr/,来查看了。

Indexing Data

启动脚本已经在Solr的实例里添加了一些样例数据,但是我们需要再往里面塞一些东西,来看一下Solr是怎么工作的。

在example/exampledocs文件夹里包含了一些下面需要用到的XML文件。

快速地来瞅一眼XML文件里都包含什么内容,这些XML是由一些字段构成的,每个字段都有一个名字和值。比如:

<add><doc>  
  <field name="id">9885A004</field>
  <field name="name">Canon PowerShot SD500</field>
  <field name="manu">Canon Inc.</field>
...
  <field name="inStock">true</field>
</doc></add>

这个文件夹下还有一个叫post.jar的文件,提供了向solr导入文件的一种快捷方法。

cd example/exampledocs  
java -Dc=techproducts -jar post.jar sd500.xml  

会有下面这样的输出:

SimplePostTool version 5.0.0  
Posting files to [base] url http://localhost:8983/solr/techproducts/update using content-type application/xml...  
POSTing file sd500.xml to [base]  
1 files indexed.  
COMMITting Solr index changes to http://localhost:8983/solr/techproducts/update...  
Time spent: 0:00:00.186  

说明post请求成功了。

向Solr中导入数据有两种方式:

1.HTTP  
2.Native client  

在具体章节来讨论细节,这里不做赘述。

Searching

我们来看看如何获取刚才添加的数据。

由于Solr可以直接接受HTTP请求,所以你可以直接在浏览器中和Solr进行交互,在地址栏输入:

http://localhost:8983/solr/techproducts/select?q=sd500&wt=json  

将会返回下面这样的json数据:

{
    "responseHeader": {
        "status": 0,
        "QTime": 3,
        "params": {
            "q": "sd500",
            "wt": "json"
        }
    },
    "response": {
        "numFound": 1,
        "start": 0,
        "docs": [
            {
                "id": "9885A004",
                "name": "Canon PowerShot SD500",
                "manu": "Canon Inc.",
                "manu_id_s": "canon",
                "cat": [
                    "electronics",
                    "camera"
                ],
                "features": [
                    "3x zoop, 7.1 megapixel Digital ELPH",
                    "movie clips up to 640x480 @30 fps",
                    "2.0\" TFT LCD, 118,000 pixels",
                    "built in flash, red-eye reduction"
                ],
                "includes": "32MB SD card, USB cable, AV cable, battery",
                "weight": 6.4,
                "price": 329.95,
                "price_c": "329.95,USD",
                "popularity": 7,
                "inStock": true,
                "manufacturedate_dt": "2006-02-13T15:26:37Z",
                "store": "45.19614,-93.90341",
                "_version_": 1512330534874775600
            }
        ]
    }
}

不错~我们验证了刚才导入的sd500.xml的文件内容是正常的。

下面来做一些真正的搜索。

下面是一个获取inStock = false的数据结果样例:

请求:http://localhost:8983/solr/techproducts/select?q=inStock:false&wt=json&fl=id,name

响应结果:
{
    "responseHeader": {
        "status": 0,
        "QTime": 3,
        "params": {
            "fl": "id,name",
            "q": "inStock:false",
            "wt": "json"
        }
    },
    "response": {
        "numFound": 4,
        "start": 0,
        "docs": [
            {
                "id": "EN7800GTX/2DHTV/256M",
                "name": "ASUS Extreme N7800GTX/2DHTV (256 MB)"
            },
            {
                "id": "100-435805",
                "name": "ATI Radeon X1900 XTX 512 MB PCIE Video Card"
            },
            {
                "id": "F8V7067-APL-KIT",
                "name": "Belkin Mobile Power Cord for iPod w/ Dock"
            },
            {
                "id": "IW-02",
                "name": "iPod & iPod Mini USB 2.0 Cable"
            }
        ]
    }
}

在其它的教学中你可以了解到各种各样的URL请求参数。

Shutdown

想要关闭Solr的话,只要用bin/solr stop即可。

Sold自身是很健壮的,所以即使在你遇到OS或者磁盘崩溃的情况,一般情况下也不会使Solr的索引被破坏。

原文链接:

http://www.solrtutorial.com/solr-in-5-minutes.html