Solr /export 海量数据导出实现
Solr需要流式导出海量数据,导出数据是基于流式的,当服务端match到第一条数据之后就会向客户端flush出数据。
需要导出的列需要将schema中field元素的docvalue设置为true,并且在solrconfig.xml中配置
<requestHandler name="/export" class="solr.SearchHandler">
<lst name="invariants">
<str name="rq">{!xport}</str>
<str name="wt">xsort</str>
<str name="distrib">false</str>
</lst>
<arr name="components">
<str>query</str>
</arr>
</requestHandler>客户端的查询代码如下:
final String[] fl = StringUtils.split(fields, ",");
SolrClient client = new HttpSolrClient(url);
query.setDistrib(false);
query.setFields(fields);
query.setRows(9999999);
final PrintWriter writer = new PrintWriter(new OutputStreamWriter(
FileUtils.openOutputStream(outfile), Charset.forName("utf8")));
for (String f : fl) {
writer.print(f);
writer.print(",");
}
final AtomicInteger count = new AtomicInteger(0);
QueryResponse result = client.queryAndStreamResponse(query,
new StreamingResponseCallback() {
@Override
public void streamSolrDocument(SolrDocument doc) {
// process doc
}
public void streamDocListInfo(long numFound, long start,
Float maxScore) {
// writer.println("numFound:" + numFound);
}
});
writer.close();
System.out.println("numFound:" + result.getResults().getNumFound());
client.close();solr服务端相关的代码:
QP:
ExportQParserPlugin 在export handler中使用QP
查询结果流式排序输出:
SortingResponseWriter
相关推荐
spylyt 2020-09-11
upxiaofeng 2020-06-11
TyCoding 2020-05-03
upxiaofeng 2020-04-30
lionelf 2020-04-20
TyCoding 2020-04-08
TyCoding 2020-03-26
wenchanter 2020-03-26
roygbip 2020-02-16
wsxsxz 2020-02-03
lionelf 2020-02-03
lionelf 2020-02-03
TyCoding 2020-02-01
heniancheng 2020-01-31
lionelf 2020-01-30
TyCoding 2020-01-10