Thursday 9 July 2015

Solr Preprocessing and Indexing in WCS

Scenario

To create a custom preprocessor and index the new attribute ‘BESTSELLER’ for the catalog entry document.

Steps for Preprocessing

1. Navigate to ..\IBM\WCDE_ENT70\search\pre-processConfig\MC_10001\DB2 and create a folder ‘bestseller’. 
2. Now create a custom preprocessor with the below lines of code and name the file as  ‘wc-dataimport-preprocess-fullbuild.xml’ to perform the preprocessing explosion and flattening of data from the custom bestseller table. (We name the custom preprocessor as the “wc-..-..-fullbuild.xml because the preprocessing script looks for this file once it is executed).

<?xml version="1.0" encoding="UTF-8"?>

<_config:DIHPreProcessConfig xmlns:_config="http://www.ibm.com/xmlns/prod/commerce/foundation/config" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.ibm.com/xmlns/prod/commerce/foundation/config ../../xsd/wc-dataimport-preprocess.xsd ">        
 <_config:data-processing-config processor="com.ibm.commerce.foundation.dataimport.preprocess.StaticAttributeDataPreProcessor" masterCatalogId="10001" batchSize="500">
    <_config:table definition="CREATE TABLE XI_BESTSELLER_0_#lang_tag# (CATENTRY_ID BIGINT NOT NULL, BESTSELLER VARCHAR(10240))" name="XI_ BESTSELLER _0_#lang_tag#"/>
                <_config:query sql="SELECT TI_CE.CATENTRY_ID CATENTRY_ID, ATTRVALDESC.STRINGVALUE BESTSELLER
                                                 FROM TI_CATENTRY_0 TI_CE, CATENTRYATTR CATENTRYATTR, ATTRVALDESC ATTRVALDESC
                                                  WHERE
                                                                TI_CE.CATENTRY_ID = CATENTRYATTR.CATENTRY_ID
                                                                AND CATENTRYATTR.ATTR_ID = (SELECT ATTR_ID FROM ATTR WHERE IDENTIFIER = 'BESTSELLER')
                                                                AND CATENTRYATTR.ATTRVAL_ID = ATTRVALDESC.ATTRVAL_ID
                                                                AND ATTRVALDESC.LANGUAGE_ID =?language_id?
                                                 ORDER BY CATENTRY_ID"/>
    <_config:mapping>
      <_config:key queryColumn="CATENTRY_ID" tableColumn="CATENTRY_ID"/>
      <_config:column-mapping>
        <_config:column-column-mapping>
                <_config:column-column queryColumn="BESTSELLER" tableColumn="BESTSELLER" />
        </_config:column-column-mapping>
        </_config:column-mapping>
    </_config:mapping>                  
  </_config:data-processing-config>
 
 </_config:DIHPreProcessConfig>
 
3. Navigate to ..\IBM\WCDE_ENT70\bin and run the preprocessing scripts using the below command –

di-preprocess.bat ..\IBM\WCDE_ENT70\search\pre-processConf
ig\MC_10001\DB2\bestseller -force true

4. Validation to check if the preprocessing is successfully done –
Query the temp table and check if the table is populated with the column values.

Steps for Indexing

1. The schema.xml needs to be customized to add in the new field assignments from the preprocessing tableNavigate to ..\IBM\WCDE_ENT70\search\solr\home\MC_10001\en_US\CatalogEntry\conf and edit the schema.xml
Add the following field within the <fields> tag –

<field name="BESTSELLER" type="wc_keywordText" indexed="true" stored="true" multiValued="true"/>

2. Each targetable file (wc-data-config.xml) needs to be modified to pull the BESTSELLER data from the XI table and add it to the index based on the particular store for the Catentry.
Now edit the wc-data-config.xml file and add the following lines –

Go to the query section and inside ‘select’, add XI_BESTSELLER. BESTSELLER BESTSELLER,
Inside ‘FROM CATENTRY’, add LEFT OUTER JOIN XI_ BESTSELLER _0_1 XI_BESTSELLER ON (CATENTRY.CATENTRY_ID=XI_BESTSELLER.CATENTRY_ID)

Add the field mapping –

<field column=" BESTSELLER" splitBy=";" sourceColName=" BESTSELLER"/>

Here, column refers to the schema field in Solr and sourceColName refers to table column in db.

3. Navigate to ..\IBM\WCDE_ENT70\bin and run the indexing scripts using the below command –

di-buildindex.bat -masterCatalogId 10001 -indextype Catalo
gEntry -localename en_US

4. Validation to check if the field is indexed into Solr –
Hit the Solr URL and check if the field is indexed and populated with values.


No comments:

Post a Comment