2
I’m trying to index some wiki pages using Solr 7.0, but in the last step for that, the Dataimporthandler apparently isn’t extracting the data. I don’t know what is happing cause any error is Throwing.
When I call http://localhost:8983/Solr/mycore/dataimport? command=full-import two Different behavior are noticeable.
The first Sponse for my first request is.
{
"responseHeader":{
"status":0,
"QTime":75
},
"initArgs":[
"defaults",[
"config","data-config.xml"
]
],
"command":"full-import",
"status":"idle",
"importResponse":"",
"statusMessages":{}
}
The Second Response when I just press enter Again is.
{
"responseHeader":{
"status":0,
"QTime":26
},
"initArgs":[
"defaults",[
"config","data-config.xml"
]
],
"command":"full-import",
"status":"idle",
"importResponse":"",
"statusMessages":{
"Total Requests made to DataSource":"0",
"Total Rows Fetched":"2",
"Total Documents Processed":"0",
"Total Documents Skipped":"0",
"Full Dump Started":"2017-10-28 07:05:31",
"":"Indexing completed. Added/Updated: 0 documents. Deleted 0
documents.",
"Committed":"2017-10-28 07:05:31",
"Time taken":"0:0:0.449"
}
}
As you can see in the Second Answer, the DIH founds 2 Documents or Rows. It’s Exactly the number of the Document that I have in my test file wiki.xml
. The problem is DIH isn’t extracting as you may notice in Indexing completed. Added/Updated: 0 documents. Deleted 0 documents.
Here is my Solr Configuration: git gist I’m using Windows 10, Solr 7.0 and Lucene 7.0.
What I’m tried so far...
- One those data that I’m trying to Extract is the "user", but there are some irregularities with it, for example, the
<contributor>
XML tag have some time two subtag<username>
(the user nickname) and<id>
(the user id) when a user has an Account and some other times when the user doesn’t have an Account the<contributor>
appears only with one subtag<ip>
. So I just Try to import the data without the "user" data. - I’m just trying to get only the id and title. To that, I comment the other Fields in
data-config.xml
.
No one those tests work.