When Sphinx Search v. 0.9.9 ends

My source data grew and grew until I encounted the problem with the version 0.9.9.

1. INDEXER

While building index I received the following warning:
WARNING: duplicate document ids found

I checked it twice and the source data were not duplicated.

This is the sample of the indexer output when the a/b warning occurred.
collected 37556684 docs, 88183.8 MB
collected 203872754 attr values
sorted 203.9 Mvalues, 100.0
sorted 53924.6 Mhits, 100.0

2. SEARCHD

The next problem was that the searchd was unable to bring this index up (I received the message “Segmentation fault”).
I observed the following errors in /var/log/messages
Feb 15 15:54:57 shp01 kernel: searchd[4860]: segfault at 0000000000000000 rip 000000000048291e rsp 00007ffff30ebba0 error 4
Feb 17 08:32:12 shp01 kernel: searchd[2692]: segfault at 0000000000000020 rip 00000000004d627b rsp 00007ffff30eabf0 error 6

While examaning the sorces I think the problem in indexer was at:
# in the loop
while ( qDocinfo.GetLength() )
With the following code I noticed many of id had the value 0:
bool bPBiszero = DOCINFO2ID ( pEntry ) ==0;

When the searchd started it has the following problems:

# in the method:
bool CSphIndex_VLN::Preread ()  
# int the loop:
for ( DWORD i=1; i=m_pMva.GetNumEntries() )
   // printf("broken index: mva docid verification failed [...]
    continue;
if ( i==0 && DOCINFO2ID(pMva-DOCINFO_IDSIZE)!=uDocID )
    // printf("broken index: mva docid verification failed, (docidfmt: [...]
    if (uDocID == 0)
        continue;

And the index worked (but no concistency).

The solution is to use the latest version (2.0.4).

  1. No comments yet.

  1. No trackbacks yet.