types.rdb - maps 6Mb of memory ... (up from 3) + block based, but multi-block records ok (apparently) + + 512 byte blocks ... storbase.cxx: storeError OStoreSuperBlockPage::create ( OStorePageBIOS &rBIOS, const OStorePageDescriptor &rDescr) { storeError eErrCode = store_E_OutOfMemory; void * p = rtl_allocateMemory (rDescr.m_nSize); if (p != 0) { // Setup initial Page. __store_memset (p, 0, rDescr.m_nSize); // Mark as modified. m_aState.modified(); // Write initial Page. eErrCode = rBIOS.write (0, p, rDescr.m_nSize); if (eErrCode == store_E_None) storbase.cxx: storeError OStorePageBIOS::create (sal_uInt16 nPageSize) eErrCode = m_pSuper->create ( *this, OStorePageDescriptor (nPageSize, nPageSize, nMinSize)); ... store/inc/store/types.h /** PageSize (recommended) default. @see store_openFile() */ #define STORE_DEFAULT_PAGESIZE ((sal_uInt16)0x0100) /** PageSize (enforced) limits. @see store_openFile() */ #define STORE_MINIMUM_PAGESIZE ((sal_uInt16)0x0100) #define STORE_MAXIMUM_PAGESIZE ((sal_uInt16)0x8000) ... storeError SAL_CALL store_openFile ( rtl_uString *pFilename, storeAccessMode eAccessMode, sal_uInt16 nPageSize, storeFileHandle *phFile ) SAL_THROW_EXTERN_C(); ... Used from registry/ + store.hxx: inline storeError createInMemory ( sal_uInt16 nPageSize = STORE_DEFAULT_PAGESIZE ) SAL_THROW(()); registry/source/regimpl.cxx (gdb) up #5 0x4001a69d in ORegistry::initRegistry (this=0x401fdea8, regName=@0xbfffc880, accessMode=4) at /opt/OpenOffice/src680-m129/registry/source/regimpl.cxx:531 531 errCode = rRegFile.create(regName, sAccessMode, REG_PAGESIZE); Crash in: storlckb.cxx #0 0x4046f0c0 in memset () from /lib/libc.so.6 #1 0x4005cdf8 in rtl_zeroMemory () from /opt/OpenOffice/src680-m129/solver/680/unxlngi4.pro/lib/libuno_sal.so.3 #2 0x405288f7 in store::OStoreDirectory::create () from /opt/OpenOffice/src680-m129/solver/680/unxlngi4.pro/lib/libstore.so.3 #3 0x4052f7eb in store_openDirectory () from /opt/OpenOffice/src680-m129/solver/680/unxlngi4.pro/lib/libstore.so.3 #4 0x4001ae00 in ?? () from /opt/OpenOffice/src680-m129/solver/680/unxlngi4.pro/lib/libreg.so.3 #5 0x4002563d in reg_getResolvedKeyName () from /opt/OpenOffice/src680-m129/solver/680/unxlngi4.pro/lib/libreg.so.3 block samller than XYZ string lengths ? Page size selected is 256 Start create xManager's page size 256 fprintf( stderr, "xManager's page size %d\n", m_aDescr.m_nSize ); delete m_pNode; m_pNode = new(m_aDescr.m_nSize) inode(m_aDescr.m_nSize); if (!m_pNode) return store_E_OutOfMemory; fprintf( stderr, "created m_pNode\n"); ie. dies in this 'new' ... source/storpage.hxx: typedef OStoreDirectoryPageData inode; stordata.hxx: struct OStoreDirectoryPageData : public store::OStorePageData (cf. storbase.hxx's OStorePageData ) + the directory has a maxiumum namesize ... Potentially use a smaller STORE_MAXIMUM_NAMESIZE if the block size is smaller ? ** With a MAX_NAMESIZE of 64 bytes + works perfectly for types.rdb ... ** Problem: types.h: __store_FindData: exposes STORE_MAXIMUM_NAMESIZE as part of the (stable?) API ... => to shrink the block size ... + Can we make directory entries variable size ? & leave the MAX_PATH thing in-place ? + looks like the 'Directory' is a file ... (nice) + store.hxx: + Directory API has: + 'traveller' class 'travel' callback. + 'first', 'next' iterators. -> store.inl: store_findFirst, store_findNext + the whole API 'storeFindData' ... + registry lookups - prolly deadly slow - linear search over this file ... - m_nReserved - an index ? + storelcbk.cxx: OStoreDirectory::iterate (*pFindData) + single entry point ... + Crazy [!] + there is a public 'OStoreDirectory' interface, (all inline) store/inc/store/store.hxx + and a private 'OStoreDirectory' (storelckb.hxx) - we can bin all the 'clever' ness, and just use a 'file' to stream directory entries into (most likely) struct OStoreDirectoryPageData : public store::OStorePageData NameBlock m_aNameBlock; - block == inline foo DataBlock m_aDataBlock; sal_uInt8 m_pData[1]; + m_pData[1] == symlink target etc. + [!] hmm ... - lots of hard-coded sizes ... + [!] - no hashing of names ... * Stream API: store_openStream: -> OStoreLockBytes::create 'Stream' is wrapper around 'OStoreLockBytes' + storlckb.cxx (OStoreLockBytes) OStoreLockBytes::create __store_iget - unwinds path ? + returns 'inode' (m_pNode) OStorePageDescriptor: inode->m_aDescr inode == OStoreDirectoryPageData ... * Directory API: + inode ... [ registry code !? ] + hmm ... * Registry API: + Need to provide: + + just a ton of data structures + 1 big write, + multiple reads ... + linear scans ... [directory files] + + + directory at the end of the file ... + ... foo ... + zip file ? * The object to make vari-sized is: + 'OStoreDirectoryPageData' + includes name, data, m_pDirData + is an 'OStorePageData' - implicitely '1 block' + Contains an: + OStoreDirectoryDataBlock + data length and + pointers to inodes + 16+8+1+1 * 4 bytes = 104 bytes. + 'OStoreDirectoryPageObject' + seems to be the in-memory representation of the thing. + 'get' and 'put' methods. + [ gets a Page handle at construct time ] * What code is used to manipulate zip files [!] + package/ - contains the .zip code... + is there .zip support in the URE !? + can we just use this component ? + do we need type data for it ? ... + can we use the 'ZipFile' code ... + * Problems: + directory mgmt: + most of the data is in the 'directory' ... + but we need to be able to extend it dynamically ... + how is that done ? [ nasty scattered linked list ? ] + most simply - yes ... + Represent files - they can grow too ? + chunk lists ? + fragmentation - not an issue - always foo ... + 'blocks' ? - what we want to avoid ... + => file descriptor foo ... + Problems: + multiple directories open concurrently + concurrent (thread-safe?) writes to multiple streams + the registry API allows all of that. + Can we go 1 level up: + to the 'types' & 'services' level ? + why are we using this code to store them ? + Dies during registry foo with: Office/services.rdb/en-US_inprogress_1/services.rdb' succesful! Name too long (103) ... Name too long (103) ... >>>Connection_Impl.writeRegistryServi ie. we need a more imaginative approach here ... + Symlinks - seem to use the data itself as the destination (good) [ nice ] + 5 entry points where we need this. + how do we allocate & chain a new block + how do we subsequently free that (?) + deletefoo... + vari-sized blocks (?) rBIOS.allocate(OStorePageObject) / OStorePageBIOS::free ... STORE_PAGE_NULL + TODO: + check serialization of the extra data ... storlckb.cxx: 1. write name - from 'const rtl_String *' 2. read name - build CRC - requires psz char * 3. read from psz char * -> unicode conversion storpage.cxx: 1. write name - from pSrcName 'rtl_String *' 2. write name - from pDstName 'rtl_String *' OStorePageManager - more friendly ? setName '$VL_value' ==25250== Invalid read of size 4 ==25250== at 0x341997BF: rtl_freeMemory_CUSTOM (in /opt/OpenOffice/src680-m129/solver/680/unxlngi4.pro/lib/libuno_sal.so.3) ==25250== by 0x341998F3: rtl_freeMemory (in /opt/OpenOffice/src680-m129/solver/680/unxlngi4.pro/lib/libuno_sal.so.3) ==25250== by 0x3464BD16: store::OStorePageData::operator delete(void*) (storbase.hxx:561) ==25250== by 0x346514F4: store::OStoreLockBytes::~OStoreLockBytes() (storlckb.cxx:408) ==25250== by 0x34643624: store::OStoreObject::release() (object.cxx:128) ==25250== by 0x346515A8: store::OStoreLockBytes::release() (storlckb.cxx:436) ==25250== by 0x346584A2: store_releaseHandle (store.cxx:133) ==25250== by 0x341581FA: (within /opt/OpenOffice/src680-m129/solver/680/unxlngi4.pro/lib/libreg.so.3) ==25250== by 0x34161D3E: (within /opt/OpenOffice/src680-m129/solver/680/unxlngi4.pro/lib/libreg.so.3) ==25250== by 0x3415D65B: (within /opt/OpenOffice/src680-m129/solver/680/unxlngi4.pro/lib/libreg.so.3) ==25250== by 0x8087F63: RegistryKey::setValue(rtl::OUString const&, RegValueType, void*, unsigned long) (in /opt/OpenOffice/src680-m129/solver/680/unxlngi4.pro/bin/idlc) ==25250== by 0x8085F67: AstModule::dump(RegistryKey&) (in /opt/OpenOffice/src680-m129/solver/680/unxlngi4.pro/bin/idlc) ==25250== by 0x80843A0: AstDeclaration::dump(RegistryKey&) (in /opt/OpenOffice/src680-m129/solver/680/unxlngi4.pro/bin/idlc) ==25250== by 0x80860EA: AstModule::dump(RegistryKey&) (in /opt/OpenOffice/src680-m129/solver/680/unxlngi4.pro/bin/idlc) ==25250== by 0x80843A0: AstDeclaration::dump(RegistryKey&) (in /opt/OpenOffice/src680-m129/solver/680/unxlngi4.pro/bin/idlc) ==25250== by 0x80860EA: AstModule::dump(RegistryKey&) (in /opt/OpenOffice/src680-m129/solver/680/unxlngi4.pro/bin/idlc) ==25250== by 0x80843A0: AstDeclaration::dump(RegistryKey&) (in /opt/OpenOffice/src680-m129/solver/680/unxlngi4.pro/bin/idlc) ==25250== by 0x80860EA: AstModule::dump(RegistryKey&) (in /opt/OpenOffice/src680-m129/solver/680/unxlngi4.pro/bin/idlc) ==25250== by 0x80843A0: AstDeclaration::dump(RegistryKey&) (in /opt/OpenOffice/src680-m129/solver/680/unxlngi4.pro/bin/idlc) ==25250== by 0x80860EA: AstModule::dump(RegistryKey&) (in /opt/OpenOffice/src680-m129/solver/680/unxlngi4.pro/bin/idlc) ==25250== by 0x807CC6E: produceFile(rtl::OString const&) (in /opt/OpenOffice/src680-m129/solver/680/unxlngi4.pro/bin/idlc) ==25250== by 0x80780C1: sal_main (in /opt/OpenOffice/src680-m129/solver/680/unxlngi4.pro/bin/idlc) ==25250== by 0x80776C7: main (in /opt/OpenOffice/src680-m129/solver/680/unxlngi4.pro/bin/idlc) ==25250== Address 0xC92794DC is not stack'd, malloc'd or (recently) free'd ==25250== ==25250== Process terminating with default action of signal 11 (SIGSEGV): dumping core ==25250== at 0x341997BF: rtl_freeMemory_CUSTOM (in /opt/OpenOffice/src680-m129/solver/680/unxlngi4.pro/lib/libuno_sal.so.3) ==25250== by 0x341998F3: rtl_freeMemory (in /opt/OpenOffice/src680-m129/solver/680/unxlngi4.pro/lib/libuno_sal.so.3) ==25250== by 0x3464BD16: store::OStorePageData::operator delete(void*) (storbase.hxx:561) ==25250== by 0x346514F4: store::OStoreLockBytes::~OStoreLockBytes() (storlckb.cxx:408) ==25250== by 0x34643624: store::OStoreObject::release() (object.cxx:128) ==25250== by 0x346515A8: store::OStoreLockBytes::release() (storlckb.cxx:436) ==25250== by 0x346584A2: store_releaseHandle (store.cxx:133) ==25250== by 0x341581FA: (within /opt/OpenOffice/src680-m129/solver/680/unxlngi4.pro/lib/libreg.so.3) ==25250== by 0x34161D3E: (within /opt/OpenOffice/src680-m129/solver/680/unxlngi4.pro/lib/libreg.so.3) ==25250== by 0x3415D65B: (within /opt/OpenOffice/src680-m129/solver/680/unxlngi4.pro/lib/libreg.so.3) ==25250== by 0x8087F63: RegistryKey::setValue(rtl::OUString const&, RegValueType, void*, unsigned long) (in /opt/OpenOffice/src680-m129/solver/680/unxlngi4.pro/bin/idlc) ==25250== by 0x8085F67: AstModule::dump(RegistryKey&) (in /opt/OpenOffice/src680-m129/solver/680/unxlngi4.pro/bin/idlc) ==25250== by 0x80843A0: AstDeclaration::dump(RegistryKey&) (in /opt/OpenOffice/src680-m129/solver/680/unxlngi4.pro/bin/idlc) ==25250== by 0x80860EA: AstModule::dump(RegistryKey&) (in /opt/OpenOffice/src680-m129/solver/680/unxlngi4.pro/bin/idlc) ==25250== by 0x80843A0: AstDeclaration::dump(RegistryKey&) (in /opt/OpenOffice/src680-m129/solver/680/unxlngi4.pro/bin/idlc) ==25250== by 0x80860EA: AstModule::dump(RegistryKey&) (in /opt/OpenOffice/src680-m129/solver/680/unxlngi4.pro/bin/idlc) ==25250== by 0x80843A0: AstDeclaration::dump(RegistryKey&) (in /opt/OpenOffice/src680-m129/solver/680/unxlngi4.pro/bin/idlc) ==25250== by 0x80860EA: AstModule::dump(RegistryKey&) (in /opt/OpenOffice/src680-m129/solver/680/unxlngi4.pro/bin/idlc) ==25250== by 0x80843A0: AstDeclaration::dump(RegistryKey&) (in /opt/OpenOffice/src680-m129/solver/680/unxlngi4.pro/bin/idlc) ==25250== by 0x80860EA: AstModule::dump(RegistryKey&) (in /opt/OpenOffice/src680-m129/solver/680/unxlngi4.pro/bin/idlc) ==25250== by 0x807CC6E: produceFile(rtl::OString const&) (in /opt/OpenOffice/src680-m129/solver/680/unxlngi4.pro/bin/idlc) ==25250== by 0x80780C1: sal_main (in /opt/OpenOffice/src680-m129/solver/680/unxlngi4.pro/bin/idlc) ==25250== by 0x80776C7: main (in /opt/OpenOffice/src680-m129/solver/680/unxlngi4.pro/bin/idlc) urd files - being generated just fine ... * Nested directories ... (!?) * [ was not using rtl_string correctly - wrt. lengths ... ] * Problems + can we re-work the inode stuff - to handle smaller files (?) + need tool to measure avg. size of entries in our .rdbs ... + measure each stream size & each file-name size ... + add it to a bucket ... - log (?) + with the new name stuff - identical types.rdb 'regview' output + * The 1st pass: - saves 5Mb michael@linux:/opt/OOInstall2/program> ls -l *.rdb -r--r--r-- 1 michael users 4521984 2005-10-04 17:57 /opt/OOInstall/program/services.rdb -rw-r--r-- 1 michael users 6127616 2005-10-05 10:01 /opt/OOInstall/program/types.rdb michael@linux:/opt/OOInstall2/program> ls -l *.rdb -r--r--r-- 1 michael users 2031616 2005-10-10 16:39 services.rdb -r--r--r-- 1 michael users 3407872 2005-10-10 16:37 types.rdb 'stats' command output for Store statistics: types.db 3553 files 3554 dirs 0 links Sizes: bucket - count - percentage 2^ 0 - 0 - 0 2^ 1 - 0 - 0 2^ 2 - 0 - 0 2^ 3 - 0 - 0 2^ 4 - 0 - 0 2^ 5 - 0 - 0 2^ 6 - 19 - 0.53 2^ 7 - 268 - 7.5 2^ 8 - 1436 - 40 2^ 9 - 1156 - 33 2^10 - 467 - 13 2^11 - 150 - 4.2 2^12 - 47 - 1.3 2^13 - 9 - 0.25 2^14 - 0 - 0 2^15 - 1 - 0.028 Name lengths: bucket - count - percentage 2^ 0 - 0 - 0 2^ 1 - 0 - 0 2^ 2 - 35 - 0.49 2^ 3 - 376 - 5.3 2^ 4 - 4975 - 70 2^ 5 - 1667 - 23 2^ 6 - 54 - 0.76 Store statistics: services.rdb: 2436 files 5863 dirs 0 links Sizes: bucket - count - percentage 2^ 0 - 0 - 0 2^ 1 - 0 - 0 2^ 2 - 0 - 0 2^ 3 - 0 - 0 2^ 4 - 1 - 0.041 2^ 5 - 804 - 33 2^ 6 - 1463 - 60 2^ 7 - 139 - 5.7 2^ 8 - 16 - 0.66 2^ 9 - 11 - 0.45 2^10 - 1 - 0.041 2^11 - 0 - 0 2^12 - 0 - 0 2^13 - 1 - 0.041 Name lengths: bucket - count - percentage 2^ 0 - 0 - 0 2^ 1 - 0 - 0 2^ 2 - 797 - 9.6 2^ 3 - 1608 - 19 2^ 4 - 3258 - 39 2^ 5 - 386 - 4.7 2^ 6 - 2212 - 27 2^ 7 - 38 - 0.46 Analysis: types.rdb 98% of files < 2^12 bytes < = 4096 bytes types.rdb 100% of names < 64 bytes services.rdb 99% of files < 2^8 bytes < = 256 bytes services.rdb 99% of names < 64 bytes Largest file == 2^15 bytes == 32k. Block size 512: 4096 = 8*512 [ we have 16 single indirection pages ] Block size 128: 4096 = 32*128 - [ 128bytes = 32 indirections ] Indirection limits: + direct 128 - + single 4096 + double 131k + triple 4Mb Types.rdb: 80% < 512bytes => + 4 direct + 2 single + 1 double + 1 triple => 8x4 = 32bytes [of 128] OStorePageGuard: 64bits ... [ 8 bytes] OStorePageKey: 64bits [ 8 bytes] OStorePageData (base): + m_aGuard - 4 + m_aDescr - 8 + m_aMarked - 4 + m_aUnused - 4 OStoreDirectoryDataBlock: + m_aGuard - 8 + m_aTable - 32 + m_nDataLen - 4 + total - 44bytes OStorePageNameBlock + m_aGuard - 8 + m_aKey - 8 + m_nAttrib - 4 + m_pNameData[] + m_nNameLength - 4 + m_pNameBlock - 4 [!? - should be 8 ] + 32bytes + leaving 32bytes of name string ... types.rdb: 99% names < 32bytes services.rdb: 72% names < 32bytes ** The current embarassment ** Work out a size (bytes) / interface ... + massive [!?] + 3000 .idl files; = 2kb each interface + strip out headers & footers + ~same size ... (6.6Mb) + gzip types.rdb: 720k + gzip text.bz2: 630k ** Valgrind - no errors ** * Needs to be backwards compatible + can open / read old & new store files ... * Screws with header changes ... + hmm ... * virtual accessors on 'OStorePageNameBlock' ? [no] + [macro driven ...]