Skip to content

Commit 7532810

Browse files
committed
Original vbisam
0 parents  commit 7532810

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

67 files changed

+60375
-0
lines changed

AUTHORS

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
* Trevor van Bremen <[email protected]> wrote
2+
VBISAM.
3+
4+
* Roger While <[email protected]> autoconf'd/libtoolized
5+
it. Also major code restructure.

COPYING

Lines changed: 339 additions & 0 deletions
Large diffs are not rendered by default.

COPYING.LIB

Lines changed: 504 additions & 0 deletions
Large diffs are not rendered by default.

ChangeLog

Lines changed: 311 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,311 @@
1+
2007-11-27 Roger While <[email protected]>
2+
3+
* Tidy code, take out caching mevhanism.
4+
5+
2007-02-26 Roger While <[email protected]>
6+
7+
* 2.0 release. First release of libtoolized version.
8+
We now have 3 extra directories : bin, tests, libvbisam.
9+
The VBISAM library proper is in libvbisam. The helper
10+
programs are in bin. Test programs are in tests.
11+
12+
Release version is defined in configure.ac
13+
14+
2004-06-07 Trevor van Bremen
15+
16+
1.03
17+
====
18+
Fixes list
19+
----------
20+
vbDataIO.c isinternal.h Makefile:
21+
Mikhail pointed out that VBISAM was illegally perform pointer arithmetic
22+
on voids within vbDataIO.c. I'd thought the gcc switch -Wall would trap
23+
such things but I was wrong! Added -Wpointer-arith to Makefile CFLAGS
24+
isinternal.h:
25+
Mikhail added the correct values for VB_ENDIAN on HPUX and AIX. I've
26+
requested Mikhail to check whether his compilers have a CPU-dependant
27+
manifest constant defined rather than an OS dependant one.
28+
isopen.c:
29+
Guido pointed out that I was still using the OLD free list format in
30+
the tCountRows() function. This was screwing up the returned value
31+
for the isindexinfo() call!
32+
vbIndexIO.c vbCheck.c:
33+
Guido ALSO pointed out that C-ISAM is not 100% C-ISAM compatible in that
34+
it occassionally 'forgets' to insert the 0x7f in the free-list node
35+
signature. <Sigh>
36+
CvtTo64.c Makefile isHelper.c isbuild.c isinternal.h isopen.c vbCheck.c vbDataIO.c vbIndexIO.c vbKeysIO.c vbNodeMemIO.c vbVarLenIO.c vbisam.h:
37+
Changed the dependency on _FILE_OFFSET_BITS == 64 to ISAMMODE == 1
38+
This is to allow the 64-bit file I/O system calls to function even when
39+
working on C-ISAM compatible file formats thus breaking the 31-bit (2GB)
40+
barrier and extending it to 41-bit!
41+
vbCheck.c:
42+
Fixes made such that all that's required for a successful rebuild of a
43+
fixed length row file is the key descriptor nodes in the index file to
44+
be 'valid'.
45+
Also, set the index map entry for the dictionary node AFTER it may have
46+
been squashed.
47+
istrans.c:
48+
A quick check of strace() output showed that C-ISAM is performing some
49+
rudimentary locking each time it writes to the transaction log file.
50+
I implemented the same scheme for compatibility.
51+
vbLocking.c vbLowLevel.c:
52+
Changes made to iVBLock to directly deal with being interrupted by a
53+
signal during the fcntl call.
54+
vbLocking.c:
55+
Made certain that the psKeyCurr[MAXSUBS] of a table are 'valid' within
56+
iVBEnter().
57+
isdelete.c:
58+
Tidyup in preparation for 2PC (iProcessDelete function added)
59+
Handle correct error return for isdelrec+isdelcurr if row isn't on file
60+
Don't try to reinsert a key if iVBKeyDelete fails, it only corrupts the
61+
file further. This should be done INSIDE iVBKeyDelete!
62+
vbCheck.c:
63+
Fixup to correctly 'free' all that was allocated
64+
vbNodeMemIO.c:
65+
If an index had LCOMPRESS but did not have TCOMPRESS, the high value
66+
entry in a node would still contain the TCOMPRESS count! (Ooops)
67+
vbNodeMemIO.c vbKeysIO.c isinternal.h:
68+
I noticed that C-ISAM outperformed VBISAM if the index didn't
69+
make use of LCOMPRESS, TCOMPRESS or DCOMPRESS. I surmized that in
70+
that case, C-ISAM is simply editing the key within the node rather than
71+
completely rebuilding the node from a linked list. I've implemented
72+
a corresponding algorithm (iQuickNodeSave) in this module to bring
73+
VBISAM into the lead again (performance wise). I'm guessing that many/
74+
most savvy C-ISAM developers always use SOME form of key compression and
75+
thus, this will PROBABLY not affect most people, but I could not BEAR
76+
the humiliation of some 25 year old piece of mature code outperforming
77+
my 6 month old creation.
78+
For anyone even MILDLY interested, the biggest performance gains in
79+
VBISAM can be made in the following functions:
80+
iVBNodeSave
81+
iVBTreeLoad
82+
iVBKeyCompare
83+
iVBKeyInsert
84+
iVBKeyDelete
85+
These currently consume the MAJORITY of CPU time and thus even a 'small'
86+
improvement in these functions can reap potentially disparate (large)
87+
benefit to overall throughput.
88+
vbNodeMemIO.c:
89+
Fixes to 'force' the iIsTOF and iIsEOF flags within the tree
90+
isHelper.c:
91+
a: Initialize tValue to 0 in ldquad and lValue to 0 in ldlong
92+
b: Globally reverse the VB_ENDIAN content to match BYTE_ORDER
93+
isbuild.c:
94+
isaddindex was not returning an error if the table was not open in
95+
ISEXCLLOCK mode but still did not create the new index
96+
MVTest.c:
97+
Extended to handle multiple indexes
98+
Decided to make rowlen 256 bytes too!
99+
README.64bit:
100+
New file to describe some background for 64 bit operation
101+
isrecover.c:
102+
Include error checking for call to iVBRollMeBack
103+
iswrite.c:
104+
This was writing an INCORRECT transaction on fixed length files until
105+
isreclen got set. Oops!
106+
vbBlockIO.c:
107+
Standardized all file offsets as long long (64 bit)
108+
vbKeysIO.c:
109+
Added in a couple of extra 'checking' algorithms for DEBUG use
110+
Largely rewrote the entire iVBKeyDelete function
111+
vbMemIO.c:
112+
Added a few assertions on VBTREE / VBKEY (de-)allocation
113+
isread.c:
114+
Guido pointed out that I wasn't setting iResult after iVBDataLock and
115+
was therefore returning JUNK
116+
vbNodeMemIO.c:
117+
The new iQuickNodeSave function was not handling keys with ISDUPS set
118+
vbKeysIO.c:
119+
iVBKeyLocateRow was not recursing through all the possible duplicates
120+
in order to find an EXACT match on the row number!
121+
isinternal.h isopen.c isbuild.c vbLowLevel.c:
122+
Moved from using access(2) to using stat(2).
123+
vbLowLevel.c isbuild.c isopen.c isinternal.h vbCheck.c:
124+
Implement the 'sharing' of open file handles since the close() call
125+
implicitly releases *ALL* locks on the handle that were 'owned' by the
126+
process. Changes made to insert a level of indirection on:
127+
Open, Close, Lseek, Read, Write, Lock
128+
isinternal.h isHelper.c isopen.c vbLocking.c:
129+
Moved the locks from psVBFile to sVBFile and implemented table-granular
130+
locking. Note that if CISAMLOCKS is defined when compiling VBISAM, the
131+
locking strategy tries to more closely mirror that of C-ISAM. However,
132+
this introduces BUGS just like in C-ISAM! I advise against using the
133+
CISAMLOCKS and would PREFER you fixed the buggy code you have!
134+
isopen.c:
135+
Minor fixup to delay setting psVBFile handles et al to -1
136+
(Was screwing up the freeing of locks!)
137+
isinternal.h isHelper.c isdelete.c isread.c istrans.c vbLocking.c iswrite.c isrewrite.c:
138+
Change locking strategy to eliminate the concept of a 'transactional'
139+
lock. ALL locks are transactional if VBISAM has called isbegin()!
140+
iswrite.c:
141+
Only unlock a row if a transactional lock was applied *AND* the iswrite
142+
is going to FAIL!
143+
vbKeysIO.c:
144+
Changes to keep psVBFile [iHandle]->psKeyCurr [iKeyNumber] up to date on
145+
iVBKeyDelete() calls.
146+
Changes to the debug routines
147+
Fix iVBLocateRow (was NFG if ISDUPS was set!)
148+
isrewrite.c:
149+
Changes to make iRowUpdate work correctly for ISDUPS indexes
150+
vbLocking.c:
151+
Changes to iVBEnter such that it will FAIL with ENOTRANS if isbegin was
152+
not called PRIOR to the VBISAM function that is calling iVBEnter.
153+
isdelete.c:
154+
Make writing the log transation the LAST function!
155+
Also, use pcWriteBuffer for the deleted row so the log WORKS
156+
vbNodeMemIO.c:
157+
TINY change to delay resetting iIsTOF in iNodeSplit since it was being
158+
screwed up by iVBNodeSave
159+
isrewrite.c:
160+
OK, let's face it. It was a MESS (Still is, but it worx lotz betta!)
161+
isopen.c:
162+
isindexinfo was returning an INCORRECT di_nrecords!
163+
Known Issues
164+
------------
165+
Uncertain (but MOSTLY in vbKeysIO.c):
166+
Two-phase commit (2PC) still needs to be done
167+
vbLowLevel.c:
168+
I believe that incoming signals are still an issue on at least the
169+
tVBRead () function.
170+
(write() *SHOULD* be atomic already but I know that read() is not)
171+
vbCheck.c:
172+
Still not able to process ISVARLEN files
173+
Since vbCheck doesn't call iVBExit until it's finished all its work, it
174+
becomes a REAL memory hog! (See vbMemIO.c below)
175+
vbMemIO.c:
176+
More of a 'suggestion' than an 'issue'. It's probably wise to 'limit'
177+
the amount of RAM any given VBISAM process can allocate. An environment
178+
variable would be the logical way to do this. Also, it'd might be cool
179+
to make it so that you could allow a maximum of X bytes per table with
180+
an overall limit of Y bytes for the process as a whole.
181+
Documentation:
182+
Heck, I need to write a whole freaking BOOK!!!
183+
Reminder to self to include the environment variables in the docs
184+
================================================================================
185+
06Jun2004 1.02-beta (D-Day!)
186+
============================
187+
Fixes list
188+
----------
189+
LOTS of changes!!! (Certain that I've NOT covered them all here)
190+
191+
iswrite() was creating a row in the data file BEFORE having created the index
192+
entries associated with it. If an EDUPL error resulted, the data row was not
193+
purged. Fixed by delaying writing the data row till AFTER the indexes were
194+
added.
195+
196+
Management of the data row free lists was a mess. Previously deleted data rows
197+
were *not* being reused.
198+
199+
Interprocess disturbances causing 105 (EBADFILE) errors were common during
200+
iswrite () and iscommit () operations
201+
202+
Using the same transaction log file for multiple processes was causing 105
203+
(EBADFILE) errors
204+
205+
iswrite was inserting a new row into the *WRONG* node if it was exactly
206+
replacing a row that had previously been deleted. (Including the dup number)
207+
208+
iVBEnter () and iVBExit () code fixed to better handle concurrency
209+
(Most specifically, the handling of the iIsDictLocked to determine whether the
210+
iVBExit () call should update the dictionary node transaction number)
211+
Many source modules modified to change handling of iIsDictLocked.
212+
213+
Minor 'touch ups' made to the internal cache handling functions iVBBlockXXXX()
214+
Also, externalized them into their own unique module (vbBlockIO.c) with a switch
215+
possible in isinternal.h to allow selection of caching. (VB_CACHE)
216+
217+
Bug fixed where performing an isread (ISEQUAL) on a table where the index was
218+
ISDUPS would NOT find any matching rows if the FIRST (duplicate number 0) row
219+
of that key value had been deleted.
220+
221+
I re-tested the suite 'performance' again (Thank goodness for strace) to
222+
see what else could be optimized...
223+
In doing so, I found a HUGE saving in the TreeLoad function.
224+
Instead of searching the linked-list of keys in the VBTREE structure, I
225+
implemented a simple array of keys and devised a crude but effective list
226+
bi-section algorithm. VBISAM now beats the pants off the competition!
227+
It's possible that I can optimize this even further and, based
228+
upon the DRAMATIC improvements it's offered thus far, I will probably do so.
229+
Also, I need to phase out the old use of the psKeyFirst and psKeyLast values
230+
held in the VBTREE structure.
231+
232+
Fixed up the isrecover () code to truly recover things as it should!
233+
234+
Wrote a simple program (vbRecover) that performs the actual isrecover given the
235+
name of the log file on the command line
236+
237+
Wrote the beginnings of the VBISAM equivalent to bcheck as vbCheck
238+
239+
Known Issues
240+
------------
241+
STILL have not bothered to write iscluster ()
242+
I simply CANNOT see enough relevance to do so
243+
244+
STILL have not bothered to write isaudit ()
245+
Does *ANYONE* out there really USE this?
246+
247+
Need to test the functionality of deleting a row within a transaction...
248+
Should it be possible for an unrelated process to create a row with the SAME
249+
unique index values before the transaction has been 'committed'?
250+
Interestingly enough, C-ISAM Vn 7.2 fails MISERABLY on this issue to the point
251+
where the DataFree list in the index file becomes corrupted.
252+
See the bug files on SF.net titled "When is a bug NOT a bug?"
253+
Note for self to fully implement ACID within VBISAM
254+
This is in the process of being addressed by way of the 2PC (Two Phase Commit)
255+
but is NOT fully implemented in this release!
256+
A strong CAVEAT with respect to the 2PC code is that the index file of a given
257+
table will be left in an inconsistent state for much longer periods than the
258+
competition. Specifically, any key deletion will *NOT* take effect until the
259+
transaction is completed with an isrollback or iscommit call. A system-crash
260+
occuring during a transaction thus leaves the index file a little 'screwed up'.
261+
However, the vbCheck and isrecover system of VBISAM automagically deals with
262+
this issue. Furthermore, it is always good programming practice to make any
263+
transaction as short as possible (thereby minimizing the effect) and it's also
264+
good practice to at least vbCheck (if not completely recover) the effected
265+
tables subsequent to a system crash.
266+
267+
Need to fully test the row release code in isHelper.c (isrelease, isrelrec and
268+
isrelcurr). Specifically, I want to make sure that they are 'transaction-safe'
269+
270+
Many function calls are still made without testing the return value. Most of
271+
these are flagged in the source with a comment containing the word 'BUG'.
272+
273+
Still need to 'complete' the isrecover code to handle the following transaction
274+
types:
275+
VBL_BUILD, VBL_CREINDEX, VBL_DELINDEX, VBL_CLUSTER
276+
VBL_FILEERASE, VBL_RENAME, VBL_SETUNIQUE, VBL_UNIQUEID
277+
278+
Still haven't implemented the virtual file system. If I bother at all with
279+
doing so, it will come *AFTER* the 2PC code.
280+
281+
With all the changes made in this release, I've not yet ascertained whether the
282+
64-bit file I/O is fully functional. I have no specific reason to suspect to
283+
the contrary, but YMMV. Let me know if you encounter any issues.
284+
285+
Summary
286+
-------
287+
All in all, quite a valuable release this time IMNSHO. Give it a few weeks time
288+
to allow things to 'settle' and I'll seriously consider moving the SF.net status
289+
of VBISAM from BETA to whatever comes next...
290+
As always, many thanks to those in the virtual OSS world who have assisted me
291+
to 'polish' VBISAM to its current state. Feature requests are *ALWAYS* welcome
292+
================================================================================
293+
06Jan2004 1.01Beta
294+
==================
295+
Added COPYING.LIB and License line to each module header
296+
(Thanks to Johann von Nepomuk for spotting this huge oversight on my part!)
297+
298+
Changed VBBLOCK based functions in vbLowLevel.c to delay writing back to the
299+
disk files while the index is 'locked' for use. (Limits the number of system
300+
calls performed. This *REALLY* has a HUGE impact if ISEXCLLOCK is used!
301+
302+
Implemented variable length row processing
303+
HUGE warning... This is not 100% compatible with the competition from IBM.
304+
Therefore, if you use the bcheck utility from IBM, it *MAY* report issues with
305+
the variable length nodes in the index file. This is because the IBM product
306+
uses a *STUPID* algorithm for determining which 'group' a variable length node
307+
with some free space left should belong to. (Namely, IBM appears to have
308+
cutoffs at 200-bytes, 400-bytes, 600-bytes and 800-bytes. I have chosen to use
309+
a logarithmic approach instead with cutoffs at 8-bytes, 32-bytes, 128-bytes and
310+
512-bytes). To put it MILDLY, the varlen code should be considered 'UNSTABLE'
311+
in this release.

0 commit comments

Comments
 (0)