Can someone point me to an algorithm for implementing a (balanced) binary search tree in an array rather than a linked list? The only operations needed are search/insert.

The linked list is fast - O(log n) search time and O(1) insert time (once the search fails). However, the linked list has greater memory requirement - maintaining separate left and right child pointers (for the braches which are less than or more than the current node). *O(1) insertion time if no attempt is made to keep the search tree balanced.

The array would be slower - O(log n) search as well as O(log n) insert, but would require less space.
----------------
Obviously Binary Search with Insertion Sort into the list would be slower - O(Log n) search and O(n) insertion. Anyone interested in that algorithm?

Thanks.
Posted on 2006-01-02 09:54:48 by V Coder
I am sure
Posted on 2006-01-02 10:53:09 by shism2
How do you make a search algorithm in O(log n) for a linked list?
Posted on 2006-01-02 12:02:02 by QvasiModo
Depends, whats the size of your data? 100 bytes for each item?
If the data is big the linked list is the best way, the two pointers wont make a big diference, but if your data this wont be a good option.
You can group the data in pages and make a linked list of those pages, each page have ahn 32kb,
insert always on the last page, if the last page is full insert a new page.
Posted on 2006-01-02 14:56:10 by Eduardo Schardong
BASIC code. Apologies.

// BINARY SEARCH
IF n=0 THEN n=1: w(1)=value
sab:
m = (l + r)\2: wa$=value(m)
IF w$ = wa$ THEN saf
IF w$ < wa$ THEN r = m - 1 ELSE l = m + 1
IF l <= r THEN sab
// INSERTION into sorted list
incr n: FOR j = n TO l+1 STEP -1: w(j) = w(j-1): NEXT j
w(j)=value
saf: ... // Deal with data found


Yes. This is great for very small sets. I need one for a very large set, so insertion will take increasingly longer. That means a binary search tree is necessary.

The linked list version is great in that it offers O(1) insertion speed, but this results in an unbalanced list. In a worse case scenario, it produces a one-sided tree with height n instead of log2 (n). With random data it may not be too bad. In addition the linked list takes 50% more space than my data itself, and space is badly at a premium.

Now, apparently the array version would mean that the data have to be resorted after each insertion. I am now beginning to wonder if that will slow down the routine too much. Any tips please?
Posted on 2006-01-02 19:36:24 by V Coder
Does anyone know of an algorithm for implementing a (balanced) binary search tree in an array rather than a linked list? The only operations needed are search/insert.

Thanks.
Posted on 2006-01-07 18:17:04 by V Coder
Isn't a binary search tree implemented like an array only a sorted array??

You will need only 2 search functions,

1) one that return the exact place of the number or fail
2) the other that return the position where the number that  you enter will be placed, that is 3,5,9,13 and you insert 7.5, it will return position 2, after that you will only move to the left the 9 and 13 and place in the 7.5.



Also there is a variant if  remember OK, the bin search give "probability of .5 to each side, but if you have 1,2,3,4,5,..., 1000 in the first half and in the second 10000, 20000, 30000,...,10000000 then if you are searching 99999000, it will be best if you start the search more to the right than the half of the array.



You can use perhaps a hinting or caching of the last or last 2 numbers inserted/searched... for "cut" at first hand a part of the array...

For example p is for position p6 mean position-6 the number dosent matter at all ;)...

p0, p1, p2, p3,..., p100

you insert now x1 and say that you will insert it on p45 now the last number is in p101.
now you insert x2 and is compared to x1 and you take the part of the array that is correct, and so on.






Based on the anterior, you can use the following as STEPS insertion...

1) Do a buffer array with implementation of a binary search tree, do the insertions in this array, will be very fast because the size... perhaps use linear search for insertion...

2) When the array is full, commit it to the major array using the anterior way of hinthing about the last number inserted, and because is already ordered, you will be cuting upward your main array of search.

3) you can cut more the major array, if in the minor array, you first insert the 1st number, and then the last number, andd taking that sub-array of the main array like the place to search and insert (tought insertions will affect the major array, especially at the end)

Simple example 1:

buffer array
1,2,4,6,7,8,9,10

major array

5, 15, 40, 50, 54, 62, 70 76, 88, 89, 90, 93, 96, 99, 103, 105

First inserted
1*, 5, 15, 40, 50, 54, 62, 70 76, 88, 89, 90, 93, 96, 99, 103, 105
Last inserted
1*,5, 10*, 5, 15, 40, 50, 54, 62, 70 76, 88, 89, 90, 93, 96, 99, 103, 105
now your place to insert you will not need apply the bin search to the whole array, but to 1*, 5, 10* and move 8 places the rest of the array 15, 40, 50, 54, 62, 70 76, 88, 89, 90, 93, 96, 99, 103, 105 ;)...

Also you can try inserting the first, the last, the after the first, the before the last, and so on, for help reduce de array to "implode" himself (dont know if the correct term).





Dont know if this thing have a name, but I guess it will help a little in performance???



By the way
Posts: 666

Posted on 2006-01-07 22:30:16 by rea

Does anyone know of an algorithm for implementing a (balanced) binary search tree in an array rather than a linked list? The only operations needed are search/insert.

Thanks.

And why not mix them?

Having a structure:
dunno STRUCT
previous dd ?
size dd ?
next dd ?
array (your data type) 4000 dup(?)
dunno ENDS

To search start with the first structure, if not there search in the structure pointed by next, if next is null there are no more rows, so insert, to insert put a row on the last structure and increase the size, if size equals the size of array add a new structure (globalalloc you know what i mean), point its previous to the last structure, and point the last structure next to this new structure.

Working a little more on this idea you can sort this list and put an index to speed up the search.
Posted on 2006-01-08 07:15:45 by Eduardo Schardong
I did some search about binary search tree and saw the concept mentioned of using a search tree implemented in an array. For node i, the left child is index 2i, and the right child is index 2i+1. Inserting an item and then balancing the tree will take a really long time, but it eliminates the need to use a linked list. A linked list will be very quick but unbalanced (which does not bother me). The main problem for my purposes is that it uses extra memory.

I guess nobody knows of an algorithm that I can convert to assembly code?

Thanks anyway.

Cheers
Posted on 2006-01-08 23:07:39 by V Coder
Using a linked structure saves you memory because you don't start with an array of max size (or resizing your array). With the array structure you have to allocate enough memory to hold all the objects you want to add at the beginning, while in a linked structure you allocate memory as you go.

The end case memory usage of a linked BSP is not even 3/2 more than an array BSP

>>Array BSP with 100 objects in it BALANCED
Array 2*100+2 DWORDS long, each dword is a pointer to the data the object holds. The extra dwords are because the last object needs to have space for its children.
You'll need a byte array that holds the balance factor for each array element.
202*4+202
= 1010bytes

>>Linked BSP with 100 objects BALANCED
struct Obj
DWORD ptrToData
DWORD ptrToLeft
DWORD ptrToRight
BYTE balanceFactor ;;
endstruc
3*100*4+100
= 1300bytes


Balancing an array BST with rotations is difficult.
Why even use a BST, can't you just use a sorted array?
It would be easier to implement a quicksort algo then an AVL tree in an array IMO
Insertion = O(1) + Quicksort O(n*logn)
Search  =  O(logn)

If you really want to implement an array bst, I suggest you check out a linked implementation and just realize that when it's trying to get the left or right child you are doing 2*(index) + 1 or 2.

Here's a good LinkedAVL implementation in Java. The code is straight forward so porting it to asm shouldn't be impossible, then converting it to an array AVL would be mostly find and replace.

http://www.cs.sjsu.edu/~smithj/oldclass/146s02/solutions/a2/AVL.java
Posted on 2006-01-09 01:24:51 by r22
I did some search about binary search tree and saw the concept mentioned of using a search tree implemented in an array. For node i, the left child is index 2i, and the right child is index 2i+1.



If the tree is complete in a n-level and balanced, then at the end will be a sorted array ;).


                      6
                    /  \
                  3        11
              /  \      /    \
            1      5    9      13

It will end in something like
1,3,4,6, 9, 11,13



But also a not complete in the n-level will be sorted..

                      6
                    /  \
                  3        11
              /  \      /    \
            1            9

It will end in something like
1,3,6, 9, 11

that is without take care of the position in the array like 2*i+1 or 2*i+2, you can aply a bin-search and it will work.


If not, for the two anterior cases, you will have something like

1)
6 for the header and apart
3,11, 1,5, 9,13

2)
6 for the header
3,11, 1,_, 9,_


See that the header not being a left or right child need be outside of the array.
Posted on 2006-01-09 10:50:17 by rea
rea, the nature of the bst is to be sorted, but BALANCED means the height of all the children is no more than 1 unit apart.

  A
/  \
    B
  /  \
      C
Is an unbalanced tree, it's still sorted but the elements were added in a was that causes the structure of the tree to be inefficient for searching.
Posted on 2006-01-09 14:09:35 by r22
You mean it by this???

But also a not complete in the n-level will be sorted..


I was thinking in reference to the array, how it look (sorted), even that is not complete the bst in the n-th level, also in that way isn't ofered the "extra spaces" betwen the elements like give them the 2i+1 or 2*i+2, tought you will need move blocks of memory when inserting 1 element. Is based on such order of array I suguest the steps for insertion: buffer them, commit-insertingfirst-last each time.





What can I correct is that I fail in the first time in interpretation, the header isn't apart from the array lol, is only that it can not be acceded with 2*i+1 or 2*i+2, if you take apart the header like I do, then those two functions are 2*i and 2*i+1 respectively.




With that last way of implementing the array for a bst, there is no way that the look at first sight is sorted ;), even that the nature of the bst is to be sorted. (Now Im thinking how like is a implementation of a bst in an array, there are now 2 posible way diferent, but still the same?, that is, tought a sorted array have the sorted way of the bst but dosent offer the left-right child-thing, the second way of implementation ofer the left-right-child of the nature of the bst, but dosent ofer at first sight the ordered nature of the bst ;)).
Posted on 2006-01-09 15:06:47 by rea
although no explicit balance conditions are imposed on the tree, each of these operations can be shown to use time O(lg n) on an n-element tree


http://www.ibr.cs.tu-bs.de/courses/ss98/audii/applets/BST/SplayTree-Example.html

Not exactly what you asked for but it may do the job you want and the animation looks good!

Paul.
Posted on 2006-01-09 16:21:53 by pdixon

Using a linked structure saves you memory because you don't start with an array of max size (or resizing your array). With the array structure you have to allocate enough memory to hold all the objects you want to add at the beginning, while in a linked structure you allocate memory as you go.

The end case memory usage of a linked BSP is not even 3/2 more than an array BSP

>>Array BSP with 100 objects in it BALANCED
Array 2*100+2 DWORDS long, each dword is a pointer to the data the object holds. The extra dwords are because the last object needs to have space for its children.
You'll need a byte array that holds the balance factor for each array element.
202*4+202
= 1010bytes

>>Linked BSP with 100 objects BALANCED
struct Obj
DWORD ptrToData
DWORD ptrToLeft
DWORD ptrToRight
BYTE balanceFactor ;;
endstruc
3*100*4+100
= 1300bytes

In the linked list form, the program would allocate the estimated amount of memory needed for the list at the start (eg. 2.5GB) on a machine with that RAM - each data element consists of 4bytes left pointer, 4 bytes right pointer, 16 bytes data.
Insert: Binary Search for the data. If you reach a node with a zero pointer (left or right, depending on whether the data to be inserted is less than or more than the data at that node), then set that pointer to the end of the list. Insert the new data at the end of the list. Reset the end of list pointer to the next available position.

In the array form with binary search and insertion sort, the program would allocate the estimated amount of memory needed at the start (eg. 1.7GB) - each data element consists of 16 bytes data.
Insert: Binary Search for the data. If the data is not found, insert the data at the current position, shifting everything else down. That is the routine I included above I guess that is effectively a bubble sort? Or is it an insertion sort. No matter. This is exponentially slower than the linked list version. And having done a binary search I know exactly where the new data needs to go in the sorted list, so I'm guessing that there is no other more effective way of getting it there. Even heap, shell, quick or other sort will not work faster.

What I want is
In the array form with binary search and insertion sort, the program would allocate the estimated amount of memory needed at the start (eg. 1.7GB) - each data element consists of 16 bytes data.
Insert: Binary Search for the data. If the data is not found. I guess there is no way to keep it balanced without essentially going through the same hassle of comparing with parent and sibling nodes and shifting things around, which works out to slower than insertion/bubble sort, because the implementation above needs no further compares.
Posted on 2006-01-10 18:17:28 by V Coder