Introduction
Monday, May 14, 2001
I have already talked about collections, and how the new ones are all subtly different from VBA.Collection. This time, I do a collection from scratch. It will allow subclasses to by typesafe. It will have keys. It will let you manipulate entries every which way. If a linked list implementation is OK with you, and you're obsessed with feature-richness, you might make this one your favourite base class.
You can see the full code in the abandoned Gregor.Lists project. The following code examples don't cut it, really.
Let me just recap here what features I'm going after:
- Support for keys.
- Optional (as opposed to required) keys.
- Unique keys (as opposed to ambiguous "names").
- Access by index.
- Entries are ordered meaningfully (unlike the dictionary family). The test here is iteration using For-Each.
- Iterating with For-Each gets the value (as opposed to the key/name; not a "DictionaryEntry" either).
None of the framework classes has all of them.
The links
Let's start with the links. A linked list stores information in links. Every link has a reference to the next link. If it references the previous link, too, we speak of a doubly-linked list. My links have a "Value" member that takes objects. You could speed things up a little if you create a custom link class for every collection task (you can do so by inheriting from CLink), but here let's be content with the wrapper classes' ensuring type safety.
Public Class CLink Private m_PrevLink As CLink Private m_Value As Object Private m_NextLink As CLink Public Overloads Sub New(ByVal value As Object) m_Value = value End Sub Public Overloads Sub New(ByVal value As Object, ByVal nextLink As CLink) m_Value = Value m_NextLink = nextLink m_NextLink.m_PrevLink = Me End Sub Public Overloads Sub New(ByVal prevLink As CLink, ByVal value As Object) m_PrevLink = prevLink m_PrevLink.m_NextLink = Me m_Value = value End Sub Public Overloads Sub New(ByVal prevLink As CLink, ByVal value As Object, ByVal nextLink As CLink) m_PrevLink = prevLink m_PrevLink.m_NextLink = Me m_Value = value m_NextLink = nextLink m_NextLink.m_PrevLink = Me End Sub Public Property PrevLink As CLink Get Return m_PrevLink End Get Set m_PrevLink = value End Set End Property Public Property Value As Object Get Return m_Value End Get Set m_Value = value End Set End Property Public Property NextLink As CLink Get Return m_NextLink End Get Set m_NextLink = value End Set End Property End Class
The references to the previous and next links are public; this allows wrapper classes to deal with lists efficiently. Wrappers need not expose the links themselves, of course. Another concern is, if you use linked lists in "raw" mode, you need to get at the references.
If you look at the constructors, you'll see that these links are already fairly usable. When creating a new link, it can easily be fitted into an existing list:
Dim monday As New CLink("Monday") Dim wednesday As New CLink(monday, "Wednesday") Dim tuesday As New CLink(monday, "Tuesday", wednesday)
Keys
The CLink class also has keys (which require four more constructor that aren't documented here), which are stored in a String field and exposed via a property. Keys are guarenteed to be unique by this method (which the key-aware constructors and the "Set" of the Key property call):
Public Function ExistsKey(ByVal sKey As String) As Boolean If Me.Key = sKey Then Return True Dim prev As CLink = Me.PrevLink, naext As CLink = Me.NextLink Do Until prev Is Nothing If StrComp(prev.Key, sKey, CompareMethod.Binary) = 0 Then Return True prev = prev.PrevLink Loop Do Until naext Is Nothing If StrComp(naext.Key, sKey, CompareMethod.Binary) = 0 Then Return True naext = naext.NextLink Loop End Function
Indices
You can also find out the position of a given link in the list (zero-based; you'll see why this property's name is so expressive about it, later):
Public ReadOnly Property IndexZB As Integer Get Dim prev As CLink = Me.PrevLink, i As Integer Do Until prev Is Nothing i += 1 prev = prev.PrevLink Loop Return i End Get End Property
The abstract wrapper class NList
This class encapsulates all the linked list logic, so it feels and smells like a real collection class, which is what we want, after all.
However, the class does not expose methods to add entries or to retrieve them; those are protected, so derived classes need to call them in their own "Add" or "Item" methods. What about "Shadows"? For one, the "Shadows" modifier, which hides a base class's method, is not implemented in Beta 1. Also, the main purpose of "Shadows" is ensure clients' compatibility with a derived class if the base class one day introduces a new member that conflicts with a name in the derived class, and I vow to respect this design principle for the time being. But the true reason to use the Shadows modifier sparingly is the idea of polymorphism. If clients have code that calls a method of the base class, this should also work on a derived class. So Shadows should be used with great caution. If we can't shadow these methods, can we override them instead? Now, overriding cannot change the return type (or arguments' types) of a method, it can only provide a new implementation. Again, it's about protecting the interface. Concluding, derived classes need to provide a few methods of their own (but it's not much work).
The basics
I'll just show you the constructor and the Add method here:
Public MustInherit Class NList Implements IEnumerable Protected m_First As CLink Protected m_Last As CLink Protected m_LowerBound As Integer Protected m_RequireKeys As Boolean Protected Overloads Sub New(ByVal fRequireKeys As Boolean) m_RequireKeys = fRequireKeys End Sub Protected Overloads Sub New(ByVal fRequireKeys As Boolean, ParamArray ByVal values() As Object) m_RequireKeys = fRequireKeys Dim i As Integer If fRequireKeys = True Then For i = 0 To UBound(values) Step 2 Me.BaseAdd(values(i), CType(values(i + 1), String)) Next i Else For i = 0 To UBound(values) Me.BaseAdd(values(i)) Next i End If End Sub Protected Overloads Sub BaseAdd(ByVal link As CLink) If m_RequireKeys = True And link.Key = "" Then Throw New ArgumentException("Must provide a key") End If If Not m_First Is Nothing And link.Key <> "" And m_First.ExistsKey(link.Key) Then Throw New ArgumentException("Key already exists") End If If m_First Is Nothing Then m_First = link m_Last = m_First Else link.PrevLink = m_Last m_Last.NextLink = link m_Last = link End If End Sub Protected Overloads Function BaseAdd(ByVal value As Object) As CLink Dim knew As New CLink(value) Me.BaseAdd(knew) Return knew End Function Protected Overloads Function BaseAdd(ByVal value As Object, ByVal sKey As String) As CLink Dim knew As New CLink(value, sKey) Me.BaseAdd(knew) Return knew End Function ' ... End Class
Inheritors can provide their own ParamArray-initializer, and it's easy to pass the array on to the base (as explained in Changes). When an instance of a derived class is created, you need to say whether you want keys to be required. All methods that take input will then make sure that keys are indeed provided, and that they're unique.
Indexing the list
Since NList does not keep an index for the links, but rather calculates them when needed, you can change the lower bound willy-nilly. This disadvantage is that indexed look-up is slow (using the BaseItem method). However iteration with the enumerator (For-Each; Do-Loop) does not suffer from this. Here's everything about bounds & counts:
Public Property LowerBound As Integer Get Return m_LowerBound End Get Set m_LowerBound = value End Set End Property Public ReadOnly Property UpperBound As Integer Get Return Me.Count + Me.LowerBound - 1 End Get End Property Public ReadOnly Property Count As Integer Get Dim e As New NListEnumerator(Me), c As Integer Do While e.MoveNext = True c += 1 Loop Return c End Get End Property
You can improve the speed of indexed loopup (using For-Next). Keep a field that remembers the last link returned, another one that holds the the difference between the last index requested and the lower bound at that time, and then, when the next index is requested, calculate the offset from the possibly changed lower bound against the next index requested, and substract both numbers, giving you an new offset by which you can iterate in the appropriate direction based on the last link returned. In any case, when iterating over the whole list, there is no advantage from using For-Next anyway. Just keep in mind that indices are calculated.
Other members
NList has a few other properties and methods that do most about everything you'll ever need. Some are public because the type of the elements isn't a concern while others are protected. Most of these members are overloaded. Check out the source in the zip file to find out more. Here's an overview:
- BaseFirst, BaseLast: These return the first and the last CLink object in the list, respectively.
- BaseItem: Gets or sets a CLink object by index (see above for a discussion about indexing) or by key. This property is protected.
- GetIndex, GetKey: Retrieve the index or the key of a link with a given key or at a given position, respectively. These values are based on CLink's properties, but since derived classes do not usually expose CLink objects, these methods are implemented in the base class for convenience. An exception is thrown when invalid input is passed.
- Exists: Tests if either a link with a given key or of a given index exists. Returns a Boolean, so you can use this for testing without catching an exception.
- BaseContains: You pass a value, and it tells you if it's in one of the list's links. Note that there is no actual value comparison for reference types. This one is protected, so inheritors can ensure type safety.
- BaseIndexOf, BaseLastIndexOf: These take a comparer delegate, so clients can search the list for entries (inheritors can provide the compare function in a transparent way for common compare operations). Since the list's bounds are adjustable, no limbo value (like negative one) is returned if the value isn't found; instead, an exception is thrown.
- Remove: Overloaded. Removes a link of a given key, a given index, or a span defined by starting and ending indices.
- BaseAdd, BaseInsertAt: These insert pre-created CLink objects or values at the end or at a given position. Links are not copied. Some overloaded variants of BaseInsertAt take a comparer delegate, so you can insert values into a sorted list without having to resort it (derived classes can provide a compare function behind the scenes to ease things for the client).
- BaseAddList, BaseInsertListAt: Pass a reference to an existing list, and it will be added or inserted. The list passed is not copied; it's rather like joining two lists.
- Shift: Lets you rearrange entries in the list. You can shift a single link or a portion of the list.
- CreateLink: An overloaded protected function that acts as a factory for CLink objects. Inheritors can override it in order to use their own links (that is, objects of a class derived from CLink) with additional properties.
- Filter: Pass a delegate to this method, and you'll be called back in order to check the value (type: Object) of every link. If you return False, the link will be removed.
- Sort: This one will sort the list; pass a delegate to be called back. It works somewhat similiar to a bubble sort. I've used a recursive algorithm.
- BaseCopy: I planned to add Cut and Copy methods to NList (to extract a portion from the list to a new list). However, this is really something that belongs in the derived classes' interfaces, because the type of the returned list is important. So instead I added the BaseCopy method to NList, which takes a reference to an instance of a derived list class (created by the inheritor) as a formal NList parameter, filling in the copied links. This way, it takes just a few lines of code in the derived class (for either the Copy or the Cut method), while type safety is guaranteed for the client. Check out NObjectList below for an example.
Several of these members use the DCompare and DCheck delegates. These aren't strongly typed (using Objects), but in a derived class, you can use different delegates and method signatures in case of BaseInsertAt, BaseIndexOf, and BaseLastIndexOf. Maybe both Sort and Filter should be rewritten as well (making them protected, thus allowing derived classes to use other types), but you can do that yourself. Note, however, that using comparer delegates isn't the standard way of determining object value equality - rather, one would call the (often overridden) Equals method of the objects pointed to by CLink's Value property, or a the redefined Equals method of the links themselves (which might just forward the call to the value object). I'll look into that, so the whole comparer story might be rewritten.
The enumerator
What's more interesting is the enumerator. Originally, NList's enumerator getter function was a MustOverride, in order to make sure that inheritors implement their own enumerator (NListEnumerator's Current property returns CLink objects, not values). I decided to change this. Now, GetEnumerator returns an NListEnumerator object, so users of derived classes can iterate over the links (it's somewhat similiar to a DictionaryEnumerator in that a special intermediate complex type [CLink] is used). However, the method can be overridden, so derived classes can use a more appropriate enumerator (i.e., one that iterates over the values):
Public Overridable Function GetEnumerator() As Collections.IEnumerator Implements IEnumerable.GetEnumerator Return New NListEnumerator(Me) End Function
This is different from other collection classes (like those in System.Collections), which implement an enumerator for you in a not-overidable way (the downside to this is explained in Collections). NList uses NListEnumerator internally. Derived classes can use it, too (it's instantiatable), but you might prefer to derive your own enumerator class from it since its Current method, as mentioned, returns a "raw" CLink object, not the value.
By the way, NListEnumerator is not implemented as an inner class, because it would be shown as a type in a of derived classes, which is not desirable; derived class should have their own enumerator class. You might wonder why can't we make the enumerator protected and extend it to a protected class? But VB.NET does not allow us to do that - it's in the docs, doesn't seem to be a Beta issue. Plus, a protected enumerators aren't what we want; client must be able to use them.
Anyway, here's NListEnumerator:
Public Class NListEnumerator Implements IEnumerator, ICloneable Protected m_List As NList Protected m_Current As CLink Protected m_Started As Boolean Public Sub New(ByVal list As NList) m_List = list End Sub Public ReadOnly Overridable Property Current As Object Implements IEnumerator.Current Get Return m_Current End Get End Property Public Function MoveNext() As Boolean Implements IEnumerator.MoveNext If m_Started = False Then m_Current = m_List.First m_Started = True Else m_Current = m_Current.NextLink End If Return m_Current <> Nothing End Function Public Function MovePrevious() As Boolean If m_Started = False Then m_Current = m_List.Last m_Started = True Else m_Current = m_Current.PrevLink End If Return m_Current <> Nothing End Function Public Sub Reset() Implements IEnumerator.Reset m_Current = Nothing m_Started = False End Sub Public Overridable Function Clone() As Object Implements ICloneable.Clone Dim e As New NListEnumerator(m_List) e.m_Current = m_Current e.m_Started = m_Started Return e End Function End Class
Inheritors need to provide a constructor, override Current and Clone (possibly throwing a System.NotImplementedException) - that's it (see for an example of a derived enumerator below). If inheritors like, they can also iterate in extreme-raw mode using the protected First and Last properties of NList, both returning CLink objects.
Derived classes
Derived classes could implement ICollection, or perhaps even IList. NList doesn't do that because it's a design choice not to expose members like "Item" (type-safety). So you might want to do that, depending on you needs. You'll find some methods in the framework that take IList or ICollection as parameters; but you might as well get by with IEnumerable, which you get with NList (though you're not tight to its implementation). The examples below don't implement additional interfaces, but it's easy to do if need be. Implementing IList in particular has a downside, however: The interface specifies that return values and arguments of methods like Add or Remove be of type Object, which is at odds with the concept of a strongly typed collection.
So here are two examples of derived classes, NObjectList and NStringList. While the first one is very simple, the second one adds some stunning String stuff. The code might look a little verbose, but note that most it is optional.
NObjectList
Public Class NObjectList Inherits NList Public Overloads Sub New(ByVal fRequireKeys As Boolean) MyBase.New(fRequireKeys) End Sub Public Overloads Sub New(ByVal fRequireKeys As Boolean, ParamArray ByVal values() As Object) MyBase.New(fRequireKeys, values) End Sub Public Overloads Sub Add(ByVal value As Object) MyBase.BaseAdd(value) End Sub Public Overloads Sub Add(ByVal value As Object, ByVal sKey As String) MyBase.BaseAdd(value, sKey) End Sub Public Sub AddList(ByVal lst As NObjectList) MyBase.BaseInsertListAt(lst, Me.UpperBound + 1) End Sub Public Overloads Sub InsertAt(ByVal value As Object, ByVal index As Integer) MyBase.BaseInsertAt(value, index) End Sub Public Overloads Sub InsertAt(ByVal value As Object, ByVal index As Integer, ByVal sKey As String) MyBase.BaseInsertAt(value, index, sKey) End Sub Public Overloads Sub InsertAt(ByVal value As Object, ByVal comparer As NList.DCompare) MyBase.BaseInsertAt(value, comparer) End Sub Public Overloads Sub InsertAt(ByVal value As Object, ByVal comparer As Nlist.DCompare, ByVal sKey As String) MyBase.BaseInsertAt(value, comparer, sKey) End Sub Public Sub InsertListAt(ByVal lst As NObjectList, ByVal index As Integer) MyBase.BaseInsertListAt(lst, index) End Sub Public Function Contains(ByVal value As Object) As Boolean Return MyBase.BaseContains(value) End Function Public Function IndexOf(ByVal comparer As NList.DCompare, ByVal value As Object, ByVal nStart As Integer) As Integer Return MyBase.BaseIndexOf(comparer, value, nStart) End Function Public Function LastIndexOf(ByVal comparer As NList.DCompare, ByVal value As Object, ByVal nStart As Integer) As Integer Return MyBase.BaseLastIndexOf(comparer, value, nStart) End Function Public Function Copy(ByVal iFrom As Integer, ByVal iTo As Integer) As NObjectList Dim lst As New NObjectList(m_RequireKeys) MyBase.BaseCopy(iFrom, iTo, lst) Return lst End Function Public Function Cut(ByVal iFrom As Integer, ByVal iTo As Integer) As NObjectList Cut = Me.Copy(iFrom, iTo) MyBase.Remove(iFrom, iTo) End Function Public Default Overloads Property Item(ByVal index As Integer) As Object Get Return MyBase.BaseItem(index).Value End Get Set MyBase.BaseItem(index).Value = value End Set End Property Public Default Overloads Property Item(ByVal sKey As String) As Object Get Return MyBase.BaseItem(sKey).Value End Get Set MyBase.BaseItem(sKey).Value = value End Set End Property Public Overrides Function GetEnumerator As IEnumerator Return New NObjectListEnumerator(Me) End Function Public Class NObjectListEnumerator Inherits NListEnumerator Public Sub New(ByVal lst As NObjectList) MyBase.New(lst) End Sub Public Overrides ReadOnly Property Current() As System.Object Get Return m_Current.Value End Get End Property Public Overrides Function Clone() As System.Object Dim e As New NObjectListEnumerator(CType(m_List, NObjectList)) e.m_Current = m_Current e.m_Started = m_Started Return e End Function End Class End Class
NStringList
Here is NStringList, but I only show you some additional members:
Public Class NStringList Inherits NList Public Overloads Sub New(ByVal sFile As String) MyBase.New(False) Me.FromFile(sFile) End Sub ' ... Public Sub FromFile(ByVal sFile As String) Dim fil As File, sr As StreamReader, sLine As String Try fil = New File(sFile) sr = fil.OpenText Me.Clear sLine = sr.ReadLine Do Until sLine Is Nothing Me.Add(sLine) sLine = sr.ReadLine Loop Catch e As Exception Throw e Finally sr.Close End Try End Sub Public Sub ToFile(ByVal sFile As String) Dim sw As StreamWriter, enm As NStringListEnumerator Try sw = New StreamWriter(sFile) enm = CType(Me.GetEnumerator, NStringListEnumerator) Do While enm.MoveNext = True sw.WriteLine(CType(enm.Current, String)) Loop Catch e As Exception Throw e Finally sw.Close End Try End Sub Public Sub FromString(ByVal s As String, ParamArray ByVal separators() As Char) Dim aStr() As String, i As Integer aStr = s.Split(separators) For i = 0 To UBound(aStr) Me.Add(aStr(i)) Next i End Sub Public Function ToStringEx(ByVal separator As Char) As String Dim aStr(Me.Count) As String, i As Integer For i = 0 To Me.Count - 1 If i > 0 Then aStr(i) = separator aStr(i) &= Me(i) Next i Return New String(aStr) End Function End Class
Delegating to an NList inheritor?
This is not unheard of. Check out the CHistory class in the accompanying sample download. It works like the in my VB6Core component, but got a revamp. Maybe I'll chat about at some point, but not now.